From hzhang at mcs.anl.gov Wed Nov 1 09:33:33 2023 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Wed, 1 Nov 2023 14:33:33 +0000 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: References: Message-ID: Victoria, "** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed Ordering based on METIS" Try parmetis. Hong ________________________________ From: petsc-users on behalf of Victoria Rolandi Sent: Tuesday, October 31, 2023 10:30 PM To: petsc-users at mcs.anl.gov Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS Hi, I'm solving a large sparse linear system in parallel and I am using PETSc with MUMPS. I am trying to test different options, like the ordering of the matrix. Everything works if I use the -mat_mumps_icntl_7 2 or -mat_mumps_icntl_7 0 options (with the first one, AMF, performing better than AMD), however when I test METIS -mat_mumps_icntl_7 5 I get an error (reported at the end of the email). I have configured PETSc with the following options: --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis and the installation didn't give any problems. Could you help me understand why metis is not working? Thank you in advance, Victoria Error: ****** ANALYSIS STEP ******** ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed Processing a graph of size: 699150 with 69238690 edges Ordering based on METIS 510522 37081376 [100] [10486 699150] Error! Unknown CType: -1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Wed Nov 1 11:17:59 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Wed, 1 Nov 2023 17:17:59 +0100 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: References: Message-ID: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> > On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users wrote: > > Victoria, > "** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed > Ordering based on METIS" This warning is benign and appears for every run using a sequential partitioner in MUMPS with a MATMPIAIJ. (I?m not saying switching to ParMETIS will not make the issue go away) Thanks, Pierre $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu -mat_mumps_icntl_4 2 Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 executing #MPI = 2, without OMP ================================================= MUMPS compiled with option -Dmetis MUMPS compiled with option -Dparmetis MUMPS compiled with option -Dpord MUMPS compiled with option -Dptscotch MUMPS compiled with option -Dscotch ================================================= L U Solver for unsymmetric matrices Type of parallelism: Working host ****** ANALYSIS STEP ******** ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed Processing a graph of size: 56 with 194 edges Ordering based on AMF WARNING: Largest root node of size 26 not selected for parallel execution Leaving analysis phase with ... INFOG(1) = 0 INFOG(2) = 0 [?] > Try parmetis. > Hong > From: petsc-users on behalf of Victoria Rolandi > Sent: Tuesday, October 31, 2023 10:30 PM > To: petsc-users at mcs.anl.gov > Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS > > Hi, > > I'm solving a large sparse linear system in parallel and I am using PETSc with MUMPS. I am trying to test different options, like the ordering of the matrix. Everything works if I use the -mat_mumps_icntl_7 2 or -mat_mumps_icntl_7 0 options (with the first one, AMF, performing better than AMD), however when I test METIS -mat_mumps_icntl_7 5 I get an error (reported at the end of the email). > > I have configured PETSc with the following options: > > --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis > > and the installation didn't give any problems. > > Could you help me understand why metis is not working? > > Thank you in advance, > Victoria > > Error: > > ****** ANALYSIS STEP ******** > ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed > Processing a graph of size: 699150 with 69238690 edges > Ordering based on METIS > 510522 37081376 [100] [10486 699150] > Error! Unknown CType: -1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 1 11:52:35 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 1 Nov 2023 12:52:35 -0400 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> Message-ID: <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> Pierre, Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this situation so as to not trigger the confusing warning message from MUMPS? Barry > On Nov 1, 2023, at 12:17?PM, Pierre Jolivet wrote: > > > >> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users wrote: >> >> Victoria, >> "** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >> Ordering based on METIS" > > This warning is benign and appears for every run using a sequential partitioner in MUMPS with a MATMPIAIJ. > (I?m not saying switching to ParMETIS will not make the issue go away) > > Thanks, > Pierre > > $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu -mat_mumps_icntl_4 2 > Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 > executing #MPI = 2, without OMP > > ================================================= > MUMPS compiled with option -Dmetis > MUMPS compiled with option -Dparmetis > MUMPS compiled with option -Dpord > MUMPS compiled with option -Dptscotch > MUMPS compiled with option -Dscotch > ================================================= > L U Solver for unsymmetric matrices > Type of parallelism: Working host > > ****** ANALYSIS STEP ******** > > ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed > Processing a graph of size: 56 with 194 edges > Ordering based on AMF > WARNING: Largest root node of size 26 not selected for parallel execution > > Leaving analysis phase with ... > INFOG(1) = 0 > INFOG(2) = 0 > [?] > >> Try parmetis. >> Hong >> From: petsc-users on behalf of Victoria Rolandi >> Sent: Tuesday, October 31, 2023 10:30 PM >> To: petsc-users at mcs.anl.gov >> Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS >> >> Hi, >> >> I'm solving a large sparse linear system in parallel and I am using PETSc with MUMPS. I am trying to test different options, like the ordering of the matrix. Everything works if I use the -mat_mumps_icntl_7 2 or -mat_mumps_icntl_7 0 options (with the first one, AMF, performing better than AMD), however when I test METIS -mat_mumps_icntl_7 5 I get an error (reported at the end of the email). >> >> I have configured PETSc with the following options: >> >> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis >> >> and the installation didn't give any problems. >> >> Could you help me understand why metis is not working? >> >> Thank you in advance, >> Victoria >> >> Error: >> >> ****** ANALYSIS STEP ******** >> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >> Processing a graph of size: 699150 with 69238690 edges >> Ordering based on METIS >> 510522 37081376 [100] [10486 699150] >> Error! Unknown CType: -1 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Wed Nov 1 12:33:27 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Wed, 1 Nov 2023 18:33:27 +0100 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> Message-ID: Victoria, please keep the list in copy. > I am not understanding how can I switch to ParMetis if it does not appear in the options of -mat_mumps_icntl_7.In the options I only have Metis and not ParMetis. You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 Barry, I don?t think we can programmatically shut off this warning, it?s guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which are only settable/gettable by people with access to consortium releases. I?ll ask the MUMPS people for confirmation. Note that this warning is only printed to screen with the option -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. Thanks, Pierre > On 1 Nov 2023, at 5:52?PM, Barry Smith wrote: > > > Pierre, > > Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this situation so as to not trigger the confusing warning message from MUMPS? > > Barry > >> On Nov 1, 2023, at 12:17?PM, Pierre Jolivet wrote: >> >> >> >>> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users wrote: >>> >>> Victoria, >>> "** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>> Ordering based on METIS" >> >> This warning is benign and appears for every run using a sequential partitioner in MUMPS with a MATMPIAIJ. >> (I?m not saying switching to ParMETIS will not make the issue go away) >> >> Thanks, >> Pierre >> >> $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu -mat_mumps_icntl_4 2 >> Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 >> executing #MPI = 2, without OMP >> >> ================================================= >> MUMPS compiled with option -Dmetis >> MUMPS compiled with option -Dparmetis >> MUMPS compiled with option -Dpord >> MUMPS compiled with option -Dptscotch >> MUMPS compiled with option -Dscotch >> ================================================= >> L U Solver for unsymmetric matrices >> Type of parallelism: Working host >> >> ****** ANALYSIS STEP ******** >> >> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >> Processing a graph of size: 56 with 194 edges >> Ordering based on AMF >> WARNING: Largest root node of size 26 not selected for parallel execution >> >> Leaving analysis phase with ... >> INFOG(1) = 0 >> INFOG(2) = 0 >> [?] >> >>> Try parmetis. >>> Hong >>> From: petsc-users on behalf of Victoria Rolandi >>> Sent: Tuesday, October 31, 2023 10:30 PM >>> To: petsc-users at mcs.anl.gov >>> Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS >>> >>> Hi, >>> >>> I'm solving a large sparse linear system in parallel and I am using PETSc with MUMPS. I am trying to test different options, like the ordering of the matrix. Everything works if I use the -mat_mumps_icntl_7 2 or -mat_mumps_icntl_7 0 options (with the first one, AMF, performing better than AMD), however when I test METIS -mat_mumps_icntl_7 5 I get an error (reported at the end of the email). >>> >>> I have configured PETSc with the following options: >>> >>> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis >>> >>> and the installation didn't give any problems. >>> >>> Could you help me understand why metis is not working? >>> >>> Thank you in advance, >>> Victoria >>> >>> Error: >>> >>> ****** ANALYSIS STEP ******** >>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>> Processing a graph of size: 699150 with 69238690 edges >>> Ordering based on METIS >>> 510522 37081376 [100] [10486 699150] >>> Error! Unknown CType: -1 >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 1 14:02:21 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 1 Nov 2023 15:02:21 -0400 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> Message-ID: <2DFC7870-75E7-4664-9F06-F3E78764AEC1@petsc.dev> Pierre, Sorry, I was not clear. What I meant was that the PETSc code that calls MUMPS could change the value of ICNTL(6) under certain conditions before calling MUMPS, thus the MUMPS warning might not be triggered. I am basing this on a guess from looking at the MUMPS manual and the warning message that the particular value of ICNTL(6) is incompatible with the given matrix state. But I could easily be wrong. Barry > On Nov 1, 2023, at 1:33?PM, Pierre Jolivet wrote: > > Victoria, please keep the list in copy. > >> I am not understanding how can I switch to ParMetis if it does not appear in the options of -mat_mumps_icntl_7.In the options I only have Metis and not ParMetis. > > > You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 > > Barry, I don?t think we can programmatically shut off this warning, it?s guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which are only settable/gettable by people with access to consortium releases. > I?ll ask the MUMPS people for confirmation. > Note that this warning is only printed to screen with the option -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. > > Thanks, > Pierre > >> On 1 Nov 2023, at 5:52?PM, Barry Smith wrote: >> >> >> Pierre, >> >> Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this situation so as to not trigger the confusing warning message from MUMPS? >> >> Barry >> >>> On Nov 1, 2023, at 12:17?PM, Pierre Jolivet wrote: >>> >>> >>> >>>> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users wrote: >>>> >>>> Victoria, >>>> "** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>> Ordering based on METIS" >>> >>> This warning is benign and appears for every run using a sequential partitioner in MUMPS with a MATMPIAIJ. >>> (I?m not saying switching to ParMETIS will not make the issue go away) >>> >>> Thanks, >>> Pierre >>> >>> $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu -mat_mumps_icntl_4 2 >>> Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 >>> executing #MPI = 2, without OMP >>> >>> ================================================= >>> MUMPS compiled with option -Dmetis >>> MUMPS compiled with option -Dparmetis >>> MUMPS compiled with option -Dpord >>> MUMPS compiled with option -Dptscotch >>> MUMPS compiled with option -Dscotch >>> ================================================= >>> L U Solver for unsymmetric matrices >>> Type of parallelism: Working host >>> >>> ****** ANALYSIS STEP ******** >>> >>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>> Processing a graph of size: 56 with 194 edges >>> Ordering based on AMF >>> WARNING: Largest root node of size 26 not selected for parallel execution >>> >>> Leaving analysis phase with ... >>> INFOG(1) = 0 >>> INFOG(2) = 0 >>> [?] >>> >>>> Try parmetis. >>>> Hong >>>> From: petsc-users on behalf of Victoria Rolandi >>>> Sent: Tuesday, October 31, 2023 10:30 PM >>>> To: petsc-users at mcs.anl.gov >>>> Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS >>>> >>>> Hi, >>>> >>>> I'm solving a large sparse linear system in parallel and I am using PETSc with MUMPS. I am trying to test different options, like the ordering of the matrix. Everything works if I use the -mat_mumps_icntl_7 2 or -mat_mumps_icntl_7 0 options (with the first one, AMF, performing better than AMD), however when I test METIS -mat_mumps_icntl_7 5 I get an error (reported at the end of the email). >>>> >>>> I have configured PETSc with the following options: >>>> >>>> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis >>>> >>>> and the installation didn't give any problems. >>>> >>>> Could you help me understand why metis is not working? >>>> >>>> Thank you in advance, >>>> Victoria >>>> >>>> Error: >>>> >>>> ****** ANALYSIS STEP ******** >>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>> Processing a graph of size: 699150 with 69238690 edges >>>> Ordering based on METIS >>>> 510522 37081376 [100] [10486 699150] >>>> Error! Unknown CType: -1 >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeremy.theler-ext at ansys.com Wed Nov 1 14:49:43 2023 From: jeremy.theler-ext at ansys.com (Jeremy Theler (External)) Date: Wed, 1 Nov 2023 19:49:43 +0000 Subject: [petsc-users] Non-manifold dmplex Message-ID: Hello all but especially Matt, I am trying to build this "non-manifold" DMplex that has already been brought up in these two threads: https://lists.mcs.anl.gov/pipermail/petsc-users/2021-December/045091.html https://lists.mcs.anl.gov/pipermail/petsc-users/2021-October/044743.html In this new thread I'm opening as in the two above, the background is solving elasticity with FEM using a mixture of solid, shells/plates and/or beam/truss elements. My first attempt was to create a cube and a line connecting one of the cube's corners to a point in Gmsh (cube-line.geo and cube-line-geo.png) and then mesh this geometry so as to have a single hex8 and a single line2, both defined using 9 nodes. (cube-line.msh and cube-line-mesh.png). But DMPlexCreateGmshFromFile() did not pick up the line2 element because there's this logic that "cell" means "element with topological dimension equal to that of the mesh", i.e. 3. So, I then tried to create the DM from a DAG with the same information that could have been parsed from the msh file: PetscInt num_points[4] = {9, 1, 0, 1}; PetscInt cone_size[11] = {8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2}; PetscInt cones[11] = {3,4,2,1,7,5,6,8, 6,9}; PetscInt cone_orientations[11] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; PetscReal vertex_coords[3*9] = { 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 2, 0, 0 }; DM dm; PetscCall(DMCreate(PETSC_COMM_WORLD, &dm)); PetscCall(PetscObjectSetName((PetscObject)dm, "cubeline-fromdag")); PetscCall(DMSetType(dm, DMPLEX)); PetscCall(DMSetDimension(dm, 3)); PetscCall(DMPlexCreateFromDAG(dm, 3, num_points, cone_size, cones, cone_orientations, vertex_coords)); The DM is now created successfully. With some slight modifications in DMPlexVTKGetCellType_Internal() I can even get a VTK out of the DM which shows the geometry I intended to have (cube-line-dag.vtk and cube-line-vtk.png). However, when I want to interpolate the DM I get an error which is not the same as the one given in the existing threads: [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Cone position 1 of point 10 is not in the valid range [0, 1) [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.20.1-96-g294d477ba62 GIT Date: 2023-10-31 00:56:56 -0500 [0]PETSC ERROR: ./main on a arch-linux-c-debug named tom by gtheler Wed Nov 1 16:17:15 2023 [0]PETSC ERROR: Configure options [0]PETSC ERROR: #1 DMPlexInsertCone() at /home/gtheler/libs/petsc/src/dm/impls/plex/plex.c:3384 [0]PETSC ERROR: #2 DMPlexInterpolateFaces_Internal() at /home/gtheler/libs/petsc/src/dm/impls/plex/plexinterpolate.c:691 [0]PETSC ERROR: #3 DMPlexInterpolate() at /home/gtheler/libs/petsc/src/dm/impls/plex/plexinterpolate.c:1499 I tried to understand a little bit what was going on and the first thing I noticed is that the depth returned by DMPlexGetDepth() is 1 even when I created the DM with an explicit depth of 3. Thus, I then hardcoded the depth to 3 in DMPlexInterpolateFaces_Internal() and it went a little bit further but failed during stratification: [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: New depth 3 range [11,19) overlaps with depth 1 range [11,19) [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.20.1-96-g294d477ba62 GIT Date: 2023-10-31 00:56:56 -0500 [0]PETSC ERROR: ./main on a arch-linux-c-debug named tom by gtheler Wed Nov 1 16:19:40 2023 [0]PETSC ERROR: Configure options [0]PETSC ERROR: #1 DMPlexCreateDepthStratum() at /home/gtheler/libs/petsc/src/dm/impls/plex/plex.c:4232 [0]PETSC ERROR: #2 DMPlexStratify() at /home/gtheler/libs/petsc/src/dm/impls/plex/plex.c:4356 [0]PETSC ERROR: #3 DMPlexInterpolateFaces_Internal() at /home/gtheler/libs/petsc/src/dm/impls/plex/plexinterpolate.c:711 [0]PETSC ERROR: #4 DMPlexInterpolate() at /home/gtheler/libs/petsc/src/dm/impls/plex/plexinterpolate.c:1499 Find also attached cube-line.c I followed the old thread on the archives. Here, Matt suggests that "if you assign cell types, you can even get Plex to automatically interpolate.": https://lists.mcs.anl.gov/pipermail/petsc-users/2021-December/045100.html But here he says it won't work because "We use depth in the DAG as a proxy for cell dimension, but this will no longer work if faces are not part of a volume." https://lists.mcs.anl.gov/pipermail/petsc-users/2021-October/044785.html Questions: 1. Is it ok to read back a depth of 1 instead of 3 or is this something that could be fixed? 2. Is there a way to make the interpolation algorithm able to handle this kind of DAGs? Thanks -- jeremy -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube-line.geo Type: application/octet-stream Size: 345 bytes Desc: cube-line.geo URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube-line-geo.png Type: image/png Size: 74330 bytes Desc: cube-line-geo.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube-line.msh Type: application/octet-stream Size: 2354 bytes Desc: cube-line.msh URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube-line-mesh.png Type: image/png Size: 15330 bytes Desc: cube-line-mesh.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube-line-dag.vtk Type: application/octet-stream Size: 563 bytes Desc: cube-line-dag.vtk URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube-line-vtk.png Type: image/png Size: 155722 bytes Desc: cube-line-vtk.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube-line.c Type: text/x-csrc Size: 1653 bytes Desc: cube-line.c URL: From qtang at lanl.gov Wed Nov 1 14:55:53 2023 From: qtang at lanl.gov (Tang, Qi) Date: Wed, 1 Nov 2023 19:55:53 +0000 Subject: [petsc-users] Fieldsplit on MATNEST Message-ID: Hi, I have a block matrix with type of MATNEST, but can I call -pc_fieldsplit_detect_saddle_point to change its IS? I assume it is not possible. Also, I notice there is a small but important typo in the new fieldsplit doc: https://petsc.org/release/manualpages/PC/PCFIELDSPLIT/ The first matrix of the full factorization misses A10. I believe it should be [I -ksp(A00) A01] [ I ] Best, Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Nov 1 20:32:43 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 1 Nov 2023 21:32:43 -0400 Subject: [petsc-users] Fieldsplit on MATNEST In-Reply-To: References: Message-ID: On Wed, Nov 1, 2023 at 3:59?PM Tang, Qi via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > I have a block matrix with type of MATNEST, but can I call > -pc_fieldsplit_detect_saddle_point to change its IS? I assume it is not > possible. > The detection part will work since MatGetDiagonal() is supported, but we do not have code in there to take arbitrary blocks from MATNEST since the whole point is to get no-copy access. Detection is not needed in the case that the blocks line up. > Also, I notice there is a small but important typo in the new fieldsplit > doc: > https://petsc.org/release/manualpages/PC/PCFIELDSPLIT/ > The first matrix of the full factorization misses A10. I believe it should > be > [I -ksp(A00) A01] > [ I ] > Yes, that is right. Thanks, Matt > Best, > Qi > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Thu Nov 2 01:08:37 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 2 Nov 2023 07:08:37 +0100 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: <2DFC7870-75E7-4664-9F06-F3E78764AEC1@petsc.dev> References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> <2DFC7870-75E7-4664-9F06-F3E78764AEC1@petsc.dev> Message-ID: > On 1 Nov 2023, at 8:02?PM, Barry Smith wrote: > > > Pierre, > > Sorry, I was not clear. What I meant was that the PETSc code that calls MUMPS could change the value of ICNTL(6) under certain conditions before calling MUMPS, thus the MUMPS warning might not be triggered. Again, I?m not sure it is possible, as the message is not guarded by the value of ICNTL(6), but by some other internal parameters. Thanks, Pierre $ for i in {1..7} do echo "ICNTL(6) = ${i}" ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu -mat_mumps_icntl_4 2 -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 -mat_mumps_icntl_6 ${i} | grep -i "not allowed" done ICNTL(6) = 1 ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed ICNTL(6) = 2 ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed ICNTL(6) = 3 ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed ICNTL(6) = 4 ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed ICNTL(6) = 5 ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed ICNTL(6) = 6 ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed ICNTL(6) = 7 ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed > I am basing this on a guess from looking at the MUMPS manual and the warning message that the particular value of ICNTL(6) is incompatible with the given matrix state. But I could easily be wrong. > > Barry > > >> On Nov 1, 2023, at 1:33?PM, Pierre Jolivet wrote: >> >> Victoria, please keep the list in copy. >> >>> I am not understanding how can I switch to ParMetis if it does not appear in the options of -mat_mumps_icntl_7.In the options I only have Metis and not ParMetis. >> >> >> You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 >> >> Barry, I don?t think we can programmatically shut off this warning, it?s guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which are only settable/gettable by people with access to consortium releases. >> I?ll ask the MUMPS people for confirmation. >> Note that this warning is only printed to screen with the option -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. >> >> Thanks, >> Pierre >> >>> On 1 Nov 2023, at 5:52?PM, Barry Smith wrote: >>> >>> >>> Pierre, >>> >>> Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this situation so as to not trigger the confusing warning message from MUMPS? >>> >>> Barry >>> >>>> On Nov 1, 2023, at 12:17?PM, Pierre Jolivet wrote: >>>> >>>> >>>> >>>>> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users wrote: >>>>> >>>>> Victoria, >>>>> "** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>> Ordering based on METIS" >>>> >>>> This warning is benign and appears for every run using a sequential partitioner in MUMPS with a MATMPIAIJ. >>>> (I?m not saying switching to ParMETIS will not make the issue go away) >>>> >>>> Thanks, >>>> Pierre >>>> >>>> $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu -mat_mumps_icntl_4 2 >>>> Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 >>>> executing #MPI = 2, without OMP >>>> >>>> ================================================= >>>> MUMPS compiled with option -Dmetis >>>> MUMPS compiled with option -Dparmetis >>>> MUMPS compiled with option -Dpord >>>> MUMPS compiled with option -Dptscotch >>>> MUMPS compiled with option -Dscotch >>>> ================================================= >>>> L U Solver for unsymmetric matrices >>>> Type of parallelism: Working host >>>> >>>> ****** ANALYSIS STEP ******** >>>> >>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>> Processing a graph of size: 56 with 194 edges >>>> Ordering based on AMF >>>> WARNING: Largest root node of size 26 not selected for parallel execution >>>> >>>> Leaving analysis phase with ... >>>> INFOG(1) = 0 >>>> INFOG(2) = 0 >>>> [?] >>>> >>>>> Try parmetis. >>>>> Hong >>>>> From: petsc-users on behalf of Victoria Rolandi >>>>> Sent: Tuesday, October 31, 2023 10:30 PM >>>>> To: petsc-users at mcs.anl.gov >>>>> Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS >>>>> >>>>> Hi, >>>>> >>>>> I'm solving a large sparse linear system in parallel and I am using PETSc with MUMPS. I am trying to test different options, like the ordering of the matrix. Everything works if I use the -mat_mumps_icntl_7 2 or -mat_mumps_icntl_7 0 options (with the first one, AMF, performing better than AMD), however when I test METIS -mat_mumps_icntl_7 5 I get an error (reported at the end of the email). >>>>> >>>>> I have configured PETSc with the following options: >>>>> >>>>> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis >>>>> >>>>> and the installation didn't give any problems. >>>>> >>>>> Could you help me understand why metis is not working? >>>>> >>>>> Thank you in advance, >>>>> Victoria >>>>> >>>>> Error: >>>>> >>>>> ****** ANALYSIS STEP ******** >>>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>> Processing a graph of size: 699150 with 69238690 edges >>>>> Ordering based on METIS >>>>> 510522 37081376 [100] [10486 699150] >>>>> Error! Unknown CType: -1 >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maruthinh at gmail.com Thu Nov 2 10:20:51 2023 From: maruthinh at gmail.com (Maruthi NH) Date: Thu, 2 Nov 2023 20:50:51 +0530 Subject: [petsc-users] error while compiling PETSc on windows using cygwin Message-ID: Hi all, I get the following error while trying to compile PETSc version 3.20.1 on Windows \petsc\include\petsc/private/cpp/unordered_map.hpp(309): error C2938: 'std::enable_if_t' : Failed to specialize alias template This is the configuration file I used to compile PETSc #!/usr/bin/python import os petsc_hash_pkgs=os.path.join(os.getenv('HOME'),'petsc-hash-pkgs') oadirf='"/cygdrive/c/Program Files (x86)/Intel/oneAPI"' oadir=os.popen('cygpath -u '+os.popen('cygpath -ms '+oadirf).read()).read().strip() oamkldir=oadir+'/mkl/2022.1.0/lib/intel64' oampidir=oadir+'/mpi/2021.6.0' if __name__ == '__main__': import sys import os sys.path.insert(0, os.path.abspath('config')) import configure configure_options = [ '--package-prefix-hash='+petsc_hash_pkgs, '--with-debugging=0', '--with-shared-libraries=0', '--with-blaslapack-lib=-L'+oamkldir+' mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib', '--with-cc=win32fe cl', '--with-cxx=win32fe cl', '--with-fc=win32fe ifort', 'FOPTFLGS=-O3 -fp-model=precise', '--with-mpi-include='+oampidir+'/include', '--with-mpi-lib='+oampidir+'/lib/release/impi.lib', '-with-mpiexec='+oampidir+'/bin/mpiexec -localonly', ] configure.petsc_configure(configure_options) Regards, Maruthi -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Nov 2 11:03:19 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 2 Nov 2023 12:03:19 -0400 Subject: [petsc-users] error while compiling PETSc on windows using cygwin In-Reply-To: References: Message-ID: <5A46A349-3201-4A5A-AE62-282D1E98F34F@petsc.dev> It could be you would benefit from having the latest Microsoft compilers If you do not need C++ you could use --with-cxx=0 Otherwise please send configure.log to petsc-maint at mcs.anl.gov > On Nov 2, 2023, at 11:20?AM, Maruthi NH wrote: > > Hi all, > > I get the following error while trying to compile PETSc version 3.20.1 on Windows > > \petsc\include\petsc/private/cpp/unordered_map.hpp(309): error C2938: 'std::enable_if_t' : Failed to specialize alias template > > This is the configuration file I used to compile PETSc > > #!/usr/bin/python > > import os > petsc_hash_pkgs=os.path.join(os.getenv('HOME'),'petsc-hash-pkgs') > > oadirf='"/cygdrive/c/Program Files (x86)/Intel/oneAPI"' > oadir=os.popen('cygpath -u '+os.popen('cygpath -ms '+oadirf).read()).read().strip() > oamkldir=oadir+'/mkl/2022.1.0/lib/intel64' > oampidir=oadir+'/mpi/2021.6.0' > > if __name__ == '__main__': > import sys > import os > sys.path.insert(0, os.path.abspath('config')) > import configure > configure_options = [ > '--package-prefix-hash='+petsc_hash_pkgs, > '--with-debugging=0', > '--with-shared-libraries=0', > '--with-blaslapack-lib=-L'+oamkldir+' mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib', > '--with-cc=win32fe cl', > '--with-cxx=win32fe cl', > '--with-fc=win32fe ifort', > 'FOPTFLGS=-O3 -fp-model=precise', > '--with-mpi-include='+oampidir+'/include', > '--with-mpi-lib='+oampidir+'/lib/release/impi.lib', > '-with-mpiexec='+oampidir+'/bin/mpiexec -localonly', > ] > configure.petsc_configure(configure_options) > > Regards, > Maruthi From victoria.rolandi93 at gmail.com Thu Nov 2 11:29:01 2023 From: victoria.rolandi93 at gmail.com (Victoria Rolandi) Date: Thu, 2 Nov 2023 09:29:01 -0700 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> Message-ID: Pierre, Yes, sorry, I'll keep the list in copy. Launching with those options (-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2) I get an error during the analysis step. I also launched increasing the memory and I still have the error. *The calculations stops at :* Entering CMUMPS 5.4.1 from C interface with JOB, N = 1 699150 executing #MPI = 2, without OMP ================================================= MUMPS compiled with option -Dmetis MUMPS compiled with option -Dparmetis ================================================= L U Solver for unsymmetric matrices Type of parallelism: Working host ****** ANALYSIS STEP ******** ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed Using ParMETIS for parallel ordering Structural symmetry is: 90% *The error:* [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple MacOS to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.17.0, unknown [0]PETSC ERROR: ./charlin.exe on a named n1056 by vrolandi Wed Nov 1 11:38:28 2023 [0]PETSC ERROR: Configure options --prefix=/u/home/v/vrolandi/CODES/LIBRARY/packages/petsc/installationDir --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort CXXOPTFLAGS=-O3 --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis [0]PETSC ERROR: #1 User provided function() at unknown file:0 [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash. Abort(59) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 Thanks, Victoria Il giorno mer 1 nov 2023 alle ore 10:33 Pierre Jolivet ha scritto: > Victoria, please keep the list in copy. > > I am not understanding how can I switch to ParMetis if it does not appear > in the options of -mat_mumps_icntl_7.In the options I only have Metis and > not ParMetis. > > > You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 > > Barry, I don?t think we can programmatically shut off this warning, it?s > guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which are > only settable/gettable by people with access to consortium releases. > I?ll ask the MUMPS people for confirmation. > Note that this warning is only printed to screen with the option > -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. > > Thanks, > Pierre > > On 1 Nov 2023, at 5:52?PM, Barry Smith wrote: > > > Pierre, > > Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this situation > so as to not trigger the confusing warning message from MUMPS? > > Barry > > On Nov 1, 2023, at 12:17?PM, Pierre Jolivet wrote: > > > > On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Victoria, > "** Maximum transversal (ICNTL(6)) not allowed because matrix is > distributed > Ordering based on METIS" > > > This warning is benign and appears for every run using a sequential > partitioner in MUMPS with a MATMPIAIJ. > (I?m not saying switching to ParMETIS will not make the issue go away) > > Thanks, > Pierre > > $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu > -mat_mumps_icntl_4 2 > Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 > executing #MPI = 2, without OMP > > ================================================= > MUMPS compiled with option -Dmetis > MUMPS compiled with option -Dparmetis > MUMPS compiled with option -Dpord > MUMPS compiled with option -Dptscotch > MUMPS compiled with option -Dscotch > ================================================= > L U Solver for unsymmetric matrices > Type of parallelism: Working host > > ****** ANALYSIS STEP ******** > > ** Maximum transversal (ICNTL(6)) not allowed because matrix is > distributed > Processing a graph of size: 56 with 194 edges > Ordering based on AMF > WARNING: Largest root node of size 26 not selected for parallel > execution > > Leaving analysis phase with ... > INFOG(1) = 0 > INFOG(2) = 0 > [?] > > Try parmetis. > Hong > ------------------------------ > *From:* petsc-users on behalf of > Victoria Rolandi > *Sent:* Tuesday, October 31, 2023 10:30 PM > *To:* petsc-users at mcs.anl.gov > *Subject:* [petsc-users] Error using Metis with PETSc installed with MUMPS > > Hi, > > I'm solving a large sparse linear system in parallel and I am using PETSc > with MUMPS. I am trying to test different options, like the ordering of the > matrix. Everything works if I use the *-mat_mumps_icntl_7 2 *or *-mat_mumps_icntl_7 > 0 *options (with the first one, AMF, performing better than AMD), however > when I test METIS *-mat_mumps_icntl_7** 5 *I get an error (reported at > the end of the email). > > I have configured PETSc with the following options: > > --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort > --with-scalar-type=complex --with-debugging=0 --with-precision=single > --download-mumps --download-scalapack --download-parmetis --download-metis > > and the installation didn't give any problems. > > Could you help me understand why metis is not working? > > Thank you in advance, > Victoria > > Error: > > ****** ANALYSIS STEP ******** > ** Maximum transversal (ICNTL(6)) not allowed because matrix is > distributed > Processing a graph of size: 699150 with 69238690 edges > Ordering based on METIS > 510522 37081376 [100] [10486 699150] > Error! Unknown CType: -1 > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Thu Nov 2 11:35:00 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Thu, 2 Nov 2023 17:35:00 +0100 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> Message-ID: <57FF63B6-B33D-405D-BF3C-CB87EF051A33@joliv.et> > On 2 Nov 2023, at 5:29?PM, Victoria Rolandi wrote: > > Pierre, > Yes, sorry, I'll keep the list in copy. > Launching with those options (-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2) I get an error during the analysis step. I also launched increasing the memory and I still have the error. Oh, OK, that?s bad. Would you be willing to give SCOTCH and/or PT-SCOTCH a try? You?d need to reconfigure/recompile with --download-ptscotch (and maybe --download-bison depending on your system). Then, the option would become either -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 (PT-SCOTCH) or -mat_mumps_icntl_7 3 (SCOTCH). It may be worth updating PETSc as well (you are using 3.17.0, we are at 3.20.1), though I?m not sure we updated the METIS/ParMETIS snapshots since then, so it may not fix the present issue. Thanks, Pierre > The calculations stops at : > > Entering CMUMPS 5.4.1 from C interface with JOB, N = 1 699150 > executing #MPI = 2, without OMP > > ================================================= > MUMPS compiled with option -Dmetis > MUMPS compiled with option -Dparmetis > ================================================= > L U Solver for unsymmetric matrices > Type of parallelism: Working host > > ****** ANALYSIS STEP ******** > > ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed > Using ParMETIS for parallel ordering > Structural symmetry is: 90% > > > The error: > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple MacOS to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.17.0, unknown > [0]PETSC ERROR: ./charlin.exe on a named n1056 by vrolandi Wed Nov 1 11:38:28 2023 > [0]PETSC ERROR: Configure options --prefix=/u/home/v/vrolandi/CODES/LIBRARY/packages/petsc/installationDir --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort CXXOPTFLAGS=-O3 --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis > > [0]PETSC ERROR: #1 User provided function() at unknown file:0 > [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash. > Abort(59) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > > Thanks, > Victoria > > Il giorno mer 1 nov 2023 alle ore 10:33 Pierre Jolivet > ha scritto: >> Victoria, please keep the list in copy. >> >>> I am not understanding how can I switch to ParMetis if it does not appear in the options of -mat_mumps_icntl_7.In the options I only have Metis and not ParMetis. >> >> >> You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 >> >> Barry, I don?t think we can programmatically shut off this warning, it?s guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which are only settable/gettable by people with access to consortium releases. >> I?ll ask the MUMPS people for confirmation. >> Note that this warning is only printed to screen with the option -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. >> >> Thanks, >> Pierre >> >>> On 1 Nov 2023, at 5:52?PM, Barry Smith > wrote: >>> >>> >>> Pierre, >>> >>> Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this situation so as to not trigger the confusing warning message from MUMPS? >>> >>> Barry >>> >>>> On Nov 1, 2023, at 12:17?PM, Pierre Jolivet > wrote: >>>> >>>> >>>> >>>>> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users > wrote: >>>>> >>>>> Victoria, >>>>> "** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>> Ordering based on METIS" >>>> >>>> This warning is benign and appears for every run using a sequential partitioner in MUMPS with a MATMPIAIJ. >>>> (I?m not saying switching to ParMETIS will not make the issue go away) >>>> >>>> Thanks, >>>> Pierre >>>> >>>> $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu -mat_mumps_icntl_4 2 >>>> Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 >>>> executing #MPI = 2, without OMP >>>> >>>> ================================================= >>>> MUMPS compiled with option -Dmetis >>>> MUMPS compiled with option -Dparmetis >>>> MUMPS compiled with option -Dpord >>>> MUMPS compiled with option -Dptscotch >>>> MUMPS compiled with option -Dscotch >>>> ================================================= >>>> L U Solver for unsymmetric matrices >>>> Type of parallelism: Working host >>>> >>>> ****** ANALYSIS STEP ******** >>>> >>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>> Processing a graph of size: 56 with 194 edges >>>> Ordering based on AMF >>>> WARNING: Largest root node of size 26 not selected for parallel execution >>>> >>>> Leaving analysis phase with ... >>>> INFOG(1) = 0 >>>> INFOG(2) = 0 >>>> [?] >>>> >>>>> Try parmetis. >>>>> Hong >>>>> From: petsc-users > on behalf of Victoria Rolandi > >>>>> Sent: Tuesday, October 31, 2023 10:30 PM >>>>> To: petsc-users at mcs.anl.gov > >>>>> Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS >>>>> >>>>> Hi, >>>>> >>>>> I'm solving a large sparse linear system in parallel and I am using PETSc with MUMPS. I am trying to test different options, like the ordering of the matrix. Everything works if I use the -mat_mumps_icntl_7 2 or -mat_mumps_icntl_7 0 options (with the first one, AMF, performing better than AMD), however when I test METIS -mat_mumps_icntl_7 5 I get an error (reported at the end of the email). >>>>> >>>>> I have configured PETSc with the following options: >>>>> >>>>> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis >>>>> >>>>> and the installation didn't give any problems. >>>>> >>>>> Could you help me understand why metis is not working? >>>>> >>>>> Thank you in advance, >>>>> Victoria >>>>> >>>>> Error: >>>>> >>>>> ****** ANALYSIS STEP ******** >>>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>> Processing a graph of size: 699150 with 69238690 edges >>>>> Ordering based on METIS >>>>> 510522 37081376 [100] [10486 699150] >>>>> Error! Unknown CType: -1 >>>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From maruthinh at gmail.com Thu Nov 2 13:56:21 2023 From: maruthinh at gmail.com (Maruthi NH) Date: Fri, 3 Nov 2023 00:26:21 +0530 Subject: [petsc-users] error while compiling PETSc on windows using cygwin In-Reply-To: <5A46A349-3201-4A5A-AE62-282D1E98F34F@petsc.dev> References: <5A46A349-3201-4A5A-AE62-282D1E98F34F@petsc.dev> Message-ID: Hi Barry, Thanks for the suggestion. It worked after updating the compilers. Regards, Maruthi On Thu, 2 Nov 2023 at 9:33 PM, Barry Smith wrote: > > It could be you would benefit from having the latest Microsoft compilers > > If you do not need C++ you could use --with-cxx=0 > > Otherwise please send configure.log to petsc-maint at mcs.anl.gov > > > > > On Nov 2, 2023, at 11:20?AM, Maruthi NH wrote: > > > > Hi all, > > > > I get the following error while trying to compile PETSc version 3.20.1 > on Windows > > > > \petsc\include\petsc/private/cpp/unordered_map.hpp(309): error C2938: > 'std::enable_if_t' : Failed to specialize alias template > > > > This is the configuration file I used to compile PETSc > > > > #!/usr/bin/python > > > > import os > > petsc_hash_pkgs=os.path.join(os.getenv('HOME'),'petsc-hash-pkgs') > > > > oadirf='"/cygdrive/c/Program Files (x86)/Intel/oneAPI"' > > oadir=os.popen('cygpath -u '+os.popen('cygpath -ms > '+oadirf).read()).read().strip() > > oamkldir=oadir+'/mkl/2022.1.0/lib/intel64' > > oampidir=oadir+'/mpi/2021.6.0' > > > > if __name__ == '__main__': > > import sys > > import os > > sys.path.insert(0, os.path.abspath('config')) > > import configure > > configure_options = [ > > '--package-prefix-hash='+petsc_hash_pkgs, > > '--with-debugging=0', > > '--with-shared-libraries=0', > > '--with-blaslapack-lib=-L'+oamkldir+' mkl_intel_lp64_dll.lib > mkl_sequential_dll.lib mkl_core_dll.lib', > > '--with-cc=win32fe cl', > > '--with-cxx=win32fe cl', > > '--with-fc=win32fe ifort', > > 'FOPTFLGS=-O3 -fp-model=precise', > > '--with-mpi-include='+oampidir+'/include', > > '--with-mpi-lib='+oampidir+'/lib/release/impi.lib', > > '-with-mpiexec='+oampidir+'/bin/mpiexec -localonly', > > ] > > configure.petsc_configure(configure_options) > > > > Regards, > > Maruthi > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sajidsyed2021 at u.northwestern.edu Thu Nov 2 15:36:59 2023 From: sajidsyed2021 at u.northwestern.edu (Sajid Ali) Date: Thu, 2 Nov 2023 15:36:59 -0500 Subject: [petsc-users] Status of PETScSF failures with GPU-aware MPI on Perlmutter Message-ID: Hi PETSc-developers, I had posted about crashes within PETScSF when using GPU-aware MPI on Perlmutter a while ago ( https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2022-February/045585.html). Now that the software stacks have stabilized, I was wondering if there was a fix for the same as I am still observing similar crashes. I am attaching the trace of the latest crash (with PETSc-3.20.0) for reference. Thank You, Sajid Ali (he/him) | Research Associate Data Science, Simulation, and Learning Division Fermi National Accelerator Laboratory s-sajid-ali.github.io -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 2_gpu_crash Type: application/octet-stream Size: 11301 bytes Desc: not available URL: From junchao.zhang at gmail.com Thu Nov 2 16:01:18 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 2 Nov 2023 16:01:18 -0500 Subject: [petsc-users] Status of PETScSF failures with GPU-aware MPI on Perlmutter In-Reply-To: References: Message-ID: Hi, Sajid, Do you have a test example to reproduce the error? --Junchao Zhang On Thu, Nov 2, 2023 at 3:37?PM Sajid Ali wrote: > Hi PETSc-developers, > > I had posted about crashes within PETScSF when using GPU-aware MPI on > Perlmutter a while ago ( > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2022-February/045585.html). > Now that the software stacks have stabilized, I was wondering if there was > a fix for the same as I am still observing similar crashes. > > I am attaching the trace of the latest crash (with PETSc-3.20.0) for > reference. > > Thank You, > Sajid Ali (he/him) | Research Associate > Data Science, Simulation, and Learning Division > Fermi National Accelerator Laboratory > s-sajid-ali.github.io > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Nov 2 16:02:01 2023 From: jed at jedbrown.org (Jed Brown) Date: Thu, 02 Nov 2023 15:02:01 -0600 Subject: [petsc-users] Status of PETScSF failures with GPU-aware MPI on Perlmutter In-Reply-To: References: Message-ID: <87r0l7swc6.fsf@jedbrown.org> What modules do you have loaded. I don't know if it currently works with cuda-11.7. I assume you're following these instructions carefully. https://docs.nersc.gov/development/programming-models/mpi/cray-mpich/#cuda-aware-mpi In our experience, GPU-aware MPI continues to be brittle on these machines. Maybe you can inquire with NERSC exactly which CUDA versions are tested with GPU-aware MPI. Sajid Ali writes: > Hi PETSc-developers, > > I had posted about crashes within PETScSF when using GPU-aware MPI on > Perlmutter a while ago ( > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2022-February/045585.html). > Now that the software stacks have stabilized, I was wondering if there was > a fix for the same as I am still observing similar crashes. > > I am attaching the trace of the latest crash (with PETSc-3.20.0) for > reference. > > Thank You, > Sajid Ali (he/him) | Research Associate > Data Science, Simulation, and Learning Division > Fermi National Accelerator Laboratory > s-sajid-ali.github.io From mmolinos at us.es Fri Nov 3 04:19:03 2023 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Fri, 3 Nov 2023 09:19:03 +0000 Subject: [petsc-users] Domain decomposition in PETSc for Molecular Dynamics Message-ID: Dear all, I am currently working on the development of a in-house molecular dynamics code using PETSc and C++. So far the code works great, however it is a little bit slow since I am not exploiting MPI for PETSc vectors. I was wondering if there is a way to perform the domain decomposition efficiently using some PETSc functionality. Any feedback is highly appreciated. Best regards, Miguel From knepley at gmail.com Fri Nov 3 09:42:00 2023 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 3 Nov 2023 10:42:00 -0400 Subject: [petsc-users] Fieldsplit on MATNEST In-Reply-To: References: Message-ID: On Wed, Nov 1, 2023 at 9:32?PM Matthew Knepley wrote: > On Wed, Nov 1, 2023 at 3:59?PM Tang, Qi via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi, >> >> I have a block matrix with type of MATNEST, but can I call >> -pc_fieldsplit_detect_saddle_point to change its IS? I assume it is not >> possible. >> > > The detection part will work since MatGetDiagonal() is supported, but we > do not have code in there > to take arbitrary blocks from MATNEST since the whole point is to get > no-copy access. Detection is > not needed in the case that the blocks line up. > > >> Also, I notice there is a small but important typo in the new fieldsplit >> doc: >> https://petsc.org/release/manualpages/PC/PCFIELDSPLIT/ >> The first matrix of the full factorization misses A10. I believe it >> should be >> [I -ksp(A00) A01] >> [ I ] >> > > Yes, that is right. > Fixed: https://gitlab.com/petsc/petsc/-/merge_requests/6993 THanks, Matt > Thanks, > > Matt > > >> Best, >> Qi >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victoria.rolandi93 at gmail.com Fri Nov 3 13:28:48 2023 From: victoria.rolandi93 at gmail.com (Victoria Rolandi) Date: Fri, 3 Nov 2023 11:28:48 -0700 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: <57FF63B6-B33D-405D-BF3C-CB87EF051A33@joliv.et> References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> <57FF63B6-B33D-405D-BF3C-CB87EF051A33@joliv.et> Message-ID: Pierre, Sure, I have now installed PETSc with MUMPS and PT-SCHOTCH, I got some errors at the beginning but then it worked adding --COPTFLAGS="-D_POSIX_C_SOURCE=199309L" to the configuration. Also, I have compilation errors when I try to use newer versions, so I kept the 3.17.0 for the moment. Now the parallel ordering works with PT-SCOTCH, however, is it normal that I do not see any difference in the performance compared to sequential ordering ? Also, could the error using Metis/Parmetis be due to the fact that my main code (to which I linked PETSc) uses a different ParMetis than the one separately installed by PETSC during the configuration? Hence should I configure PETSc linking ParMetis to the same library used by my main code? Thanks, Victoria Il giorno gio 2 nov 2023 alle ore 09:35 Pierre Jolivet ha scritto: > > On 2 Nov 2023, at 5:29?PM, Victoria Rolandi > wrote: > > Pierre, > Yes, sorry, I'll keep the list in copy. > Launching with those options (-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2) > I get an error during the analysis step. I also launched increasing the > memory and I still have the error. > > > Oh, OK, that?s bad. > Would you be willing to give SCOTCH and/or PT-SCOTCH a try? > You?d need to reconfigure/recompile with --download-ptscotch (and maybe > --download-bison depending on your system). > Then, the option would become either -mat_mumps_icntl_28 2 > -mat_mumps_icntl_29 2 (PT-SCOTCH) or -mat_mumps_icntl_7 3 (SCOTCH). > It may be worth updating PETSc as well (you are using 3.17.0, we are at > 3.20.1), though I?m not sure we updated the METIS/ParMETIS snapshots since > then, so it may not fix the present issue. > > Thanks, > Pierre > > *The calculations stops at :* > > Entering CMUMPS 5.4.1 from C interface with JOB, N = 1 699150 > executing #MPI = 2, without OMP > > ================================================= > MUMPS compiled with option -Dmetis > MUMPS compiled with option -Dparmetis > ================================================= > L U Solver for unsymmetric matrices > Type of parallelism: Working host > > ****** ANALYSIS STEP ******** > > ** Maximum transversal (ICNTL(6)) not allowed because matrix is > distributed > Using ParMETIS for parallel ordering > Structural symmetry is: 90% > > > *The error:* > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple MacOS > to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.17.0, unknown > [0]PETSC ERROR: ./charlin.exe on a named n1056 by vrolandi Wed Nov 1 > 11:38:28 2023 > [0]PETSC ERROR: Configure options > --prefix=/u/home/v/vrolandi/CODES/LIBRARY/packages/petsc/installationDir > --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort CXXOPTFLAGS=-O3 > --with-scalar-type=complex --with-debugging=0 --with-precision=single > --download-mumps --download-scalapack --download-parmetis --download-metis > > [0]PETSC ERROR: #1 User provided function() at unknown file:0 > [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is > causing the crash. > Abort(59) on node 0 (rank 0 in comm 0): application called > MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > > Thanks, > Victoria > > Il giorno mer 1 nov 2023 alle ore 10:33 Pierre Jolivet > ha scritto: > >> Victoria, please keep the list in copy. >> >> I am not understanding how can I switch to ParMetis if it does not appear >> in the options of -mat_mumps_icntl_7.In the options I only have Metis and >> not ParMetis. >> >> >> You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 >> >> Barry, I don?t think we can programmatically shut off this warning, it?s >> guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which are >> only settable/gettable by people with access to consortium releases. >> I?ll ask the MUMPS people for confirmation. >> Note that this warning is only printed to screen with the option >> -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. >> >> Thanks, >> Pierre >> >> On 1 Nov 2023, at 5:52?PM, Barry Smith wrote: >> >> >> Pierre, >> >> Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this situation >> so as to not trigger the confusing warning message from MUMPS? >> >> Barry >> >> On Nov 1, 2023, at 12:17?PM, Pierre Jolivet wrote: >> >> >> >> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >> Victoria, >> "** Maximum transversal (ICNTL(6)) not allowed because matrix is >> distributed >> Ordering based on METIS" >> >> >> This warning is benign and appears for every run using a sequential >> partitioner in MUMPS with a MATMPIAIJ. >> (I?m not saying switching to ParMETIS will not make the issue go away) >> >> Thanks, >> Pierre >> >> $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu >> -mat_mumps_icntl_4 2 >> Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 >> executing #MPI = 2, without OMP >> >> ================================================= >> MUMPS compiled with option -Dmetis >> MUMPS compiled with option -Dparmetis >> MUMPS compiled with option -Dpord >> MUMPS compiled with option -Dptscotch >> MUMPS compiled with option -Dscotch >> ================================================= >> L U Solver for unsymmetric matrices >> Type of parallelism: Working host >> >> ****** ANALYSIS STEP ******** >> >> ** Maximum transversal (ICNTL(6)) not allowed because matrix is >> distributed >> Processing a graph of size: 56 with 194 edges >> Ordering based on AMF >> WARNING: Largest root node of size 26 not selected for parallel >> execution >> >> Leaving analysis phase with ... >> INFOG(1) = 0 >> INFOG(2) = 0 >> [?] >> >> Try parmetis. >> Hong >> ------------------------------ >> *From:* petsc-users on behalf of >> Victoria Rolandi >> *Sent:* Tuesday, October 31, 2023 10:30 PM >> *To:* petsc-users at mcs.anl.gov >> *Subject:* [petsc-users] Error using Metis with PETSc installed with >> MUMPS >> >> Hi, >> >> I'm solving a large sparse linear system in parallel and I am using PETSc >> with MUMPS. I am trying to test different options, like the ordering of the >> matrix. Everything works if I use the *-mat_mumps_icntl_7 2 *or *-mat_mumps_icntl_7 >> 0 *options (with the first one, AMF, performing better than AMD), >> however when I test METIS *-mat_mumps_icntl_7** 5 *I get an error >> (reported at the end of the email). >> >> I have configured PETSc with the following options: >> >> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort >> --with-scalar-type=complex --with-debugging=0 --with-precision=single >> --download-mumps --download-scalapack --download-parmetis --download-metis >> >> and the installation didn't give any problems. >> >> Could you help me understand why metis is not working? >> >> Thank you in advance, >> Victoria >> >> Error: >> >> ****** ANALYSIS STEP ******** >> ** Maximum transversal (ICNTL(6)) not allowed because matrix is >> distributed >> Processing a graph of size: 699150 with 69238690 edges >> Ordering based on METIS >> 510522 37081376 [100] [10486 699150] >> Error! Unknown CType: -1 >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Fri Nov 3 13:33:50 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 3 Nov 2023 19:33:50 +0100 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> <57FF63B6-B33D-405D-BF3C-CB87EF051A33@joliv.et> Message-ID: <6B8137E2-2F13-4F4A-86B7-CF7AD75666CD@joliv.et> > On 3 Nov 2023, at 7:28?PM, Victoria Rolandi wrote: > > Pierre, > > Sure, I have now installed PETSc with MUMPS and PT-SCHOTCH, I got some errors at the beginning but then it worked adding --COPTFLAGS="-D_POSIX_C_SOURCE=199309L" to the configuration. > Also, I have compilation errors when I try to use newer versions, so I kept the 3.17.0 for the moment. You should ask for assistance to get the latest version. (Par)METIS snapshots may have not changed, but the MUMPS one did, with performance improvements. > Now the parallel ordering works with PT-SCOTCH, however, is it normal that I do not see any difference in the performance compared to sequential ordering ? Impossible to tell without you providing actual figures (number of nnz, number of processes, timings with sequential ordering, etc.), but 699k is not that big of a problem, so that is not extremely surprising. > Also, could the error using Metis/Parmetis be due to the fact that my main code (to which I linked PETSc) uses a different ParMetis than the one separately installed by PETSC during the configuration? Yes. > Hence should I configure PETSc linking ParMetis to the same library used by my main code? Yes. Thanks, Pierre > Thanks, > Victoria > > Il giorno gio 2 nov 2023 alle ore 09:35 Pierre Jolivet > ha scritto: >> >>> On 2 Nov 2023, at 5:29?PM, Victoria Rolandi > wrote: >>> >>> Pierre, >>> Yes, sorry, I'll keep the list in copy. >>> Launching with those options (-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2) I get an error during the analysis step. I also launched increasing the memory and I still have the error. >> >> Oh, OK, that?s bad. >> Would you be willing to give SCOTCH and/or PT-SCOTCH a try? >> You?d need to reconfigure/recompile with --download-ptscotch (and maybe --download-bison depending on your system). >> Then, the option would become either -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 (PT-SCOTCH) or -mat_mumps_icntl_7 3 (SCOTCH). >> It may be worth updating PETSc as well (you are using 3.17.0, we are at 3.20.1), though I?m not sure we updated the METIS/ParMETIS snapshots since then, so it may not fix the present issue. >> >> Thanks, >> Pierre >> >>> The calculations stops at : >>> >>> Entering CMUMPS 5.4.1 from C interface with JOB, N = 1 699150 >>> executing #MPI = 2, without OMP >>> >>> ================================================= >>> MUMPS compiled with option -Dmetis >>> MUMPS compiled with option -Dparmetis >>> ================================================= >>> L U Solver for unsymmetric matrices >>> Type of parallelism: Working host >>> >>> ****** ANALYSIS STEP ******** >>> >>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>> Using ParMETIS for parallel ordering >>> Structural symmetry is: 90% >>> >>> >>> The error: >>> >>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind >>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple MacOS to find memory corruption errors >>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>> [0]PETSC ERROR: to get more information on the crash. >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [0]PETSC ERROR: Signal received >>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.17.0, unknown >>> [0]PETSC ERROR: ./charlin.exe on a named n1056 by vrolandi Wed Nov 1 11:38:28 2023 >>> [0]PETSC ERROR: Configure options --prefix=/u/home/v/vrolandi/CODES/LIBRARY/packages/petsc/installationDir --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort CXXOPTFLAGS=-O3 --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis >>> >>> [0]PETSC ERROR: #1 User provided function() at unknown file:0 >>> [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash. >>> Abort(59) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >>> >>> >>> Thanks, >>> Victoria >>> >>> Il giorno mer 1 nov 2023 alle ore 10:33 Pierre Jolivet > ha scritto: >>>> Victoria, please keep the list in copy. >>>> >>>>> I am not understanding how can I switch to ParMetis if it does not appear in the options of -mat_mumps_icntl_7.In the options I only have Metis and not ParMetis. >>>> >>>> >>>> You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 >>>> >>>> Barry, I don?t think we can programmatically shut off this warning, it?s guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which are only settable/gettable by people with access to consortium releases. >>>> I?ll ask the MUMPS people for confirmation. >>>> Note that this warning is only printed to screen with the option -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. >>>> >>>> Thanks, >>>> Pierre >>>> >>>>> On 1 Nov 2023, at 5:52?PM, Barry Smith > wrote: >>>>> >>>>> >>>>> Pierre, >>>>> >>>>> Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this situation so as to not trigger the confusing warning message from MUMPS? >>>>> >>>>> Barry >>>>> >>>>>> On Nov 1, 2023, at 12:17?PM, Pierre Jolivet > wrote: >>>>>> >>>>>> >>>>>> >>>>>>> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users > wrote: >>>>>>> >>>>>>> Victoria, >>>>>>> "** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>>>> Ordering based on METIS" >>>>>> >>>>>> This warning is benign and appears for every run using a sequential partitioner in MUMPS with a MATMPIAIJ. >>>>>> (I?m not saying switching to ParMETIS will not make the issue go away) >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>> $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu -mat_mumps_icntl_4 2 >>>>>> Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 >>>>>> executing #MPI = 2, without OMP >>>>>> >>>>>> ================================================= >>>>>> MUMPS compiled with option -Dmetis >>>>>> MUMPS compiled with option -Dparmetis >>>>>> MUMPS compiled with option -Dpord >>>>>> MUMPS compiled with option -Dptscotch >>>>>> MUMPS compiled with option -Dscotch >>>>>> ================================================= >>>>>> L U Solver for unsymmetric matrices >>>>>> Type of parallelism: Working host >>>>>> >>>>>> ****** ANALYSIS STEP ******** >>>>>> >>>>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>>> Processing a graph of size: 56 with 194 edges >>>>>> Ordering based on AMF >>>>>> WARNING: Largest root node of size 26 not selected for parallel execution >>>>>> >>>>>> Leaving analysis phase with ... >>>>>> INFOG(1) = 0 >>>>>> INFOG(2) = 0 >>>>>> [?] >>>>>> >>>>>>> Try parmetis. >>>>>>> Hong >>>>>>> From: petsc-users > on behalf of Victoria Rolandi > >>>>>>> Sent: Tuesday, October 31, 2023 10:30 PM >>>>>>> To: petsc-users at mcs.anl.gov > >>>>>>> Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I'm solving a large sparse linear system in parallel and I am using PETSc with MUMPS. I am trying to test different options, like the ordering of the matrix. Everything works if I use the -mat_mumps_icntl_7 2 or -mat_mumps_icntl_7 0 options (with the first one, AMF, performing better than AMD), however when I test METIS -mat_mumps_icntl_7 5 I get an error (reported at the end of the email). >>>>>>> >>>>>>> I have configured PETSc with the following options: >>>>>>> >>>>>>> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis >>>>>>> >>>>>>> and the installation didn't give any problems. >>>>>>> >>>>>>> Could you help me understand why metis is not working? >>>>>>> >>>>>>> Thank you in advance, >>>>>>> Victoria >>>>>>> >>>>>>> Error: >>>>>>> >>>>>>> ****** ANALYSIS STEP ******** >>>>>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>>>> Processing a graph of size: 699150 with 69238690 edges >>>>>>> Ordering based on METIS >>>>>>> 510522 37081376 [100] [10486 699150] >>>>>>> Error! Unknown CType: -1 >>>>>> >>>>> >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Nov 4 07:40:05 2023 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 4 Nov 2023 08:40:05 -0400 Subject: [petsc-users] Domain decomposition in PETSc for Molecular Dynamics In-Reply-To: References: Message-ID: Hi MIGUEL, This might be a good place to start: https://petsc.org/main/manual/vec/ Feel free to ask more specific questions, but the docs are a good place to start. Thanks, Mark On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ wrote: > Dear all, > > I am currently working on the development of a in-house molecular dynamics > code using PETSc and C++. So far the code works great, however it is a > little bit slow since I am not exploiting MPI for PETSc vectors. I was > wondering if there is a way to perform the domain decomposition efficiently > using some PETSc functionality. Any feedback is highly appreciated. > > Best regards, > Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Nov 4 07:54:39 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 4 Nov 2023 08:54:39 -0400 Subject: [petsc-users] Domain decomposition in PETSc for Molecular Dynamics In-Reply-To: References: Message-ID: On Sat, Nov 4, 2023 at 8:40?AM Mark Adams wrote: > Hi MIGUEL, > > This might be a good place to start: https://petsc.org/main/manual/vec/ > Feel free to ask more specific questions, but the docs are a good place to > start. > > Thanks, > Mark > > On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ > wrote: > >> Dear all, >> >> I am currently working on the development of a in-house molecular >> dynamics code using PETSc and C++. So far the code works great, however it >> is a little bit slow since I am not exploiting MPI for PETSc vectors. I was >> wondering if there is a way to perform the domain decomposition efficiently >> using some PETSc functionality. Any feedback is highly appreciated. >> > It sounds like you mean "is there a way to specify a communication construct that can send my particle information automatically". We use PetscSF for that. You can see how this works with the DMSwarm class, which represents a particle discretization. You can either use that, or if it does not work for you, do the same things with your class. Thanks, Matt > Best regards, >> Miguel > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Sat Nov 4 09:50:44 2023 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Sat, 4 Nov 2023 14:50:44 +0000 Subject: [petsc-users] Domain decomposition in PETSc for Molecular Dynamics Message-ID: <2BEA961D-00D5-4880-A162-7262E398C048@us.es> ?Thank you Mark! I will have a look to it. Best, Miguel On 4 Nov 2023, at 13:54, Matthew Knepley wrote: ? On Sat, Nov 4, 2023 at 8:40?AM Mark Adams > wrote: Hi MIGUEL, This might be a good place to start: https://petsc.org/main/manual/vec/ Feel free to ask more specific questions, but the docs are a good place to start. Thanks, Mark On Fri, Nov 3, 2023 at 5:19?AM MIGUEL MOLINOS PEREZ > wrote: Dear all, I am currently working on the development of a in-house molecular dynamics code using PETSc and C++. So far the code works great, however it is a little bit slow since I am not exploiting MPI for PETSc vectors. I was wondering if there is a way to perform the domain decomposition efficiently using some PETSc functionality. Any feedback is highly appreciated. It sounds like you mean "is there a way to specify a communication construct that can send my particle information automatically". We use PetscSF for that. You can see how this works with the DMSwarm class, which represents a particle discretization. You can either use that, or if it does not work for you, do the same things with your class. Thanks, Matt Best regards, Miguel -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yc17470 at connect.um.edu.mo Sun Nov 5 20:54:30 2023 From: yc17470 at connect.um.edu.mo (Gong Yujie) Date: Mon, 6 Nov 2023 02:54:30 +0000 Subject: [petsc-users] Performance problem about output mesh in vtk format Message-ID: Dear PETSc developers, I'm trying to output a result data in vtk format and find that it is quite slow. Then I try to check this issue by a simple test code: ??????PetscCall(PetscInitialize(&argc,&argv,(char*)0,NULL)); ??????DM??????????????????????dm,dmParallel,dmAux; ??????PetscBool?????????interpolate=PETSC_TRUE; ??????PetscCall(DMPlexCreateExodusFromFile(PETSC_COMM_WORLD,"artery_plaque.exo",interpolate,&dm)); ??????PetscCall(DMViewFromOptions(dm,NULL,"-dm_view")); ??????PetscCall(PetscFinalize()); and run with ./dm_test -dm_view vtk:./ksp_data/abc.vtk -log_view It took about 600s to output the mesh. I'm not sure if there is something wrong in my code or my configuration of PETSc. Could you please give me some advice on this? Best Regards, Yujie P.S. The result for log_view **************************************************************************************************************************************************************** *** WIDEN YOUR WINDOW TO 160 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** **************************************************************************************************************************************************************** ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ ./dm_test on a arch-linux-c-opt named DESKTOP-0H8HCOD with 1 processor, by qingfeng Mon Nov 6 10:43:31 2023 Using Petsc Release Version 3.19.5, unknown Max Max/Min Avg Total Time (sec): 6.286e+02 1.000 6.286e+02 Objects: 1.400e+02 1.000 1.400e+02 Flops: 0.000e+00 0.000 0.000e+00 0.000e+00 Flops/sec: 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 6.2859e+02 100.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage DMPlexInterp 1 1.0 3.1186e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexStratify 3 1.0 4.2802e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexSymmetrize 3 1.0 1.0806e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Object Type Creations Destructions. Reports information only for process 0. --- Event Stage 0: Main Stage Container 2 1 Distributed Mesh 5 3 DM Label 20 8 Index Set 64 52 Section 17 12 Star Forest Graph 10 7 Discrete System 7 5 Weak Form 7 5 GraphPartitioner 3 2 Matrix 2 1 Vector 1 0 Viewer 2 1 ======================================================================================================================== Average time to get PetscTime(): 1.8e-08 #PETSc Option Table entries: -dm_view vtk:./ksp_data/abc.vtk # (source: command line) -log_view # (source: command line) #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --with-debugging=0 --with-strict-petscerrorcode --download-openmpi --download-metis --download-exodusii --download-parmetis --download-netcdf --download-pnetcdf --download-hdf5 --download-zlib --download-superlu --download-superlu_dist --download-triangle --download-cmake --download-fblaslapack --download-slepc ----------------------------------------- Libraries compiled on 2023-09-15 02:34:25 on DESKTOP-0H8HCOD Machine characteristics: Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.29 Using PETSc directory: /home/qingfeng/petsc/optpetsc3-19-5/petsc Using PETSc arch: arch-linux-c-opt ----------------------------------------- Using C compiler: /home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch -fstack-protector -fvisibility=hidden -g -O Using Fortran compiler: /home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/bin/mpif90 -fPIC -Wall -ffree-line-length-none -ffree-line-length-0 -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O ----------------------------------------- Using include paths: -I/home/qingfeng/petsc/optpetsc3-19-5/petsc/include -I/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/include ----------------------------------------- Using C linker: /home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/bin/mpicc Using Fortran linker: /home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/bin/mpif90 Using libraries: -Wl,-rpath,/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/lib -L/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/lib -lpetsc -Wl,-rpath,/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/lib -L/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9 -lsuperlu -lsuperlu_dist -lflapack -lfblas -lexoIIv2for32 -lexodus -lnetcdf -lpnetcdf -lhdf5_hl -lhdf5 -lparmetis -lmetis -ltriangle -lm -lz -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl ----------------------------------------- The mesh information: DM Object: Created by ICEMCFD - EXODUS II Interface 1 MPI process type: plex Created by ICEMCFD - EXODUS II Interface in 3 dimensions: Number of 0-cells per rank: 134549 Number of 1-cells per rank: 841756 Number of 2-cells per rank: 1366008 Number of 3-cells per rank: 658801 Labels: celltype: 4 strata with value/size (0 (134549), 6 (658801), 3 (1366008), 1 (841756)) depth: 4 strata with value/size (0 (134549), 1 (841756), 2 (1366008), 3 (658801)) Cell Sets: 2 strata with value/size (1 (604426), 2 (54375)) Vertex Sets: 5 strata with value/size (3 (481), 4 (27248), 5 (20560), 6 (653), 7 (2370)) Face Sets: 5 strata with value/size (8 (740), 9 (54206), 10 (40857), 11 (999), 12 (4534)) SMALLER: 1 strata with value/size (8 (740)) OUTER: 1 strata with value/size (9 (54206)) INNER: 1 strata with value/size (10 (40857)) BIGGER: 1 strata with value/size (11 (999)) -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Nov 5 21:02:16 2023 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 5 Nov 2023 22:02:16 -0500 Subject: [petsc-users] Performance problem about output mesh in vtk format In-Reply-To: References: Message-ID: On Sun, Nov 5, 2023 at 9:54?PM Gong Yujie wrote: > Dear PETSc developers, > > I'm trying to output a result data in vtk format and find that it is quite > slow. Then I try to check this issue by a simple test code: > > * PetscCall(PetscInitialize(&argc,&argv,(char*)0,NULL));* > * DM dm,dmParallel,dmAux;* > * PetscBool interpolate=PETSC_TRUE;* > * > PetscCall(DMPlexCreateExodusFromFile(PETSC_COMM_WORLD,"artery_plaque.exo",interpolate,&dm));* > * PetscCall(DMViewFromOptions(dm,NULL,"-dm_view"));* > * PetscCall(PetscFinalize());* > and run with *./dm_test -dm_view vtk:./ksp_data/abc.vtk -log_view* > > It took about 600s to output the mesh. I'm not sure if there is something > wrong in my code or my configuration of PETSc. Could you please give me > some advice on this? > VTK is an ASCII format, and the mesh is not small. The file size may be causing problems on your system. What if you choose VTU instead? I now mostly use HDF5, and the utility that creates an XDMF to match it. Thanks, Matt > Best Regards, > Yujie > > P.S. The result for log_view > **************************************************************************************************************************************************************** > > *** WIDEN YOUR WINDOW TO 160 CHARACTERS. > Use 'enscript -r -fCourier9' to print this document > *** > > **************************************************************************************************************************************************************** > > ------------------------------------------------------------------ PETSc > Performance Summary: > ------------------------------------------------------------------ > > ./dm_test on a arch-linux-c-opt named DESKTOP-0H8HCOD with 1 processor, by > qingfeng Mon Nov 6 10:43:31 2023 > Using Petsc Release Version 3.19.5, unknown > > Max Max/Min Avg Total > Time (sec): 6.286e+02 1.000 6.286e+02 > Objects: 1.400e+02 1.000 1.400e+02 > Flops: 0.000e+00 0.000 0.000e+00 0.000e+00 > Flops/sec: 0.000e+00 0.000 0.000e+00 0.000e+00 > MPI Msg Count: 0.000e+00 0.000 0.000e+00 0.000e+00 > MPI Msg Len (bytes): 0.000e+00 0.000 0.000e+00 0.000e+00 > MPI Reductions: 0.000e+00 0.000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flops > and VecAXPY() for complex vectors of length N > --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 6.2859e+02 100.0% 0.0000e+00 0.0% 0.000e+00 > 0.0% 0.000e+00 0.0% 0.000e+00 0.0% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > DMPlexInterp 1 1.0 3.1186e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexStratify 3 1.0 4.2802e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DMPlexSymmetrize 3 1.0 1.0806e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > ------------------------------------------------------------------------------------------------------------------------ > > Object Type Creations Destructions. Reports information only > for process 0. > > --- Event Stage 0: Main Stage > > Container 2 1 > Distributed Mesh 5 3 > DM Label 20 8 > Index Set 64 52 > Section 17 12 > Star Forest Graph 10 7 > Discrete System 7 5 > Weak Form 7 5 > GraphPartitioner 3 2 > Matrix 2 1 > Vector 1 0 > Viewer 2 1 > > ======================================================================================================================== > Average time to get PetscTime(): 1.8e-08 > #PETSc Option Table entries: > -dm_view vtk:./ksp_data/abc.vtk # (source: command line) > -log_view # (source: command line) > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: --with-debugging=0 --with-strict-petscerrorcode > --download-openmpi --download-metis --download-exodusii --download-parmetis > --download-netcdf --download-pnetcdf --download-hdf5 --download-zlib > --download-superlu --download-superlu_dist --download-triangle > --download-cmake --download-fblaslapack --download-slepc > ----------------------------------------- > Libraries compiled on 2023-09-15 02:34:25 on DESKTOP-0H8HCOD > Machine characteristics: > Linux-5.10.16.3-microsoft-standard-WSL2-x86_64-with-glibc2.29 > Using PETSc directory: /home/qingfeng/petsc/optpetsc3-19-5/petsc > Using PETSc arch: arch-linux-c-opt > ----------------------------------------- > > Using C compiler: > /home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/bin/mpicc -fPIC > -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-lto-type-mismatch > -fstack-protector -fvisibility=hidden -g -O > Using Fortran compiler: > /home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/bin/mpif90 > -fPIC -Wall -ffree-line-length-none -ffree-line-length-0 > -Wno-lto-type-mismatch -Wno-unused-dummy-argument -g -O > > ----------------------------------------- > > Using include paths: -I/home/qingfeng/petsc/optpetsc3-19-5/petsc/include > -I/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/include > ----------------------------------------- > > Using C linker: > /home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/bin/mpicc > Using Fortran linker: > /home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/bin/mpif90 > Using libraries: > -Wl,-rpath,/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/lib > -L/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/lib -lpetsc > -Wl,-rpath,/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/lib > -L/home/qingfeng/petsc/optpetsc3-19-5/petsc/arch-linux-c-opt/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/9 > -L/usr/lib/gcc/x86_64-linux-gnu/9 -lsuperlu -lsuperlu_dist -lflapack > -lfblas -lexoIIv2for32 -lexodus -lnetcdf -lpnetcdf -lhdf5_hl -lhdf5 > -lparmetis -lmetis -ltriangle -lm -lz -ldl -lmpi_usempif08 > -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm > -lgcc_s -lquadmath -lpthread -lstdc++ -ldl > ----------------------------------------- > > The mesh information: > DM Object: Created by ICEMCFD - EXODUS II Interface 1 MPI process > type: plex > Created by ICEMCFD - EXODUS II Interface in 3 dimensions: > Number of 0-cells per rank: 134549 > Number of 1-cells per rank: 841756 > Number of 2-cells per rank: 1366008 > Number of 3-cells per rank: 658801 > Labels: > celltype: 4 strata with value/size (0 (134549), 6 (658801), 3 (1366008), > 1 (841756)) > depth: 4 strata with value/size (0 (134549), 1 (841756), 2 (1366008), 3 > (658801)) > Cell Sets: 2 strata with value/size (1 (604426), 2 (54375)) > Vertex Sets: 5 strata with value/size (3 (481), 4 (27248), 5 (20560), 6 > (653), 7 (2370)) > Face Sets: 5 strata with value/size (8 (740), 9 (54206), 10 (40857), 11 > (999), 12 (4534)) > SMALLER: 1 strata with value/size (8 (740)) > OUTER: 1 strata with value/size (9 (54206)) > INNER: 1 strata with value/size (10 (40857)) > BIGGER: 1 strata with value/size (11 (999)) > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victoria.rolandi93 at gmail.com Tue Nov 7 13:47:40 2023 From: victoria.rolandi93 at gmail.com (Victoria Rolandi) Date: Tue, 7 Nov 2023 11:47:40 -0800 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: <6B8137E2-2F13-4F4A-86B7-CF7AD75666CD@joliv.et> References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> <57FF63B6-B33D-405D-BF3C-CB87EF051A33@joliv.et> <6B8137E2-2F13-4F4A-86B7-CF7AD75666CD@joliv.et> Message-ID: Hi Pierre, Thanks for your reply. I am now trying to configure PETSc with the same METIS/ParMETIS of my main code. I get the following error, and I still get it even if I change the option --with-precision=double/--with-precision=single Metis specified is incompatible! IDXTYPEWIDTH=64 metis build appears to be specified for a default 32-bit-indices build of PETSc. Suggest using --download-metis for a compatible metis ******************************************************************************* In the cofigure.log I have: compilation aborted for /tmp/petsc-yxtl_gwd/config.packages.metis/conftest.c (code 2) Source: #include "confdefs.h" #include "conffix.h" #include "metis.h" int main() { #if (IDXTYPEWIDTH != 32) #error incompatible IDXTYPEWIDTH #endif; return 0; } How could I proceed? Thanks, Victoria Il giorno ven 3 nov 2023 alle ore 11:34 Pierre Jolivet ha scritto: > > > On 3 Nov 2023, at 7:28?PM, Victoria Rolandi > wrote: > > Pierre, > > Sure, I have now installed PETSc with MUMPS and PT-SCHOTCH, I got some > errors at the beginning but then it worked adding --COPTFLAGS="-D_POSIX_C_SOURCE=199309L" > to the configuration. > Also, I have compilation errors when I try to use newer versions, so I > kept the 3.17.0 for the moment. > > > You should ask for assistance to get the latest version. > (Par)METIS snapshots may have not changed, but the MUMPS one did, with > performance improvements. > > Now the parallel ordering works with PT-SCOTCH, however, is it normal that > I do not see any difference in the performance compared to sequential > ordering ? > > > Impossible to tell without you providing actual figures (number of nnz, > number of processes, timings with sequential ordering, etc.), but 699k is > not that big of a problem, so that is not extremely surprising. > > Also, could the error using Metis/Parmetis be due to the fact that my main > code (to which I linked PETSc) uses a different ParMetis than the one > separately installed by PETSC during the configuration? > > > Yes. > > Hence should I configure PETSc linking ParMetis to the same library used > by my main code? > > > Yes. > > Thanks, > Pierre > > Thanks, > Victoria > > Il giorno gio 2 nov 2023 alle ore 09:35 Pierre Jolivet > ha scritto: > >> >> On 2 Nov 2023, at 5:29?PM, Victoria Rolandi >> wrote: >> >> Pierre, >> Yes, sorry, I'll keep the list in copy. >> Launching with those options (-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 >> 2) I get an error during the analysis step. I also launched increasing the >> memory and I still have the error. >> >> >> Oh, OK, that?s bad. >> Would you be willing to give SCOTCH and/or PT-SCOTCH a try? >> You?d need to reconfigure/recompile with --download-ptscotch (and maybe >> --download-bison depending on your system). >> Then, the option would become either -mat_mumps_icntl_28 2 >> -mat_mumps_icntl_29 2 (PT-SCOTCH) or -mat_mumps_icntl_7 3 (SCOTCH). >> It may be worth updating PETSc as well (you are using 3.17.0, we are at >> 3.20.1), though I?m not sure we updated the METIS/ParMETIS snapshots since >> then, so it may not fix the present issue. >> >> Thanks, >> Pierre >> >> *The calculations stops at :* >> >> Entering CMUMPS 5.4.1 from C interface with JOB, N = 1 699150 >> executing #MPI = 2, without OMP >> >> ================================================= >> MUMPS compiled with option -Dmetis >> MUMPS compiled with option -Dparmetis >> ================================================= >> L U Solver for unsymmetric matrices >> Type of parallelism: Working host >> >> ****** ANALYSIS STEP ******** >> >> ** Maximum transversal (ICNTL(6)) not allowed because matrix is >> distributed >> Using ParMETIS for parallel ordering >> Structural symmetry is: 90% >> >> >> *The error:* >> >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple MacOS >> to find memory corruption errors >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> [0]PETSC ERROR: to get more information on the crash. >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Signal received >> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.17.0, unknown >> [0]PETSC ERROR: ./charlin.exe on a named n1056 by vrolandi Wed Nov 1 >> 11:38:28 2023 >> [0]PETSC ERROR: Configure options >> --prefix=/u/home/v/vrolandi/CODES/LIBRARY/packages/petsc/installationDir >> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort CXXOPTFLAGS=-O3 >> --with-scalar-type=complex --with-debugging=0 --with-precision=single >> --download-mumps --download-scalapack --download-parmetis --download-metis >> >> [0]PETSC ERROR: #1 User provided function() at unknown file:0 >> [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is >> causing the crash. >> Abort(59) on node 0 (rank 0 in comm 0): application called >> MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> >> Thanks, >> Victoria >> >> Il giorno mer 1 nov 2023 alle ore 10:33 Pierre Jolivet >> ha scritto: >> >>> Victoria, please keep the list in copy. >>> >>> I am not understanding how can I switch to ParMetis if it does not >>> appear in the options of -mat_mumps_icntl_7.In the options I only have >>> Metis and not ParMetis. >>> >>> >>> You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 >>> >>> Barry, I don?t think we can programmatically shut off this warning, it?s >>> guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which are >>> only settable/gettable by people with access to consortium releases. >>> I?ll ask the MUMPS people for confirmation. >>> Note that this warning is only printed to screen with the option >>> -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. >>> >>> Thanks, >>> Pierre >>> >>> On 1 Nov 2023, at 5:52?PM, Barry Smith wrote: >>> >>> >>> Pierre, >>> >>> Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this >>> situation so as to not trigger the confusing warning message from MUMPS? >>> >>> Barry >>> >>> On Nov 1, 2023, at 12:17?PM, Pierre Jolivet wrote: >>> >>> >>> >>> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>> Victoria, >>> "** Maximum transversal (ICNTL(6)) not allowed because matrix is >>> distributed >>> Ordering based on METIS" >>> >>> >>> This warning is benign and appears for every run using a sequential >>> partitioner in MUMPS with a MATMPIAIJ. >>> (I?m not saying switching to ParMETIS will not make the issue go away) >>> >>> Thanks, >>> Pierre >>> >>> $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu >>> -mat_mumps_icntl_4 2 >>> Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 >>> executing #MPI = 2, without OMP >>> >>> ================================================= >>> MUMPS compiled with option -Dmetis >>> MUMPS compiled with option -Dparmetis >>> MUMPS compiled with option -Dpord >>> MUMPS compiled with option -Dptscotch >>> MUMPS compiled with option -Dscotch >>> ================================================= >>> L U Solver for unsymmetric matrices >>> Type of parallelism: Working host >>> >>> ****** ANALYSIS STEP ******** >>> >>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is >>> distributed >>> Processing a graph of size: 56 with 194 edges >>> Ordering based on AMF >>> WARNING: Largest root node of size 26 not selected for parallel >>> execution >>> >>> Leaving analysis phase with ... >>> INFOG(1) = 0 >>> INFOG(2) = 0 >>> [?] >>> >>> Try parmetis. >>> Hong >>> ------------------------------ >>> *From:* petsc-users on behalf of >>> Victoria Rolandi >>> *Sent:* Tuesday, October 31, 2023 10:30 PM >>> *To:* petsc-users at mcs.anl.gov >>> *Subject:* [petsc-users] Error using Metis with PETSc installed with >>> MUMPS >>> >>> Hi, >>> >>> I'm solving a large sparse linear system in parallel and I am using >>> PETSc with MUMPS. I am trying to test different options, like the ordering >>> of the matrix. Everything works if I use the *-mat_mumps_icntl_7 2 *or *-mat_mumps_icntl_7 >>> 0 *options (with the first one, AMF, performing better than AMD), >>> however when I test METIS *-mat_mumps_icntl_7** 5 *I get an error >>> (reported at the end of the email). >>> >>> I have configured PETSc with the following options: >>> >>> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort >>> --with-scalar-type=complex --with-debugging=0 --with-precision=single >>> --download-mumps --download-scalapack --download-parmetis --download-metis >>> >>> and the installation didn't give any problems. >>> >>> Could you help me understand why metis is not working? >>> >>> Thank you in advance, >>> Victoria >>> >>> Error: >>> >>> ****** ANALYSIS STEP ******** >>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is >>> distributed >>> Processing a graph of size: 699150 with 69238690 edges >>> Ordering based on METIS >>> 510522 37081376 [100] [10486 699150] >>> Error! Unknown CType: -1 >>> >>> >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at joliv.et Tue Nov 7 14:24:50 2023 From: pierre at joliv.et (Pierre Jolivet) Date: Tue, 7 Nov 2023 21:24:50 +0100 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> <57FF63B6-B33D-405D-BF3C-CB87EF051A33@joliv.et> <6B8137E2-2F13-4F4A-86B7-CF7AD75666CD@joliv.et> Message-ID: <3357568E-81F5-49C3-B422-3CC32ADAB99F@joliv.et> > On 7 Nov 2023, at 8:47?PM, Victoria Rolandi wrote: > > Hi Pierre, > > Thanks for your reply. I am now trying to configure PETSc with the same METIS/ParMETIS of my main code. > > I get the following error, and I still get it even if I change the option --with-precision=double/--with-precision=single > > Metis specified is incompatible! > IDXTYPEWIDTH=64 metis build appears to be specified for a default 32-bit-indices build of PETSc. > Suggest using --download-metis for a compatible metis > ******************************************************************************* > > In the cofigure.log I have: > > compilation aborted for /tmp/petsc-yxtl_gwd/config.packages.metis/conftest.c (code 2) > Source: > #include "confdefs.h" > #include "conffix.h" > #include "metis.h" > > int main() { > #if (IDXTYPEWIDTH != 32) > #error incompatible IDXTYPEWIDTH > #endif; > return 0; > } > > > How could I proceed? I would use --download-metis and then have your code use METIS from PETSc, not the other way around. Thanks, Pierre > Thanks, > Victoria > > > > Il giorno ven 3 nov 2023 alle ore 11:34 Pierre Jolivet > ha scritto: >> >> >>> On 3 Nov 2023, at 7:28?PM, Victoria Rolandi > wrote: >>> >>> Pierre, >>> >>> Sure, I have now installed PETSc with MUMPS and PT-SCHOTCH, I got some errors at the beginning but then it worked adding --COPTFLAGS="-D_POSIX_C_SOURCE=199309L" to the configuration. >>> Also, I have compilation errors when I try to use newer versions, so I kept the 3.17.0 for the moment. >> >> You should ask for assistance to get the latest version. >> (Par)METIS snapshots may have not changed, but the MUMPS one did, with performance improvements. >> >>> Now the parallel ordering works with PT-SCOTCH, however, is it normal that I do not see any difference in the performance compared to sequential ordering ? >> >> Impossible to tell without you providing actual figures (number of nnz, number of processes, timings with sequential ordering, etc.), but 699k is not that big of a problem, so that is not extremely surprising. >> >>> Also, could the error using Metis/Parmetis be due to the fact that my main code (to which I linked PETSc) uses a different ParMetis than the one separately installed by PETSC during the configuration? >> >> Yes. >> >>> Hence should I configure PETSc linking ParMetis to the same library used by my main code? >> >> Yes. >> >> Thanks, >> Pierre >> >>> Thanks, >>> Victoria >>> >>> Il giorno gio 2 nov 2023 alle ore 09:35 Pierre Jolivet > ha scritto: >>>> >>>>> On 2 Nov 2023, at 5:29?PM, Victoria Rolandi > wrote: >>>>> >>>>> Pierre, >>>>> Yes, sorry, I'll keep the list in copy. >>>>> Launching with those options (-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2) I get an error during the analysis step. I also launched increasing the memory and I still have the error. >>>> >>>> Oh, OK, that?s bad. >>>> Would you be willing to give SCOTCH and/or PT-SCOTCH a try? >>>> You?d need to reconfigure/recompile with --download-ptscotch (and maybe --download-bison depending on your system). >>>> Then, the option would become either -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 (PT-SCOTCH) or -mat_mumps_icntl_7 3 (SCOTCH). >>>> It may be worth updating PETSc as well (you are using 3.17.0, we are at 3.20.1), though I?m not sure we updated the METIS/ParMETIS snapshots since then, so it may not fix the present issue. >>>> >>>> Thanks, >>>> Pierre >>>> >>>>> The calculations stops at : >>>>> >>>>> Entering CMUMPS 5.4.1 from C interface with JOB, N = 1 699150 >>>>> executing #MPI = 2, without OMP >>>>> >>>>> ================================================= >>>>> MUMPS compiled with option -Dmetis >>>>> MUMPS compiled with option -Dparmetis >>>>> ================================================= >>>>> L U Solver for unsymmetric matrices >>>>> Type of parallelism: Working host >>>>> >>>>> ****** ANALYSIS STEP ******** >>>>> >>>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>> Using ParMETIS for parallel ordering >>>>> Structural symmetry is: 90% >>>>> >>>>> >>>>> The error: >>>>> >>>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>> [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind >>>>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple MacOS to find memory corruption errors >>>>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>>> [0]PETSC ERROR: to get more information on the crash. >>>>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>>> [0]PETSC ERROR: Signal received >>>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>>>> [0]PETSC ERROR: Petsc Release Version 3.17.0, unknown >>>>> [0]PETSC ERROR: ./charlin.exe on a named n1056 by vrolandi Wed Nov 1 11:38:28 2023 >>>>> [0]PETSC ERROR: Configure options --prefix=/u/home/v/vrolandi/CODES/LIBRARY/packages/petsc/installationDir --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort CXXOPTFLAGS=-O3 --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis >>>>> >>>>> [0]PETSC ERROR: #1 User provided function() at unknown file:0 >>>>> [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash. >>>>> Abort(59) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >>>>> >>>>> >>>>> Thanks, >>>>> Victoria >>>>> >>>>> Il giorno mer 1 nov 2023 alle ore 10:33 Pierre Jolivet > ha scritto: >>>>>> Victoria, please keep the list in copy. >>>>>> >>>>>>> I am not understanding how can I switch to ParMetis if it does not appear in the options of -mat_mumps_icntl_7.In the options I only have Metis and not ParMetis. >>>>>> >>>>>> >>>>>> You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 >>>>>> >>>>>> Barry, I don?t think we can programmatically shut off this warning, it?s guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which are only settable/gettable by people with access to consortium releases. >>>>>> I?ll ask the MUMPS people for confirmation. >>>>>> Note that this warning is only printed to screen with the option -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. >>>>>> >>>>>> Thanks, >>>>>> Pierre >>>>>> >>>>>>> On 1 Nov 2023, at 5:52?PM, Barry Smith > wrote: >>>>>>> >>>>>>> >>>>>>> Pierre, >>>>>>> >>>>>>> Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this situation so as to not trigger the confusing warning message from MUMPS? >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>>> On Nov 1, 2023, at 12:17?PM, Pierre Jolivet > wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users > wrote: >>>>>>>>> >>>>>>>>> Victoria, >>>>>>>>> "** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>>>>>> Ordering based on METIS" >>>>>>>> >>>>>>>> This warning is benign and appears for every run using a sequential partitioner in MUMPS with a MATMPIAIJ. >>>>>>>> (I?m not saying switching to ParMETIS will not make the issue go away) >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Pierre >>>>>>>> >>>>>>>> $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type lu -mat_mumps_icntl_4 2 >>>>>>>> Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 >>>>>>>> executing #MPI = 2, without OMP >>>>>>>> >>>>>>>> ================================================= >>>>>>>> MUMPS compiled with option -Dmetis >>>>>>>> MUMPS compiled with option -Dparmetis >>>>>>>> MUMPS compiled with option -Dpord >>>>>>>> MUMPS compiled with option -Dptscotch >>>>>>>> MUMPS compiled with option -Dscotch >>>>>>>> ================================================= >>>>>>>> L U Solver for unsymmetric matrices >>>>>>>> Type of parallelism: Working host >>>>>>>> >>>>>>>> ****** ANALYSIS STEP ******** >>>>>>>> >>>>>>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>>>>> Processing a graph of size: 56 with 194 edges >>>>>>>> Ordering based on AMF >>>>>>>> WARNING: Largest root node of size 26 not selected for parallel execution >>>>>>>> >>>>>>>> Leaving analysis phase with ... >>>>>>>> INFOG(1) = 0 >>>>>>>> INFOG(2) = 0 >>>>>>>> [?] >>>>>>>> >>>>>>>>> Try parmetis. >>>>>>>>> Hong >>>>>>>>> From: petsc-users > on behalf of Victoria Rolandi > >>>>>>>>> Sent: Tuesday, October 31, 2023 10:30 PM >>>>>>>>> To: petsc-users at mcs.anl.gov > >>>>>>>>> Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I'm solving a large sparse linear system in parallel and I am using PETSc with MUMPS. I am trying to test different options, like the ordering of the matrix. Everything works if I use the -mat_mumps_icntl_7 2 or -mat_mumps_icntl_7 0 options (with the first one, AMF, performing better than AMD), however when I test METIS -mat_mumps_icntl_7 5 I get an error (reported at the end of the email). >>>>>>>>> >>>>>>>>> I have configured PETSc with the following options: >>>>>>>>> >>>>>>>>> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-scalar-type=complex --with-debugging=0 --with-precision=single --download-mumps --download-scalapack --download-parmetis --download-metis >>>>>>>>> >>>>>>>>> and the installation didn't give any problems. >>>>>>>>> >>>>>>>>> Could you help me understand why metis is not working? >>>>>>>>> >>>>>>>>> Thank you in advance, >>>>>>>>> Victoria >>>>>>>>> >>>>>>>>> Error: >>>>>>>>> >>>>>>>>> ****** ANALYSIS STEP ******** >>>>>>>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is distributed >>>>>>>>> Processing a graph of size: 699150 with 69238690 edges >>>>>>>>> Ordering based on METIS >>>>>>>>> 510522 37081376 [100] [10486 699150] >>>>>>>>> Error! Unknown CType: -1 >>>>>>>> >>>>>>> >>>>>> >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From victoria.rolandi93 at gmail.com Tue Nov 7 16:27:53 2023 From: victoria.rolandi93 at gmail.com (Victoria Rolandi) Date: Tue, 7 Nov 2023 14:27:53 -0800 Subject: [petsc-users] Error using Metis with PETSc installed with MUMPS In-Reply-To: <3357568E-81F5-49C3-B422-3CC32ADAB99F@joliv.et> References: <27CBA36D-D273-4C9B-84F3-DEB2D73B12AA@joliv.et> <6CA490E5-A44B-415A-9B41-A9441FB96A47@petsc.dev> <57FF63B6-B33D-405D-BF3C-CB87EF051A33@joliv.et> <6B8137E2-2F13-4F4A-86B7-CF7AD75666CD@joliv.et> <3357568E-81F5-49C3-B422-3CC32ADAB99F@joliv.et> Message-ID: Great! It compiles now and both the commands -mat_mumps_icntl_7 2 and -mat_mumps_icntl_29 2 work, and perform better compared to the other ordering types. Thank you Pierre! As you suggested, I'll also send a new email concerning the errors I have with newer PETSc versions. Best, Victoria Il giorno mar 7 nov 2023 alle ore 12:25 Pierre Jolivet ha scritto: > > > On 7 Nov 2023, at 8:47?PM, Victoria Rolandi > wrote: > > Hi Pierre, > > Thanks for your reply. I am now trying to configure PETSc with the same > METIS/ParMETIS of my main code. > > I get the following error, and I still get it even if I change the option > --with-precision=double/--with-precision=single > > Metis specified is incompatible! > IDXTYPEWIDTH=64 metis build appears to be specified for a default > 32-bit-indices build of PETSc. > Suggest using --download-metis for a compatible metis > > ******************************************************************************* > > In the cofigure.log I have: > > compilation aborted for > /tmp/petsc-yxtl_gwd/config.packages.metis/conftest.c (code 2) > Source: > #include "confdefs.h" > #include "conffix.h" > #include "metis.h" > > int main() { > #if (IDXTYPEWIDTH != 32) > #error incompatible IDXTYPEWIDTH > #endif; > return 0; > } > > > How could I proceed? > > > I would use --download-metis and then have your code use METIS from PETSc, > not the other way around. > > Thanks, > Pierre > > Thanks, > Victoria > > > > Il giorno ven 3 nov 2023 alle ore 11:34 Pierre Jolivet > ha scritto: > >> >> >> On 3 Nov 2023, at 7:28?PM, Victoria Rolandi >> wrote: >> >> Pierre, >> >> Sure, I have now installed PETSc with MUMPS and PT-SCHOTCH, I got some >> errors at the beginning but then it worked adding --COPTFLAGS="-D_POSIX_C_SOURCE=199309L" >> to the configuration. >> Also, I have compilation errors when I try to use newer versions, so I >> kept the 3.17.0 for the moment. >> >> >> You should ask for assistance to get the latest version. >> (Par)METIS snapshots may have not changed, but the MUMPS one did, with >> performance improvements. >> >> Now the parallel ordering works with PT-SCOTCH, however, is it normal >> that I do not see any difference in the performance compared to sequential >> ordering ? >> >> >> Impossible to tell without you providing actual figures (number of nnz, >> number of processes, timings with sequential ordering, etc.), but 699k is >> not that big of a problem, so that is not extremely surprising. >> >> Also, could the error using Metis/Parmetis be due to the fact that my >> main code (to which I linked PETSc) uses a different ParMetis than the one >> separately installed by PETSC during the configuration? >> >> >> Yes. >> >> Hence should I configure PETSc linking ParMetis to the same library used >> by my main code? >> >> >> Yes. >> >> Thanks, >> Pierre >> >> Thanks, >> Victoria >> >> Il giorno gio 2 nov 2023 alle ore 09:35 Pierre Jolivet >> ha scritto: >> >>> >>> On 2 Nov 2023, at 5:29?PM, Victoria Rolandi < >>> victoria.rolandi93 at gmail.com> wrote: >>> >>> Pierre, >>> Yes, sorry, I'll keep the list in copy. >>> Launching with those options (-mat_mumps_icntl_28 2 -mat_mumps_icntl_29 >>> 2) I get an error during the analysis step. I also launched increasing the >>> memory and I still have the error. >>> >>> >>> Oh, OK, that?s bad. >>> Would you be willing to give SCOTCH and/or PT-SCOTCH a try? >>> You?d need to reconfigure/recompile with --download-ptscotch (and maybe >>> --download-bison depending on your system). >>> Then, the option would become either -mat_mumps_icntl_28 2 >>> -mat_mumps_icntl_29 2 (PT-SCOTCH) or -mat_mumps_icntl_7 3 (SCOTCH). >>> It may be worth updating PETSc as well (you are using 3.17.0, we are at >>> 3.20.1), though I?m not sure we updated the METIS/ParMETIS snapshots since >>> then, so it may not fix the present issue. >>> >>> Thanks, >>> Pierre >>> >>> *The calculations stops at :* >>> >>> Entering CMUMPS 5.4.1 from C interface with JOB, N = 1 699150 >>> executing #MPI = 2, without OMP >>> >>> ================================================= >>> MUMPS compiled with option -Dmetis >>> MUMPS compiled with option -Dparmetis >>> ================================================= >>> L U Solver for unsymmetric matrices >>> Type of parallelism: Working host >>> >>> ****** ANALYSIS STEP ******** >>> >>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is >>> distributed >>> Using ParMETIS for parallel ordering >>> Structural symmetry is: 90% >>> >>> >>> *The error:* >>> >>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind >>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple MacOS >>> to find memory corruption errors >>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >>> and run >>> [0]PETSC ERROR: to get more information on the crash. >>> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [0]PETSC ERROR: Signal received >>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.17.0, unknown >>> [0]PETSC ERROR: ./charlin.exe on a named n1056 by vrolandi Wed Nov 1 >>> 11:38:28 2023 >>> [0]PETSC ERROR: Configure options >>> --prefix=/u/home/v/vrolandi/CODES/LIBRARY/packages/petsc/installationDir >>> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort CXXOPTFLAGS=-O3 >>> --with-scalar-type=complex --with-debugging=0 --with-precision=single >>> --download-mumps --download-scalapack --download-parmetis --download-metis >>> >>> [0]PETSC ERROR: #1 User provided function() at unknown file:0 >>> [0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is >>> causing the crash. >>> Abort(59) on node 0 (rank 0 in comm 0): application called >>> MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >>> >>> >>> Thanks, >>> Victoria >>> >>> Il giorno mer 1 nov 2023 alle ore 10:33 Pierre Jolivet >>> ha scritto: >>> >>>> Victoria, please keep the list in copy. >>>> >>>> I am not understanding how can I switch to ParMetis if it does not >>>> appear in the options of -mat_mumps_icntl_7.In the options I only have >>>> Metis and not ParMetis. >>>> >>>> >>>> You need to use -mat_mumps_icntl_28 2 -mat_mumps_icntl_29 2 >>>> >>>> Barry, I don?t think we can programmatically shut off this warning, >>>> it?s guarded by a bunch of KEEP() values, see src/dana_driver.F:4707, which >>>> are only settable/gettable by people with access to consortium releases. >>>> I?ll ask the MUMPS people for confirmation. >>>> Note that this warning is only printed to screen with the option >>>> -mat_mumps_icntl_4 2 (or higher), so this won?t show up for standard runs. >>>> >>>> Thanks, >>>> Pierre >>>> >>>> On 1 Nov 2023, at 5:52?PM, Barry Smith wrote: >>>> >>>> >>>> Pierre, >>>> >>>> Could the PETSc MUMPS interface "turn-off" ICNTL(6) in this >>>> situation so as to not trigger the confusing warning message from MUMPS? >>>> >>>> Barry >>>> >>>> On Nov 1, 2023, at 12:17?PM, Pierre Jolivet wrote: >>>> >>>> >>>> >>>> On 1 Nov 2023, at 3:33?PM, Zhang, Hong via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>> Victoria, >>>> "** Maximum transversal (ICNTL(6)) not allowed because matrix is >>>> distributed >>>> Ordering based on METIS" >>>> >>>> >>>> This warning is benign and appears for every run using a sequential >>>> partitioner in MUMPS with a MATMPIAIJ. >>>> (I?m not saying switching to ParMETIS will not make the issue go away) >>>> >>>> Thanks, >>>> Pierre >>>> >>>> $ ../../../../arch-darwin-c-debug-real/bin/mpirun -n 2 ./ex2 -pc_type >>>> lu -mat_mumps_icntl_4 2 >>>> Entering DMUMPS 5.6.2 from C interface with JOB, N = 1 56 >>>> executing #MPI = 2, without OMP >>>> >>>> ================================================= >>>> MUMPS compiled with option -Dmetis >>>> MUMPS compiled with option -Dparmetis >>>> MUMPS compiled with option -Dpord >>>> MUMPS compiled with option -Dptscotch >>>> MUMPS compiled with option -Dscotch >>>> ================================================= >>>> L U Solver for unsymmetric matrices >>>> Type of parallelism: Working host >>>> >>>> ****** ANALYSIS STEP ******** >>>> >>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is >>>> distributed >>>> Processing a graph of size: 56 with 194 edges >>>> Ordering based on AMF >>>> WARNING: Largest root node of size 26 not selected for parallel >>>> execution >>>> >>>> Leaving analysis phase with ... >>>> INFOG(1) = 0 >>>> INFOG(2) = 0 >>>> [?] >>>> >>>> Try parmetis. >>>> Hong >>>> ------------------------------ >>>> *From:* petsc-users on behalf of >>>> Victoria Rolandi >>>> *Sent:* Tuesday, October 31, 2023 10:30 PM >>>> *To:* petsc-users at mcs.anl.gov >>>> *Subject:* [petsc-users] Error using Metis with PETSc installed with >>>> MUMPS >>>> >>>> Hi, >>>> >>>> I'm solving a large sparse linear system in parallel and I am using >>>> PETSc with MUMPS. I am trying to test different options, like the ordering >>>> of the matrix. Everything works if I use the *-mat_mumps_icntl_7 2 *or >>>> *-mat_mumps_icntl_7 0 *options (with the first one, AMF, performing >>>> better than AMD), however when I test METIS *-mat_mumps_icntl_7** 5 *I >>>> get an error (reported at the end of the email). >>>> >>>> I have configured PETSc with the following options: >>>> >>>> --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort >>>> --with-scalar-type=complex --with-debugging=0 --with-precision=single >>>> --download-mumps --download-scalapack --download-parmetis --download-metis >>>> >>>> and the installation didn't give any problems. >>>> >>>> Could you help me understand why metis is not working? >>>> >>>> Thank you in advance, >>>> Victoria >>>> >>>> Error: >>>> >>>> ****** ANALYSIS STEP ******** >>>> ** Maximum transversal (ICNTL(6)) not allowed because matrix is >>>> distributed >>>> Processing a graph of size: 699150 with 69238690 edges >>>> Ordering based on METIS >>>> 510522 37081376 [100] [10486 699150] >>>> Error! Unknown CType: -1 >>>> >>>> >>>> >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bastian.loehrer at tu-dresden.de Wed Nov 8 07:30:35 2023 From: bastian.loehrer at tu-dresden.de (=?UTF-8?Q?Bastian_L=C3=B6hrer?=) Date: Wed, 8 Nov 2023 14:30:35 +0100 Subject: [petsc-users] Fortran, Hypre, PETSc, 64bit-integers Message-ID: An HTML attachment was scrubbed... URL: From ctchengben at mail.scut.edu.cn Tue Nov 7 23:20:21 2023 From: ctchengben at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=) Date: Wed, 8 Nov 2023 13:20:21 +0800 (GMT+08:00) Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI Message-ID: <77a1187d.69cc.18bad5fd697.Coremail.ctchengben@mail.scut.edu.cn> Hello, Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: 1. PETSc: version 3.19.2 2. VS: version 2022 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit 4. Cygwin: see the picture attatched (see picture cygwin) And the compiler option in configuration is: ./configure --prefix=/cygdrive/g/mypetsc/petsc2023 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' --with-shared-libraries=0 --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/libTherefore, I write this e-mail to look for your help. /release/impi.lib --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly but there return an error: ********************************************************************************************* OSError while running ./configure --------------------------------------------------------------------------------------------- Cannot run executables created with FC. If this machine uses a batch system to submit jobs you will need to configure using ./configure with the additional option --with-batch. Otherwise there is problem with the compilers. Can you compile and run code with your compiler '/cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win32fe ifort'? See https://petsc.org/release/faq/#error-libimf ********************************************************************************************* Then I try to open configure.log in petsc, but there turnout an error that I can't open it.(see picture 1) And then I right click on properties and click safety,it just turnout "The permissions on test directory are incorrectly ordered.which may cause some entries to be ineffective." (see picture 2) And it also likely seen ?NULL SID? as the top entry in permission lists.(see picture 3) Then i follow this blog(https://blog.dhampir.no/content/forcing-cygwin-to-create-sane-permissions-on-windows) to edit /etc/fstab in Cygwin, and add ?noacl? to the mount options for /cygdrive. But it's not working. So I can't sent configure.log to you guys, it seems cygwin that installed in my computer happened to some problem. Mayebe the error happened in the configure on petsc just because of this reason. So I wrrit this email to report my problem and ask for your help. Looking forward your reply! sinserely, Cheng. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cygwin.png Type: image/png Size: 106545 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: picture1.jpg Type: image/jpeg Size: 44718 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: picture2.png Type: image/png Size: 4164 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: picture3.jpg Type: image/jpeg Size: 90655 bytes Desc: not available URL: From ctchengben at mail.scut.edu.cn Wed Nov 8 01:20:49 2023 From: ctchengben at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=) Date: Wed, 8 Nov 2023 15:20:49 +0800 (GMT+08:00) Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI In-Reply-To: <77a1187d.69cc.18bad5fd697.Coremail.ctchengben@mail.scut.edu.cn> References: <77a1187d.69cc.18bad5fd697.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: <455a3dae.6a1e.18badce213b.Coremail.ctchengben@mail.scut.edu.cn> Sorry, the configure is ./configure --prefix=/cygdrive/g/mypetsc/petsc2023 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' --with-shared-libraries=0 --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/lib/release/impi.lib --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly -----????----- ???:?? ????:2023-11-08 13:20:21 (???) ???: petsc-users at mcs.anl.gov ??: Error in configuring PETSc with Cygwin on Windows by using Intel MPI Hello, Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: 1. PETSc: version 3.19.2 2. VS: version 2022 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit 4. Cygwin: see the picture attatched (see picture cygwin) And the compiler option in configuration is: ./configure --prefix=/cygdrive/g/mypetsc/petsc2023 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' --with-shared-libraries=0 --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/libTherefore, I write this e-mail to look for your help. /release/impi.lib --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly but there return an error: ********************************************************************************************* OSError while running ./configure --------------------------------------------------------------------------------------------- Cannot run executables created with FC. If this machine uses a batch system to submit jobs you will need to configure using ./configure with the additional option --with-batch. Otherwise there is problem with the compilers. Can you compile and run code with your compiler '/cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win32fe ifort'? See https://petsc.org/release/faq/#error-libimf ********************************************************************************************* Then I try to open configure.log in petsc, but there turnout an error that I can't open it.(see picture 1) And then I right click on properties and click safety,it just turnout "The permissions on test directory are incorrectly ordered.which may cause some entries to be ineffective." (see picture 2) And it also likely seen ?NULL SID? as the top entry in permission lists.(see picture 3) Then i follow this blog(https://blog.dhampir.no/content/forcing-cygwin-to-create-sane-permissions-on-windows) to edit /etc/fstab in Cygwin, and add ?noacl? to the mount options for /cygdrive. But it's not working. So I can't sent configure.log to you guys, it seems cygwin that installed in my computer happened to some problem. Mayebe the error happened in the configure on petsc just because of this reason. So I wrrit this email to report my problem and ask for your help. Looking forward your reply! sinserely, Cheng. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramoni.zsedano at gmail.com Wed Nov 8 08:53:18 2023 From: ramoni.zsedano at gmail.com (Ramoni Z. Sedano Azevedo) Date: Wed, 8 Nov 2023 11:53:18 -0300 Subject: [petsc-users] Better solver and preconditioner to use multiple GPU Message-ID: Hey! I am using PETSC in Fortran code and we apply the MPI process to parallelize the code. At the moment, the options that have been used are -ksp_monitor_true_residual -ksp_type bcgs -pc_type bjacobi -sub_pc_type ilu -sub_pc_factor_levels 3 -sub_pc_factor_fill 6 Now, we want to use multiple GPUs and I would like to know if there is a better solver and preconditioner pair to apply in this case. Yours sincerely, Ramoni Z. S . Azevedo -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 8 09:42:30 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 8 Nov 2023 10:42:30 -0500 Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI In-Reply-To: <77a1187d.69cc.18bad5fd697.Coremail.ctchengben@mail.scut.edu.cn> References: <77a1187d.69cc.18bad5fd697.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: Send the file $PETSC_ARCH/lib/petsc/conf/configure.log > On Nov 8, 2023, at 12:20?AM, ?? wrote: > > Hello, > Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: > 1. PETSc: version 3.19.2 > 2. VS: version 2022 > 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit > 4. Cygwin: see the picture attatched (see picture cygwin) > > > And the compiler option in configuration is: > ./configure --prefix=/cygdrive/g/mypetsc/petsc2023 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' --with-shared-libraries=0 > --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include > --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/libTherefore, I write this e-mail to look for your help. > /release/impi.lib > --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly > > > > > but there return an error: > ********************************************************************************************* > OSError while running ./configure > --------------------------------------------------------------------------------------------- > Cannot run executables created with FC. If this machine uses a batch system > to submit jobs you will need to configure using ./configure with the additional option > --with-batch. > Otherwise there is problem with the compilers. Can you compile and run code with your > compiler '/cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win32fe ifort'? > See https://petsc.org/release/faq/#error-libimf > ********************************************************************************************* > > > > Then I try to open configure.log in petsc, but there turnout an error that I can't open it.(see picture 1) > > And then I right click on properties and click safety,it just turnout "The permissions on test directory are incorrectly ordered.which may cause some entries to be ineffective." (see picture 2) > > And it also likely seen ?NULL SID? as the top entry in permission lists.(see picture 3) > > Then i follow this blog(https://blog.dhampir.no/content/forcing-cygwin-to-create-sane-permissions-on-windows) to edit /etc/fstab in Cygwin, and add ?noacl? to the mount options for /cygdrive. > > But it's not working. > > So I can't sent configure.log to you guys, it seems cygwin that installed in my computer happened to some problem. > > Mayebe the error happened in the configure on petsc just because of this reason. > > > > So I wrrit this email to report my problem and ask for your help. > > > Looking forward your reply! > > > sinserely, > Cheng. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 8 09:46:32 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 8 Nov 2023 10:46:32 -0500 Subject: [petsc-users] Better solver and preconditioner to use multiple GPU In-Reply-To: References: Message-ID: <8546E9E1-2B55-494F-91B1-2ED96C626822@petsc.dev> Unfortunately, ILU(3) is not something that runs well on GPUs so ideally, we should find a preconditioner that works well in terms of iteration count but also runs well on GPUs. You can start by saying a bit about the nature of your problem. Is it a PDE? What type of discretization? Barry > On Nov 8, 2023, at 9:53?AM, Ramoni Z. Sedano Azevedo wrote: > > > Hey! > > I am using PETSC in Fortran code and we apply the MPI process to parallelize the code. > > At the moment, the options that have been used are > -ksp_monitor_true_residual > -ksp_type bcgs > -pc_type bjacobi > -sub_pc_type ilu > -sub_pc_factor_levels 3 > -sub_pc_factor_fill 6 > > Now, we want to use multiple GPUs and I would like to know if there is a better solver and preconditioner pair to apply in this case. > > Yours sincerely, > Ramoni Z. S . Azevedo > > From jed at jedbrown.org Wed Nov 8 10:22:12 2023 From: jed at jedbrown.org (Jed Brown) Date: Wed, 08 Nov 2023 09:22:12 -0700 Subject: [petsc-users] Better solver and preconditioner to use multiple GPU In-Reply-To: References: Message-ID: <87h6lwp64r.fsf@jedbrown.org> What sort of problem are you solving? Algebraic multigrid like gamg or hypre are good choices for elliptic problems. Sparse triangular solves have horrific efficiency even on one GPU so you generally want to do your best to stay away from them. "Ramoni Z. Sedano Azevedo" writes: > Hey! > > I am using PETSC in Fortran code and we apply the MPI process to > parallelize the code. > > At the moment, the options that have been used are > -ksp_monitor_true_residual > -ksp_type bcgs > -pc_type bjacobi > -sub_pc_type ilu > -sub_pc_factor_levels 3 > -sub_pc_factor_fill 6 > > Now, we want to use multiple GPUs and I would like to know if there is a > better solver and preconditioner pair to apply in this case. > > Yours sincerely, > Ramoni Z. S . Azevedo From zs1996 at sjtu.edu.cn Wed Nov 8 10:46:26 2023 From: zs1996 at sjtu.edu.cn (=?gb2312?B?1cXKpA==?=) Date: Thu, 9 Nov 2023 00:46:26 +0800 (CST) Subject: [petsc-users] error in configuring PETSc Message-ID: <222159542.4317947.1699461986571.JavaMail.zimbra@sjtu.edu.cn> Dear PETSc developer, I use the following commands to configure petsc, but errors occur: ./configure --with-cc=gcc-11 --with-cxx=g++-11 --with-fc=gfortran-11 --download-fftw --download-openmpi --download-fblaslapack --free-line-length-0 -g -fallow-argument-mismatch --enable-shared --enable-parallel --enable-fortran --with-zlibs=yes --with-szlib=no --with-cxx-dialect=C++11 --with-c2html=0 --with-x=0 --download-hdf5-fortran-bindings=1 --download-hdf5 I tried many times but cannot fix it. So I ask help for you. Thanks in advance. Best regards, Sheng Zhang Ph.D School of Materials Science and Engineering Shanghai Jiao Tong University 800 Dongchuan Road Shanghai, 200240 China -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot from 2023-11-09 00-44-32.png Type: image/png Size: 33039 bytes Desc: not available URL: From balay at mcs.anl.gov Wed Nov 8 10:52:54 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 8 Nov 2023 10:52:54 -0600 (CST) Subject: [petsc-users] error in configuring PETSc In-Reply-To: <222159542.4317947.1699461986571.JavaMail.zimbra@sjtu.edu.cn> References: <222159542.4317947.1699461986571.JavaMail.zimbra@sjtu.edu.cn> Message-ID: Suggest attaching text logs (copy/paste) - instead of screenshots. Try: ./configure --with-cc=gcc-11 --with-cxx=g++-11 --with-fc=gfortran-11 --download-fftw --download-openmpi --download-fblaslapack --with-zlibs=yes --with-szlib=no --with-c2html=0 --with-x=0 --download-hdf5-fortran-bindings=1 --download-hdf5 --download-sowing-cxx=g++-11 If you still have issues - send configure.log for this failure Satish On Thu, 9 Nov 2023, ?? wrote: > Dear PETSc developer, > > I use the following commands to configure petsc, but errors occur: > ./configure --with-cc=gcc-11 --with-cxx=g++-11 --with-fc=gfortran-11 --download-fftw --download-openmpi --download-fblaslapack --free-line-length-0 -g -fallow-argument-mismatch --enable-shared --enable-parallel --enable-fortran --with-zlibs=yes --with-szlib=no --with-cxx-dialect=C++11 --with-c2html=0 --with-x=0 --download-hdf5-fortran-bindings=1 --download-hdf5 > > > I tried many times but cannot fix it. So I ask help for you. Thanks in advance. > Best regards, > Sheng Zhang > > Ph.D > School of Materials Science and Engineering > Shanghai Jiao Tong University > 800 Dongchuan Road > Shanghai, 200240 China > From s.roongta at mpie.de Wed Nov 8 12:13:16 2023 From: s.roongta at mpie.de (Sharan Roongta) Date: Wed, 8 Nov 2023 19:13:16 +0100 Subject: [petsc-users] DMPlex and Gmsh Message-ID: <0a3f9f9c-737e-4d31-aa8a-c700fc9a2fcd@mpie.de> Dear Petsc team, I want to load a .msh file generated using Gmsh software into the DMPlex object. There are several things I would want to clarify, but I would like to start with "Physical tags". If I have defined "Physical Points", "Physical Surface", and "Physical Volume" in my .geo file, I get the physical tags in the ".msh" file. When I load this mesh in DMPlex, and view the DM: call DMView(globalMesh, PETSC_VIEWER_STDOUT_WORLD,err_PETSc) ? CHKERRQ(err_PETSc) This is the output I get: DM Object: n/a 1 MPI process ? type: plex n/a in 3 dimensions: ? Number of 0-cells per rank: 14 ? Number of 1-cells per rank: 49 ? Number of 2-cells per rank: 60 ? Number of 3-cells per rank: 24 Labels: ? celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) ? depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) ? Cell Sets: 1 strata with value/size (8 (24)) ? Face Sets: 6 strata with value/size (2 (4), 3 (4), 4 (4), 5 (4), 6 (4), 7 (4)) I was expecting to get the "Node Sets" or "Vertex Sets" also. Is my assumption wrong? If yes, then how can one figure out the boundary nodes and their tags where I want to apply certain boundary conditions? Currently we apply boundary conditions on faces, therefore "Face Sets" was enough. But now we want to apply displacements on certain boundary nodes. I have also attached the .geo and .msh file (hope you can open it) The Petsc version I am using is 3.18.6. Thanks and Regards, Sharan Roongta ----------------------------------------------- Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 ------------------------------------------------- Please consider that invitations and e-mails of our institute are only valid if they end with ... at mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ... at mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube_check.msh Type: application/octet-stream Size: 2082 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cube_check.geo Type: application/octet-stream Size: 1175 bytes Desc: not available URL: From bldenton at buffalo.edu Wed Nov 8 13:40:29 2023 From: bldenton at buffalo.edu (Brandon Denton) Date: Wed, 8 Nov 2023 19:40:29 +0000 Subject: [petsc-users] Storing Values using a Triplet for using later Message-ID: Good Afternoon, Is there a structure within PETSc that allows storage of a value using a triple similar to PetscHMapIJSet with the key using a struct{PetscScalar i, j, k;}? I'm trying to access mesh information (the shape function coefficients I will calculate prior to their use) who's values I want to store in the auxiliary array available in the Residual Functions of PETSc's FEM infrastructure. After some trial and error work, I've come to the realization that the coordinates (x[]) available in the auxiliary functions is the centroid of the cell/element currently being evaluated. This triplet is unique for each cell/element for a valid mesh so I think it's reasonable to use this triplet as a key for looking up stored values unique to each cell/element. My plan is to attached the map to the Application Context, also available to Auxiliary Functions, to enable these calculations. Does such a map infrastructure exist within PETSc? If so, could you point me to a reference for it? If not, does anyone have any suggestions on how to solve this problem? Thank you in advance for your time. Brandon Denton -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Nov 8 13:47:25 2023 From: jed at jedbrown.org (Jed Brown) Date: Wed, 08 Nov 2023 12:47:25 -0700 Subject: [petsc-users] Storing Values using a Triplet for using later In-Reply-To: References: Message-ID: <87a5roowmq.fsf@jedbrown.org> I don't think you want to hash floating point values, but I've had a number of reasons to want spatial hashing for near-neighbor queries in PETSc and that would be a great contribution. (Spatial hashes have a length scale and compute integer bins.) Brandon Denton via petsc-users writes: > Good Afternoon, > > Is there a structure within PETSc that allows storage of a value using a triple similar to PetscHMapIJSet with the key using a struct{PetscScalar i, j, k;}? > > I'm trying to access mesh information (the shape function coefficients I will calculate prior to their use) who's values I want to store in the auxiliary array available in the Residual Functions of PETSc's FEM infrastructure. After some trial and error work, I've come to the realization that the coordinates (x[]) available in the auxiliary functions is the centroid of the cell/element currently being evaluated. This triplet is unique for each cell/element for a valid mesh so I think it's reasonable to use this triplet as a key for looking up stored values unique to each cell/element. My plan is to attached the map to the Application Context, also available to Auxiliary Functions, to enable these calculations. > > Does such a map infrastructure exist within PETSc? If so, could you point me to a reference for it? If not, does anyone have any suggestions on how to solve this problem? > > Thank you in advance for your time. > Brandon Denton From bsmith at petsc.dev Wed Nov 8 14:20:54 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 8 Nov 2023 15:20:54 -0500 Subject: [petsc-users] Storing Values using a Triplet for using later In-Reply-To: <87a5roowmq.fsf@jedbrown.org> References: <87a5roowmq.fsf@jedbrown.org> Message-ID: <601471C7-6DB5-4D88-A64F-4FD5766A49C5@petsc.dev> PETSc hashing is done by leveraging the great package khash. Take a look at include/petsc/private/hashtable.h and hashijkey.h for how we use kash to automatically generate the code for a particular key. Just make your tolerance smaller than the smallest diameters of your cells. Barry > On Nov 8, 2023, at 2:47?PM, Jed Brown wrote: > > I don't think you want to hash floating point values, but I've had a number of reasons to want spatial hashing for near-neighbor queries in PETSc and that would be a great contribution. (Spatial hashes have a length scale and compute integer bins.) > > Brandon Denton via petsc-users writes: > >> Good Afternoon, >> >> Is there a structure within PETSc that allows storage of a value using a triple similar to PetscHMapIJSet with the key using a struct{PetscScalar i, j, k;}? >> >> I'm trying to access mesh information (the shape function coefficients I will calculate prior to their use) who's values I want to store in the auxiliary array available in the Residual Functions of PETSc's FEM infrastructure. After some trial and error work, I've come to the realization that the coordinates (x[]) available in the auxiliary functions is the centroid of the cell/element currently being evaluated. This triplet is unique for each cell/element for a valid mesh so I think it's reasonable to use this triplet as a key for looking up stored values unique to each cell/element. My plan is to attached the map to the Application Context, also available to Auxiliary Functions, to enable these calculations. >> >> Does such a map infrastructure exist within PETSc? If so, could you point me to a reference for it? If not, does anyone have any suggestions on how to solve this problem? >> >> Thank you in advance for your time. >> Brandon Denton From bourdin at mcmaster.ca Wed Nov 8 14:32:26 2023 From: bourdin at mcmaster.ca (Blaise Bourdin) Date: Wed, 8 Nov 2023 20:32:26 +0000 Subject: [petsc-users] DMPlex and Gmsh In-Reply-To: <0a3f9f9c-737e-4d31-aa8a-c700fc9a2fcd@mpie.de> References: <0a3f9f9c-737e-4d31-aa8a-c700fc9a2fcd@mpie.de> Message-ID: <122F8693-7176-44FC-A1BF-1D73FAE1769C@mcmaster.ca> An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Nov 8 15:18:03 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 8 Nov 2023 16:18:03 -0500 Subject: [petsc-users] Storing Values using a Triplet for using later In-Reply-To: References: Message-ID: On Wed, Nov 8, 2023 at 2:40?PM Brandon Denton via petsc-users < petsc-users at mcs.anl.gov> wrote: > Good Afternoon, > > Is there a structure within PETSc that allows storage of a value using a > triple similar to PetscHMapIJSet with the key using a struct{PetscScalar i, > j, k;}? > > I'm trying to access mesh information (the shape function coefficients I > will calculate prior to their use) who's values I want to store in the > auxiliary array available in the Residual Functions of PETSc's FEM > infrastructure. After some trial and error work, I've come to the > realization that the coordinates (x[]) available in the auxiliary functions > is the centroid of the cell/element currently being evaluated. This triplet > is unique for each cell/element for a valid mesh so I think it's reasonable > to use this triplet as a key for looking up stored values unique to each > cell/element. My plan is to attached the map to the Application Context, > also available to Auxiliary Functions, to enable these calculations. > > Does such a map infrastructure exist within PETSc? If so, could you point > me to a reference for it? If not, does anyone have any suggestions on how > to solve this problem? > As Jed says, this is a spatial hash. I have a primitive spatial hash now. You can use DMLocatePoints() to find the cell containing a point (like the centroid). Let me know if this does not work or if I misunderstand the problem. Thanks! Matt > Thank you in advance for your time. > Brandon Denton > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Nov 8 16:08:56 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 8 Nov 2023 17:08:56 -0500 Subject: [petsc-users] DMPlex and Gmsh In-Reply-To: <122F8693-7176-44FC-A1BF-1D73FAE1769C@mcmaster.ca> References: <0a3f9f9c-737e-4d31-aa8a-c700fc9a2fcd@mpie.de> <122F8693-7176-44FC-A1BF-1D73FAE1769C@mcmaster.ca> Message-ID: On Wed, Nov 8, 2023 at 4:50?PM Blaise Bourdin wrote: > Hi, > > I think that you need to use the magical keyword ? > -dm_plex_gmsh_mark_vertices? for that > I try to describe the options here: https://petsc.org/main/manualpages/DMPlex/DMPlexCreateGmsh/ Thanks, Matt > Blaise > > On Nov 8, 2023, at 1:13?PM, Sharan Roongta wrote: > > Caution: External email. > > Dear Petsc team, > > I want to load a .msh file generated using Gmsh software into the DMPlex > object. There are several things I would want to clarify, but I would like > to start with ?Physical tags?. > > If I have defined ?Physical Points?, ?Physical Surface?, and ?Physical > Volume? in my .geo file, I get the physical tags in the ?.msh? file. > When I load this mesh in DMPlex, and view the DM: > > call DMView(globalMesh, PETSC_VIEWER_STDOUT_WORLD,err_PETSc) > CHKERRQ(err_PETSc) > > This is the output I get: > > DM Object: n/a 1 MPI process > type: plex > n/a in 3 dimensions: > Number of 0-cells per rank: 14 > Number of 1-cells per rank: 49 > Number of 2-cells per rank: 60 > Number of 3-cells per rank: 24 > Labels: > celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) > depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) > Cell Sets: 1 strata with value/size (8 (24)) > Face Sets: 6 strata with value/size (2 (4), 3 (4), 4 (4), 5 (4), 6 (4), > 7 (4)) > > I was expecting to get the ?Node Sets? or ?Vertex Sets? also. Is my > assumption wrong? > > If yes, then how can one figure out the boundary nodes and their tags > where I want to apply certain boundary conditions? > Currently we apply boundary conditions on faces, therefore ?Face Sets? was > enough. But now we want to apply displacements on certain boundary nodes. > > I have also attached the .geo and .msh file (hope you can open it) > The Petsc version I am using is 3.18.6. > > > Thanks and Regards, > Sharan Roongta > > ----------------------------------------------- > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > ------------------------------------------------- > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > > > > ------------------------------ > ------------------------------------------------- > Stay up to date and follow us on LinkedIn, Twitter and YouTube. > > Max-Planck-Institut f?r Eisenforschung GmbH > Max-Planck-Stra?e 1 > D-40237 D?sseldorf > > Handelsregister B 2533 > Amtsgericht D?sseldorf > > Gesch?ftsf?hrung > Prof. Dr. Gerhard Dehm > Prof. Dr. J?rg Neugebauer > Prof. Dr. Dierk Raabe > Dr. Kai de Weldige > > Ust.-Id.-Nr.: DE 11 93 58 514 > Steuernummer: 105 5891 1000 > > > Please consider that invitations and e-mails of our institute are > only valid if they end with ?@mpie.de. > If you are not sure of the validity please contact rco at mpie.de > > Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails > aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. > In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de > ------------------------------------------------- > > > > ? > Canada Research Chair in Mathematical and Computational Aspects of Solid > Mechanics (Tier 1) > Professor, Department of Mathematics & Statistics > Hamilton Hall room 409A, McMaster University > 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada > https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From zs1996 at sjtu.edu.cn Wed Nov 8 18:28:16 2023 From: zs1996 at sjtu.edu.cn (=?gb2312?B?1cXKpA==?=) Date: Thu, 9 Nov 2023 08:28:16 +0800 (CST) Subject: [petsc-users] error in configuring PETSc In-Reply-To: References: <222159542.4317947.1699461986571.JavaMail.zimbra@sjtu.edu.cn> Message-ID: <911854850.4347434.1699489696737.JavaMail.zimbra@sjtu.edu.cn> Thank you for your suggestion. However, similar error still occurs. The configure.log has been attached. ----- ???? ----- ???: "Satish Balay" ???: "zs1996" ??: "petsc-users" ????: ???, 2023? 11 ? 09? ?? 12:52:54 ??: Re: [petsc-users] error in configuring PETSc Suggest attaching text logs (copy/paste) - instead of screenshots. Try: ./configure --with-cc=gcc-11 --with-cxx=g++-11 --with-fc=gfortran-11 --download-fftw --download-openmpi --download-fblaslapack --with-zlibs=yes --with-szlib=no --with-c2html=0 --with-x=0 --download-hdf5-fortran-bindings=1 --download-hdf5 --download-sowing-cxx=g++-11 If you still have issues - send configure.log for this failure Satish On Thu, 9 Nov 2023, ?? wrote: > Dear PETSc developer, > > I use the following commands to configure petsc, but errors occur: > ./configure --with-cc=gcc-11 --with-cxx=g++-11 --with-fc=gfortran-11 --download-fftw --download-openmpi --download-fblaslapack --free-line-length-0 -g -fallow-argument-mismatch --enable-shared --enable-parallel --enable-fortran --with-zlibs=yes --with-szlib=no --with-cxx-dialect=C++11 --with-c2html=0 --with-x=0 --download-hdf5-fortran-bindings=1 --download-hdf5 > > > I tried many times but cannot fix it. So I ask help for you. Thanks in advance. > Best regards, > Sheng Zhang > > Ph.D > School of Materials Science and Engineering > Shanghai Jiao Tong University > 800 Dongchuan Road > Shanghai, 200240 China > -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 3278657 bytes Desc: not available URL: From bsmith at petsc.dev Wed Nov 8 19:38:05 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 8 Nov 2023 20:38:05 -0500 Subject: [petsc-users] error in configuring PETSc In-Reply-To: <911854850.4347434.1699489696737.JavaMail.zimbra@sjtu.edu.cn> References: <222159542.4317947.1699461986571.JavaMail.zimbra@sjtu.edu.cn> <911854850.4347434.1699489696737.JavaMail.zimbra@sjtu.edu.cn> Message-ID: You need --download-sowing-cc=gcc-11 --download-sowing-cxx=g++-11 > On Nov 8, 2023, at 7:28?PM, ?? wrote: > > Thank you for your suggestion. However, similar error still occurs. The configure.log has been attached. > > ----- ???? ----- > ???: "Satish Balay" > ???: "zs1996" > ??: "petsc-users" > ????: ???, 2023? 11 ? 09? ?? 12:52:54 > ??: Re: [petsc-users] error in configuring PETSc > > Suggest attaching text logs (copy/paste) - instead of screenshots. > > Try: > > ./configure --with-cc=gcc-11 --with-cxx=g++-11 --with-fc=gfortran-11 --download-fftw --download-openmpi --download-fblaslapack --with-zlibs=yes --with-szlib=no --with-c2html=0 --with-x=0 --download-hdf5-fortran-bindings=1 --download-hdf5 --download-sowing-cxx=g++-11 > > If you still have issues - send configure.log for this failure > > Satish > > On Thu, 9 Nov 2023, ?? wrote: > >> Dear PETSc developer, >> >> I use the following commands to configure petsc, but errors occur: >> ./configure --with-cc=gcc-11 --with-cxx=g++-11 --with-fc=gfortran-11 --download-fftw --download-openmpi --download-fblaslapack --free-line-length-0 -g -fallow-argument-mismatch --enable-shared --enable-parallel --enable-fortran --with-zlibs=yes --with-szlib=no --with-cxx-dialect=C++11 --with-c2html=0 --with-x=0 --download-hdf5-fortran-bindings=1 --download-hdf5 >> >> >> I tried many times but cannot fix it. So I ask help for you. Thanks in advance. >> Best regards, >> Sheng Zhang >> >> Ph.D >> School of Materials Science and Engineering >> Shanghai Jiao Tong University >> 800 Dongchuan Road >> Shanghai, 200240 China > From s.roongta at mpie.de Thu Nov 9 03:45:34 2023 From: s.roongta at mpie.de (Sharan Roongta) Date: Thu, 9 Nov 2023 10:45:34 +0100 Subject: [petsc-users] [BULK] Re: DMPlex and Gmsh In-Reply-To: References: <0a3f9f9c-737e-4d31-aa8a-c700fc9a2fcd@mpie.de> <122F8693-7176-44FC-A1BF-1D73FAE1769C@mcmaster.ca> Message-ID: <43ff718c-3694-4fbf-a4df-74ffb26e9e99@mpie.de> Hello, Thank you for the response. I shall try it out. Regards, Sharan From: Matthew Knepley [mailto:knepley at gmail.com] Sent: Wednesday, 8 November 2023 23:09 To: Blaise Bourdin Cc: Sharan Roongta ; petsc-users at mcs.anl.gov Subject: [BULK] Re: [petsc-users] DMPlex and Gmsh Importance: Low On Wed, Nov 8, 2023 at 4:50?PM Blaise Bourdin wrote: Hi, I think that you need to use the magical keyword ?-dm_plex_gmsh_mark_vertices? for that I try to describe the options here: https://petsc.org/main/manualpages/DMPlex/DMPlexCreateGmsh/ Thanks, Matt Blaise On Nov 8, 2023, at 1:13?PM, Sharan Roongta wrote: Caution: External email. Dear Petsc team, I want to load a .msh file generated using Gmsh software into the DMPlex object. There are several things I would want to clarify, but I would like to start with ?Physical tags?. If I have defined ?Physical Points?, ?Physical Surface?, and ?Physical Volume? in my .geo file, I get the physical tags in the ?.msh? file. When I load this mesh in DMPlex, and view the DM: call DMView(globalMesh, PETSC_VIEWER_STDOUT_WORLD,err_PETSc) CHKERRQ(err_PETSc) This is the output I get: DM Object: n/a 1 MPI process type: plex n/a in 3 dimensions: Number of 0-cells per rank: 14 Number of 1-cells per rank: 49 Number of 2-cells per rank: 60 Number of 3-cells per rank: 24 Labels: celltype: 4 strata with value/size (0 (14), 6 (24), 3 (60), 1 (49)) depth: 4 strata with value/size (0 (14), 1 (49), 2 (60), 3 (24)) Cell Sets: 1 strata with value/size (8 (24)) Face Sets: 6 strata with value/size (2 (4), 3 (4), 4 (4), 5 (4), 6 (4), 7 (4)) I was expecting to get the ?Node Sets? or ?Vertex Sets? also. Is my assumption wrong? If yes, then how can one figure out the boundary nodes and their tags where I want to apply certain boundary conditions? Currently we apply boundary conditions on faces, therefore ?Face Sets? was enough. But now we want to apply displacements on certain boundary nodes. I have also attached the .geo and .msh file (hope you can open it) The Petsc version I am using is 3.18.6. Thanks and Regards, Sharan Roongta ----------------------------------------------- Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 ------------------------------------------------- Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- ? Canada Research Chair in Mathematical and Computational Aspects of Solid Mechanics (Tier 1) Professor, Department of Mathematics & Statistics Hamilton Hall room 409A, McMaster University 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada https://www.math.mcmaster.ca/bourdin | +1 (905) 525 9140 ext. 27243 -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmoschopoulos at outlook.com Thu Nov 9 05:57:49 2023 From: pmoschopoulos at outlook.com (Pantelis Moschopoulos) Date: Thu, 9 Nov 2023 11:57:49 +0000 Subject: [petsc-users] PETSC breaks when using HYPRE preconditioner Message-ID: Hello everyone, I am trying to use Petsc coupled with Hypre BoomerAMG as preconditioner in our in-house code to simulate the transient motion of complex fluid with finite elements. The problem is that after a random number of iterations, an error arises when the Hypre is called. The error that I get in the terminal is the following: [16]PETSC ERROR: ------------------------------------------------------------------------ [16]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [16]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [16]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/ [16]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [16]PETSC ERROR: The line numbers in the error traceback are not always exact. [16]PETSC ERROR: #1 Hypre solve [16]PETSC ERROR: #2 PCApply_HYPRE() at /home/pmosx/Libraries/petsc/src/ksp/pc/impls/hypre/hypre.c:451 [16]PETSC ERROR: #3 PCApply() at /home/pmosx/Libraries/petsc/src/ksp/pc/interface/precon.c:486 [16]PETSC ERROR: #4 PCApplyBAorAB() at /home/pmosx/Libraries/petsc/src/ksp/pc/interface/precon.c:756 [16]PETSC ERROR: #5 KSP_PCApplyBAorAB() at /home/pmosx/Libraries/petsc/include/petsc/private/kspimpl.h:443 [16]PETSC ERROR: #6 KSPGMRESCycle() at /home/pmosx/Libraries/petsc/src/ksp/ksp/impls/gmres/gmres.c:146 [16]PETSC ERROR: #7 KSPSolve_GMRES() at /home/pmosx/Libraries/petsc/src/ksp/ksp/impls/gmres/gmres.c:227 [16]PETSC ERROR: #8 KSPSolve_Private() at /home/pmosx/Libraries/petsc/src/ksp/ksp/interface/itfunc.c:910 [16]PETSC ERROR: #9 KSPSolve() at /home/pmosx/Libraries/petsc/src/ksp/ksp/interface/itfunc.c:1082 On the same time, I use valgrind and when the program stops, it reports the following: ==1261647== Invalid read of size 8 ==1261647== at 0x4841C74: _intel_fast_memcpy (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==1261647== by 0x16231F73: hypre_GaussElimSolve (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libHYPRE-2.29.0.so) ==1261647== by 0x1622DB4F: hypre_BoomerAMGCycle (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libHYPRE-2.29.0.so) ==1261647== by 0x1620002E: hypre_BoomerAMGSolve (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libHYPRE-2.29.0.so) ==1261647== by 0x12B6F8F8: PCApply_HYPRE (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x12C38785: PCApply (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x12C36A39: PCApplyBAorAB (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x126299E1: KSPGMRESCycle (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x12628051: KSPSolve_GMRES (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x127A532E: KSPSolve_Private (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x127A3C8A: KSPSolve (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x12C50AF1: kspsolve_ (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==1261647== This is indeed a very peculiar error. I cannot understand why it happens. In our solution procedure, we split the equations and we solve them segregated. I create two different ksp (1_ksp and 2_ksp) using KSPSetOptionsPrefix. Might this choice create a confusion and results in this error? Any help is much appreciated. Pantelis -------------- next part -------------- An HTML attachment was scrubbed... URL: From bldenton at buffalo.edu Thu Nov 9 07:08:23 2023 From: bldenton at buffalo.edu (Brandon Denton) Date: Thu, 9 Nov 2023 13:08:23 +0000 Subject: [petsc-users] Storing Values using a Triplet for using later In-Reply-To: References: Message-ID: Good Morning, Thank you Matt, Jed, and Barry. I will looking into each of these suggestions a report back. -Brandon ________________________________ From: Matthew Knepley Sent: Wednesday, November 8, 2023 4:18 PM To: Brandon Denton Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Storing Values using a Triplet for using later On Wed, Nov 8, 2023 at 2:40?PM Brandon Denton via petsc-users > wrote: Good Afternoon, Is there a structure within PETSc that allows storage of a value using a triple similar to PetscHMapIJSet with the key using a struct{PetscScalar i, j, k;}? I'm trying to access mesh information (the shape function coefficients I will calculate prior to their use) who's values I want to store in the auxiliary array available in the Residual Functions of PETSc's FEM infrastructure. After some trial and error work, I've come to the realization that the coordinates (x[]) available in the auxiliary functions is the centroid of the cell/element currently being evaluated. This triplet is unique for each cell/element for a valid mesh so I think it's reasonable to use this triplet as a key for looking up stored values unique to each cell/element. My plan is to attached the map to the Application Context, also available to Auxiliary Functions, to enable these calculations. Does such a map infrastructure exist within PETSc? If so, could you point me to a reference for it? If not, does anyone have any suggestions on how to solve this problem? As Jed says, this is a spatial hash. I have a primitive spatial hash now. You can use DMLocatePoints() to find the cell containing a point (like the centroid). Let me know if this does not work or if I misunderstand the problem. Thanks! Matt Thank you in advance for your time. Brandon Denton -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Nov 9 09:53:13 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 9 Nov 2023 10:53:13 -0500 Subject: [petsc-users] PETSC breaks when using HYPRE preconditioner In-Reply-To: References: Message-ID: <0BF74B89-F1B7-49DD-BB79-C3D04D1E7A53@petsc.dev> Pantelis If you can set an X Windows DISPLAY variable that works you can run with -on_error_attach_debugger and gdb should pop up in an Xterm on MPI rank 16 showing the code where it is crashing (based on Valgrind Address 0x0 is not stack'd, malloc'd or (recently) free'd there will be pointer of 0 that should not be). Or if the computer system has some parallel debugger you can use that directly. For lldb use -on_error_attach_debugger lldb If you have some compiler optimizations set when you ./configure PETSc you might try making another PETSC_ARCH without optimizations (this is PETSc's default when you do not use --with-debugging=0). Does it still crash with no optimizations? Perhaps also try with a different compiler? Barry > On Nov 9, 2023, at 6:57?AM, Pantelis Moschopoulos wrote: > > Hello everyone, > > I am trying to use Petsc coupled with Hypre BoomerAMG as preconditioner in our in-house code to simulate the transient motion of complex fluid with finite elements. The problem is that after a random number of iterations, an error arises when the Hypre is called. > > The error that I get in the terminal is the following: > [16]PETSC ERROR: ------------------------------------------------------------------------ > [16]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [16]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [16]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/ > [16]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [16]PETSC ERROR: The line numbers in the error traceback are not always exact. > [16]PETSC ERROR: #1 Hypre solve > [16]PETSC ERROR: #2 PCApply_HYPRE() at /home/pmosx/Libraries/petsc/src/ksp/pc/impls/hypre/hypre.c:451 > [16]PETSC ERROR: #3 PCApply() at /home/pmosx/Libraries/petsc/src/ksp/pc/interface/precon.c:486 > [16]PETSC ERROR: #4 PCApplyBAorAB() at /home/pmosx/Libraries/petsc/src/ksp/pc/interface/precon.c:756 > [16]PETSC ERROR: #5 KSP_PCApplyBAorAB() at /home/pmosx/Libraries/petsc/include/petsc/private/kspimpl.h:443 > [16]PETSC ERROR: #6 KSPGMRESCycle() at /home/pmosx/Libraries/petsc/src/ksp/ksp/impls/gmres/gmres.c:146 > [16]PETSC ERROR: #7 KSPSolve_GMRES() at /home/pmosx/Libraries/petsc/src/ksp/ksp/impls/gmres/gmres.c:227 > [16]PETSC ERROR: #8 KSPSolve_Private() at /home/pmosx/Libraries/petsc/src/ksp/ksp/interface/itfunc.c:910 > [16]PETSC ERROR: #9 KSPSolve() at /home/pmosx/Libraries/petsc/src/ksp/ksp/interface/itfunc.c:1082 > > On the same time, I use valgrind and when the program stops, it reports the following: > ==1261647== Invalid read of size 8 > ==1261647== at 0x4841C74: _intel_fast_memcpy (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so ) > ==1261647== by 0x16231F73: hypre_GaussElimSolve (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libHYPRE-2.29.0.so ) > ==1261647== by 0x1622DB4F: hypre_BoomerAMGCycle (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libHYPRE-2.29.0.so ) > ==1261647== by 0x1620002E: hypre_BoomerAMGSolve (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libHYPRE-2.29.0.so ) > ==1261647== by 0x12B6F8F8: PCApply_HYPRE (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so .3.20.1) > ==1261647== by 0x12C38785: PCApply (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so .3.20.1) > ==1261647== by 0x12C36A39: PCApplyBAorAB (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so .3.20.1) > ==1261647== by 0x126299E1: KSPGMRESCycle (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so .3.20.1) > ==1261647== by 0x12628051: KSPSolve_GMRES (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so .3.20.1) > ==1261647== by 0x127A532E: KSPSolve_Private (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so .3.20.1) > ==1261647== by 0x127A3C8A: KSPSolve (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so .3.20.1) > ==1261647== by 0x12C50AF1: kspsolve_ (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so .3.20.1) > ==1261647== Address 0x0 is not stack'd, malloc'd or (recently) free'd > ==1261647== > > This is indeed a very peculiar error. I cannot understand why it happens. In our solution procedure, we split the equations and we solve them segregated. I create two different ksp (1_ksp and 2_ksp) using KSPSetOptionsPrefix. Might this choice create a confusion and results in this error? > > Any help is much appreciated. > > Pantelis -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmoschopoulos at outlook.com Thu Nov 9 10:06:13 2023 From: pmoschopoulos at outlook.com (Pantelis Moschopoulos) Date: Thu, 9 Nov 2023 16:06:13 +0000 Subject: [petsc-users] =?utf-8?q?=CE=91=CF=80=3A__PETSC_breaks_when_using?= =?utf-8?q?_HYPRE_preconditioner?= In-Reply-To: <0BF74B89-F1B7-49DD-BB79-C3D04D1E7A53@petsc.dev> References: <0BF74B89-F1B7-49DD-BB79-C3D04D1E7A53@petsc.dev> Message-ID: Barry, I configured PETSC with --with-debugging=yes. I think this is enough to block any optimizations, right? I tried both Intel and GNU compilers. The error persists. I tried to change the preconditioner and use PILUT instead of BoomerAMG from Hypre. Still, the error appears. I noticed that I do not have any problem when only one ksp is present. I am afraid that this is the culprit. I will try your suggestion and setup an X Windows DISPLAY variable and send back the results. Thanks for your time, Pantelis ________________________________ ???: Barry Smith ????????: ??????, 9 ????????? 2023 5:53 ?? ????: Pantelis Moschopoulos ????.: petsc-users at mcs.anl.gov ????: Re: [petsc-users] PETSC breaks when using HYPRE preconditioner Pantelis If you can set an X Windows DISPLAY variable that works you can run with -on_error_attach_debugger and gdb should pop up in an Xterm on MPI rank 16 showing the code where it is crashing (based on Valgrind Address 0x0 is not stack'd, malloc'd or (recently) free'd there will be pointer of 0 that should not be). Or if the computer system has some parallel debugger you can use that directly. For lldb use -on_error_attach_debugger lldb If you have some compiler optimizations set when you ./configure PETSc you might try making another PETSC_ARCH without optimizations (this is PETSc's default when you do not use --with-debugging=0). Does it still crash with no optimizations? Perhaps also try with a different compiler? Barry On Nov 9, 2023, at 6:57?AM, Pantelis Moschopoulos wrote: Hello everyone, I am trying to use Petsc coupled with Hypre BoomerAMG as preconditioner in our in-house code to simulate the transient motion of complex fluid with finite elements. The problem is that after a random number of iterations, an error arises when the Hypre is called. The error that I get in the terminal is the following: [16]PETSC ERROR: ------------------------------------------------------------------------ [16]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [16]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [16]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/ [16]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [16]PETSC ERROR: The line numbers in the error traceback are not always exact. [16]PETSC ERROR: #1 Hypre solve [16]PETSC ERROR: #2 PCApply_HYPRE() at /home/pmosx/Libraries/petsc/src/ksp/pc/impls/hypre/hypre.c:451 [16]PETSC ERROR: #3 PCApply() at /home/pmosx/Libraries/petsc/src/ksp/pc/interface/precon.c:486 [16]PETSC ERROR: #4 PCApplyBAorAB() at /home/pmosx/Libraries/petsc/src/ksp/pc/interface/precon.c:756 [16]PETSC ERROR: #5 KSP_PCApplyBAorAB() at /home/pmosx/Libraries/petsc/include/petsc/private/kspimpl.h:443 [16]PETSC ERROR: #6 KSPGMRESCycle() at /home/pmosx/Libraries/petsc/src/ksp/ksp/impls/gmres/gmres.c:146 [16]PETSC ERROR: #7 KSPSolve_GMRES() at /home/pmosx/Libraries/petsc/src/ksp/ksp/impls/gmres/gmres.c:227 [16]PETSC ERROR: #8 KSPSolve_Private() at /home/pmosx/Libraries/petsc/src/ksp/ksp/interface/itfunc.c:910 [16]PETSC ERROR: #9 KSPSolve() at /home/pmosx/Libraries/petsc/src/ksp/ksp/interface/itfunc.c:1082 On the same time, I use valgrind and when the program stops, it reports the following: ==1261647== Invalid read of size 8 ==1261647== at 0x4841C74: _intel_fast_memcpy (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==1261647== by 0x16231F73: hypre_GaussElimSolve (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libHYPRE-2.29.0.so) ==1261647== by 0x1622DB4F: hypre_BoomerAMGCycle (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libHYPRE-2.29.0.so) ==1261647== by 0x1620002E: hypre_BoomerAMGSolve (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libHYPRE-2.29.0.so) ==1261647== by 0x12B6F8F8: PCApply_HYPRE (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x12C38785: PCApply (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x12C36A39: PCApplyBAorAB (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x126299E1: KSPGMRESCycle (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x12628051: KSPSolve_GMRES (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x127A532E: KSPSolve_Private (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x127A3C8A: KSPSolve (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== by 0x12C50AF1: kspsolve_ (in /home/pmosx/Libraries/PETSC_INS_DIR_INTELDebug/lib/libpetsc.so.3.20.1) ==1261647== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==1261647== This is indeed a very peculiar error. I cannot understand why it happens. In our solution procedure, we split the equations and we solve them segregated. I create two different ksp (1_ksp and 2_ksp) using KSPSetOptionsPrefix. Might this choice create a confusion and results in this error? Any help is much appreciated. Pantelis -------------- next part -------------- An HTML attachment was scrubbed... URL: From ramoni.zsedano at gmail.com Thu Nov 9 12:54:36 2023 From: ramoni.zsedano at gmail.com (Ramoni Z. Sedano Azevedo) Date: Thu, 9 Nov 2023 15:54:36 -0300 Subject: [petsc-users] Better solver and preconditioner to use multiple GPU In-Reply-To: <87h6lwp64r.fsf@jedbrown.org> References: <87h6lwp64r.fsf@jedbrown.org> Message-ID: We are solving the Direct Problem of Controlled Source Electromagnetics (CSEM) using finite difference discretization. Em qua., 8 de nov. de 2023 ?s 13:22, Jed Brown escreveu: > What sort of problem are you solving? Algebraic multigrid like gamg or > hypre are good choices for elliptic problems. Sparse triangular solves have > horrific efficiency even on one GPU so you generally want to do your best > to stay away from them. > > "Ramoni Z. Sedano Azevedo" writes: > > > Hey! > > > > I am using PETSC in Fortran code and we apply the MPI process to > > parallelize the code. > > > > At the moment, the options that have been used are > > -ksp_monitor_true_residual > > -ksp_type bcgs > > -pc_type bjacobi > > -sub_pc_type ilu > > -sub_pc_factor_levels 3 > > -sub_pc_factor_fill 6 > > > > Now, we want to use multiple GPUs and I would like to know if there is a > > better solver and preconditioner pair to apply in this case. > > > > Yours sincerely, > > Ramoni Z. S . Azevedo > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Thu Nov 9 13:31:15 2023 From: rlmackie862 at gmail.com (Randall Mackie) Date: Thu, 9 Nov 2023 11:31:15 -0800 Subject: [petsc-users] Better solver and preconditioner to use multiple GPU In-Reply-To: References: <87h6lwp64r.fsf@jedbrown.org> Message-ID: <5998C67B-9708-4FEC-B7BE-E2AC573E6F7A@gmail.com> Hi Ramoni, All EM induction methods solved numerically like finite differences are difficult already because of the null-space of the curl-curl equations and then adding air layers on top of your model also introduce another singularity. These have been dealt with in the past by adding in some sort of divergence condition. Solving the curl-curl equations with a direct solution is fine, but iterative solutions are difficult. There is no easy out of the box solution to this, but you can look at using multi-grid as a PC but this requires special care, for example: https://academic.oup.com/gji/article-pdf/207/3/1554/6623047/ggw352.pdf A good way to stabilize curl curl solutions is by explicit inclusion of grad-div J: https://academic.oup.com/gji/article/216/2/906/5154929 Good luck Randy Mackie > On Nov 9, 2023, at 10:54 AM, Ramoni Z. Sedano Azevedo wrote: > > We are solving the Direct Problem of Controlled Source Electromagnetics (CSEM) using finite difference discretization. > > Em qua., 8 de nov. de 2023 ?s 13:22, Jed Brown > escreveu: > What sort of problem are you solving? Algebraic multigrid like gamg or hypre are good choices for elliptic problems. Sparse triangular solves have horrific efficiency even on one GPU so you generally want to do your best to stay away from them. > > "Ramoni Z. Sedano Azevedo" > writes: > > > Hey! > > > > I am using PETSC in Fortran code and we apply the MPI process to > > parallelize the code. > > > > At the moment, the options that have been used are > > -ksp_monitor_true_residual > > -ksp_type bcgs > > -pc_type bjacobi > > -sub_pc_type ilu > > -sub_pc_factor_levels 3 > > -sub_pc_factor_fill 6 > > > > Now, we want to use multiple GPUs and I would like to know if there is a > > better solver and preconditioner pair to apply in this case. > > > > Yours sincerely, > > Ramoni Z. S . Azevedo -------------- next part -------------- An HTML attachment was scrubbed... URL: From Donald.Planalp at colorado.edu Fri Nov 10 20:53:41 2023 From: Donald.Planalp at colorado.edu (Donald Rex Planalp) Date: Fri, 10 Nov 2023 19:53:41 -0700 Subject: [petsc-users] Petsc4py Simulation: Mat.axpy() slow Message-ID: Hello, I am trying to use petsc4py to conduct a quantum mechanics simulation. I've been able to construct all of the relevant matrices, however I am reaching a gigantic bottleneck. For the simplest problem I am running I have a few matrices each about 5000x5000. In order to begin time propagation I need to add these matrices together. However, on 6 cores of my local machine it is taking approximately 1-2 seconds per addition. Since I need to do this for each time step in my simulation it is prohibitively slow since there could be upwards of 10K time steps. Below is the relevant code: structure = structure=PETSc.Mat.Structure.DIFFERENT_NONZERO_PATTERN if test2: def makeLeft(S,MIX,ANG,ATOM,i): S.axpy(-Field.pulse[i],MIX,structure) S.axpy(-Field.pulse[i],ANG,structure) S.axpy(-1,ATOM,structure) return S def makeRight(S,MIX,ANG,ATOM,i): S.axpy(Field.pulse[i],MIX,structure) S.axpy(Field.pulse[i],ANG,structure) S.axpy(1,ATOM,structure) return S H_mix = Int.H_mix H_mix.scale(1j * dt /2) H_ang = Int.H_ang H_ang.scale(1j * dt /2) H_atom = Int.H_atom H_atom.scale(1j * dt /2) S = Int.S_total psi_initial = psi.psi_initial.copy() ksp = PETSc.KSP().create(PETSc.COMM_WORLD) for i,t in enumerate(box.t): print(i,L) O_L = makeLeft(S,H_mix,H_ang,H_atom,i) O_R = makeRight(S,H_mix,H_ang,H_atom,i) if i == 0: known = O_R.getVecRight() sol = O_L.getVecRight() O_R.mult(psi_initial,known) ksp.setOperators(O_L) ksp.solve(known,sol) psi_initial.copy(sol) I need to clean it up a bit, but the main point is that I need to add the matrices many times for a single time step. I can't preallocate memory very well since some of the matrices aren't the most sparse either. It seems if I cannot speed up the addition it will be difficult to continue so I was wondering if you had any insights. Thank you for your time -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sat Nov 11 12:22:12 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sat, 11 Nov 2023 13:22:12 -0500 Subject: [petsc-users] Petsc4py Simulation: Mat.axpy() slow In-Reply-To: References: Message-ID: <7C56CC59-3639-43C8-AC6D-5F8452005E5E@petsc.dev> DIFFERENT_NONZERO_PATTERN will always be slow because it needs to determine the nonzero pattern of the result for each computation. SAME_NONZERO_PATTERN can obviously run fast but may not be practical for your problem. SUBSET_NONZERO_PATTERN (which means in A += b*B we know that B has NO nonzero locations that A does not have) does not need to reallocate anything in A so it can be reasonably fast (we can do more optimizations on the current code to improve it for you). So, can you construct the S nonzero structure initially to have the union of the nonzero structures of all the matrices that accumulate into it? This is the same nonzero structure that it "ends up with" in the current code. With this nonzero structure, SUBSET_NONZERO_PATTERN can be used. Depending on how sparse all the matrices that get accumulated into S are, you could perhaps go further and make sure they all have the same nonzero structure (sure, it uses more memory and will do extra operations, but the actual computation will be so fast it may be worth the extra computations. Barry > On Nov 10, 2023, at 9:53?PM, Donald Rex Planalp wrote: > > Hello, > > I am trying to use petsc4py to conduct a quantum mechanics simulation. I've been able to construct all of the relevant matrices, however I am reaching a gigantic bottleneck. > > For the simplest problem I am running I have a few matrices each about 5000x5000. In order to begin time propagation I need to add these matrices together. However, on 6 cores of my local machine it is taking approximately 1-2 seconds per addition. Since I need to do this for each time step in my simulation it is prohibitively slow since there could be upwards of 10K time steps. > > Below is the relevant code: > > structure = structure=PETSc.Mat.Structure.DIFFERENT_NONZERO_PATTERN > if test2: > def makeLeft(S,MIX,ANG,ATOM,i): > > > S.axpy(-Field.pulse[i],MIX,structure) > S.axpy(-Field.pulse[i],ANG,structure) > S.axpy(-1,ATOM,structure) > return S > def makeRight(S,MIX,ANG,ATOM,i): > > > > S.axpy(Field.pulse[i],MIX,structure) > S.axpy(Field.pulse[i],ANG,structure) > S.axpy(1,ATOM,structure) > > return S > > H_mix = Int.H_mix > H_mix.scale(1j * dt /2) > > H_ang = Int.H_ang > H_ang.scale(1j * dt /2) > > H_atom = Int.H_atom > H_atom.scale(1j * dt /2) > > S = Int.S_total > > psi_initial = psi.psi_initial.copy() > ksp = PETSc.KSP().create(PETSc.COMM_WORLD) > > > for i,t in enumerate(box.t): > print(i,L) > > > > O_L = makeLeft(S,H_mix,H_ang,H_atom,i) > O_R = makeRight(S,H_mix,H_ang,H_atom,i) > > > > if i == 0: > known = O_R.getVecRight() > sol = O_L.getVecRight() > > O_R.mult(psi_initial,known) > > ksp.setOperators(O_L) > > > ksp.solve(known,sol) > > > > > psi_initial.copy(sol) > > > I need to clean it up a bit, but the main point is that I need to add the matrices many times for a single time step. I can't preallocate memory very well since some of the matrices aren't the most sparse either. It seems if I cannot speed up the addition it will be difficult to continue so I was wondering if you had any insights. > > Thank you for your time -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sat Nov 11 15:54:15 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sat, 11 Nov 2023 16:54:15 -0500 Subject: [petsc-users] Petsc4py Simulation: Mat.axpy() slow In-Reply-To: References: <7C56CC59-3639-43C8-AC6D-5F8452005E5E@petsc.dev> Message-ID: <28607A10-CF8B-44D6-857E-4B2DBC89CBB6@petsc.dev> Here is the code for MPIBAIJ with SAME_NONZERO_STRUCTURE. (the other formats are similar) PetscErrorCode MatAXPY_SeqBAIJ(Mat Y, PetscScalar a, Mat X, MatStructure str) { Mat_SeqBAIJ *x = (Mat_SeqBAIJ *)X->data, *y = (Mat_SeqBAIJ *)Y->data; PetscInt bs = Y->rmap->bs, bs2 = bs * bs; PetscBLASInt one = 1; .... if (str == SAME_NONZERO_PATTERN) { PetscScalar alpha = a; PetscBLASInt bnz; PetscCall(PetscBLASIntCast(x->nz * bs2, &bnz)); PetscCallBLAS("BLASaxpy", BLASaxpy_(&bnz, &alpha, x->a, &one, y->a, &one)); PetscCall(PetscObjectStateIncrease((PetscObject)Y)); It directly adds the nonzero values from the two matrices together (the nonzero structure plays no role) and it uses the BLAS so it should perform as "fast as possible" given the hardware (are you configured --with-debugging=0 ?, are you using binding with mpiexec to ensure MPI is using the best combination of cores? https://petsc.org/release/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup) Can you run with -log_view and send the output so we can see directly the performance? > On Nov 11, 2023, at 4:25?PM, Donald Rex Planalp wrote: > > Hello, and thank you for the quick reply. > > For some context, all of these matrices are produced from a kronecker product in parallel between an "angular" and a "radial" matrix. In this case S_total and H_atom use the same angular matrix which is the identity, while their > radial matrices obey different sparsity. > > Meanwhile H_mix and H_ang are constructed from two different angular matrices, however both are only nonzero along the two off diagonals, and the radial matrices have the same sparse structure. > > So S_total and H_atom have the same block sparse structure, while H_mix and H_ang have the same block sparse structure. However even just adding H_mix and H_ang for the simplest case takes around half a second which is still too slow even when using SAME_NONZERO_PATTERN . > > This confuses me the most because another researcher wrote a similar code a few years back using finite difference, so his matrices are on the order 250kx250k and larger, yet the matrix additions are far faster for him. > > If you would like I can show more code snippets but im not sure what you would want to see. > > Thank you for your time > > > > On Sat, Nov 11, 2023 at 11:22?AM Barry Smith > wrote: >> >> >> DIFFERENT_NONZERO_PATTERN will always be slow because it needs to determine the nonzero pattern of the result for each computation. >> >> SAME_NONZERO_PATTERN can obviously run fast but may not be practical for your problem. >> >> SUBSET_NONZERO_PATTERN (which means in A += b*B we know that B has NO nonzero locations that A does not have) does not need to reallocate anything in A so it can be reasonably fast (we can do more optimizations on the current code to improve it for you). >> >> So, can you construct the S nonzero structure initially to have the union of the nonzero structures of all the matrices that accumulate into it? This is the same >> nonzero structure that it "ends up with" in the current code. With this nonzero structure, SUBSET_NONZERO_PATTERN can be used. Depending on how sparse all the matrices that get accumulated into S are, you could perhaps go further and make sure they all have the same nonzero structure (sure, it uses more memory and will do extra operations, but the actual computation will be so fast it may be worth the extra computations. >> >> Barry >> >> >> >>> On Nov 10, 2023, at 9:53?PM, Donald Rex Planalp > wrote: >>> >>> Hello, >>> >>> I am trying to use petsc4py to conduct a quantum mechanics simulation. I've been able to construct all of the relevant matrices, however I am reaching a gigantic bottleneck. >>> >>> For the simplest problem I am running I have a few matrices each about 5000x5000. In order to begin time propagation I need to add these matrices together. However, on 6 cores of my local machine it is taking approximately 1-2 seconds per addition. Since I need to do this for each time step in my simulation it is prohibitively slow since there could be upwards of 10K time steps. >>> >>> Below is the relevant code: >>> >>> structure = structure=PETSc.Mat.Structure.DIFFERENT_NONZERO_PATTERN >>> if test2: >>> def makeLeft(S,MIX,ANG,ATOM,i): >>> >>> >>> S.axpy(-Field.pulse[i],MIX,structure) >>> S.axpy(-Field.pulse[i],ANG,structure) >>> S.axpy(-1,ATOM,structure) >>> return S >>> def makeRight(S,MIX,ANG,ATOM,i): >>> >>> >>> >>> S.axpy(Field.pulse[i],MIX,structure) >>> S.axpy(Field.pulse[i],ANG,structure) >>> S.axpy(1,ATOM,structure) >>> >>> return S >>> >>> H_mix = Int.H_mix >>> H_mix.scale(1j * dt /2) >>> >>> H_ang = Int.H_ang >>> H_ang.scale(1j * dt /2) >>> >>> H_atom = Int.H_atom >>> H_atom.scale(1j * dt /2) >>> >>> S = Int.S_total >>> >>> psi_initial = psi.psi_initial.copy() >>> ksp = PETSc.KSP().create(PETSc.COMM_WORLD) >>> >>> >>> for i,t in enumerate(box.t): >>> print(i,L) >>> >>> >>> >>> O_L = makeLeft(S,H_mix,H_ang,H_atom,i) >>> O_R = makeRight(S,H_mix,H_ang,H_atom,i) >>> >>> >>> >>> if i == 0: >>> known = O_R.getVecRight() >>> sol = O_L.getVecRight() >>> >>> O_R.mult(psi_initial,known) >>> >>> ksp.setOperators(O_L) >>> >>> >>> ksp.solve(known,sol) >>> >>> >>> >>> >>> psi_initial.copy(sol) >>> >>> >>> I need to clean it up a bit, but the main point is that I need to add the matrices many times for a single time step. I can't preallocate memory very well since some of the matrices aren't the most sparse either. It seems if I cannot speed up the addition it will be difficult to continue so I was wondering if you had any insights. >>> >>> Thank you for your time >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sat Nov 11 18:58:46 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sat, 11 Nov 2023 19:58:46 -0500 Subject: [petsc-users] Petsc4py Simulation: Mat.axpy() slow In-Reply-To: References: <7C56CC59-3639-43C8-AC6D-5F8452005E5E@petsc.dev> <28607A10-CF8B-44D6-857E-4B2DBC89CBB6@petsc.dev> Message-ID: <8D99140A-7B3B-44D0-94DD-0A0CDCD2EEF2@petsc.dev> How many MPI processes did you use? Please try with just one to get a base line MatSetValues 1298 1.0 1.2313e-01 1.1 1.18e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 32 0 0 0 1 32 0 0 0 669 MatGetRow 1298 1.0 1.6896e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 1.7225e-01 1.0 1.18e+07 1.0 8.4e+01 1.3e+03 6.0e+00 1 32 5 14 0 1 32 5 14 0 478 This is concerning: there are 1298 MatSetValues and MatGetRow, are you calling them? If not that means the MatAXPY is calling them (if SAME_NONZERO_PATTERN is not used these are used in the MatAXPY). > I'm still somewhat new to parallel computing, so I'm not sure what you mean by binding. Does this lock in certain processes to certain cores? Yes, it is usually the right thing to do for numerical computing. I included a link to some discussion of it on petsc.org > On Nov 11, 2023, at 6:38?PM, Donald Rex Planalp wrote: > > Hello again, > > I've run the simulation with profiling again. In this setup I only ran the necessary methods to construct the matrices, and then at the end I added H_mix and H_ang using the Mataxpy method. Below are the results > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > BuildTwoSided 350 1.0 6.9605e-01 1.9 0.00e+00 0.0 3.9e+02 5.6e+00 3.5e+02 3 0 24 0 22 3 0 24 0 22 0 > BuildTwoSidedF 236 1.0 6.9569e-01 1.9 0.00e+00 0.0 2.3e+02 1.1e+01 2.4e+02 3 0 14 0 15 3 0 14 0 15 0 > SFSetGraph 114 1.0 2.0555e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFSetUp 116 1.0 1.2123e-03 1.1 0.00e+00 0.0 6.2e+02 8.9e+02 1.1e+02 0 0 38 71 7 0 0 38 71 7 0 > SFPack 312 1.0 6.2350e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFUnpack 312 1.0 2.2470e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecView 26 1.0 4.0207e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecCopy 13 1.0 7.6100e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAssemblyBegin 14 1.0 2.9690e-04 2.1 0.00e+00 0.0 2.3e+02 1.1e+01 1.4e+01 0 0 14 0 1 0 0 14 0 1 0 > VecAssemblyEnd 14 1.0 2.9400e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 312 1.0 6.0288e-04 1.6 0.00e+00 0.0 7.3e+02 3.1e+02 1.0e+02 0 0 45 29 6 0 0 45 29 7 0 > VecScatterEnd 312 1.0 9.1886e-04 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSetRandom 2 1.0 3.6500e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatMult 104 1.0 2.5355e-02 1.0 2.65e+05 1.1 8.0e+02 2.8e+02 2.2e+02 0 1 49 29 14 0 1 49 29 14 70 > MatSolve 104 1.0 2.4431e-02 1.0 1.31e+05 1.2 8.0e+02 2.8e+02 1.1e+02 0 0 49 29 7 0 0 49 29 7 36 > MatLUFactorSym 2 1.0 1.2584e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 2 1.0 7.7804e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatScale 2 1.0 4.5068e-02 1.0 2.36e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 65 0 0 0 0 65 0 0 0 3657 > MatAssemblyBegin 169 1.0 4.2296e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+02 2 0 0 0 7 2 0 0 0 7 0 > MatAssemblyEnd 169 1.0 3.6979e+00 1.0 0.00e+00 0.0 5.9e+02 9.4e+02 7.8e+02 22 0 36 71 48 22 0 36 71 49 0 > MatSetValues 1298 1.0 1.2313e-01 1.1 1.18e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 32 0 0 0 1 32 0 0 0 669 > MatGetRow 1298 1.0 1.6896e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAXPY 1 1.0 1.7225e-01 1.0 1.18e+07 1.0 8.4e+01 1.3e+03 6.0e+00 1 32 5 14 0 1 32 5 14 0 478 > PCSetUp 2 1.0 2.0848e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 > PCApply 104 1.0 2.4459e-02 1.0 1.31e+05 1.2 8.0e+02 2.8e+02 1.1e+02 0 0 49 29 7 0 0 49 29 7 36 > KSPSetUp 2 1.0 1.8400e-06 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 104 1.0 2.5008e-02 1.0 1.31e+05 1.2 8.0e+02 2.8e+02 2.2e+02 0 0 49 29 14 0 0 49 29 14 35 > EPSSetUp 2 1.0 2.4027e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 1 0 0 0 0 1 0 > EPSSolve 2 1.0 3.0789e-02 1.0 1.09e+06 1.1 8.0e+02 2.8e+02 4.5e+02 0 3 49 29 28 0 3 49 29 28 242 > STSetUp 2 1.0 2.1294e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 > STComputeOperatr 2 1.0 2.1100e-06 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > STApply 104 1.0 2.5290e-02 1.0 2.65e+05 1.1 8.0e+02 2.8e+02 2.2e+02 0 1 49 29 14 0 1 49 29 14 70 > STMatSolve 104 1.0 2.5081e-02 1.0 1.31e+05 1.2 8.0e+02 2.8e+02 2.2e+02 0 0 49 29 14 0 0 49 29 14 35 > BVCopy 20 1.0 2.8770e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > BVMultVec 206 1.0 1.8736e-04 1.5 3.13e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 11442 > BVMultInPlace 11 1.0 9.2730e-05 1.2 1.49e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11026 > BVDotVec 206 1.0 1.0389e-03 1.2 3.35e+05 1.1 0.0e+00 0.0e+00 2.1e+02 0 1 0 0 13 0 1 0 0 13 2205 > BVOrthogonalizeV 106 1.0 1.2649e-03 1.1 6.84e+05 1.1 0.0e+00 0.0e+00 2.1e+02 0 2 0 0 13 0 2 0 0 13 3706 > BVScale 106 1.0 9.0329e-05 1.9 5.51e+03 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 418 > BVSetRandom 2 1.0 1.4080e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > BVMatMultVec 104 1.0 2.5468e-02 1.0 2.65e+05 1.1 8.0e+02 2.8e+02 2.2e+02 0 1 49 29 14 0 1 49 29 14 70 > DSSolve 9 1.0 1.2452e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DSVectors 24 1.0 1.5345e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > DSOther 27 1.0 1.6735e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > ------------------------------------------------------------------------------------------------------------------------ > > I believe that the only matrix addition I did is shown there to take about 0.17 seconds. Doing so twice per timestep in addition to scaling, this ends up taking about 0.4 seconds > during the full calculation so the timestep is far too slow. > > However it seems that the number of flops is on the order of 10^7. Would this imply that > I am perhaps memory bandwidth bound on my home pc? > > I'm still somewhat new to parallel computing, so I'm not sure what you mean by binding. Does this lock in certain processes to certain cores? > > Thank you for your time and assistance > > On Sat, Nov 11, 2023 at 2:54?PM Barry Smith > wrote: >> >> Here is the code for MPIBAIJ with SAME_NONZERO_STRUCTURE. (the other formats are similar) >> >> PetscErrorCode MatAXPY_SeqBAIJ(Mat Y, PetscScalar a, Mat X, MatStructure str) >> { >> Mat_SeqBAIJ *x = (Mat_SeqBAIJ *)X->data, *y = (Mat_SeqBAIJ *)Y->data; >> PetscInt bs = Y->rmap->bs, bs2 = bs * bs; >> PetscBLASInt one = 1; >> .... >> if (str == SAME_NONZERO_PATTERN) { >> PetscScalar alpha = a; >> PetscBLASInt bnz; >> PetscCall(PetscBLASIntCast(x->nz * bs2, &bnz)); >> PetscCallBLAS("BLASaxpy", BLASaxpy_(&bnz, &alpha, x->a, &one, y->a, &one)); >> PetscCall(PetscObjectStateIncrease((PetscObject)Y)); >> >> It directly adds the nonzero values from the two matrices together (the nonzero structure plays no role) and it uses the BLAS so it should perform as "fast as possible" given the hardware (are you configured --with-debugging=0 ?, are you using binding with mpiexec to ensure MPI is using the best combination of cores? https://petsc.org/release/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup ) >> >> Can you run with -log_view and send the output so we can see directly the performance? >> >> >> >> >>> On Nov 11, 2023, at 4:25?PM, Donald Rex Planalp > wrote: >>> >>> Hello, and thank you for the quick reply. >>> >>> For some context, all of these matrices are produced from a kronecker product in parallel between an "angular" and a "radial" matrix. In this case S_total and H_atom use the same angular matrix which is the identity, while their >>> radial matrices obey different sparsity. >>> >>> Meanwhile H_mix and H_ang are constructed from two different angular matrices, however both are only nonzero along the two off diagonals, and the radial matrices have the same sparse structure. >>> >>> So S_total and H_atom have the same block sparse structure, while H_mix and H_ang have the same block sparse structure. However even just adding H_mix and H_ang for the simplest case takes around half a second which is still too slow even when using SAME_NONZERO_PATTERN . >>> >>> This confuses me the most because another researcher wrote a similar code a few years back using finite difference, so his matrices are on the order 250kx250k and larger, yet the matrix additions are far faster for him. >>> >>> If you would like I can show more code snippets but im not sure what you would want to see. >>> >>> Thank you for your time >>> >>> >>> >>> On Sat, Nov 11, 2023 at 11:22?AM Barry Smith > wrote: >>>> >>>> >>>> DIFFERENT_NONZERO_PATTERN will always be slow because it needs to determine the nonzero pattern of the result for each computation. >>>> >>>> SAME_NONZERO_PATTERN can obviously run fast but may not be practical for your problem. >>>> >>>> SUBSET_NONZERO_PATTERN (which means in A += b*B we know that B has NO nonzero locations that A does not have) does not need to reallocate anything in A so it can be reasonably fast (we can do more optimizations on the current code to improve it for you). >>>> >>>> So, can you construct the S nonzero structure initially to have the union of the nonzero structures of all the matrices that accumulate into it? This is the same >>>> nonzero structure that it "ends up with" in the current code. With this nonzero structure, SUBSET_NONZERO_PATTERN can be used. Depending on how sparse all the matrices that get accumulated into S are, you could perhaps go further and make sure they all have the same nonzero structure (sure, it uses more memory and will do extra operations, but the actual computation will be so fast it may be worth the extra computations. >>>> >>>> Barry >>>> >>>> >>>> >>>>> On Nov 10, 2023, at 9:53?PM, Donald Rex Planalp > wrote: >>>>> >>>>> Hello, >>>>> >>>>> I am trying to use petsc4py to conduct a quantum mechanics simulation. I've been able to construct all of the relevant matrices, however I am reaching a gigantic bottleneck. >>>>> >>>>> For the simplest problem I am running I have a few matrices each about 5000x5000. In order to begin time propagation I need to add these matrices together. However, on 6 cores of my local machine it is taking approximately 1-2 seconds per addition. Since I need to do this for each time step in my simulation it is prohibitively slow since there could be upwards of 10K time steps. >>>>> >>>>> Below is the relevant code: >>>>> >>>>> structure = structure=PETSc.Mat.Structure.DIFFERENT_NONZERO_PATTERN >>>>> if test2: >>>>> def makeLeft(S,MIX,ANG,ATOM,i): >>>>> >>>>> >>>>> S.axpy(-Field.pulse[i],MIX,structure) >>>>> S.axpy(-Field.pulse[i],ANG,structure) >>>>> S.axpy(-1,ATOM,structure) >>>>> return S >>>>> def makeRight(S,MIX,ANG,ATOM,i): >>>>> >>>>> >>>>> >>>>> S.axpy(Field.pulse[i],MIX,structure) >>>>> S.axpy(Field.pulse[i],ANG,structure) >>>>> S.axpy(1,ATOM,structure) >>>>> >>>>> return S >>>>> >>>>> H_mix = Int.H_mix >>>>> H_mix.scale(1j * dt /2) >>>>> >>>>> H_ang = Int.H_ang >>>>> H_ang.scale(1j * dt /2) >>>>> >>>>> H_atom = Int.H_atom >>>>> H_atom.scale(1j * dt /2) >>>>> >>>>> S = Int.S_total >>>>> >>>>> psi_initial = psi.psi_initial.copy() >>>>> ksp = PETSc.KSP().create(PETSc.COMM_WORLD) >>>>> >>>>> >>>>> for i,t in enumerate(box.t): >>>>> print(i,L) >>>>> >>>>> >>>>> >>>>> O_L = makeLeft(S,H_mix,H_ang,H_atom,i) >>>>> O_R = makeRight(S,H_mix,H_ang,H_atom,i) >>>>> >>>>> >>>>> >>>>> if i == 0: >>>>> known = O_R.getVecRight() >>>>> sol = O_L.getVecRight() >>>>> >>>>> O_R.mult(psi_initial,known) >>>>> >>>>> ksp.setOperators(O_L) >>>>> >>>>> >>>>> ksp.solve(known,sol) >>>>> >>>>> >>>>> >>>>> >>>>> psi_initial.copy(sol) >>>>> >>>>> >>>>> I need to clean it up a bit, but the main point is that I need to add the matrices many times for a single time step. I can't preallocate memory very well since some of the matrices aren't the most sparse either. It seems if I cannot speed up the addition it will be difficult to continue so I was wondering if you had any insights. >>>>> >>>>> Thank you for your time >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sat Nov 11 20:28:43 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sat, 11 Nov 2023 21:28:43 -0500 Subject: [petsc-users] Petsc4py Simulation: Mat.axpy() slow In-Reply-To: References: <7C56CC59-3639-43C8-AC6D-5F8452005E5E@petsc.dev> <28607A10-CF8B-44D6-857E-4B2DBC89CBB6@petsc.dev> <8D99140A-7B3B-44D0-94DD-0A0CDCD2EEF2@petsc.dev> Message-ID: The flop rate of 123 megaflops is absurdly small, something unexected must be happening. Any chance you can send me small code that reproduces the below? > On Nov 11, 2023, at 9:19?PM, Donald Rex Planalp wrote: > > I've isolated the profiling to only run during the addition of the two matrices. The output is > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatAssemblyBegin 1 1.0 4.5000e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatAssemblyEnd 1 1.0 7.4370e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > MatAXPY 1 1.0 6.6962e-01 1.0 8.24e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 31 0 0 0 100 100 0 0 0 123 > ------------------------------------------------------------------------------------------------------------------------ > > On Sat, Nov 11, 2023 at 6:20?PM Donald Rex Planalp > wrote: >> My apologies I missed your other question, >> >> Yes I imagine most of those setValue and getRow are due to the construction of the matrices in question. I Ended up writing my own "semi-parallel" kronecker product function which involves setting and getting >> a lot of values. I imagine I should optimize that much more, but I consider it an upfront cost to time propagation, so I haven't touched it since I got it working. >> >> Thank you for the quick replies >> >> >> >> On Sat, Nov 11, 2023 at 6:12?PM Donald Rex Planalp > wrote: >>> My apologies, that was run with 7 cores, this is the result for n = 1. >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> --- Event Stage 0: Main Stage >>> >>> BuildTwoSided 13 1.0 1.4770e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> BuildTwoSidedF 13 1.0 2.3150e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecView 24 1.0 3.1525e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecCopy 12 1.0 4.1000e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAssemblyBegin 13 1.0 5.1290e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAssemblyEnd 13 1.0 7.0200e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSetRandom 2 1.0 6.2500e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatMult 112 1.0 1.7071e-03 1.0 1.41e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 8268 >>> MatSolve 112 1.0 8.3160e-04 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 8486 >>> MatLUFactorSym 2 1.0 1.2660e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatLUFactorNum 2 1.0 2.2895e-04 1.0 1.06e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4646 >>> MatScale 2 1.0 4.4903e-02 1.0 1.65e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 61 0 0 0 0 61 0 0 0 3671 >>> MatAssemblyBegin 169 1.0 4.5040e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatAssemblyEnd 169 1.0 2.2126e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 40 0 0 0 0 40 0 0 0 0 0 >>> MatSetValues 9078 1.0 6.1440e-01 1.0 8.24e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 31 0 0 0 1 31 0 0 0 134 >>> MatGetRow 9078 1.0 5.8936e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetRowIJ 2 1.0 9.0900e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatGetOrdering 2 1.0 3.6880e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatAXPY 1 1.0 6.4112e-01 1.0 8.24e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 31 0 0 0 1 31 0 0 0 129 >>> PCSetUp 2 1.0 4.6308e-04 1.0 1.06e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2297 >>> PCApply 112 1.0 8.4632e-04 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 8339 >>> KSPSetUp 2 1.0 4.5000e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSolve 112 1.0 8.9241e-04 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 7908 >>> EPSSetUp 2 1.0 6.5904e-04 1.0 1.06e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1614 >>> EPSSolve 2 1.0 4.4982e-03 1.0 2.11e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 8 0 0 0 4687 >>> STSetUp 2 1.0 5.0441e-04 1.0 1.06e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2109 >>> STComputeOperatr 2 1.0 2.5700e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> STApply 112 1.0 1.6852e-03 1.0 1.41e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 8375 >>> STMatSolve 112 1.0 9.1930e-04 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 7677 >>> BVCopy 20 1.0 1.7930e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> BVMultVec 219 1.0 2.3467e-04 1.0 2.31e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9861 >>> BVMultInPlace 12 1.0 1.1088e-04 1.0 1.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 9857 >>> BVDotVec 219 1.0 2.9956e-04 1.0 2.47e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 8245 >>> BVOrthogonalizeV 114 1.0 5.8952e-04 1.0 4.82e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 8181 >>> BVScale 114 1.0 2.9150e-05 1.0 4.06e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1392 >>> BVSetRandom 2 1.0 9.6900e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> BVMatMultVec 112 1.0 1.7464e-03 1.0 1.41e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 8082 >>> DSSolve 10 1.0 9.1354e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DSVectors 24 1.0 1.1908e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DSOther 30 1.0 1.2798e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> On Sat, Nov 11, 2023 at 5:59?PM Barry Smith > wrote: >>>> >>>> How many MPI processes did you use? Please try with just one to get a base line >>>> >>>> MatSetValues 1298 1.0 1.2313e-01 1.1 1.18e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 32 0 0 0 1 32 0 0 0 669 >>>> MatGetRow 1298 1.0 1.6896e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> MatAXPY 1 1.0 1.7225e-01 1.0 1.18e+07 1.0 8.4e+01 1.3e+03 6.0e+00 1 32 5 14 0 1 32 5 14 0 478 >>>> >>>> This is concerning: there are 1298 MatSetValues and MatGetRow, are you calling them? If not that means the MatAXPY is calling them (if SAME_NONZERO_PATTERN is not used these are used in the MatAXPY). >>>> >>>>> I'm still somewhat new to parallel computing, so I'm not sure what you mean by binding. Does this lock in certain processes to certain cores? >>>> >>>> >>>> Yes, it is usually the right thing to do for numerical computing. I included a link to some discussion of it on petsc.org >>>> >>>>> On Nov 11, 2023, at 6:38?PM, Donald Rex Planalp > wrote: >>>>> >>>>> Hello again, >>>>> >>>>> I've run the simulation with profiling again. In this setup I only ran the necessary methods to construct the matrices, and then at the end I added H_mix and H_ang using the Mataxpy method. Below are the results >>>>> >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> >>>>> --- Event Stage 0: Main Stage >>>>> >>>>> BuildTwoSided 350 1.0 6.9605e-01 1.9 0.00e+00 0.0 3.9e+02 5.6e+00 3.5e+02 3 0 24 0 22 3 0 24 0 22 0 >>>>> BuildTwoSidedF 236 1.0 6.9569e-01 1.9 0.00e+00 0.0 2.3e+02 1.1e+01 2.4e+02 3 0 14 0 15 3 0 14 0 15 0 >>>>> SFSetGraph 114 1.0 2.0555e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> SFSetUp 116 1.0 1.2123e-03 1.1 0.00e+00 0.0 6.2e+02 8.9e+02 1.1e+02 0 0 38 71 7 0 0 38 71 7 0 >>>>> SFPack 312 1.0 6.2350e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> SFUnpack 312 1.0 2.2470e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> VecView 26 1.0 4.0207e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> VecCopy 13 1.0 7.6100e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> VecAssemblyBegin 14 1.0 2.9690e-04 2.1 0.00e+00 0.0 2.3e+02 1.1e+01 1.4e+01 0 0 14 0 1 0 0 14 0 1 0 >>>>> VecAssemblyEnd 14 1.0 2.9400e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> VecScatterBegin 312 1.0 6.0288e-04 1.6 0.00e+00 0.0 7.3e+02 3.1e+02 1.0e+02 0 0 45 29 6 0 0 45 29 7 0 >>>>> VecScatterEnd 312 1.0 9.1886e-04 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> VecSetRandom 2 1.0 3.6500e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> MatMult 104 1.0 2.5355e-02 1.0 2.65e+05 1.1 8.0e+02 2.8e+02 2.2e+02 0 1 49 29 14 0 1 49 29 14 70 >>>>> MatSolve 104 1.0 2.4431e-02 1.0 1.31e+05 1.2 8.0e+02 2.8e+02 1.1e+02 0 0 49 29 7 0 0 49 29 7 36 >>>>> MatLUFactorSym 2 1.0 1.2584e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> MatLUFactorNum 2 1.0 7.7804e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> MatScale 2 1.0 4.5068e-02 1.0 2.36e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 65 0 0 0 0 65 0 0 0 3657 >>>>> MatAssemblyBegin 169 1.0 4.2296e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+02 2 0 0 0 7 2 0 0 0 7 0 >>>>> MatAssemblyEnd 169 1.0 3.6979e+00 1.0 0.00e+00 0.0 5.9e+02 9.4e+02 7.8e+02 22 0 36 71 48 22 0 36 71 49 0 >>>>> MatSetValues 1298 1.0 1.2313e-01 1.1 1.18e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 32 0 0 0 1 32 0 0 0 669 >>>>> MatGetRow 1298 1.0 1.6896e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> MatAXPY 1 1.0 1.7225e-01 1.0 1.18e+07 1.0 8.4e+01 1.3e+03 6.0e+00 1 32 5 14 0 1 32 5 14 0 478 >>>>> PCSetUp 2 1.0 2.0848e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 >>>>> PCApply 104 1.0 2.4459e-02 1.0 1.31e+05 1.2 8.0e+02 2.8e+02 1.1e+02 0 0 49 29 7 0 0 49 29 7 36 >>>>> KSPSetUp 2 1.0 1.8400e-06 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> KSPSolve 104 1.0 2.5008e-02 1.0 1.31e+05 1.2 8.0e+02 2.8e+02 2.2e+02 0 0 49 29 14 0 0 49 29 14 35 >>>>> EPSSetUp 2 1.0 2.4027e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 1 0 0 0 0 1 0 >>>>> EPSSolve 2 1.0 3.0789e-02 1.0 1.09e+06 1.1 8.0e+02 2.8e+02 4.5e+02 0 3 49 29 28 0 3 49 29 28 242 >>>>> STSetUp 2 1.0 2.1294e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 0 0 0 0 0 1 0 >>>>> STComputeOperatr 2 1.0 2.1100e-06 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> STApply 104 1.0 2.5290e-02 1.0 2.65e+05 1.1 8.0e+02 2.8e+02 2.2e+02 0 1 49 29 14 0 1 49 29 14 70 >>>>> STMatSolve 104 1.0 2.5081e-02 1.0 1.31e+05 1.2 8.0e+02 2.8e+02 2.2e+02 0 0 49 29 14 0 0 49 29 14 35 >>>>> BVCopy 20 1.0 2.8770e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> BVMultVec 206 1.0 1.8736e-04 1.5 3.13e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 11442 >>>>> BVMultInPlace 11 1.0 9.2730e-05 1.2 1.49e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11026 >>>>> BVDotVec 206 1.0 1.0389e-03 1.2 3.35e+05 1.1 0.0e+00 0.0e+00 2.1e+02 0 1 0 0 13 0 1 0 0 13 2205 >>>>> BVOrthogonalizeV 106 1.0 1.2649e-03 1.1 6.84e+05 1.1 0.0e+00 0.0e+00 2.1e+02 0 2 0 0 13 0 2 0 0 13 3706 >>>>> BVScale 106 1.0 9.0329e-05 1.9 5.51e+03 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 418 >>>>> BVSetRandom 2 1.0 1.4080e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> BVMatMultVec 104 1.0 2.5468e-02 1.0 2.65e+05 1.1 8.0e+02 2.8e+02 2.2e+02 0 1 49 29 14 0 1 49 29 14 70 >>>>> DSSolve 9 1.0 1.2452e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> DSVectors 24 1.0 1.5345e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> DSOther 27 1.0 1.6735e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> >>>>> I believe that the only matrix addition I did is shown there to take about 0.17 seconds. Doing so twice per timestep in addition to scaling, this ends up taking about 0.4 seconds >>>>> during the full calculation so the timestep is far too slow. >>>>> >>>>> However it seems that the number of flops is on the order of 10^7. Would this imply that >>>>> I am perhaps memory bandwidth bound on my home pc? >>>>> >>>>> I'm still somewhat new to parallel computing, so I'm not sure what you mean by binding. Does this lock in certain processes to certain cores? >>>>> >>>>> Thank you for your time and assistance >>>>> >>>>> On Sat, Nov 11, 2023 at 2:54?PM Barry Smith > wrote: >>>>>> >>>>>> Here is the code for MPIBAIJ with SAME_NONZERO_STRUCTURE. (the other formats are similar) >>>>>> >>>>>> PetscErrorCode MatAXPY_SeqBAIJ(Mat Y, PetscScalar a, Mat X, MatStructure str) >>>>>> { >>>>>> Mat_SeqBAIJ *x = (Mat_SeqBAIJ *)X->data, *y = (Mat_SeqBAIJ *)Y->data; >>>>>> PetscInt bs = Y->rmap->bs, bs2 = bs * bs; >>>>>> PetscBLASInt one = 1; >>>>>> .... >>>>>> if (str == SAME_NONZERO_PATTERN) { >>>>>> PetscScalar alpha = a; >>>>>> PetscBLASInt bnz; >>>>>> PetscCall(PetscBLASIntCast(x->nz * bs2, &bnz)); >>>>>> PetscCallBLAS("BLASaxpy", BLASaxpy_(&bnz, &alpha, x->a, &one, y->a, &one)); >>>>>> PetscCall(PetscObjectStateIncrease((PetscObject)Y)); >>>>>> >>>>>> It directly adds the nonzero values from the two matrices together (the nonzero structure plays no role) and it uses the BLAS so it should perform as "fast as possible" given the hardware (are you configured --with-debugging=0 ?, are you using binding with mpiexec to ensure MPI is using the best combination of cores? https://petsc.org/release/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup ) >>>>>> >>>>>> Can you run with -log_view and send the output so we can see directly the performance? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On Nov 11, 2023, at 4:25?PM, Donald Rex Planalp > wrote: >>>>>>> >>>>>>> Hello, and thank you for the quick reply. >>>>>>> >>>>>>> For some context, all of these matrices are produced from a kronecker product in parallel between an "angular" and a "radial" matrix. In this case S_total and H_atom use the same angular matrix which is the identity, while their >>>>>>> radial matrices obey different sparsity. >>>>>>> >>>>>>> Meanwhile H_mix and H_ang are constructed from two different angular matrices, however both are only nonzero along the two off diagonals, and the radial matrices have the same sparse structure. >>>>>>> >>>>>>> So S_total and H_atom have the same block sparse structure, while H_mix and H_ang have the same block sparse structure. However even just adding H_mix and H_ang for the simplest case takes around half a second which is still too slow even when using SAME_NONZERO_PATTERN . >>>>>>> >>>>>>> This confuses me the most because another researcher wrote a similar code a few years back using finite difference, so his matrices are on the order 250kx250k and larger, yet the matrix additions are far faster for him. >>>>>>> >>>>>>> If you would like I can show more code snippets but im not sure what you would want to see. >>>>>>> >>>>>>> Thank you for your time >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Sat, Nov 11, 2023 at 11:22?AM Barry Smith > wrote: >>>>>>>> >>>>>>>> >>>>>>>> DIFFERENT_NONZERO_PATTERN will always be slow because it needs to determine the nonzero pattern of the result for each computation. >>>>>>>> >>>>>>>> SAME_NONZERO_PATTERN can obviously run fast but may not be practical for your problem. >>>>>>>> >>>>>>>> SUBSET_NONZERO_PATTERN (which means in A += b*B we know that B has NO nonzero locations that A does not have) does not need to reallocate anything in A so it can be reasonably fast (we can do more optimizations on the current code to improve it for you). >>>>>>>> >>>>>>>> So, can you construct the S nonzero structure initially to have the union of the nonzero structures of all the matrices that accumulate into it? This is the same >>>>>>>> nonzero structure that it "ends up with" in the current code. With this nonzero structure, SUBSET_NONZERO_PATTERN can be used. Depending on how sparse all the matrices that get accumulated into S are, you could perhaps go further and make sure they all have the same nonzero structure (sure, it uses more memory and will do extra operations, but the actual computation will be so fast it may be worth the extra computations. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On Nov 10, 2023, at 9:53?PM, Donald Rex Planalp > wrote: >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I am trying to use petsc4py to conduct a quantum mechanics simulation. I've been able to construct all of the relevant matrices, however I am reaching a gigantic bottleneck. >>>>>>>>> >>>>>>>>> For the simplest problem I am running I have a few matrices each about 5000x5000. In order to begin time propagation I need to add these matrices together. However, on 6 cores of my local machine it is taking approximately 1-2 seconds per addition. Since I need to do this for each time step in my simulation it is prohibitively slow since there could be upwards of 10K time steps. >>>>>>>>> >>>>>>>>> Below is the relevant code: >>>>>>>>> >>>>>>>>> structure = structure=PETSc.Mat.Structure.DIFFERENT_NONZERO_PATTERN >>>>>>>>> if test2: >>>>>>>>> def makeLeft(S,MIX,ANG,ATOM,i): >>>>>>>>> >>>>>>>>> >>>>>>>>> S.axpy(-Field.pulse[i],MIX,structure) >>>>>>>>> S.axpy(-Field.pulse[i],ANG,structure) >>>>>>>>> S.axpy(-1,ATOM,structure) >>>>>>>>> return S >>>>>>>>> def makeRight(S,MIX,ANG,ATOM,i): >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> S.axpy(Field.pulse[i],MIX,structure) >>>>>>>>> S.axpy(Field.pulse[i],ANG,structure) >>>>>>>>> S.axpy(1,ATOM,structure) >>>>>>>>> >>>>>>>>> return S >>>>>>>>> >>>>>>>>> H_mix = Int.H_mix >>>>>>>>> H_mix.scale(1j * dt /2) >>>>>>>>> >>>>>>>>> H_ang = Int.H_ang >>>>>>>>> H_ang.scale(1j * dt /2) >>>>>>>>> >>>>>>>>> H_atom = Int.H_atom >>>>>>>>> H_atom.scale(1j * dt /2) >>>>>>>>> >>>>>>>>> S = Int.S_total >>>>>>>>> >>>>>>>>> psi_initial = psi.psi_initial.copy() >>>>>>>>> ksp = PETSc.KSP().create(PETSc.COMM_WORLD) >>>>>>>>> >>>>>>>>> >>>>>>>>> for i,t in enumerate(box.t): >>>>>>>>> print(i,L) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> O_L = makeLeft(S,H_mix,H_ang,H_atom,i) >>>>>>>>> O_R = makeRight(S,H_mix,H_ang,H_atom,i) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> if i == 0: >>>>>>>>> known = O_R.getVecRight() >>>>>>>>> sol = O_L.getVecRight() >>>>>>>>> >>>>>>>>> O_R.mult(psi_initial,known) >>>>>>>>> >>>>>>>>> ksp.setOperators(O_L) >>>>>>>>> >>>>>>>>> >>>>>>>>> ksp.solve(known,sol) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> psi_initial.copy(sol) >>>>>>>>> >>>>>>>>> >>>>>>>>> I need to clean it up a bit, but the main point is that I need to add the matrices many times for a single time step. I can't preallocate memory very well since some of the matrices aren't the most sparse either. It seems if I cannot speed up the addition it will be difficult to continue so I was wondering if you had any insights. >>>>>>>>> >>>>>>>>> Thank you for your time >>>>>>>> >>>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Nov 12 21:19:23 2023 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 12 Nov 2023 22:19:23 -0500 Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI In-Reply-To: <49254233.76e5.18bc66a98af.Coremail.ctchengben@mail.scut.edu.cn> References: <77a1187d.69cc.18bad5fd697.Coremail.ctchengben@mail.scut.edu.cn> <49254233.76e5.18bc66a98af.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: The configure is failing while testing the Intel Fortran compiler Executing: /cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win_ifort -c -o /tmp/petsc-61cxbt4e/config.setCompilers/conftest.o -I/tmp/petsc-61cxbt4e/config.compilers -I/tmp/petsc-61cxbt4e/config.setCompilers -MT -Z7 -Od /tmp/petsc-61cxbt4e/config.setCompilers/conftest.F90 Successful compile: Source: program main end Executing: /cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win_ifort -o /tmp/petsc-61cxbt4e/config.setCompilers/conftest.exe -MT -Z7 -Od /tmp/petsc-61cxbt4e/config.setCompilers/conftest.o stdout: LINK : ?????? G:\cygwin\tmp\PE9718~1\CONFIG~1.SET\conftest.exe ???????????????????????????????????????? Linker output before filtering: LINK : ?????? G:\cygwin\tmp\PE9718~1\CONFIG~1.SET\conftest.exe ???????????????????????????????????????? Linker output after filtering: LINK : ?????? G:\cygwin\tmp\PE9718~1\CONFIG~1.SET\conftest.exe ???????????????????????????????????????? Can you please try changing the compiler to print its messages in English and then attempt to compile the trivial program above as indicated and send all the output. Barry > On Nov 12, 2023, at 9:02?PM, ?? wrote: > > Sorry for replying to your email so late, and I find configure.log > > > -----????----- > ???: "Barry Smith" > ????: 2023-11-08 23:42:30 (???) > ???: ?? > ??: petsc-users at mcs.anl.gov > ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI > > > Send the file $PETSC_ARCH/lib/petsc/conf/configure.log > > > >> On Nov 8, 2023, at 12:20?AM, ?? wrote: >> >> Hello, >> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: >> 1. PETSc: version 3.19.2 >> 2. VS: version 2022 >> 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit >> 4. Cygwin: see the picture attatched (see picture cygwin) >> >> >> And the compiler option in configuration is: >> ./configure --prefix=/cygdrive/g/mypetsc/petsc2023 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' --with-shared-libraries=0 >> --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include >> --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/libTherefore, I write this e-mail to look for your help. >> /release/impi.lib >> --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly >> >> >> >> >> but there return an error: >> ********************************************************************************************* >> OSError while running ./configure >> --------------------------------------------------------------------------------------------- >> Cannot run executables created with FC. If this machine uses a batch system >> to submit jobs you will need to configure using ./configure with the additional option >> --with-batch. >> Otherwise there is problem with the compilers. Can you compile and run code with your >> compiler '/cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win32fe ifort'? >> See https://petsc.org/release/faq/#error-libimf >> ********************************************************************************************* >> >> >> >> Then I try to open configure.log in petsc, but there turnout an error that I can't open it.(see picture 1) >> >> And then I right click on properties and click safety,it just turnout "The permissions on test directory are incorrectly ordered.which may cause some entries to be ineffective." (see picture 2) >> >> And it also likely seen ?NULL SID? as the top entry in permission lists.(see picture 3) >> >> Then i follow this blog(https://blog.dhampir.no/content/forcing-cygwin-to-create-sane-permissions-on-windows) to edit /etc/fstab in Cygwin, and add ?noacl? to the mount options for /cygdrive. >> >> But it's not working. >> >> So I can't sent configure.log to you guys, it seems cygwin that installed in my computer happened to some problem. >> >> Mayebe the error happened in the configure on petsc just because of this reason. >> >> >> >> So I wrrit this email to report my problem and ask for your help. >> >> >> Looking forward your reply! >> >> >> sinserely, >> Cheng. >> > ?? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1251493 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.bkp Type: application/octet-stream Size: 1251477 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Nov 13 07:45:39 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 13 Nov 2023 08:45:39 -0500 Subject: [petsc-users] [petsc-maint] small problems In-Reply-To: <00f046f19f9b1f4355b7bdf23746be61@cryptolab.net> References: <00f046f19f9b1f4355b7bdf23746be61@cryptolab.net> Message-ID: On Mon, Nov 13, 2023 at 8:23?AM edgar via petsc-maint < petsc-maint at mcs.anl.gov> wrote: > Dear list, > > Some weeks ago, someone kindly shared a patch to improve the performance > on small problems (less than or around 1 000 dofs). I deleted the > e-mail, thinking that I would be able to find it back on the archives. > It may have been in the petsc-maint list, which means that there is no > archive. I would like to ask if someone could redirect the thread, a > link in https://lists.mcs.anl.gov/pipermail or tell me where I can find > the pull request or patch. Thank you very much in advance. > I cannot find it. What search string could we use? Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From edgarlux at cryptolab.net Mon Nov 13 07:23:31 2023 From: edgarlux at cryptolab.net (edgar) Date: Mon, 13 Nov 2023 13:23:31 +0000 Subject: [petsc-users] small problems Message-ID: <00f046f19f9b1f4355b7bdf23746be61@cryptolab.net> Dear list, Some weeks ago, someone kindly shared a patch to improve the performance on small problems (less than or around 1 000 dofs). I deleted the e-mail, thinking that I would be able to find it back on the archives. It may have been in the petsc-maint list, which means that there is no archive. I would like to ask if someone could redirect the thread, a link in https://lists.mcs.anl.gov/pipermail or tell me where I can find the pull request or patch. Thank you very much in advance. From bsmith at petsc.dev Tue Nov 14 11:17:42 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 14 Nov 2023 12:17:42 -0500 Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI In-Reply-To: <6a7220ed.7ccc.18bceb9edc7.Coremail.ctchengben@mail.scut.edu.cn> References: <77a1187d.69cc.18bad5fd697.Coremail.ctchengben@mail.scut.edu.cn> <49254233.76e5.18bc66a98af.Coremail.ctchengben@mail.scut.edu.cn> <6a7220ed.7ccc.18bceb9edc7.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: You can check the build was successful manually. cd src/snes/tutorials make ex19 mpiexec -n 1 ./ex19 -snes_monitor > On Nov 14, 2023, at 11:46?AM, ?? wrote: > > Hi Barry, > > Thanks for the suggestion. It seems good after I change complier to english.Then I begin to configure and make it. > But unfortunate, when I try to make PETSC_DIR=/cygdrive/g/mypetsc/petsc-3.19.2 PETSC_ARCH=arch-mswin-c-debug check > It just happen to an error: > Running check examples to verify correct installation > Using PETSC_DIR=/cygdrive/g/mypetsc/petsc-3.19.2 and PETSC_ARCH=arch-mswin-c-debug > /usr/bin/bash: -c: line 9: unexpected EOF while looking for matching `"' > make[1]: *** [makefile:123: check] Error 2 > make: *** [GNUmakefile:17: check] Error 2 > > > So I send email to look for you help, and configure.log is attached. > > > sinserely, > Cheng. > > > > > -----????----- > ???: "Barry Smith" > > ????: 2023-11-13 11:19:23 (???) > ???: ?? > > ??: petsc-users > > ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI > > > The configure is failing while testing the Intel Fortran compiler > > Executing: /cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win_ifort -c -o /tmp/petsc-61cxbt4e/config.setCompilers/conftest.o -I/tmp/petsc-61cxbt4e/config.compilers -I/tmp/petsc-61cxbt4e/config.setCompilers -MT -Z7 -Od /tmp/petsc-61cxbt4e/config.setCompilers/conftest.F90 > Successful compile: > Source: > program main > > end > > Executing: /cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win_ifort -o /tmp/petsc-61cxbt4e/config.setCompilers/conftest.exe -MT -Z7 -Od /tmp/petsc-61cxbt4e/config.setCompilers/conftest.o > stdout: LINK : ?????? G:\cygwin\tmp\PE9718~1\CONFIG~1.SET\conftest.exe ???????????????????????????????????????? > Linker output before filtering: > LINK : ?????? G:\cygwin\tmp\PE9718~1\CONFIG~1.SET\conftest.exe ???????????????????????????????????????? > Linker output after filtering: > LINK : ?????? G:\cygwin\tmp\PE9718~1\CONFIG~1.SET\conftest.exe ???????????????????????????????????????? > > Can you please try changing the compiler to print its messages in English and then attempt to compile the trivial program above as indicated and send all the output. > > Barry > > >> On Nov 12, 2023, at 9:02?PM, ?? > wrote: >> >> Sorry for replying to your email so late, and I find configure.log >> >> >> -----????----- >> ???: "Barry Smith" > >> ????: 2023-11-08 23:42:30 (???) >> ???: ?? > >> ??: petsc-users at mcs.anl.gov >> ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI >> >> >> Send the file $PETSC_ARCH/lib/petsc/conf/configure.log >> >> >> >>> On Nov 8, 2023, at 12:20?AM, ?? > wrote: >>> >>> Hello, >>> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: >>> 1. PETSc: version 3.19.2 >>> 2. VS: version 2022 >>> 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit >>> 4. Cygwin: see the picture attatched (see picture cygwin) >>> >>> >>> And the compiler option in configuration is: >>> ./configure --prefix=/cygdrive/g/mypetsc/petsc2023 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' --with-shared-libraries=0 >>> --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include >>> --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/libTherefore, I write this e-mail to look for your help. >>> /release/impi.lib >>> --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly >>> >>> >>> >>> >>> but there return an error: >>> ********************************************************************************************* >>> OSError while running ./configure >>> --------------------------------------------------------------------------------------------- >>> Cannot run executables created with FC. If this machine uses a batch system >>> to submit jobs you will need to configure using ./configure with the additional option >>> --with-batch. >>> Otherwise there is problem with the compilers. Can you compile and run code with your >>> compiler '/cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win32fe ifort'? >>> See https://petsc.org/release/faq/#error-libimf >>> ********************************************************************************************* >>> >>> >>> >>> Then I try to open configure.log in petsc, but there turnout an error that I can't open it.(see picture 1) >>> >>> And then I right click on properties and click safety,it just turnout "The permissions on test directory are incorrectly ordered.which may cause some entries to be ineffective." (see picture 2) >>> >>> And it also likely seen ?NULL SID? as the top entry in permission lists.(see picture 3) >>> >>> Then i follow this blog(https://blog.dhampir.no/content/forcing-cygwin-to-create-sane-permissions-on-windows) to edit /etc/fstab in Cygwin, and add ?noacl? to the mount options for /cygdrive. >>> >>> But it's not working. >>> >>> So I can't sent configure.log to you guys, it seems cygwin that installed in my computer happened to some problem. >>> >>> Mayebe the error happened in the configure on petsc just because of this reason. >>> >>> >>> >>> So I wrrit this email to report my problem and ask for your help. >>> >>> >>> Looking forward your reply! >>> >>> >>> sinserely, >>> Cheng. >>> >> > > > ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 3135587 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From rene.chenard at icloud.com Tue Nov 14 14:50:09 2023 From: rene.chenard at icloud.com (Rene Chenard) Date: Tue, 14 Nov 2023 15:50:09 -0500 Subject: [petsc-users] Inquiry Regarding Callback Implementation for SNES in PETSc Message-ID: <828A7C8B-113F-49D8-9EF2-D81ED81FFC72@icloud.com> Dear PETSc Developers, I hope this message finds you well. My name is Ren? Chenard, and I am a Research Professional at Universit? Laval. We are currently working on implementing a wrapper for the SNES in our project and are seeking guidance on incorporating callbacks at different stages of the resolution process. Specifically, our objective is to implement a callback that triggers at the initiation of every nonlinear iteration and another that activates at the conclusion of each nonlinear iteration. In our exploration, we discovered the potential use of SNESSetUpdate and SNESSetConvergenceTest for this purpose. However, we encountered a challenge with SNESSetUpdate, as it seems to be ineffective for the ngmres and anderson solver types, the latter of which appears to be based on the implementation of ngmres. We are reaching out to seek clarification on whether this behavior is intentional and to explore alternative approaches that might better suit our needs. To facilitate a clearer understanding of our observations, we have prepared a reproduction example in the file named test_snes.c. This file outlines the specific scenarios where we encountered challenges with SNESSetUpdate and provides a context for our inquiries. Here are our specific questions: 1. Is SNESSetUpdate designed to function uniformly across all types of SNES solvers? 2. What would be the recommended approach to implement custom callbacks around every nonlinear iteration, especially considering the apparent limitations with SNESSetUpdate for certain solver types? 3. We observed a discrepancy in the iteration/step numbering between the update function (set by SNESSetUpdate) and the convergence function (set by SNESSetConvergenceTest). Could you provide clarification on this, considering the documentation's description of SNESSetUpdate as the function "called at the beginning of every iteration of the nonlinear solve"? We genuinely appreciate your expertise in this matter, and your insights will be invaluable in guiding our implementation. Thank you in advance for your consideration and support. Warm regards, ?Ren? Chenard Research Professional at Universit? Laval rene.chenard.1 at ulaval.ca ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_snes.c Type: application/octet-stream Size: 13581 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Nov 14 15:37:46 2023 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 14 Nov 2023 16:37:46 -0500 Subject: [petsc-users] Inquiry Regarding Callback Implementation for SNES in PETSc In-Reply-To: <828A7C8B-113F-49D8-9EF2-D81ED81FFC72@icloud.com> References: <828A7C8B-113F-49D8-9EF2-D81ED81FFC72@icloud.com> Message-ID: On Tue, Nov 14, 2023 at 3:59?PM Rene Chenard via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc Developers, > > I hope this message finds you well. My name is Ren? Chenard, and I am a > Research Professional at Universit? Laval. We are currently working on > implementing a wrapper for the SNES in our project and are seeking guidance > on incorporating callbacks at different stages of the resolution process. > > Specifically, our objective is to implement a callback that triggers at > the initiation of every nonlinear iteration and another that activates at > the conclusion of each nonlinear iteration. In our exploration, we > discovered the potential use of SNESSetUpdate and SNESSetConvergenceTest > for this purpose. > > However, we encountered a challenge with SNESSetUpdate, as it seems to be > ineffective for the ngmres and anderson solver types, the latter of which > appears to be based on the implementation of ngmres. We are reaching out to > seek clarification on whether this behavior is intentional and to explore > alternative approaches that might better suit our needs. > > To facilitate a clearer understanding of our observations, we have > prepared a reproduction example in the file named test_snes.c. This file > outlines the specific scenarios where we encountered challenges with > SNESSetUpdate and provides a context for our inquiries. > > Here are our specific questions: > > 1. Is SNESSetUpdate designed to function uniformly across all types of > SNES solvers? > Yes, this is our bug in NGMRES. It is fine if you make an MR for this, or we can do it. > 2. What would be the recommended approach to implement custom callbacks > around every nonlinear iteration, especially considering the apparent > limitations with SNESSetUpdate for certain solver types? > That is the right way, we just need to fix NGMRES. > 3. We observed a discrepancy in the iteration/step numbering between the > update function (set by SNESSetUpdate) and the convergence function (set by > SNESSetConvergenceTest). Could you provide clarification on this, > considering the documentation's description of SNESSetUpdate as the > function "called at the beginning of every iteration of the nonlinear > solve"? > Update is designed to be called at each iteration. The convergence test could possibly be called more than that (I think). For example, during line search you might call the convergence test to decide whether to keep searching. What do you want to do at the end of each iterate? Thanks, Matt > We genuinely appreciate your expertise in this matter, and your insights > will be invaluable in guiding our implementation. Thank you in advance for > your consideration and support. > > Warm regards, > > ?Ren? Chenard > Research Professional at Universit? Laval > rene.chenard.1 at ulaval.ca > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel.salazar at corintis.com Wed Nov 15 03:00:21 2023 From: miguel.salazar at corintis.com (Miguel Angel Salazar de Troya) Date: Wed, 15 Nov 2023 10:00:21 +0100 Subject: [petsc-users] Error handling in petsc4py Message-ID: Hello, The following simple petsc4py snippet runs out of memory, but I would like to handle it from python with the usual try-except. Is there any way to do so? How can I get the PETSc error codes in the python interface? Thanks from petsc4py import PETSc import sys, petsc4py petsc4py.init(sys.argv) try: m, n = 1000000, 1000000 A = PETSc.Mat().createAIJ([m, n], nnz=1e6) A.assemblyBegin() A.assemblyEnd() except Exception as e: print(f"An error occurred: {e}") An error occurred: error code 55 [0] MatSeqAIJSetPreallocation() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/mat/impls/aij/seq/aij.c:3942 [0] MatSeqAIJSetPreallocation_SeqAIJ() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/mat/impls/aij/seq/aij.c:4008 [0] PetscMallocA() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/sys/memory/mal.c:408 [0] PetscMallocAlign() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/sys/memory/mal.c:53 [0] Out of memory. Allocated: 0, Used by process: 59752448 [0] Memory requested 18446744064984991744 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 15 09:18:38 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 15 Nov 2023 10:18:38 -0500 Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI In-Reply-To: <4570079e.7f3e.18bd2d04793.Coremail.ctchengben@mail.scut.edu.cn> References: <77a1187d.69cc.18bad5fd697.Coremail.ctchengben@mail.scut.edu.cn> <49254233.76e5.18bc66a98af.Coremail.ctchengben@mail.scut.edu.cn> <6a7220ed.7ccc.18bceb9edc7.Coremail.ctchengben@mail.scut.edu.cn> <4570079e.7f3e.18bd2d04793.Coremail.ctchengben@mail.scut.edu.cn> Message-ID: <939D150A-893C-46B4-9DE7-65483C6F9240@petsc.dev> It is successfully installed and you can start using it. Barry > On Nov 15, 2023, at 6:49?AM, ?? wrote: > > Hi Barry, > > I follow your suggestion and the result showed: > mpiexec -n 1 ./ex19 -snes_monitor > lid velocity = 0.0625, prandtl # = 1., grashof # = 1. > 0 SNES Function norm 2.391552133017e-01 > 1 SNES Function norm 6.839858507066e-05 > 2 SNES Function norm 8.558777232425e-11 > Number of SNES iterations = 2 > > > It seems worked, shall I try the more examples or PETSc had been installed successfully on my computer. > > sinserely, > > Cheng. > > > > > > > -----????----- > ???: "Barry Smith" > > ????: 2023-11-15 01:17:42 (???) > ???: ?? > > ??: petsc-users > > ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI > > > > You can check the build was successful manually. > > cd src/snes/tutorials > make ex19 > mpiexec -n 1 ./ex19 -snes_monitor > > > >> On Nov 14, 2023, at 11:46?AM, ?? > wrote: >> >> Hi Barry, >> >> Thanks for the suggestion. It seems good after I change complier to english.Then I begin to configure and make it. >> But unfortunate, when I try to make PETSC_DIR=/cygdrive/g/mypetsc/petsc-3.19.2 PETSC_ARCH=arch-mswin-c-debug check >> It just happen to an error: >> Running check examples to verify correct installation >> Using PETSC_DIR=/cygdrive/g/mypetsc/petsc-3.19.2 and PETSC_ARCH=arch-mswin-c-debug >> /usr/bin/bash: -c: line 9: unexpected EOF while looking for matching `"' >> make[1]: *** [makefile:123: check] Error 2 >> make: *** [GNUmakefile:17: check] Error 2 >> >> >> So I send email to look for you help, and configure.log is attached. >> >> >> sinserely, >> Cheng. >> >> >> >> >> -----????----- >> ???: "Barry Smith" > >> ????: 2023-11-13 11:19:23 (???) >> ???: ?? > >> ??: petsc-users > >> ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI >> >> >> The configure is failing while testing the Intel Fortran compiler >> >> Executing: /cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win_ifort -c -o /tmp/petsc-61cxbt4e/config.setCompilers/conftest.o -I/tmp/petsc-61cxbt4e/config.compilers -I/tmp/petsc-61cxbt4e/config.setCompilers -MT -Z7 -Od /tmp/petsc-61cxbt4e/config.setCompilers/conftest.F90 >> Successful compile: >> Source: >> program main >> >> end >> >> Executing: /cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win_ifort -o /tmp/petsc-61cxbt4e/config.setCompilers/conftest.exe -MT -Z7 -Od /tmp/petsc-61cxbt4e/config.setCompilers/conftest.o >> stdout: LINK : ?????? G:\cygwin\tmp\PE9718~1\CONFIG~1.SET\conftest.exe ???????????????????????????????????????? >> Linker output before filtering: >> LINK : ?????? G:\cygwin\tmp\PE9718~1\CONFIG~1.SET\conftest.exe ???????????????????????????????????????? >> Linker output after filtering: >> LINK : ?????? G:\cygwin\tmp\PE9718~1\CONFIG~1.SET\conftest.exe ???????????????????????????????????????? >> >> Can you please try changing the compiler to print its messages in English and then attempt to compile the trivial program above as indicated and send all the output. >> >> Barry >> >> >>> On Nov 12, 2023, at 9:02?PM, ?? > wrote: >>> >>> Sorry for replying to your email so late, and I find configure.log >>> >>> >>> -----????----- >>> ???: "Barry Smith" > >>> ????: 2023-11-08 23:42:30 (???) >>> ???: ?? > >>> ??: petsc-users at mcs.anl.gov >>> ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using Intel MPI >>> >>> >>> Send the file $PETSC_ARCH/lib/petsc/conf/configure.log >>> >>> >>> >>>> On Nov 8, 2023, at 12:20?AM, ?? > wrote: >>>> >>>> Hello, >>>> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below: >>>> 1. PETSc: version 3.19.2 >>>> 2. VS: version 2022 >>>> 3. Intel MPI: download Intel oneAPI Base Toolkit and HPC Toolkit >>>> 4. Cygwin: see the picture attatched (see picture cygwin) >>>> >>>> >>>> And the compiler option in configuration is: >>>> ./configure --prefix=/cygdrive/g/mypetsc/petsc2023 --with-cc='win32fe cl' --with-fc='win32fe ifort' --with-cxx='win32fe cl' --with-shared-libraries=0 >>>> --with-mpi-include=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/include >>>> --with-mpi-lib=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/libTherefore, I write this e-mail to look for your help. >>>> /release/impi.lib >>>> --with-mpiexec=/cygdrive/g/Intel/oneAPI/mpi/2021.10.0/bin/mpiexec -localonly >>>> >>>> >>>> >>>> >>>> but there return an error: >>>> ********************************************************************************************* >>>> OSError while running ./configure >>>> --------------------------------------------------------------------------------------------- >>>> Cannot run executables created with FC. If this machine uses a batch system >>>> to submit jobs you will need to configure using ./configure with the additional option >>>> --with-batch. >>>> Otherwise there is problem with the compilers. Can you compile and run code with your >>>> compiler '/cygdrive/g/mypetsc/petsc-3.19.2/lib/petsc/bin/win32fe/win32fe ifort'? >>>> See https://petsc.org/release/faq/#error-libimf >>>> ********************************************************************************************* >>>> >>>> >>>> >>>> Then I try to open configure.log in petsc, but there turnout an error that I can't open it.(see picture 1) >>>> >>>> And then I right click on properties and click safety,it just turnout "The permissions on test directory are incorrectly ordered.which may cause some entries to be ineffective." (see picture 2) >>>> >>>> And it also likely seen ?NULL SID? as the top entry in permission lists.(see picture 3) >>>> >>>> Then i follow this blog(https://blog.dhampir.no/content/forcing-cygwin-to-create-sane-permissions-on-windows) to edit /etc/fstab in Cygwin, and add ?noacl? to the mount options for /cygdrive. >>>> >>>> But it's not working. >>>> >>>> So I can't sent configure.log to you guys, it seems cygwin that installed in my computer happened to some problem. >>>> >>>> Mayebe the error happened in the configure on petsc just because of this reason. >>>> >>>> >>>> >>>> So I wrrit this email to report my problem and ask for your help. >>>> >>>> >>>> Looking forward your reply! >>>> >>>> >>>> sinserely, >>>> Cheng. >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From carljohanthore at gmail.com Thu Nov 16 08:02:14 2023 From: carljohanthore at gmail.com (Carl-Johan Thore) Date: Thu, 16 Nov 2023 15:02:14 +0100 Subject: [petsc-users] Get DM used to create Vec Message-ID: Hi, Given a Vec (or Mat) created at some point with DMCreate..., is it possible to retrieve from this Vec a pointer to the DM used to create it? (I could perhaps build my own Vec-type on top of PETSc's which carried with it such a pointer but that doesn't seem like a good idea) Kind regards, Carl-Johan -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Nov 16 09:24:10 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 16 Nov 2023 09:24:10 -0600 Subject: [petsc-users] Get DM used to create Vec In-Reply-To: References: Message-ID: I was wondering if you can use https://petsc.org/release/manualpages/Sys/PetscObjectCompose/ to attach the DM to the Vec. --Junchao Zhang On Thu, Nov 16, 2023 at 8:06?AM Carl-Johan Thore wrote: > Hi, > > Given a Vec (or Mat) created at some point with DMCreate..., > is it possible to retrieve from this Vec a pointer to the DM used > to create it? > > (I could perhaps build my own Vec-type on top of PETSc's > which carried with it such a pointer but that doesn't seem like > a good idea) > > Kind regards, > Carl-Johan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Nov 16 09:24:13 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 16 Nov 2023 10:24:13 -0500 Subject: [petsc-users] Get DM used to create Vec In-Reply-To: References: Message-ID: https://petsc.org/release/manualpages/DM/VecGetDM/ Note there is also https://petsc.org/release/manualpages/Sys/PetscObjectCompose/ which along with https://petsc.org/release/manualpages/Sys/PetscContainerCreate/ allows you to attach any data you like to a Vec for later access. This approach supports layering your own higher-level vector information on top of regular Vec (it is done this way to support C and old Fortran that don't support inheritance). > On Nov 16, 2023, at 9:02?AM, Carl-Johan Thore wrote: > > Hi, > > Given a Vec (or Mat) created at some point with DMCreate..., > is it possible to retrieve from this Vec a pointer to the DM used > to create it? > > (I could perhaps build my own Vec-type on top of PETSc's > which carried with it such a pointer but that doesn't seem like > a good idea) > > Kind regards, > Carl-Johan -------------- next part -------------- An HTML attachment was scrubbed... URL: From carljohanthore at gmail.com Thu Nov 16 09:26:50 2023 From: carljohanthore at gmail.com (Carl-Johan Thore) Date: Thu, 16 Nov 2023 16:26:50 +0100 Subject: [petsc-users] Get DM used to create Vec In-Reply-To: References: Message-ID: Perfect, Thanks! On Thu, Nov 16, 2023 at 4:24?PM Barry Smith wrote: > > https://petsc.org/release/manualpages/DM/VecGetDM/ > > Note there is also > https://petsc.org/release/manualpages/Sys/PetscObjectCompose/ which along > with https://petsc.org/release/manualpages/Sys/PetscContainerCreate/ allows > you to attach any data you like to a Vec for later access. This approach > supports layering your own higher-level vector information on top of > regular Vec (it is done this way to support C and old Fortran that don't > support inheritance). > > > > On Nov 16, 2023, at 9:02?AM, Carl-Johan Thore > wrote: > > Hi, > > Given a Vec (or Mat) created at some point with DMCreate..., > is it possible to retrieve from this Vec a pointer to the DM used > to create it? > > (I could perhaps build my own Vec-type on top of PETSc's > which carried with it such a pointer but that doesn't seem like > a good idea) > > Kind regards, > Carl-Johan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Nov 16 09:39:50 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 16 Nov 2023 10:39:50 -0500 Subject: [petsc-users] Get DM used to create Vec In-Reply-To: References: Message-ID: On Thu, Nov 16, 2023 at 10:30?AM Barry Smith wrote: > > https://petsc.org/release/manualpages/DM/VecGetDM/ > > Note there is also > https://petsc.org/release/manualpages/Sys/PetscObjectCompose/ which along > with https://petsc.org/release/manualpages/Sys/PetscContainerCreate/ allows > you to attach any data you like to a Vec for later access. This approach > supports layering your own higher-level vector information on top of > regular Vec (it is done this way to support C and old Fortran that don't > support inheritance). > Caution: It is not hard to create reference cycles when using PetscObjectCompose(). Thanks, Matt > On Nov 16, 2023, at 9:02?AM, Carl-Johan Thore > wrote: > > Hi, > > Given a Vec (or Mat) created at some point with DMCreate..., > is it possible to retrieve from this Vec a pointer to the DM used > to create it? > > (I could perhaps build my own Vec-type on top of PETSc's > which carried with it such a pointer but that doesn't seem like > a good idea) > > Kind regards, > Carl-Johan > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Thu Nov 16 17:19:02 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Thu, 16 Nov 2023 18:19:02 -0500 Subject: [petsc-users] VecNorm causes program to hang Message-ID: I have a program which reads a vector from file into an array, and then uses that array to create a PETSc Vec object. The Vec is defined on the global communicator, but not all processes actually contain entries of it. For example, suppose we have 4 processors, and the vector is of size 10. Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks 2 and 3 will not have any entries of the Vec. This Vec is then used as an input to other parts of the code, and those work fine. However, if I try to take the norm of the Vec with VecNorm(), I get the error `MPI_Allreduce() called in different locations (code lines) on different processors` The stack trace shows that ranks 0 and 1 (from the above example) are still in the VecNorm() function while ranks 2 and 3 have moved on to a later part of the code. If I add a PetscBarrier() after the VecNorm(), I find that the program hangs. The funny thing is that part of the code duplicates the Vec with VecDuplicate() and assigns to the duplicated vector the result of some computations. The duplicated Vec has the same layout as the original Vec, but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(), however, the copied Vec also causes VecNorm() to hang. I've printed out the original Vec, and there are no corrupted/NaN entries. I have a temporary workaround where I perturb the original Vec slightly before copying it to another Vec. This causes the program to successfully terminate. Any advice on how to get VecNorm() working with the original Vec? Thanks, Sreeram -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Nov 16 17:30:31 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 16 Nov 2023 18:30:31 -0500 Subject: [petsc-users] VecNorm causes program to hang In-Reply-To: References: Message-ID: Congratulations you have found a ginormous bug in PETSc! Thanks for the detail information on the problem. I will post a fix shortly. Barry > On Nov 16, 2023, at 6:19?PM, Sreeram R Venkat wrote: > > I have a program which reads a vector from file into an array, and then uses that array to create a PETSc Vec object. The Vec is defined on the global communicator, but not all processes actually contain entries of it. For example, suppose we have 4 processors, and the vector is of size 10. Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks 2 and 3 will not have any entries of the Vec. > > This Vec is then used as an input to other parts of the code, and those work fine. However, if I try to take the norm of the Vec with VecNorm(), I get the error > > `MPI_Allreduce() called in different locations (code lines) on different processors` > > The stack trace shows that ranks 0 and 1 (from the above example) are still in the VecNorm() function while ranks 2 and 3 have moved on to a later part of the code. If I add a PetscBarrier() after the VecNorm(), I find that the program hangs. > > The funny thing is that part of the code duplicates the Vec with VecDuplicate() and assigns to the duplicated vector the result of some computations. The duplicated Vec has the same layout as the original Vec, but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(), however, the copied Vec also causes VecNorm() to hang. I've printed out the original Vec, and there are no corrupted/NaN entries. > > I have a temporary workaround where I perturb the original Vec slightly before copying it to another Vec. This causes the program to successfully terminate. > > Any advice on how to get VecNorm() working with the original Vec? > > Thanks, > Sreeram -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Nov 16 19:27:21 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 16 Nov 2023 20:27:21 -0500 Subject: [petsc-users] VecNorm causes program to hang In-Reply-To: References: Message-ID: On Thu, Nov 16, 2023 at 6:19?PM Sreeram R Venkat wrote: > I have a program which reads a vector from file into an array, and then > uses that array to create a PETSc Vec object. The Vec is defined on the > global communicator, but not all processes actually contain entries of it. > For example, suppose we have 4 processors, and the vector is of size 10. > Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks > 2 and 3 will not have any entries of the Vec. > > This Vec is then used as an input to other parts of the code, and those > work fine. However, if I try to take the norm of the Vec with VecNorm(), I > get the error > > `MPI_Allreduce() called in different locations (code lines) on different > processors` > > The stack trace shows that ranks 0 and 1 (from the above example) are > still in the VecNorm() function while ranks 2 and 3 have moved on to a > later part of the code. If I add a PetscBarrier() after the VecNorm(), I > find that the program hangs. > > The funny thing is that part of the code duplicates the Vec with > VecDuplicate() and assigns to the duplicated vector the result of some > computations. The duplicated Vec has the same layout as the original Vec, > but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(), > however, the copied Vec also causes VecNorm() to hang. I've printed out the > original Vec, and there are no corrupted/NaN entries. > > I have a temporary workaround where I perturb the original Vec slightly > before copying it to another Vec. This causes the program to successfully > terminate. > > Any advice on how to get VecNorm() working with the original Vec? > Vecs with empty layouts work fine, so it must be something else about how it is created. In order to track it down, I would first make a short program that just creates the Vec as you say and see if it hangs. If so, just send it and we will debug it. If not, I would systematically cut down your program until you get something that hangs that you can send to us. Thanks, Matt > Thanks, > Sreeram > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Thu Nov 16 19:38:02 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Thu, 16 Nov 2023 20:38:02 -0500 Subject: [petsc-users] VecNorm causes program to hang In-Reply-To: References: Message-ID: Ok, will do. It may take me a few days to get a minimal reproducible example though since the rest of the program has gotten quite large. Thanks, Sreeram On Thu, Nov 16, 2023 at 8:27?PM Matthew Knepley wrote: > On Thu, Nov 16, 2023 at 6:19?PM Sreeram R Venkat > wrote: > >> I have a program which reads a vector from file into an array, and then >> uses that array to create a PETSc Vec object. The Vec is defined on the >> global communicator, but not all processes actually contain entries of it. >> For example, suppose we have 4 processors, and the vector is of size 10. >> Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks >> 2 and 3 will not have any entries of the Vec. >> >> This Vec is then used as an input to other parts of the code, and those >> work fine. However, if I try to take the norm of the Vec with VecNorm(), I >> get the error >> >> `MPI_Allreduce() called in different locations (code lines) on different >> processors` >> >> The stack trace shows that ranks 0 and 1 (from the above example) are >> still in the VecNorm() function while ranks 2 and 3 have moved on to a >> later part of the code. If I add a PetscBarrier() after the VecNorm(), I >> find that the program hangs. >> >> The funny thing is that part of the code duplicates the Vec with >> VecDuplicate() and assigns to the duplicated vector the result of some >> computations. The duplicated Vec has the same layout as the original Vec, >> but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(), >> however, the copied Vec also causes VecNorm() to hang. I've printed out the >> original Vec, and there are no corrupted/NaN entries. >> >> I have a temporary workaround where I perturb the original Vec slightly >> before copying it to another Vec. This causes the program to successfully >> terminate. >> >> Any advice on how to get VecNorm() working with the original Vec? >> > > Vecs with empty layouts work fine, so it must be something else about how > it is created. > > In order to track it down, I would first make a short program that just > creates the Vec as you say and see if it hangs. If so, just send it and we > will debug it. If not, I would systematically cut down your program until > you get something that hangs that you can send to us. > > Thanks, > > Matt > > >> Thanks, >> Sreeram >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Thu Nov 16 20:41:38 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Thu, 16 Nov 2023 21:41:38 -0500 Subject: [petsc-users] VecNorm causes program to hang In-Reply-To: References: Message-ID: Actually, here's a short test case I just made. I have it on a git repo: https://github.com/s769/petsc-test I put some instructions for how to build and run, but if there are issues, please let me know. In this small test code, I noticed that there are some CUDA memory errors in the VecAXPY() line if the proc_cols variable is not 1. Still trying to figure out what might be causing that, but in the meantime, the code I have up there hangs for proc_rows=3, proc_cols=1, n=10 when we try to get the norm of the Vec. Hope this helps. Thanks, Sreeram On Thu, Nov 16, 2023 at 8:38?PM Sreeram R Venkat wrote: > Ok, will do. It may take me a few days to get a minimal reproducible > example though since the rest of the program has gotten quite large. > > Thanks, > Sreeram > > On Thu, Nov 16, 2023 at 8:27?PM Matthew Knepley wrote: > >> On Thu, Nov 16, 2023 at 6:19?PM Sreeram R Venkat >> wrote: >> >>> I have a program which reads a vector from file into an array, and then >>> uses that array to create a PETSc Vec object. The Vec is defined on the >>> global communicator, but not all processes actually contain entries of it. >>> For example, suppose we have 4 processors, and the vector is of size 10. >>> Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks >>> 2 and 3 will not have any entries of the Vec. >>> >>> This Vec is then used as an input to other parts of the code, and those >>> work fine. However, if I try to take the norm of the Vec with VecNorm(), I >>> get the error >>> >>> `MPI_Allreduce() called in different locations (code lines) on different >>> processors` >>> >>> The stack trace shows that ranks 0 and 1 (from the above example) are >>> still in the VecNorm() function while ranks 2 and 3 have moved on to a >>> later part of the code. If I add a PetscBarrier() after the VecNorm(), I >>> find that the program hangs. >>> >>> The funny thing is that part of the code duplicates the Vec with >>> VecDuplicate() and assigns to the duplicated vector the result of some >>> computations. The duplicated Vec has the same layout as the original Vec, >>> but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(), >>> however, the copied Vec also causes VecNorm() to hang. I've printed out the >>> original Vec, and there are no corrupted/NaN entries. >>> >>> I have a temporary workaround where I perturb the original Vec slightly >>> before copying it to another Vec. This causes the program to successfully >>> terminate. >>> >>> Any advice on how to get VecNorm() working with the original Vec? >>> >> >> Vecs with empty layouts work fine, so it must be something else about how >> it is created. >> >> In order to track it down, I would first make a short program that just >> creates the Vec as you say and see if it hangs. If so, just send it and we >> will debug it. If not, I would systematically cut down your program until >> you get something that hangs that you can send to us. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Sreeram >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maruthinh at gmail.com Thu Nov 16 23:57:01 2023 From: maruthinh at gmail.com (Maruthi NH) Date: Fri, 17 Nov 2023 11:27:01 +0530 Subject: [petsc-users] Error running make check with OneAPI C/C++ and Fortran compilers on Windows Message-ID: Hi all, I could successfully compile PETSc on Windows with Intel OneAPI C/C++ and Fortran compilers, however, when I tried to make a check after the successful installation, I got the following error message. I have also attached the configuration file. make PETSC_DIR=/home/ngh/petsc PETSC_ARCH=intel-petsc-tag-v3.20.1 check Running PETSc check examples to verify correct installation Using PETSC_DIR=/home/ngh/petsc and PETSC_ARCH=intel-petsc-tag-v3.20.1 Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process See https://petsc.org/release/faq/ =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 0 PID 13384 RUNNING AT BLRLAP1521 = EXIT STATUS: -1073741819 (c0000005) =================================================================================== Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes See https://petsc.org/release/faq/ =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 0 PID 20616 RUNNING AT BLRLAP1521 = EXIT STATUS: -1073741819 (c0000005) =================================================================================== =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 1 PID 16900 RUNNING AT BLRLAP1521 = EXIT STATUS: -1073741819 (c0000005) =================================================================================== Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process See https://petsc.org/release/faq/ forrtl: severe (157): Program Exception - access violation Image PC Routine Line Source ex5f.exe 00007FF661DB569B Unknown Unknown Unknown ex5f.exe 00007FF661A760AF Unknown Unknown Unknown ex5f.exe 00007FF6619E6EBD Unknown Unknown Unknown ex5f.exe 00007FF6619C6294 Unknown Unknown Unknown ex5f.exe 00007FF6619C3152 Unknown Unknown Unknown ex5f.exe 00007FF6619C1051 Unknown Unknown Unknown ex5f.exe 00007FF662C3C16E Unknown Unknown Unknown ex5f.exe 00007FF662C3CA50 Unknown Unknown Unknown KERNEL32.DLL 00007FFB12187344 Unknown Unknown Unknown ntdll.dll 00007FFB140C26B1 Unknown Unknown Unknown Completed PETSc check examples Error while running make check make[1]: *** [makefile:132: check] Error 1 make: *** [GNUmakefile:17: check] Error 2 Regards, mnh -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: intel-petsc-tag-v3.20.1.py Type: text/x-python Size: 1064 bytes Desc: not available URL: From mfadams at lbl.gov Fri Nov 17 08:32:41 2023 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 17 Nov 2023 09:32:41 -0500 Subject: [petsc-users] VecNorm causes program to hang In-Reply-To: References: Message-ID: I get this error: (base) 06:30 2 login10 master= perlmutter:~/petsc-test$ bash -x buildme.sh + '[' -z '' ']' + case "$-" in + __lmod_vx=x + '[' -n x ']' + set +x Shell debugging temporarily silenced: export LMOD_SH_DBG_ON=1 for this output (/opt/cray/pe/lmod/lmod/init/bash) Shell debugging restarted + unset __lmod_vx + git pull Already up to date. + cmake . -- Configuring done -- Generating done -- Build files have been written to: /global/homes/m/madams/petsc-test + make -j [ 33%] Building CUDA object CMakeFiles/test.dir/main.cu.o In file included from /global/homes/m/madams/petsc/include/petscbag.h:3, from /global/homes/m/madams/petsc/include/petsc.h:6, from /global/homes/m/madams/petsc-test/shared.cuh:8, from /global/homes/m/madams/petsc-test/main.cu:1: /global/homes/m/madams/petsc/include/petscsys.h:65:12: fatal error: mpi.h: No such file or directory 65 | #include | ^~~~~~~ compilation terminated. make[2]: *** [CMakeFiles/test.dir/build.make:76: CMakeFiles/test.dir/main.cu.o] Error 1 make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/test.dir/all] Error 2 make: *** [Makefile:91: all] Error 2 (base) 06:31 2 login10 master= perlmutter:~/petsc-test$ On Thu, Nov 16, 2023 at 9:42?PM Sreeram R Venkat wrote: > Actually, here's a short test case I just made. > I have it on a git repo: https://github.com/s769/petsc-test > > I put some instructions for how to build and run, but if there are issues, > please let me know. > > In this small test code, I noticed that there are some CUDA memory errors > in the VecAXPY() line if the proc_cols variable is not 1. Still trying to > figure out what might be causing that, but in the meantime, the code I have > up there hangs for proc_rows=3, proc_cols=1, n=10 when we try to get the > norm of the Vec. > > Hope this helps. > > Thanks, > Sreeram > > On Thu, Nov 16, 2023 at 8:38?PM Sreeram R Venkat > wrote: > >> Ok, will do. It may take me a few days to get a minimal reproducible >> example though since the rest of the program has gotten quite large. >> >> Thanks, >> Sreeram >> >> On Thu, Nov 16, 2023 at 8:27?PM Matthew Knepley >> wrote: >> >>> On Thu, Nov 16, 2023 at 6:19?PM Sreeram R Venkat >>> wrote: >>> >>>> I have a program which reads a vector from file into an array, and then >>>> uses that array to create a PETSc Vec object. The Vec is defined on the >>>> global communicator, but not all processes actually contain entries of it. >>>> For example, suppose we have 4 processors, and the vector is of size 10. >>>> Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks >>>> 2 and 3 will not have any entries of the Vec. >>>> >>>> This Vec is then used as an input to other parts of the code, and those >>>> work fine. However, if I try to take the norm of the Vec with VecNorm(), I >>>> get the error >>>> >>>> `MPI_Allreduce() called in different locations (code lines) on >>>> different processors` >>>> >>>> The stack trace shows that ranks 0 and 1 (from the above example) are >>>> still in the VecNorm() function while ranks 2 and 3 have moved on to a >>>> later part of the code. If I add a PetscBarrier() after the VecNorm(), I >>>> find that the program hangs. >>>> >>>> The funny thing is that part of the code duplicates the Vec with >>>> VecDuplicate() and assigns to the duplicated vector the result of some >>>> computations. The duplicated Vec has the same layout as the original Vec, >>>> but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(), >>>> however, the copied Vec also causes VecNorm() to hang. I've printed out the >>>> original Vec, and there are no corrupted/NaN entries. >>>> >>>> I have a temporary workaround where I perturb the original Vec slightly >>>> before copying it to another Vec. This causes the program to successfully >>>> terminate. >>>> >>>> Any advice on how to get VecNorm() working with the original Vec? >>>> >>> >>> Vecs with empty layouts work fine, so it must be something else about >>> how it is created. >>> >>> In order to track it down, I would first make a short program that just >>> creates the Vec as you say and see if it hangs. If so, just send it and we >>> will debug it. If not, I would systematically cut down your program until >>> you get something that hangs that you can send to us. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> Sreeram >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Nov 17 09:56:01 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 17 Nov 2023 10:56:01 -0500 Subject: [petsc-users] Error running make check with OneAPI C/C++ and Fortran compilers on Windows In-Reply-To: References: Message-ID: <19B25FB7-66E0-478F-9A18-4BD0455A9C48@petsc.dev> Please do cd src/snes/tutorials make ex19 mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres then with -n 2 Cut and paste all the output and include it in your email response Barry > On Nov 17, 2023, at 12:57?AM, Maruthi NH wrote: > > Hi all, > > I could successfully compile PETSc on Windows with Intel OneAPI C/C++ and Fortran compilers, however, when I tried to make a check after the successful installation, I got the following error message. I have also attached the configuration file. > > make PETSC_DIR=/home/ngh/petsc PETSC_ARCH=intel-petsc-tag-v3.20.1 check > Running PETSc check examples to verify correct installation > Using PETSC_DIR=/home/ngh/petsc and PETSC_ARCH=intel-petsc-tag-v3.20.1 > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process > See https://petsc.org/release/faq/ > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 13384 RUNNING AT BLRLAP1521 > = EXIT STATUS: -1073741819 (c0000005) > =================================================================================== > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes > See https://petsc.org/release/faq/ > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 20616 RUNNING AT BLRLAP1521 > = EXIT STATUS: -1073741819 (c0000005) > =================================================================================== > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 1 PID 16900 RUNNING AT BLRLAP1521 > = EXIT STATUS: -1073741819 (c0000005) > =================================================================================== > Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process > See https://petsc.org/release/faq/ > forrtl: severe (157): Program Exception - access violation > Image PC Routine Line Source > ex5f.exe 00007FF661DB569B Unknown Unknown Unknown > ex5f.exe 00007FF661A760AF Unknown Unknown Unknown > ex5f.exe 00007FF6619E6EBD Unknown Unknown Unknown > ex5f.exe 00007FF6619C6294 Unknown Unknown Unknown > ex5f.exe 00007FF6619C3152 Unknown Unknown Unknown > ex5f.exe 00007FF6619C1051 Unknown Unknown Unknown > ex5f.exe 00007FF662C3C16E Unknown Unknown Unknown > ex5f.exe 00007FF662C3CA50 Unknown Unknown Unknown > KERNEL32.DLL 00007FFB12187344 Unknown Unknown Unknown > ntdll.dll 00007FFB140C26B1 Unknown Unknown Unknown > Completed PETSc check examples > Error while running make check > make[1]: *** [makefile:132: check] Error 1 > make: *** [GNUmakefile:17: check] Error 2 > > Regards, > mnh > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Fri Nov 17 11:05:10 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Fri, 17 Nov 2023 12:05:10 -0500 Subject: [petsc-users] VecNorm causes program to hang In-Reply-To: References: Message-ID: I've updated the buildme script to specify the MPI and CUDA compilers. Please make sure those modules are loaded, and let me know if it works. Thanks, Sreeram On Fri, Nov 17, 2023 at 9:32?AM Mark Adams wrote: > I get this error: > > (base) 06:30 2 login10 master= perlmutter:~/petsc-test$ bash -x buildme.sh > + '[' -z '' ']' > + case "$-" in > + __lmod_vx=x > + '[' -n x ']' > + set +x > Shell debugging temporarily silenced: export LMOD_SH_DBG_ON=1 for this > output (/opt/cray/pe/lmod/lmod/init/bash) > Shell debugging restarted > + unset __lmod_vx > + git pull > Already up to date. > + cmake . > -- Configuring done > -- Generating done > -- Build files have been written to: /global/homes/m/madams/petsc-test > + make -j > [ 33%] Building CUDA object CMakeFiles/test.dir/main.cu.o > In file included from /global/homes/m/madams/petsc/include/petscbag.h:3, > from /global/homes/m/madams/petsc/include/petsc.h:6, > from /global/homes/m/madams/petsc-test/shared.cuh:8, > from /global/homes/m/madams/petsc-test/main.cu:1: > /global/homes/m/madams/petsc/include/petscsys.h:65:12: fatal error: mpi.h: > No such file or directory > 65 | #include > | ^~~~~~~ > compilation terminated. > make[2]: *** [CMakeFiles/test.dir/build.make:76: > CMakeFiles/test.dir/main.cu.o] Error 1 > make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/test.dir/all] Error 2 > make: *** [Makefile:91: all] Error 2 > (base) 06:31 2 login10 master= perlmutter:~/petsc-test$ > > > On Thu, Nov 16, 2023 at 9:42?PM Sreeram R Venkat > wrote: > >> Actually, here's a short test case I just made. >> I have it on a git repo: https://github.com/s769/petsc-test >> >> I put some instructions for how to build and run, but if there are >> issues, please let me know. >> >> In this small test code, I noticed that there are some CUDA memory errors >> in the VecAXPY() line if the proc_cols variable is not 1. Still trying to >> figure out what might be causing that, but in the meantime, the code I have >> up there hangs for proc_rows=3, proc_cols=1, n=10 when we try to get the >> norm of the Vec. >> >> Hope this helps. >> >> Thanks, >> Sreeram >> >> On Thu, Nov 16, 2023 at 8:38?PM Sreeram R Venkat >> wrote: >> >>> Ok, will do. It may take me a few days to get a minimal reproducible >>> example though since the rest of the program has gotten quite large. >>> >>> Thanks, >>> Sreeram >>> >>> On Thu, Nov 16, 2023 at 8:27?PM Matthew Knepley >>> wrote: >>> >>>> On Thu, Nov 16, 2023 at 6:19?PM Sreeram R Venkat >>>> wrote: >>>> >>>>> I have a program which reads a vector from file into an array, and >>>>> then uses that array to create a PETSc Vec object. The Vec is defined on >>>>> the global communicator, but not all processes actually contain entries of >>>>> it. For example, suppose we have 4 processors, and the vector is of size >>>>> 10. Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. >>>>> Ranks 2 and 3 will not have any entries of the Vec. >>>>> >>>>> This Vec is then used as an input to other parts of the code, and >>>>> those work fine. However, if I try to take the norm of the Vec with >>>>> VecNorm(), I get the error >>>>> >>>>> `MPI_Allreduce() called in different locations (code lines) on >>>>> different processors` >>>>> >>>>> The stack trace shows that ranks 0 and 1 (from the above example) are >>>>> still in the VecNorm() function while ranks 2 and 3 have moved on to a >>>>> later part of the code. If I add a PetscBarrier() after the VecNorm(), I >>>>> find that the program hangs. >>>>> >>>>> The funny thing is that part of the code duplicates the Vec with >>>>> VecDuplicate() and assigns to the duplicated vector the result of some >>>>> computations. The duplicated Vec has the same layout as the original Vec, >>>>> but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(), >>>>> however, the copied Vec also causes VecNorm() to hang. I've printed out the >>>>> original Vec, and there are no corrupted/NaN entries. >>>>> >>>>> I have a temporary workaround where I perturb the original Vec >>>>> slightly before copying it to another Vec. This causes the program to >>>>> successfully terminate. >>>>> >>>>> Any advice on how to get VecNorm() working with the original Vec? >>>>> >>>> >>>> Vecs with empty layouts work fine, so it must be something else about >>>> how it is created. >>>> >>>> In order to track it down, I would first make a short program that just >>>> creates the Vec as you say and see if it hangs. If so, just send it and we >>>> will debug it. If not, I would systematically cut down your program until >>>> you get something that hangs that you can send to us. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Sreeram >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Nov 17 11:09:37 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 17 Nov 2023 12:09:37 -0500 Subject: [petsc-users] VecNorm causes program to hang In-Reply-To: References: Message-ID: So the "bug" is not as ginormous as I originally thought. It will never produce incorrect results but can result in the errors you received. The problem is if (row_rank == 0) { PetscCall(VecCUDAReplaceArray(v, d_a)); } The place/replacearray routines are actually collective; and need to be called by all MPI processes that own a vector regardless of the local size. This is because the call can invalidate the previously known norm values that have been cached in the vector. If the norm values are invalidated on some MPI processes but not others you will get the error you have seen. Barry I will prepare a branch with better documentation and clearer error handling for this situation. > On Nov 16, 2023, at 6:30?PM, Barry Smith wrote: > > > Congratulations you have found a ginormous bug in PETSc! Thanks for the detail information on the problem. > > I will post a fix shortly. > > Barry > > >> On Nov 16, 2023, at 6:19?PM, Sreeram R Venkat wrote: >> >> I have a program which reads a vector from file into an array, and then uses that array to create a PETSc Vec object. The Vec is defined on the global communicator, but not all processes actually contain entries of it. For example, suppose we have 4 processors, and the vector is of size 10. Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks 2 and 3 will not have any entries of the Vec. >> >> This Vec is then used as an input to other parts of the code, and those work fine. However, if I try to take the norm of the Vec with VecNorm(), I get the error >> >> `MPI_Allreduce() called in different locations (code lines) on different processors` >> >> The stack trace shows that ranks 0 and 1 (from the above example) are still in the VecNorm() function while ranks 2 and 3 have moved on to a later part of the code. If I add a PetscBarrier() after the VecNorm(), I find that the program hangs. >> >> The funny thing is that part of the code duplicates the Vec with VecDuplicate() and assigns to the duplicated vector the result of some computations. The duplicated Vec has the same layout as the original Vec, but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(), however, the copied Vec also causes VecNorm() to hang. I've printed out the original Vec, and there are no corrupted/NaN entries. >> >> I have a temporary workaround where I perturb the original Vec slightly before copying it to another Vec. This causes the program to successfully terminate. >> >> Any advice on how to get VecNorm() working with the original Vec? >> >> Thanks, >> Sreeram > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maruthinh at gmail.com Fri Nov 17 11:14:39 2023 From: maruthinh at gmail.com (Maruthi NH) Date: Fri, 17 Nov 2023 22:44:39 +0530 Subject: [petsc-users] Error running make check with OneAPI C/C++ and Fortran compilers on Windows In-Reply-To: <19B25FB7-66E0-478F-9A18-4BD0455A9C48@petsc.dev> References: <19B25FB7-66E0-478F-9A18-4BD0455A9C48@petsc.dev> Message-ID: Hi Barry, I get the following error. ngh at ngh1 ~/petsc/src/snes/tutorials $ make ex19 /home/ngh/petsc/lib/petsc/bin/win32fe/win32fe icl -Qwd10161 -Qstd=c99 -MT -O3 -I/home/ngh/petsc/include -I/home/ngh/petsc/intel-petsc-tag-v3.20.1/include -I/cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/include ex19.c -L/home/ngh/petsc/intel-petsc-tag-v3.20.1/lib -L/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/2022.1.0/lib/intel64 -lpetsc mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib /cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/lib/release/impi.lib Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib -o ex19 ex19.c ngh at ngh1 ~/petsc/src/snes/tutorials $ mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres ================================================================================== = = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 0 PID 1768 RUNNING AT ngh1 = EXIT STATUS: -1073741819 (c0000005) ================================================================================== = ngh at ngh1 ~/petsc/src/snes/tutorials $ mpiexec -n 2 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres ================================================================================== = = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 0 PID 25936 RUNNING AT ngh1 = EXIT STATUS: -1073741819 (c0000005) ================================================================================== = ================================================================================== = = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 1 PID 5216 RUNNING AT ngh1 = EXIT STATUS: -1073741819 (c0000005) ================================================================================== = Regards, Maruthi On Fri, Nov 17, 2023 at 9:26?PM Barry Smith wrote: > > Please do > > cd src/snes/tutorials > make ex19 > mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres > > then with -n 2 > > Cut and paste all the output and include it in your email response > > Barry > > > > On Nov 17, 2023, at 12:57?AM, Maruthi NH wrote: > > Hi all, > > I could successfully compile PETSc on Windows with Intel OneAPI C/C++ and > Fortran compilers, however, when I tried to make a check after the > successful installation, I got the following error message. I have also > attached the configuration file. > > make PETSC_DIR=/home/ngh/petsc PETSC_ARCH=intel-petsc-tag-v3.20.1 check > Running PETSc check examples to verify correct installation > Using PETSC_DIR=/home/ngh/petsc and PETSC_ARCH=intel-petsc-tag-v3.20.1 > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process > See https://petsc.org/release/faq/ > > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 13384 RUNNING AT BLRLAP1521 > = EXIT STATUS: -1073741819 (c0000005) > > =================================================================================== > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes > See https://petsc.org/release/faq/ > > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 20616 RUNNING AT BLRLAP1521 > = EXIT STATUS: -1073741819 (c0000005) > > =================================================================================== > > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 1 PID 16900 RUNNING AT BLRLAP1521 > = EXIT STATUS: -1073741819 (c0000005) > > =================================================================================== > Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI > process > See https://petsc.org/release/faq/ > forrtl: severe (157): Program Exception - access violation > Image PC Routine Line Source > ex5f.exe 00007FF661DB569B Unknown Unknown Unknown > ex5f.exe 00007FF661A760AF Unknown Unknown Unknown > ex5f.exe 00007FF6619E6EBD Unknown Unknown Unknown > ex5f.exe 00007FF6619C6294 Unknown Unknown Unknown > ex5f.exe 00007FF6619C3152 Unknown Unknown Unknown > ex5f.exe 00007FF6619C1051 Unknown Unknown Unknown > ex5f.exe 00007FF662C3C16E Unknown Unknown Unknown > ex5f.exe 00007FF662C3CA50 Unknown Unknown Unknown > KERNEL32.DLL 00007FFB12187344 Unknown Unknown Unknown > ntdll.dll 00007FFB140C26B1 Unknown Unknown Unknown > Completed PETSc check examples > Error while running make check > make[1]: *** [makefile:132: check] Error 1 > make: *** [GNUmakefile:17: check] Error 2 > > Regards, > mnh > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Nov 17 11:54:52 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 17 Nov 2023 12:54:52 -0500 Subject: [petsc-users] Error running make check with OneAPI C/C++ and Fortran compilers on Windows In-Reply-To: References: <19B25FB7-66E0-478F-9A18-4BD0455A9C48@petsc.dev> Message-ID: <0AD8879F-4ED7-4D5E-B5BA-9B56C8539ED8@petsc.dev> Please run mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres -info and send all the output > On Nov 17, 2023, at 12:14?PM, Maruthi NH wrote: > > Hi Barry, > > I get the following error. > > ngh at ngh1 ~/petsc/src/snes/tutorials > $ make ex19 > /home/ngh/petsc/lib/petsc/bin/win32fe/win32fe icl -Qwd10161 -Qstd=c99 -MT -O3 -I/home/ngh/petsc/include -I/home/ngh/petsc/intel-petsc-tag-v3.20.1/include -I/cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/include ex19.c -L/home/ngh/petsc/intel-petsc-tag-v3.20.1/lib -L/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/2022.1.0/lib/intel64 -lpetsc mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib /cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/lib/release/impi.lib Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib -o ex19 > ex19.c > > ngh at ngh1 ~/petsc/src/snes/tutorials > $ mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 1768 RUNNING AT ngh1 > = EXIT STATUS: -1073741819 (c0000005) > =================================================================================== > > ngh at ngh1 ~/petsc/src/snes/tutorials > $ mpiexec -n 2 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 25936 RUNNING AT ngh1 > = EXIT STATUS: -1073741819 (c0000005) > =================================================================================== > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 1 PID 5216 RUNNING AT ngh1 > = EXIT STATUS: -1073741819 (c0000005) > =================================================================================== > > > Regards, > Maruthi > > On Fri, Nov 17, 2023 at 9:26?PM Barry Smith > wrote: >> >> Please do >> >> cd src/snes/tutorials >> make ex19 >> mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres >> >> then with -n 2 >> >> Cut and paste all the output and include it in your email response >> >> Barry >> >> >> >>> On Nov 17, 2023, at 12:57?AM, Maruthi NH > wrote: >>> >>> Hi all, >>> >>> I could successfully compile PETSc on Windows with Intel OneAPI C/C++ and Fortran compilers, however, when I tried to make a check after the successful installation, I got the following error message. I have also attached the configuration file. >>> >>> make PETSC_DIR=/home/ngh/petsc PETSC_ARCH=intel-petsc-tag-v3.20.1 check >>> Running PETSc check examples to verify correct installation >>> Using PETSC_DIR=/home/ngh/petsc and PETSC_ARCH=intel-petsc-tag-v3.20.1 >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process >>> See https://petsc.org/release/faq/ >>> >>> =================================================================================== >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = RANK 0 PID 13384 RUNNING AT BLRLAP1521 >>> = EXIT STATUS: -1073741819 (c0000005) >>> =================================================================================== >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes >>> See https://petsc.org/release/faq/ >>> >>> =================================================================================== >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = RANK 0 PID 20616 RUNNING AT BLRLAP1521 >>> = EXIT STATUS: -1073741819 (c0000005) >>> =================================================================================== >>> >>> =================================================================================== >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = RANK 1 PID 16900 RUNNING AT BLRLAP1521 >>> = EXIT STATUS: -1073741819 (c0000005) >>> =================================================================================== >>> Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process >>> See https://petsc.org/release/faq/ >>> forrtl: severe (157): Program Exception - access violation >>> Image PC Routine Line Source >>> ex5f.exe 00007FF661DB569B Unknown Unknown Unknown >>> ex5f.exe 00007FF661A760AF Unknown Unknown Unknown >>> ex5f.exe 00007FF6619E6EBD Unknown Unknown Unknown >>> ex5f.exe 00007FF6619C6294 Unknown Unknown Unknown >>> ex5f.exe 00007FF6619C3152 Unknown Unknown Unknown >>> ex5f.exe 00007FF6619C1051 Unknown Unknown Unknown >>> ex5f.exe 00007FF662C3C16E Unknown Unknown Unknown >>> ex5f.exe 00007FF662C3CA50 Unknown Unknown Unknown >>> KERNEL32.DLL 00007FFB12187344 Unknown Unknown Unknown >>> ntdll.dll 00007FFB140C26B1 Unknown Unknown Unknown >>> Completed PETSc check examples >>> Error while running make check >>> make[1]: *** [makefile:132: check] Error 1 >>> make: *** [GNUmakefile:17: check] Error 2 >>> >>> Regards, >>> mnh >>> >>> >>> > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From maruthinh at gmail.com Fri Nov 17 11:59:42 2023 From: maruthinh at gmail.com (Maruthi NH) Date: Fri, 17 Nov 2023 23:29:42 +0530 Subject: [petsc-users] Error running make check with OneAPI C/C++ and Fortran compilers on Windows In-Reply-To: <0AD8879F-4ED7-4D5E-B5BA-9B56C8539ED8@petsc.dev> References: <19B25FB7-66E0-478F-9A18-4BD0455A9C48@petsc.dev> <0AD8879F-4ED7-4D5E-B5BA-9B56C8539ED8@petsc.dev> Message-ID: Hi Barry, It doesn't even start to run, I still get the same error. This is all the output I get. =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 0 PID 28192 RUNNING AT BLRLAP1521 = EXIT STATUS: -1073741819 (c0000005) =================================================================================== With Windows C compiler and Intel OneAPI Fortran, it works fine. But when I use the OneAPI C/C++ compiler it fails. Regards, Maruthi On Fri, Nov 17, 2023 at 11:25?PM Barry Smith wrote: > > Please run > > mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres -info > > and send all the output > > > > On Nov 17, 2023, at 12:14?PM, Maruthi NH wrote: > > Hi Barry, > > I get the following error. > > ngh at ngh1 ~/petsc/src/snes/tutorials > $ make ex19 > /home/ngh/petsc/lib/petsc/bin/win32fe/win32fe icl -Qwd10161 -Qstd=c99 -MT > -O3 -I/home/ngh/petsc/include > -I/home/ngh/petsc/intel-petsc-tag-v3.20.1/include > -I/cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/include ex19.c > -L/home/ngh/petsc/intel-petsc-tag-v3.20.1/lib > -L/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/2022.1.0/lib/intel64 -lpetsc > mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib > /cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/lib/release/impi.lib > Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib -o ex19 > ex19.c > > ngh at ngh1 ~/petsc/src/snes/tutorials > $ mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres > > > ================================================================================== > = > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 1768 RUNNING AT ngh1 > = EXIT STATUS: -1073741819 (c0000005) > > ================================================================================== > = > > ngh at ngh1 ~/petsc/src/snes/tutorials > $ mpiexec -n 2 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres > > > ================================================================================== > = > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 25936 RUNNING AT ngh1 > = EXIT STATUS: -1073741819 (c0000005) > > ================================================================================== > = > > > ================================================================================== > = > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 1 PID 5216 RUNNING AT ngh1 > = EXIT STATUS: -1073741819 (c0000005) > > ================================================================================== > = > > > Regards, > Maruthi > > On Fri, Nov 17, 2023 at 9:26?PM Barry Smith wrote: > >> >> Please do >> >> cd src/snes/tutorials >> make ex19 >> mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres >> >> then with -n 2 >> >> Cut and paste all the output and include it in your email response >> >> Barry >> >> >> >> On Nov 17, 2023, at 12:57?AM, Maruthi NH wrote: >> >> Hi all, >> >> I could successfully compile PETSc on Windows with Intel OneAPI C/C++ and >> Fortran compilers, however, when I tried to make a check after the >> successful installation, I got the following error message. I have also >> attached the configuration file. >> >> make PETSC_DIR=/home/ngh/petsc PETSC_ARCH=intel-petsc-tag-v3.20.1 check >> Running PETSc check examples to verify correct installation >> Using PETSC_DIR=/home/ngh/petsc and PETSC_ARCH=intel-petsc-tag-v3.20.1 >> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process >> See https://petsc.org/release/faq/ >> >> >> =================================================================================== >> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >> = RANK 0 PID 13384 RUNNING AT BLRLAP1521 >> = EXIT STATUS: -1073741819 (c0000005) >> >> =================================================================================== >> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes >> See https://petsc.org/release/faq/ >> >> >> =================================================================================== >> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >> = RANK 0 PID 20616 RUNNING AT BLRLAP1521 >> = EXIT STATUS: -1073741819 (c0000005) >> >> =================================================================================== >> >> >> =================================================================================== >> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >> = RANK 1 PID 16900 RUNNING AT BLRLAP1521 >> = EXIT STATUS: -1073741819 (c0000005) >> >> =================================================================================== >> Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI >> process >> See https://petsc.org/release/faq/ >> forrtl: severe (157): Program Exception - access violation >> Image PC Routine Line Source >> ex5f.exe 00007FF661DB569B Unknown Unknown >> Unknown >> ex5f.exe 00007FF661A760AF Unknown Unknown >> Unknown >> ex5f.exe 00007FF6619E6EBD Unknown Unknown >> Unknown >> ex5f.exe 00007FF6619C6294 Unknown Unknown >> Unknown >> ex5f.exe 00007FF6619C3152 Unknown Unknown >> Unknown >> ex5f.exe 00007FF6619C1051 Unknown Unknown >> Unknown >> ex5f.exe 00007FF662C3C16E Unknown Unknown >> Unknown >> ex5f.exe 00007FF662C3CA50 Unknown Unknown >> Unknown >> KERNEL32.DLL 00007FFB12187344 Unknown Unknown >> Unknown >> ntdll.dll 00007FFB140C26B1 Unknown Unknown >> Unknown >> Completed PETSc check examples >> Error while running make check >> make[1]: *** [makefile:132: check] Error 1 >> make: *** [GNUmakefile:17: check] Error 2 >> >> Regards, >> mnh >> >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From srvenkat at utexas.edu Fri Nov 17 12:39:53 2023 From: srvenkat at utexas.edu (Sreeram R Venkat) Date: Fri, 17 Nov 2023 13:39:53 -0500 Subject: [petsc-users] VecNorm causes program to hang In-Reply-To: References: Message-ID: Thank you; that fixed the problem. I added an else { PetscCall(VecCUDAReplaceArray(v, NULL)); } Thanks, Sreeram On Fri, Nov 17, 2023 at 12:09?PM Barry Smith wrote: > > So the "bug" is not as ginormous as I originally thought. It will never > produce incorrect results but can result in the errors you received. > > The problem is > > if (row_rank == 0) > { > PetscCall(VecCUDAReplaceArray(v, d_a)); > } > > The place/replacearray routines are actually collective; and need to be > called by all MPI processes that own a vector regardless of the local size. > This is because the call can invalidate the previously known norm values > that have been cached in the vector. If the norm values are invalidated on > some MPI processes but not others you will get the error you have seen. > > Barry > > I will prepare a branch with better documentation and clearer error > handling for this situation. > > > > > On Nov 16, 2023, at 6:30?PM, Barry Smith wrote: > > > Congratulations you have found a ginormous bug in PETSc! Thanks for the > detail information on the problem. > > I will post a fix shortly. > > Barry > > > On Nov 16, 2023, at 6:19?PM, Sreeram R Venkat wrote: > > I have a program which reads a vector from file into an array, and then > uses that array to create a PETSc Vec object. The Vec is defined on the > global communicator, but not all processes actually contain entries of it. > For example, suppose we have 4 processors, and the vector is of size 10. > Rank 0 will contain entries 0-4 and Rank 1 will contain entries 5-9. Ranks > 2 and 3 will not have any entries of the Vec. > > This Vec is then used as an input to other parts of the code, and those > work fine. However, if I try to take the norm of the Vec with VecNorm(), I > get the error > > `MPI_Allreduce() called in different locations (code lines) on different > processors` > > The stack trace shows that ranks 0 and 1 (from the above example) are > still in the VecNorm() function while ranks 2 and 3 have moved on to a > later part of the code. If I add a PetscBarrier() after the VecNorm(), I > find that the program hangs. > > The funny thing is that part of the code duplicates the Vec with > VecDuplicate() and assigns to the duplicated vector the result of some > computations. The duplicated Vec has the same layout as the original Vec, > but taking VecNorm() on the duplicated Vec works fine. If I use VecCopy(), > however, the copied Vec also causes VecNorm() to hang. I've printed out the > original Vec, and there are no corrupted/NaN entries. > > I have a temporary workaround where I perturb the original Vec slightly > before copying it to another Vec. This causes the program to successfully > terminate. > > Any advice on how to get VecNorm() working with the original Vec? > > Thanks, > Sreeram > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Nov 17 14:41:39 2023 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 17 Nov 2023 15:41:39 -0500 Subject: [petsc-users] Error running make check with OneAPI C/C++ and Fortran compilers on Windows In-Reply-To: References: <19B25FB7-66E0-478F-9A18-4BD0455A9C48@petsc.dev> <0AD8879F-4ED7-4D5E-B5BA-9B56C8539ED8@petsc.dev> Message-ID: <97AC950C-D7E9-4BDA-839D-DDD55EDF97CE@petsc.dev> OneAPI has two sets of compilers, old icc and new icx. Does it fail in this way for both? Can you try with the old? > On Nov 17, 2023, at 12:59?PM, Maruthi NH wrote: > > Hi Barry, > > It doesn't even start to run, I still get the same error. This is all the output I get. > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 28192 RUNNING AT BLRLAP1521 > = EXIT STATUS: -1073741819 (c0000005) > =================================================================================== > > With Windows C compiler and Intel OneAPI Fortran, it works fine. But when I use the OneAPI C/C++ compiler it fails. > > Regards, > Maruthi > > > On Fri, Nov 17, 2023 at 11:25?PM Barry Smith > wrote: >> >> Please run >> >> mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres -info >> >> and send all the output >> >> >> >>> On Nov 17, 2023, at 12:14?PM, Maruthi NH > wrote: >>> >>> Hi Barry, >>> >>> I get the following error. >>> >>> ngh at ngh1 ~/petsc/src/snes/tutorials >>> $ make ex19 >>> /home/ngh/petsc/lib/petsc/bin/win32fe/win32fe icl -Qwd10161 -Qstd=c99 -MT -O3 -I/home/ngh/petsc/include -I/home/ngh/petsc/intel-petsc-tag-v3.20.1/include -I/cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/include ex19.c -L/home/ngh/petsc/intel-petsc-tag-v3.20.1/lib -L/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/2022.1.0/lib/intel64 -lpetsc mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib /cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/lib/release/impi.lib Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib -o ex19 >>> ex19.c >>> >>> ngh at ngh1 ~/petsc/src/snes/tutorials >>> $ mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres >>> >>> =================================================================================== >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = RANK 0 PID 1768 RUNNING AT ngh1 >>> = EXIT STATUS: -1073741819 (c0000005) >>> =================================================================================== >>> >>> ngh at ngh1 ~/petsc/src/snes/tutorials >>> $ mpiexec -n 2 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres >>> >>> =================================================================================== >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = RANK 0 PID 25936 RUNNING AT ngh1 >>> = EXIT STATUS: -1073741819 (c0000005) >>> =================================================================================== >>> >>> =================================================================================== >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = RANK 1 PID 5216 RUNNING AT ngh1 >>> = EXIT STATUS: -1073741819 (c0000005) >>> =================================================================================== >>> >>> >>> Regards, >>> Maruthi >>> >>> On Fri, Nov 17, 2023 at 9:26?PM Barry Smith > wrote: >>>> >>>> Please do >>>> >>>> cd src/snes/tutorials >>>> make ex19 >>>> mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres >>>> >>>> then with -n 2 >>>> >>>> Cut and paste all the output and include it in your email response >>>> >>>> Barry >>>> >>>> >>>> >>>>> On Nov 17, 2023, at 12:57?AM, Maruthi NH > wrote: >>>>> >>>>> Hi all, >>>>> >>>>> I could successfully compile PETSc on Windows with Intel OneAPI C/C++ and Fortran compilers, however, when I tried to make a check after the successful installation, I got the following error message. I have also attached the configuration file. >>>>> >>>>> make PETSC_DIR=/home/ngh/petsc PETSC_ARCH=intel-petsc-tag-v3.20.1 check >>>>> Running PETSc check examples to verify correct installation >>>>> Using PETSC_DIR=/home/ngh/petsc and PETSC_ARCH=intel-petsc-tag-v3.20.1 >>>>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process >>>>> See https://petsc.org/release/faq/ >>>>> >>>>> =================================================================================== >>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>>>> = RANK 0 PID 13384 RUNNING AT BLRLAP1521 >>>>> = EXIT STATUS: -1073741819 (c0000005) >>>>> =================================================================================== >>>>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes >>>>> See https://petsc.org/release/faq/ >>>>> >>>>> =================================================================================== >>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>>>> = RANK 0 PID 20616 RUNNING AT BLRLAP1521 >>>>> = EXIT STATUS: -1073741819 (c0000005) >>>>> =================================================================================== >>>>> >>>>> =================================================================================== >>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>>>> = RANK 1 PID 16900 RUNNING AT BLRLAP1521 >>>>> = EXIT STATUS: -1073741819 (c0000005) >>>>> =================================================================================== >>>>> Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process >>>>> See https://petsc.org/release/faq/ >>>>> forrtl: severe (157): Program Exception - access violation >>>>> Image PC Routine Line Source >>>>> ex5f.exe 00007FF661DB569B Unknown Unknown Unknown >>>>> ex5f.exe 00007FF661A760AF Unknown Unknown Unknown >>>>> ex5f.exe 00007FF6619E6EBD Unknown Unknown Unknown >>>>> ex5f.exe 00007FF6619C6294 Unknown Unknown Unknown >>>>> ex5f.exe 00007FF6619C3152 Unknown Unknown Unknown >>>>> ex5f.exe 00007FF6619C1051 Unknown Unknown Unknown >>>>> ex5f.exe 00007FF662C3C16E Unknown Unknown Unknown >>>>> ex5f.exe 00007FF662C3CA50 Unknown Unknown Unknown >>>>> KERNEL32.DLL 00007FFB12187344 Unknown Unknown Unknown >>>>> ntdll.dll 00007FFB140C26B1 Unknown Unknown Unknown >>>>> Completed PETSc check examples >>>>> Error while running make check >>>>> make[1]: *** [makefile:132: check] Error 1 >>>>> make: *** [GNUmakefile:17: check] Error 2 >>>>> >>>>> Regards, >>>>> mnh >>>>> >>>>> >>>>> > >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From SandeepHaripuria at proton.me Sun Nov 19 11:54:54 2023 From: SandeepHaripuria at proton.me (Sandeep Haripuria) Date: Sun, 19 Nov 2023 17:54:54 +0000 Subject: [petsc-users] Error in genmap Message-ID: Dear Sir I am trying to run an oscillating airfoil case. For this purpose I am trying to make use of "neknek". I created two separate geometry files , one for the airfoil and other for the outside surrounding. I have created the geometry and mesh using GMSH. I have attached the .geo and .msh files for your perusal. Now, I am able to successfully create the "naca0012.map" but I am getting an error when I try to genmap the outside.rea file. The error is shown below: [ratnavk at vamana](https://lists.mcs.anl.gov/mailman/listinfo/nek5000-users) :~/NEK/test_neknek$ ../trunk/bin/genmap Input (.rea) file name: outside Input mesh tolerance (default 0.2): NOTE: smaller is better, but generous is more forgiving for bad meshes. 0.01 reading .rea file data ... ERROR: error reading 1 1 406 aborting 510 in routine rdbdry. 1 quit Looking forward to your positive reply. Sandeep Haripuria [ThePlattery.Com](https://theplattery.com/) -------------- next part -------------- An HTML attachment was scrubbed... URL: From maruthinh at gmail.com Sun Nov 19 22:37:19 2023 From: maruthinh at gmail.com (Maruthi NH) Date: Mon, 20 Nov 2023 10:07:19 +0530 Subject: [petsc-users] Error running make check with OneAPI C/C++ and Fortran compilers on Windows In-Reply-To: <97AC950C-D7E9-4BDA-839D-DDD55EDF97CE@petsc.dev> References: <19B25FB7-66E0-478F-9A18-4BD0455A9C48@petsc.dev> <0AD8879F-4ED7-4D5E-B5BA-9B56C8539ED8@petsc.dev> <97AC950C-D7E9-4BDA-839D-DDD55EDF97CE@petsc.dev> Message-ID: Hi Barry, I have tried with icc 2021.6.0. I will try with icx and check. Regards, Maruthi On Sat, Nov 18, 2023 at 2:11?AM Barry Smith wrote: > > OneAPI has two sets of compilers, old icc and new icx. > > Does it fail in this way for both? Can you try with the old? > > > On Nov 17, 2023, at 12:59?PM, Maruthi NH wrote: > > Hi Barry, > > It doesn't even start to run, I still get the same error. This is all the > output I get. > > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = RANK 0 PID 28192 RUNNING AT BLRLAP1521 > = EXIT STATUS: -1073741819 (c0000005) > > =================================================================================== > > With Windows C compiler and Intel OneAPI Fortran, it works fine. But when > I use the OneAPI C/C++ compiler it fails. > > Regards, > Maruthi > > > On Fri, Nov 17, 2023 at 11:25?PM Barry Smith wrote: > >> >> Please run >> >> mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres -info >> >> and send all the output >> >> >> >> On Nov 17, 2023, at 12:14?PM, Maruthi NH wrote: >> >> Hi Barry, >> >> I get the following error. >> >> ngh at ngh1 ~/petsc/src/snes/tutorials >> $ make ex19 >> /home/ngh/petsc/lib/petsc/bin/win32fe/win32fe icl -Qwd10161 -Qstd=c99 >> -MT -O3 -I/home/ngh/petsc/include >> -I/home/ngh/petsc/intel-petsc-tag-v3.20.1/include >> -I/cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/include ex19.c >> -L/home/ngh/petsc/intel-petsc-tag-v3.20.1/lib >> -L/cygdrive/c/PROGRA~2/Intel/oneAPI/mkl/2022.1.0/lib/intel64 -lpetsc >> mkl_intel_lp64_dll.lib mkl_sequential_dll.lib mkl_core_dll.lib >> /cygdrive/c/PROGRA~2/Intel/oneAPI/mpi/2021.6.0/lib/release/impi.lib >> Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib -o ex19 >> ex19.c >> >> ngh at ngh1 ~/petsc/src/snes/tutorials >> $ mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres >> >> >> ================================================================================== >> = >> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >> = RANK 0 PID 1768 RUNNING AT ngh1 >> = EXIT STATUS: -1073741819 (c0000005) >> >> ================================================================================== >> = >> >> ngh at ngh1 ~/petsc/src/snes/tutorials >> $ mpiexec -n 2 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres >> >> >> ================================================================================== >> = >> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >> = RANK 0 PID 25936 RUNNING AT ngh1 >> = EXIT STATUS: -1073741819 (c0000005) >> >> ================================================================================== >> = >> >> >> ================================================================================== >> = >> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >> = RANK 1 PID 5216 RUNNING AT ngh1 >> = EXIT STATUS: -1073741819 (c0000005) >> >> ================================================================================== >> = >> >> >> Regards, >> Maruthi >> >> On Fri, Nov 17, 2023 at 9:26?PM Barry Smith wrote: >> >>> >>> Please do >>> >>> cd src/snes/tutorials >>> make ex19 >>> mpiexec -n 1 ./ex19 -da_refine 3 -pc_type mg -ksp_type fgmres >>> >>> then with -n 2 >>> >>> Cut and paste all the output and include it in your email response >>> >>> Barry >>> >>> >>> >>> On Nov 17, 2023, at 12:57?AM, Maruthi NH wrote: >>> >>> Hi all, >>> >>> I could successfully compile PETSc on Windows with Intel OneAPI C/C++ >>> and Fortran compilers, however, when I tried to make a check after the >>> successful installation, I got the following error message. I have also >>> attached the configuration file. >>> >>> make PETSC_DIR=/home/ngh/petsc PETSC_ARCH=intel-petsc-tag-v3.20.1 check >>> Running PETSc check examples to verify correct installation >>> Using PETSC_DIR=/home/ngh/petsc and PETSC_ARCH=intel-petsc-tag-v3.20.1 >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process >>> See https://petsc.org/release/faq/ >>> >>> >>> =================================================================================== >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = RANK 0 PID 13384 RUNNING AT BLRLAP1521 >>> = EXIT STATUS: -1073741819 (c0000005) >>> >>> =================================================================================== >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes >>> See https://petsc.org/release/faq/ >>> >>> >>> =================================================================================== >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = RANK 0 PID 20616 RUNNING AT BLRLAP1521 >>> = EXIT STATUS: -1073741819 (c0000005) >>> >>> =================================================================================== >>> >>> >>> =================================================================================== >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = RANK 1 PID 16900 RUNNING AT BLRLAP1521 >>> = EXIT STATUS: -1073741819 (c0000005) >>> >>> =================================================================================== >>> Possible error running Fortran example src/snes/tutorials/ex5f with 1 >>> MPI process >>> See https://petsc.org/release/faq/ >>> forrtl: severe (157): Program Exception - access violation >>> Image PC Routine Line >>> Source >>> ex5f.exe 00007FF661DB569B Unknown Unknown >>> Unknown >>> ex5f.exe 00007FF661A760AF Unknown Unknown >>> Unknown >>> ex5f.exe 00007FF6619E6EBD Unknown Unknown >>> Unknown >>> ex5f.exe 00007FF6619C6294 Unknown Unknown >>> Unknown >>> ex5f.exe 00007FF6619C3152 Unknown Unknown >>> Unknown >>> ex5f.exe 00007FF6619C1051 Unknown Unknown >>> Unknown >>> ex5f.exe 00007FF662C3C16E Unknown Unknown >>> Unknown >>> ex5f.exe 00007FF662C3CA50 Unknown Unknown >>> Unknown >>> KERNEL32.DLL 00007FFB12187344 Unknown Unknown >>> Unknown >>> ntdll.dll 00007FFB140C26B1 Unknown Unknown >>> Unknown >>> Completed PETSc check examples >>> Error while running make check >>> make[1]: *** [makefile:132: check] Error 1 >>> make: *** [GNUmakefile:17: check] Error 2 >>> >>> Regards, >>> mnh >>> >>> >>> >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Nov 20 06:39:39 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 20 Nov 2023 07:39:39 -0500 Subject: [petsc-users] Error in genmap In-Reply-To: References: Message-ID: On Sun, Nov 19, 2023 at 10:32?PM Sandeep Haripuria via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear Sir > > I am trying to run an oscillating airfoil case. For this purpose I am > trying to make use of "neknek". > I created two separate geometry files , one for the airfoil and other > for the outside surrounding. I have created the geometry and mesh > using GMSH. I have attached the .geo and .msh files for your perusal. > > 1. Nothing is attached 2. Did you mean to send this to the Nek mailing list? Thanks, Matt > Now, I am able to successfully create the "naca0012.map" but I am > getting an error when I try to genmap the outside.rea file. The error > is shown below: > > ratnavk at vamana :~/NEK/test_neknek$ ../trunk/bin/genmap > Input (.rea) file name: > outside > Input mesh tolerance (default 0.2): > NOTE: smaller is better, but generous is more forgiving for bad meshes. > 0.01 > reading .rea file data ... > ERROR: error reading 1 1 406 > aborting 510 in routine rdbdry. > > 1 quit > > > Looking forward to your positive reply. > > > Sandeep Haripuria > ThePlattery.Com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From drwells at email.unc.edu Mon Nov 20 07:00:00 2023 From: drwells at email.unc.edu (Wells, David) Date: Mon, 20 Nov 2023 13:00:00 +0000 Subject: [petsc-users] Logging object creation and destruction counts Message-ID: Hi everyone, I just upgraded to PETSc 3.20 and read up on the new logging infrastructure - its a very nice improvement over the old version. I have some code which checks that every construction has a corresponding destruction via PetscStageLog stageLog; ierr = PetscLogGetStageLog(&stageLog); for (int i = 0; i < stageLog->stageInfo->classLog->numClasses; ++i) { if (stageLog->stageInfo->classLog->classInfo[i].destructions != stageLog->stageInfo->classLog->classInfo[i].creations) { crash(); } } This no longer works and I can't figure out how to port it. In particular, it looks like I need to get a PetscLogEvent number for creation and another for destruction to retrieve the relevant PetscEventPerfInfo objects per-class - is there some straightforward way to do that for every registered PETSc class? Best, David Wells -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Nov 20 09:00:04 2023 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 20 Nov 2023 10:00:04 -0500 Subject: [petsc-users] Logging object creation and destruction counts In-Reply-To: References: Message-ID: On Mon, Nov 20, 2023 at 8:00?AM Wells, David wrote: > Hi everyone, > > I just upgraded to PETSc 3.20 and read up on the new logging > infrastructure - its a very nice improvement over the old version. > > I have some code which checks that every construction has a corresponding > destruction via > > > PetscStageLog stageLog; > ierr = PetscLogGetStageLog(&stageLog); > for (int i = 0; i < stageLog->stageInfo->classLog->numClasses; ++i) > { > if (stageLog->stageInfo->classLog->classInfo[i].destructions != > stageLog->stageInfo->classLog->classInfo[i].creations) > { > crash(); > } > } > > > This no longer works and I can't figure out how to port it. In particular, > it looks like I need to get a PetscLogEvent number for creation and another > for destruction to retrieve the relevant PetscEventPerfInfo objects > per-class - is there some straightforward way to do that for every > registered PETSc class? > 1. The above code seems to require that creation and destruction occur within the same stage, which might not be true. 2. https://petsc.org/main/manualpages/Profiling/PetscLogStateGetNumClasses/ gets the numbet of classes. You can recreate this loop with the code from PetscLogHandlerObjectCreate_Default() I think, which is in logdefault.c. However, as I said, this will not properly match up across stages. 3. At logdefault.c:1657 we output all the creations and destructions, so you could copy these loops, but sum across stages. 4. You could register a callback that just increments and decrements for each classid, and then calls PetscLogHandlerObjectCreate/Destroy_Default(), which might be cleaner. Thanks, Matt > Best, > David Wells > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From JanIzak.Vermaak at inl.gov Mon Nov 20 09:28:23 2023 From: JanIzak.Vermaak at inl.gov (Jan Izak C. Vermaak) Date: Mon, 20 Nov 2023 15:28:23 +0000 Subject: [petsc-users] Difficulty installing PETSC-3.17.0 on new macOS Sonoma Message-ID: Hi all, I am in the process of upgrading our petsc version to 3.20.1 but I really need our current version to work with 3.17.0. I am having install issues. Attached is the config log. Command Line Tools 15.0 (CLT 15.0) used to be the source of my problems with the previous OS version for which the solution was to install the old CLT 14.3, however, CLT 14.3 is not compatible with macOS Sonoma (which INL is requiring us to have). Any help will be appreciated. Regards, Jan Jan Vermaak, Ph.D. Senior Nuclear Multiphysics Engineer | Reactor Physics Methods and Analysis Department (C110) Reactor Systems Design and Analysis Division | Nuclear Science & Technology Directorate JanIzak.Vermaak at inl.gov | M 979-739-0789 Idaho National Laboratory | 1955 Fremont Ave. | Idaho Falls, ID | 83415 _______________________________ [signature_1025312815] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 15373 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 880469 bytes Desc: configure.log URL: From balay at mcs.anl.gov Mon Nov 20 10:33:39 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 20 Nov 2023 10:33:39 -0600 (CST) Subject: [petsc-users] Difficulty installing PETSC-3.17.0 on new macOS Sonoma In-Reply-To: References: Message-ID: <4439bb37-7e4a-eed7-16b9-9da1d28e5dd1@mcs.anl.gov> replied on petsc-maint xcode-15 changed considerably, (fixes are in petsc-3.20) that its not easy to backport all needed patches to 3.17.0 So best bet for petsc-3.19 and older is to use linux (remotely or via VM) - or downgrade to xcode-14. Satish On Mon, 20 Nov 2023, Jan Izak C. Vermaak via petsc-users wrote: > Hi all, > > I am in the process of upgrading our petsc version to 3.20.1 but I really need our current version to work with 3.17.0. > > I am having install issues. Attached is the config log. Command Line Tools 15.0 (CLT 15.0) used to be the source of my problems with the previous OS version for which the solution was to install the old CLT 14.3, however, CLT 14.3 is not compatible with macOS Sonoma (which INL is requiring us to have). > > Any help will be appreciated. > > Regards, > Jan > > Jan Vermaak, Ph.D. > Senior Nuclear Multiphysics Engineer | Reactor Physics Methods and Analysis Department (C110) > Reactor Systems Design and Analysis Division | Nuclear Science & Technology Directorate > JanIzak.Vermaak at inl.gov | M 979-739-0789 > Idaho National Laboratory | 1955 Fremont Ave. | Idaho Falls, ID | 83415 > _______________________________ > > [signature_1025312815] > > From alexlindsay239 at gmail.com Mon Nov 20 14:48:05 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Mon, 20 Nov 2023 12:48:05 -0800 Subject: [petsc-users] MUMPS valgrind error Message-ID: I recently ran into some parallel crashes and valgrind suggests the issue is with MUMPS. Has anyone else run into something similar recently? ==4022024== Invalid read of size 4 ==4022024== at 0xF961266: dmumps_dr_assemble_local (dsol_distrhs.F:301) ==4022024== by 0xF961266: dmumps_scatter_dist_rhs_ (dsol_distrhs.F:169) ==4022024== by 0xF952669: dmumps_solve_driver_ (dsol_driver.F:3677) ==4022024== by 0xF8CA64A: dmumps_ (dmumps_driver.F:2035) ==4022024== by 0xF85E12B: dmumps_f77_ (dmumps_f77.F:291) ==4022024== by 0xF85C30E: dmumps_c (mumps_c.c:485) ==4022024== by 0xEBA8E56: MatSolve_MUMPS (mumps.c:1493) ==4022024== by 0xEE85D97: MatSolve (matrix.c:3631) ==4022024== by 0xF31B060: PCApply_LU (lu.c:169) ==4022024== by 0xF50FD3B: PCApply (precon.c:486) ==4022024== by 0xF511204: PCApplyBAorAB (precon.c:756) ==4022024== by 0xF51C5F0: KSP_PCApplyBAorAB (kspimpl.h:443) ==4022024== by 0xF51C5F0: KSPGMRESCycle (gmres.c:146) ==4022024== by 0xF51C5F0: KSPSolve_GMRES (gmres.c:227) ==4022024== by 0xF5F339D: KSPSolve_Private (itfunc.c:910) ==4022024== Address 0x308b338c is 4 bytes before a block of size 1 alloc'd ==4022024== at 0x4C37135: malloc (vg_replace_malloc.c:381) ==4022024== by 0xF960E93: dmumps_scatter_dist_rhs_ (dsol_distrhs.F:139) ==4022024== by 0xF952669: dmumps_solve_driver_ (dsol_driver.F:3677) ==4022024== by 0xF8CA64A: dmumps_ (dmumps_driver.F:2035) ==4022024== by 0xF85E12B: dmumps_f77_ (dmumps_f77.F:291) ==4022024== by 0xF85C30E: dmumps_c (mumps_c.c:485) ==4022024== by 0xEBA8E56: MatSolve_MUMPS (mumps.c:1493) ==4022024== by 0xEE85D97: MatSolve (matrix.c:3631) ==4022024== by 0xF31B060: PCApply_LU (lu.c:169) ==4022024== by 0xF50FD3B: PCApply (precon.c:486) ==4022024== by 0xF511204: PCApplyBAorAB (precon.c:756) ==4022024== by 0xF51C5F0: KSP_PCApplyBAorAB (kspimpl.h:443) ==4022024== by 0xF51C5F0: KSPGMRESCycle (gmres.c:146) ==4022024== by 0xF51C5F0: KSPSolve_GMRES (gmres.c:227) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Nov 20 14:56:18 2023 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Mon, 20 Nov 2023 20:56:18 +0000 Subject: [petsc-users] MUMPS valgrind error In-Reply-To: References: Message-ID: Can you provide us a test code that reveals this error? Hong ________________________________ From: petsc-users on behalf of Alexander Lindsay Sent: Monday, November 20, 2023 2:48 PM To: PETSc Subject: [petsc-users] MUMPS valgrind error I recently ran into some parallel crashes and valgrind suggests the issue is with MUMPS. Has anyone else run into something similar recently? ==4022024== Invalid read of size 4 ==4022024== at 0xF961266: dmumps_dr_assemble_local (dsol_distrhs.F:301) ==4022024== by 0xF961266: dmumps_scatter_dist_rhs_ (dsol_distrhs.F:169) ==4022024== by 0xF952669: dmumps_solve_driver_ (dsol_driver.F:3677) ==4022024== by 0xF8CA64A: dmumps_ (dmumps_driver.F:2035) ==4022024== by 0xF85E12B: dmumps_f77_ (dmumps_f77.F:291) ==4022024== by 0xF85C30E: dmumps_c (mumps_c.c:485) ==4022024== by 0xEBA8E56: MatSolve_MUMPS (mumps.c:1493) ==4022024== by 0xEE85D97: MatSolve (matrix.c:3631) ==4022024== by 0xF31B060: PCApply_LU (lu.c:169) ==4022024== by 0xF50FD3B: PCApply (precon.c:486) ==4022024== by 0xF511204: PCApplyBAorAB (precon.c:756) ==4022024== by 0xF51C5F0: KSP_PCApplyBAorAB (kspimpl.h:443) ==4022024== by 0xF51C5F0: KSPGMRESCycle (gmres.c:146) ==4022024== by 0xF51C5F0: KSPSolve_GMRES (gmres.c:227) ==4022024== by 0xF5F339D: KSPSolve_Private (itfunc.c:910) ==4022024== Address 0x308b338c is 4 bytes before a block of size 1 alloc'd ==4022024== at 0x4C37135: malloc (vg_replace_malloc.c:381) ==4022024== by 0xF960E93: dmumps_scatter_dist_rhs_ (dsol_distrhs.F:139) ==4022024== by 0xF952669: dmumps_solve_driver_ (dsol_driver.F:3677) ==4022024== by 0xF8CA64A: dmumps_ (dmumps_driver.F:2035) ==4022024== by 0xF85E12B: dmumps_f77_ (dmumps_f77.F:291) ==4022024== by 0xF85C30E: dmumps_c (mumps_c.c:485) ==4022024== by 0xEBA8E56: MatSolve_MUMPS (mumps.c:1493) ==4022024== by 0xEE85D97: MatSolve (matrix.c:3631) ==4022024== by 0xF31B060: PCApply_LU (lu.c:169) ==4022024== by 0xF50FD3B: PCApply (precon.c:486) ==4022024== by 0xF511204: PCApplyBAorAB (precon.c:756) ==4022024== by 0xF51C5F0: KSP_PCApplyBAorAB (kspimpl.h:443) ==4022024== by 0xF51C5F0: KSPGMRESCycle (gmres.c:146) ==4022024== by 0xF51C5F0: KSPSolve_GMRES (gmres.c:227) -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Mon Nov 20 15:09:53 2023 From: liufield at gmail.com (neil liu) Date: Mon, 20 Nov 2023 16:09:53 -0500 Subject: [petsc-users] Fwd: Inquiry about DMPlexCreateSection In-Reply-To: References: Message-ID: Hello, Petsc team, Previously, following the function setupseciton in the following case https://petsc.org/main/src/dm/impls/plex/tutorials/ex7.c.html I successfully created a section layout and then imposed the Dirchlet boundary conditions. It seems DMPlexCreateSection is able to replace the above setupsection function and specify points that are located on the boundary. My question is, does DMPlexCreateSection impose boundary condition automatically, e.g., Dirichlet boundary condition (u=0). Or we should impose these boundary conditions manually, then what is the benefit of DMPlexCreateSection compared with the above setupsection function? Thanks, Xiaodong -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexlindsay239 at gmail.com Mon Nov 20 15:22:11 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Mon, 20 Nov 2023 13:22:11 -0800 Subject: [petsc-users] MUMPS valgrind error In-Reply-To: References: Message-ID: This is from a MOOSE test. If I find the same error on something simpler I will let you know On Mon, Nov 20, 2023 at 12:56?PM Zhang, Hong wrote: > Can you provide us a test code that reveals this error? > Hong > ------------------------------ > *From:* petsc-users on behalf of > Alexander Lindsay > *Sent:* Monday, November 20, 2023 2:48 PM > *To:* PETSc > *Subject:* [petsc-users] MUMPS valgrind error > > I recently ran into some parallel crashes and valgrind suggests the issue > is with MUMPS. Has anyone else run into something similar recently? > > ==4022024== Invalid read of size 4 > > ==4022024== at 0xF961266: dmumps_dr_assemble_local (dsol_distrhs.F:301) > > ==4022024== by 0xF961266: dmumps_scatter_dist_rhs_ (dsol_distrhs.F:169) > > ==4022024== by 0xF952669: dmumps_solve_driver_ (dsol_driver.F:3677) > > ==4022024== by 0xF8CA64A: dmumps_ (dmumps_driver.F:2035) > > ==4022024== by 0xF85E12B: dmumps_f77_ (dmumps_f77.F:291) > > ==4022024== by 0xF85C30E: dmumps_c (mumps_c.c:485) > > ==4022024== by 0xEBA8E56: MatSolve_MUMPS (mumps.c:1493) > > ==4022024== by 0xEE85D97: MatSolve (matrix.c:3631) > > ==4022024== by 0xF31B060: PCApply_LU (lu.c:169) > > ==4022024== by 0xF50FD3B: PCApply (precon.c:486) > > ==4022024== by 0xF511204: PCApplyBAorAB (precon.c:756) > > ==4022024== by 0xF51C5F0: KSP_PCApplyBAorAB (kspimpl.h:443) > > ==4022024== by 0xF51C5F0: KSPGMRESCycle (gmres.c:146) > > ==4022024== by 0xF51C5F0: KSPSolve_GMRES (gmres.c:227) > > ==4022024== by 0xF5F339D: KSPSolve_Private (itfunc.c:910) > > ==4022024== Address 0x308b338c is 4 bytes before a block of size 1 > alloc'd > > ==4022024== at 0x4C37135: malloc (vg_replace_malloc.c:381) > > ==4022024== by 0xF960E93: dmumps_scatter_dist_rhs_ (dsol_distrhs.F:139) > > ==4022024== by 0xF952669: dmumps_solve_driver_ (dsol_driver.F:3677) > > ==4022024== by 0xF8CA64A: dmumps_ (dmumps_driver.F:2035) > > ==4022024== by 0xF85E12B: dmumps_f77_ (dmumps_f77.F:291) > > ==4022024== by 0xF85C30E: dmumps_c (mumps_c.c:485) > > ==4022024== by 0xEBA8E56: MatSolve_MUMPS (mumps.c:1493) > > ==4022024== by 0xEE85D97: MatSolve (matrix.c:3631) > > ==4022024== by 0xF31B060: PCApply_LU (lu.c:169) > > ==4022024== by 0xF50FD3B: PCApply (precon.c:486) > > ==4022024== by 0xF511204: PCApplyBAorAB (precon.c:756) > > ==4022024== by 0xF51C5F0: KSP_PCApplyBAorAB (kspimpl.h:443) > > ==4022024== by 0xF51C5F0: KSPGMRESCycle (gmres.c:146) > > ==4022024== by 0xF51C5F0: KSPSolve_GMRES (gmres.c:227) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Nov 20 15:42:19 2023 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 20 Nov 2023 16:42:19 -0500 Subject: [petsc-users] MUMPS valgrind error In-Reply-To: References: Message-ID: <0A6D4E5F-F912-4154-9899-18576ACC5660@petsc.dev> Looks like memory allocated down in MUMPS and then accessed incorrectly inside MUMPS. Could easily not be PETSc related > On Nov 20, 2023, at 4:22?PM, Alexander Lindsay wrote: > > This is from a MOOSE test. If I find the same error on something simpler I will let you know > > On Mon, Nov 20, 2023 at 12:56?PM Zhang, Hong > wrote: >> Can you provide us a test code that reveals this error? >> Hong >> From: petsc-users > on behalf of Alexander Lindsay > >> Sent: Monday, November 20, 2023 2:48 PM >> To: PETSc > >> Subject: [petsc-users] MUMPS valgrind error >> >> I recently ran into some parallel crashes and valgrind suggests the issue is with MUMPS. Has anyone else run into something similar recently? >> >> ==4022024== Invalid read of size 4 >> ==4022024== at 0xF961266: dmumps_dr_assemble_local (dsol_distrhs.F:301) >> ==4022024== by 0xF961266: dmumps_scatter_dist_rhs_ (dsol_distrhs.F:169) >> ==4022024== by 0xF952669: dmumps_solve_driver_ (dsol_driver.F:3677) >> ==4022024== by 0xF8CA64A: dmumps_ (dmumps_driver.F:2035) >> ==4022024== by 0xF85E12B: dmumps_f77_ (dmumps_f77.F:291) >> ==4022024== by 0xF85C30E: dmumps_c (mumps_c.c:485) >> ==4022024== by 0xEBA8E56: MatSolve_MUMPS (mumps.c:1493) >> ==4022024== by 0xEE85D97: MatSolve (matrix.c:3631) >> ==4022024== by 0xF31B060: PCApply_LU (lu.c:169) >> ==4022024== by 0xF50FD3B: PCApply (precon.c:486) >> ==4022024== by 0xF511204: PCApplyBAorAB (precon.c:756) >> ==4022024== by 0xF51C5F0: KSP_PCApplyBAorAB (kspimpl.h:443) >> ==4022024== by 0xF51C5F0: KSPGMRESCycle (gmres.c:146) >> ==4022024== by 0xF51C5F0: KSPSolve_GMRES (gmres.c:227) >> ==4022024== by 0xF5F339D: KSPSolve_Private (itfunc.c:910) >> ==4022024== Address 0x308b338c is 4 bytes before a block of size 1 alloc'd >> ==4022024== at 0x4C37135: malloc (vg_replace_malloc.c:381) >> ==4022024== by 0xF960E93: dmumps_scatter_dist_rhs_ (dsol_distrhs.F:139) >> ==4022024== by 0xF952669: dmumps_solve_driver_ (dsol_driver.F:3677) >> ==4022024== by 0xF8CA64A: dmumps_ (dmumps_driver.F:2035) >> ==4022024== by 0xF85E12B: dmumps_f77_ (dmumps_f77.F:291) >> ==4022024== by 0xF85C30E: dmumps_c (mumps_c.c:485) >> ==4022024== by 0xEBA8E56: MatSolve_MUMPS (mumps.c:1493) >> ==4022024== by 0xEE85D97: MatSolve (matrix.c:3631) >> ==4022024== by 0xF31B060: PCApply_LU (lu.c:169) >> ==4022024== by 0xF50FD3B: PCApply (precon.c:486) >> ==4022024== by 0xF511204: PCApplyBAorAB (precon.c:756) >> ==4022024== by 0xF51C5F0: KSP_PCApplyBAorAB (kspimpl.h:443) >> ==4022024== by 0xF51C5F0: KSPGMRESCycle (gmres.c:146) >> ==4022024== by 0xF51C5F0: KSPSolve_GMRES (gmres.c:227) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexlindsay239 at gmail.com Tue Nov 21 00:12:41 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Mon, 20 Nov 2023 22:12:41 -0800 Subject: [petsc-users] MUMPS valgrind error In-Reply-To: <0A6D4E5F-F912-4154-9899-18576ACC5660@petsc.dev> References: <0A6D4E5F-F912-4154-9899-18576ACC5660@petsc.dev> Message-ID: <595024AF-4DD3-49C1-BCFE-36E05D3FA74D@gmail.com> An HTML attachment was scrubbed... URL: From drwells at email.unc.edu Tue Nov 21 10:47:49 2023 From: drwells at email.unc.edu (Wells, David) Date: Tue, 21 Nov 2023 16:47:49 +0000 Subject: [petsc-users] Logging object creation and destruction counts In-Reply-To: References: Message-ID: Hi Matt, Thanks! I think the third point is what I need to get this working correctly. I'll report back when I have this working (or not working). Best, David ________________________________ From: Matthew Knepley Sent: Monday, November 20, 2023 10:00 AM To: Wells, David Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Logging object creation and destruction counts On Mon, Nov 20, 2023 at 8:00?AM Wells, David > wrote: Hi everyone, I just upgraded to PETSc 3.20 and read up on the new logging infrastructure - its a very nice improvement over the old version. I have some code which checks that every construction has a corresponding destruction via PetscStageLog stageLog; ierr = PetscLogGetStageLog(&stageLog); for (int i = 0; i < stageLog->stageInfo->classLog->numClasses; ++i) { if (stageLog->stageInfo->classLog->classInfo[i].destructions != stageLog->stageInfo->classLog->classInfo[i].creations) { crash(); } } This no longer works and I can't figure out how to port it. In particular, it looks like I need to get a PetscLogEvent number for creation and another for destruction to retrieve the relevant PetscEventPerfInfo objects per-class - is there some straightforward way to do that for every registered PETSc class? 1. The above code seems to require that creation and destruction occur within the same stage, which might not be true. 2. https://petsc.org/main/manualpages/Profiling/PetscLogStateGetNumClasses/ gets the numbet of classes. You can recreate this loop with the code from PetscLogHandlerObjectCreate_Default() I think, which is in logdefault.c. However, as I said, this will not properly match up across stages. 3. At logdefault.c:1657 we output all the creations and destructions, so you could copy these loops, but sum across stages. 4. You could register a callback that just increments and decrements for each classid, and then calls PetscLogHandlerObjectCreate/Destroy_Default(), which might be cleaner. Thanks, Matt Best, David Wells -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From facklerpw at ornl.gov Wed Nov 22 10:43:23 2023 From: facklerpw at ornl.gov (Fackler, Philip) Date: Wed, 22 Nov 2023 16:43:23 +0000 Subject: [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: References: Message-ID: I definitely dropped the ball on this. I'm sorry for that. I have new profiling data using the latest (as of yesterday) of petsc/main. I've put them in a single google drive folder linked here: https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link Have a happy holiday weekend! Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang Sent: Monday, October 16, 2023 15:24 To: Fackler, Philip Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface Hi, Philip, That branch was merged to petsc/main today. Let me know once you have new profiling results. Thanks. --Junchao Zhang On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: Junchao, I've attached updated timing plots (red and blue are swapped from before; yellow is the new one). There is an improvement for the NE_3 case only with CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI cases, MatShift doesn't show up (I assume because we're using different preconditioner arguments). So, there must be some other primary culprit. I'll try to get updated profiling data to you soon. Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Fackler, Philip via Xolotl-psi-development > Sent: Wednesday, October 11, 2023 11:31 To: Junchao Zhang > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > Subject: Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface I'm on it. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Wednesday, October 11, 2023 10:14 To: Fackler, Philip > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface Hi, Philip, Could you try this branch jczhang/2023-10-05/feature-support-matshift-aijkokkos ? Thanks. --Junchao Zhang On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip > wrote: Aha! That makes sense. Thank you. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Thursday, October 5, 2023 17:29 To: Fackler, Philip > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > Subject: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface Wait a moment, it seems it was because we do not have a GPU implementation of MatShift... Let me see how to add it. --Junchao Zhang On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: Hi, Philip, I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() instead of the COO interface? MatSetValues() needs to copy the data from device to host and thus is expensive. Do you have profiling results with COO enabled? [Screenshot 2023-10-05 at 10.55.29?AM.png] --Junchao Zhang On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: Hi, Philip, I will look into the tarballs and get back to you. Thanks. --Junchao Zhang On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users > wrote: We finally have xolotl ported to use the new COO interface and the aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port to our previous version (using MatSetValuesStencil and the default Mat and Vec implementations), we expected to see an improvement in performance for both the "serial" and "cuda" builds (here I'm referring to the kokkos configuration). Attached are two plots that show timings for three different cases. All of these were run on Ascent (the Summit-like training system) with 6 MPI tasks (on a single node). The CUDA cases were given one GPU per task (and used CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent as possible. The performance of RHSJacobian (where the bulk of computation happens in xolotl) behaved basically as expected (better than expected in the serial build). NE_3 case in CUDA was the only one that performed worse, but not surprisingly, since its workload for the GPUs is much smaller. We've still got more optimization to do on this. The real surprise was how much worse the overall solve times were. This seems to be due simply to switching to the kokkos-based implementation. I'm wondering if there are any changes we can make in configuration or runtime arguments to help with PETSc's performance here. Any help looking into this would be appreciated. The tarballs linked here and here are profiling databases which, once extracted, can be viewed with hpcviewer. I don't know how helpful that will be, but hopefully it can give you some direction. Thanks for your help, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From tarot1991 at protonmail.com Wed Nov 22 15:17:43 2023 From: tarot1991 at protonmail.com (tarot1991) Date: Wed, 22 Nov 2023 21:17:43 +0000 Subject: [petsc-users] Test Message-ID: <2GmSG_X14khOJrA4LVhIydIWTbWMp7eSC0DHIOoqZZ3Ic9hmBRK_AyMCB-csUtZNPyrId53GlkINSal1dRoBpppZ3HUrY7lgdqkiILnvDEM=@protonmail.com> Test Gesendet von Proton Mail f?r Mobilger?te -------------- next part -------------- An HTML attachment was scrubbed... URL: From joauma.marichal at uclouvain.be Thu Nov 23 07:59:58 2023 From: joauma.marichal at uclouvain.be (Joauma Marichal) Date: Thu, 23 Nov 2023 13:59:58 +0000 Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors In-Reply-To: References: Message-ID: Hello, My problem persists? Is there anything I could try? Thanks a lot. Best regards, Joauma De : Matthew Knepley Date : mercredi, 25 octobre 2023 ? 14:45 ? : Joauma Marichal Cc : petsc-maint at mcs.anl.gov , petsc-users at mcs.anl.gov Objet : Re: [petsc-maint] DMSwarm on multiple processors On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint > wrote: Hello, I am using the DMSwarm library in some Eulerian-Lagrangian approach to have vapor bubbles in water. I have obtained nice results recently and wanted to perform bigger simulations. Unfortunately, when I increase the number of processors used to run the simulation, I get the following error: free(): invalid size [cns136:590327] *** Process received signal *** [cns136:590327] Signal: Aborted (6) [cns136:590327] Signal code: (-6) [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20] [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f] [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05] [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037] [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c] [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac] [cns136:590327] [ 6] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64] [cns136:590327] [ 7] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642] [cns136:590327] [ 8] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e] [cns136:590327] [ 9] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde] [cns136:590327] [10] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8] [cns136:590327] [11] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448] [cns136:590327] [12] /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20] [cns136:590327] [13] ./cobpor[0x4418dc] [cns136:590327] [14] ./cobpor[0x408b63] [cns136:590327] [15] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3] [cns136:590327] [16] ./cobpor[0x40bdee] [cns136:590327] *** End of error message *** -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited on signal 6 (Aborted). -------------------------------------------------------------------------- When I reduce the number of processors the error disappears and when I run my code without the vapor bubbles it also works. The problem seems to take place at this moment: DMCreate(PETSC_COMM_WORLD,swarm); DMSetType(*swarm,DMSWARM); DMSetDimension(*swarm,3); DMSwarmSetType(*swarm,DMSWARM_PIC); DMSwarmSetCellDM(*swarm,*dmcell); Thanks a lot for your help. Things that would help us track this down: 1) The smallest example where it fails 2) The smallest number of processes where it fails 3) A stack trace of the failure 4) A simple example that we can run that also fails Thanks, Matt Best regards, Joauma -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Nov 23 08:32:41 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 23 Nov 2023 09:32:41 -0500 Subject: [petsc-users] [petsc-maint] DMSwarm on multiple processors In-Reply-To: References: Message-ID: On Thu, Nov 23, 2023 at 9:01?AM Joauma Marichal < joauma.marichal at uclouvain.be> wrote: > Hello, > > > > My problem persists? Is there anything I could try? > Yes. It appears to be failing from a call inside PetscSFSetUpRanks(). It does allocation, and the failure is in libc, and it only happens on larger examples, so I suspect some allocation problem. Can you rebuild with debugging and run this example? Then we can see if the allocation fails. Thanks, Matt > Thanks a lot. > > > > Best regards, > > > > Joauma > > > > *De : *Matthew Knepley > *Date : *mercredi, 25 octobre 2023 ? 14:45 > *? : *Joauma Marichal > *Cc : *petsc-maint at mcs.anl.gov , > petsc-users at mcs.anl.gov > *Objet : *Re: [petsc-maint] DMSwarm on multiple processors > > On Wed, Oct 25, 2023 at 8:32?AM Joauma Marichal via petsc-maint < > petsc-maint at mcs.anl.gov> wrote: > > Hello, > > > > I am using the DMSwarm library in some Eulerian-Lagrangian approach to > have vapor bubbles in water. > > I have obtained nice results recently and wanted to perform bigger > simulations. Unfortunately, when I increase the number of processors used > to run the simulation, I get the following error: > > > > free(): invalid size > > [cns136:590327] *** Process received signal *** > > [cns136:590327] Signal: Aborted (6) > > [cns136:590327] Signal code: (-6) > > [cns136:590327] [ 0] /lib64/libc.so.6(+0x4eb20)[0x7f56cd4c9b20] > > [cns136:590327] [ 1] /lib64/libc.so.6(gsignal+0x10f)[0x7f56cd4c9a9f] > > [cns136:590327] [ 2] /lib64/libc.so.6(abort+0x127)[0x7f56cd49ce05] > > [cns136:590327] [ 3] /lib64/libc.so.6(+0x91037)[0x7f56cd50c037] > > [cns136:590327] [ 4] /lib64/libc.so.6(+0x9819c)[0x7f56cd51319c] > > [cns136:590327] [ 5] /lib64/libc.so.6(+0x99aac)[0x7f56cd514aac] > > [cns136:590327] [ 6] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64] > > [cns136:590327] [ 7] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(+0x841642)[0x7f56cea83642] > > [cns136:590327] [ 8] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e] > > [cns136:590327] [ 9] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde] > > [cns136:590327] [10] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8] > > [cns136:590327] [11] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448] > > [cns136:590327] [12] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so.3.019(DMSetUp+0x20)[0x7f56cededa20] > > [cns136:590327] [13] ./cobpor[0x4418dc] > > [cns136:590327] [14] ./cobpor[0x408b63] > > [cns136:590327] [15] > /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3] > > [cns136:590327] [16] ./cobpor[0x40bdee] > > [cns136:590327] *** End of error message *** > > -------------------------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code. Per user-direction, the job has been aborted. > > -------------------------------------------------------------------------- > > -------------------------------------------------------------------------- > > mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited > on signal 6 (Aborted). > > -------------------------------------------------------------------------- > > > > When I reduce the number of processors the error disappears and when I run > my code without the vapor bubbles it also works. > > The problem seems to take place at this moment: > > > > DMCreate(PETSC_COMM_WORLD,swarm); > > DMSetType(*swarm,DMSWARM); > > DMSetDimension(*swarm,3); > > DMSwarmSetType(*swarm,DMSWARM_PIC); > > DMSwarmSetCellDM(*swarm,*dmcell); > > > > > > Thanks a lot for your help. > > > > Things that would help us track this down: > > > > 1) The smallest example where it fails > > > > 2) The smallest number of processes where it fails > > > > 3) A stack trace of the failure > > > > 4) A simple example that we can run that also fails > > > > Thanks, > > > > Matt > > > > Best regards, > > > > Joauma > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From maitri.ksh at gmail.com Thu Nov 23 10:38:36 2023 From: maitri.ksh at gmail.com (maitri ksh) Date: Thu, 23 Nov 2023 22:08:36 +0530 Subject: [petsc-users] Segmentation Violation error using SuperLU_DIST in ex 19.c Message-ID: Hi, I ran into an error while using SuperLU_DIST in ex 19.c, I am not sure how to debug this, can anyone please help. The 'configure.log' file is attached for your reference. Thanks, Maitri -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/ > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: The line numbers in the error traceback are not always exact. > [0]PETSC ERROR: #1 SuperLU_DIST:pgssvx() > [0]PETSC ERROR: #2 MatLUFactorNumeric_SuperLU_DIST() at /home/maitri.ksh/Maitri/petsc/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:582 > [0]PETSC ERROR: #3 MatLUFactorNumeric() at /home/maitri.ksh/Maitri/petsc/src/mat/interface/matrix.c:3243 > [0]PETSC ERROR: #4 PCSetUp_LU() at /home/maitri.ksh/Maitri/petsc/src/ksp/pc/impls/factor/lu/lu.c:121 > [0]PETSC ERROR: #5 PCSetUp() at /home/maitri.ksh/Maitri/petsc/src/ksp/pc/interface/precon.c:1067 > [0]PETSC ERROR: #6 KSPSetUp() at /home/maitri.ksh/Maitri/petsc/src/ksp/ksp/interface/itfunc.c:415 > [0]PETSC ERROR: #7 KSPSolve_Private() at /home/maitri.ksh/Maitri/petsc/src/ksp/ksp/interface/itfunc.c:836 > [0]PETSC ERROR: #8 KSPSolve() at /home/maitri.ksh/Maitri/petsc/src/ksp/ksp/interface/itfunc.c:1082 > [0]PETSC ERROR: #9 SNESSolve_NEWTONLS() at /home/maitri.ksh/Maitri/petsc/src/snes/impls/ls/ls.c:215 > [0]PETSC ERROR: #10 SNESSolve() at /home/maitri.ksh/Maitri/petsc/src/snes/interface/snes.c:4632 > [0]PETSC ERROR: #11 main() at ex19.c:152 > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 /home/maitri.ksh/Maitri/petsc/src/snes/tutorials Possible problem with ex19 running with superlu_dist, diffs above -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 10168561 bytes Desc: not available URL: From balay at mcs.anl.gov Thu Nov 23 10:56:49 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 23 Nov 2023 10:56:49 -0600 (CST) Subject: [petsc-users] Segmentation Violation error using SuperLU_DIST in ex 19.c In-Reply-To: References: Message-ID: Can you do a simple build with only superlu-dist and see if the error persists? ./configure PETSC_ARCH=linux-slu --with-cc=/usr/local/gcc11/bin/gcc --with-cxx=/usr/local/gcc11/bin/g++ --with-fc=gfortran --with-debugging=1 --with-scalar-type=complex --download-mpich --download-fblaslapack --download-superlu_dist make make PETSC_ARCH=linux-slu check Satish On Thu, 23 Nov 2023, maitri ksh wrote: > Hi, > I ran into an error while using SuperLU_DIST in ex 19.c, I am not sure how > to debug this, can anyone please help. The 'configure.log' file is attached > for your reference. > Thanks, > Maitri > From bsmith at petsc.dev Thu Nov 23 11:18:48 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 23 Nov 2023 12:18:48 -0500 Subject: [petsc-users] Segmentation Violation error using SuperLU_DIST in ex 19.c In-Reply-To: References: Message-ID: <95FA0FDB-3CD5-4AD2-BD4D-CB34552237F9@petsc.dev> Try option -start_in_debugger noxterm then when the debugger starts up c to continue the run. Then when the program crashes you can do bt to see exactly where the crash happens in SuperLU_DIST and print "some variable name in the routine" to check if the variables there look reasonable or if memory looks corrupted. You can also run with valgrind to check for memory corruption. https://petsc.org/release/faq/#what-does-corrupt-argument-or-caught-signal-or-segv-or-segmentation-violation-or-bus-error-mean-can-i-use-valgrind-or-cuda-memcheck-to-debug-memory-corruption-issues > On Nov 23, 2023, at 11:38?AM, maitri ksh wrote: > > Hi, > I ran into an error while using SuperLU_DIST in ex 19.c, I am not sure how to debug this, can anyone please help. The 'configure.log' file is attached for your reference. > Thanks, > Maitri > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maitri.ksh at gmail.com Fri Nov 24 00:59:17 2023 From: maitri.ksh at gmail.com (maitri ksh) Date: Fri, 24 Nov 2023 12:29:17 +0530 Subject: [petsc-users] Segmentation Violation error using SuperLU_DIST in ex 19.c In-Reply-To: <95FA0FDB-3CD5-4AD2-BD4D-CB34552237F9@petsc.dev> References: <95FA0FDB-3CD5-4AD2-BD4D-CB34552237F9@petsc.dev> Message-ID: ok, thanks On Thu, Nov 23, 2023 at 10:49?PM Barry Smith wrote: > > Try option -start_in_debugger noxterm > > then when the debugger starts up > > c > > to continue the run. Then when the program crashes you can do > > bt > > to see exactly where the crash happens in SuperLU_DIST > > and > > print "some variable name in the routine" > > to check if the variables there look reasonable or if memory looks > corrupted. > > You can also run with valgrind to check for memory corruption. > https://petsc.org/release/faq/#what-does-corrupt-argument-or-caught-signal-or-segv-or-segmentation-violation-or-bus-error-mean-can-i-use-valgrind-or-cuda-memcheck-to-debug-memory-corruption-issues > > > > On Nov 23, 2023, at 11:38?AM, maitri ksh wrote: > > Hi, > I ran into an error while using SuperLU_DIST in ex 19.c, I am not sure how > to debug this, can anyone please help. The 'configure.log' file is attached > for your reference. > Thanks, > Maitri > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maitri.ksh at gmail.com Fri Nov 24 01:01:55 2023 From: maitri.ksh at gmail.com (maitri ksh) Date: Fri, 24 Nov 2023 12:31:55 +0530 Subject: [petsc-users] Segmentation Violation error using SuperLU_DIST in ex 19.c In-Reply-To: References: Message-ID: Hi Satish, Yes, that simple build works with no error. I configured petsc again with the configure options: PETSC_ARCH=linux-gnu-c-debug -start-in-debugger[noxterm] --with-cc=/usr/local/gcc11/bin/gcc --with-cxx=/usr/local/gcc11/bin/g++ --with-fc=gfortran --with-debugging=1 --with-scalar-type=complex --download-mpich --download-fblaslapack --with-matlab-dir=/usr/local/matlab --download-superlu --download-superlu_dist --download-hdf5 --download-mumps --download-scalapack --download--parmetis --download-metis --download-ptscotch --download-bison --download-cmake --download-make Now, it runs the superLU_dist test successfully but it gives an error with MATLAB engine 'Possible error running C/C++ src/vec/vec/tutorials/ex31 with MATLAB engine' and also an error with MAKE check. On Thu, Nov 23, 2023 at 10:26?PM Satish Balay wrote: > Can you do a simple build with only superlu-dist and see if the error > persists? > > ./configure PETSC_ARCH=linux-slu --with-cc=/usr/local/gcc11/bin/gcc > --with-cxx=/usr/local/gcc11/bin/g++ --with-fc=gfortran --with-debugging=1 > --with-scalar-type=complex --download-mpich --download-fblaslapack > --download-superlu_dist > make > make PETSC_ARCH=linux-slu check > > Satish > > On Thu, 23 Nov 2023, maitri ksh wrote: > > > Hi, > > I ran into an error while using SuperLU_DIST in ex 19.c, I am not sure > how > > to debug this, can anyone please help. The 'configure.log' file is > attached > > for your reference. > > Thanks, > > Maitri > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 9984976 bytes Desc: not available URL: From balay at mcs.anl.gov Fri Nov 24 10:46:30 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 24 Nov 2023 10:46:30 -0600 (CST) Subject: [petsc-users] Segmentation Violation error using SuperLU_DIST in ex 19.c In-Reply-To: References: Message-ID: Do you really need this combination of pkgs? Matlab is distributed with ILP64 MKL - so it doesn't really work with LP64 blas/lapack that most external packages require - i.e you can't really use use matlab and other external-packages. [also it might not work with complex] To get a successful matlab build - try: ./configure PETSC_ARCH=arch-linux-matlab --download-mpich --with-matlab-dir=/usr/local/matlab --with-matlab-engine=1 --with-blaslapack-dir=/usr/local/matlab --known-64-bit-blas-indices=1 Satish On Fri, 24 Nov 2023, maitri ksh wrote: > Hi Satish, > Yes, that simple build works with no error. I configured petsc again with > the configure options: > > PETSC_ARCH=linux-gnu-c-debug -start-in-debugger[noxterm] > --with-cc=/usr/local/gcc11/bin/gcc --with-cxx=/usr/local/gcc11/bin/g++ > --with-fc=gfortran --with-debugging=1 --with-scalar-type=complex > --download-mpich --download-fblaslapack --with-matlab-dir=/usr/local/matlab > --download-superlu --download-superlu_dist --download-hdf5 --download-mumps > --download-scalapack --download--parmetis --download-metis > --download-ptscotch --download-bison --download-cmake --download-make > > Now, it runs the superLU_dist test successfully but it gives an error with > MATLAB engine 'Possible error running C/C++ src/vec/vec/tutorials/ex31 with > MATLAB engine' and also an error with MAKE check. > > > > On Thu, Nov 23, 2023 at 10:26?PM Satish Balay wrote: > > > Can you do a simple build with only superlu-dist and see if the error > > persists? > > > > ./configure PETSC_ARCH=linux-slu --with-cc=/usr/local/gcc11/bin/gcc > > --with-cxx=/usr/local/gcc11/bin/g++ --with-fc=gfortran --with-debugging=1 > > --with-scalar-type=complex --download-mpich --download-fblaslapack > > --download-superlu_dist > > make > > make PETSC_ARCH=linux-slu check > > > > Satish > > > > On Thu, 23 Nov 2023, maitri ksh wrote: > > > > > Hi, > > > I ran into an error while using SuperLU_DIST in ex 19.c, I am not sure > > how > > > to debug this, can anyone please help. The 'configure.log' file is > > attached > > > for your reference. > > > Thanks, > > > Maitri > > > > > > > > From maitri.ksh at gmail.com Sat Nov 25 03:44:27 2023 From: maitri.ksh at gmail.com (maitri ksh) Date: Sat, 25 Nov 2023 15:14:27 +0530 Subject: [petsc-users] Segmentation Violation error using SuperLU_DIST in ex 19.c In-Reply-To: References: Message-ID: ok, thanks. On Fri, Nov 24, 2023 at 10:16?PM Satish Balay wrote: > Do you really need this combination of pkgs? > > Matlab is distributed with ILP64 MKL - so it doesn't really work with > LP64 blas/lapack that most external packages require - i.e you can't > really use use matlab and other external-packages. > > [also it might not work with complex] > > To get a successful matlab build - try: > > ./configure PETSC_ARCH=arch-linux-matlab --download-mpich > --with-matlab-dir=/usr/local/matlab --with-matlab-engine=1 > --with-blaslapack-dir=/usr/local/matlab --known-64-bit-blas-indices=1 > > Satish > > On Fri, 24 Nov 2023, maitri ksh wrote: > > > Hi Satish, > > Yes, that simple build works with no error. I configured petsc again with > > the configure options: > > > > PETSC_ARCH=linux-gnu-c-debug -start-in-debugger[noxterm] > > --with-cc=/usr/local/gcc11/bin/gcc --with-cxx=/usr/local/gcc11/bin/g++ > > --with-fc=gfortran --with-debugging=1 --with-scalar-type=complex > > --download-mpich --download-fblaslapack > --with-matlab-dir=/usr/local/matlab > > --download-superlu --download-superlu_dist --download-hdf5 > --download-mumps > > --download-scalapack --download--parmetis --download-metis > > --download-ptscotch --download-bison --download-cmake --download-make > > > > Now, it runs the superLU_dist test successfully but it gives an error > with > > MATLAB engine 'Possible error running C/C++ src/vec/vec/tutorials/ex31 > with > > MATLAB engine' and also an error with MAKE check. > > > > > > > > On Thu, Nov 23, 2023 at 10:26?PM Satish Balay wrote: > > > > > Can you do a simple build with only superlu-dist and see if the error > > > persists? > > > > > > ./configure PETSC_ARCH=linux-slu --with-cc=/usr/local/gcc11/bin/gcc > > > --with-cxx=/usr/local/gcc11/bin/g++ --with-fc=gfortran > --with-debugging=1 > > > --with-scalar-type=complex --download-mpich --download-fblaslapack > > > --download-superlu_dist > > > make > > > make PETSC_ARCH=linux-slu check > > > > > > Satish > > > > > > On Thu, 23 Nov 2023, maitri ksh wrote: > > > > > > > Hi, > > > > I ran into an error while using SuperLU_DIST in ex 19.c, I am not > sure > > > how > > > > to debug this, can anyone please help. The 'configure.log' file is > > > attached > > > > for your reference. > > > > Thanks, > > > > Maitri > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel.salazar at corintis.com Mon Nov 27 02:46:08 2023 From: miguel.salazar at corintis.com (Miguel Angel Salazar de Troya) Date: Mon, 27 Nov 2023 09:46:08 +0100 Subject: [petsc-users] Error handling in petsc4py In-Reply-To: References: Message-ID: Hello, Is there any way to get the PETSc error codes in the python interface? The test I provided below is just a simple example that I know will run out of memory. Miguel On Wed, Nov 15, 2023 at 10:00?AM Miguel Angel Salazar de Troya < miguel.salazar at corintis.com> wrote: > Hello, > > The following simple petsc4py snippet runs out of memory, but I would > like to handle it from python with the usual try-except. Is there any way > to do so? How can I get the PETSc error codes in the python interface? > > Thanks > > from petsc4py import PETSc > import sys, petsc4py > petsc4py.init(sys.argv) > try: > m, n = 1000000, 1000000 > A = PETSc.Mat().createAIJ([m, n], nnz=1e6) > > A.assemblyBegin() > A.assemblyEnd() > except Exception as e: > print(f"An error occurred: {e}") > > An error occurred: error code 55 > [0] MatSeqAIJSetPreallocation() at > /Users/miguel/repos/firedrake-glacierware/src/petsc/src/mat/impls/aij/seq/aij.c:3942 > [0] MatSeqAIJSetPreallocation_SeqAIJ() at > /Users/miguel/repos/firedrake-glacierware/src/petsc/src/mat/impls/aij/seq/aij.c:4008 > [0] PetscMallocA() at > /Users/miguel/repos/firedrake-glacierware/src/petsc/src/sys/memory/mal.c:408 > [0] PetscMallocAlign() at > /Users/miguel/repos/firedrake-glacierware/src/petsc/src/sys/memory/mal.c:53 > [0] Out of memory. Allocated: 0, Used by process: 59752448 > [0] Memory requested 18446744064984991744 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Nov 27 09:41:49 2023 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 27 Nov 2023 10:41:49 -0500 Subject: [petsc-users] Error handling in petsc4py In-Reply-To: References: Message-ID: I see cdef extern from * nogil: ctypedef enum PetscErrorCode: PETSC_SUCCESS PETSC_ERR_PLIB PETSC_ERR_SUP PETSC_ERR_USER PETSC_ERR_MEM PETSC_ERR_MPI PETSC_ERR_PYTHON ctypedef enum PetscErrorType: PETSC_ERROR_INITIAL PETSC_ERROR_REPEAT cdef PetscErrorCode CHKERR(PetscErrorCode) except PETSC_ERR_PYTHON nogil in src/binding/petsc4py/src/petsc4py/PETSc.pxd I don't know enough about cython to know how it could be accessible from Python code. > On Nov 27, 2023, at 3:46?AM, Miguel Angel Salazar de Troya wrote: > > Hello, > > Is there any way to get the PETSc error codes in the python interface? The test I provided below is just a simple example that I know will run out of memory. > > Miguel > > On Wed, Nov 15, 2023 at 10:00?AM Miguel Angel Salazar de Troya > wrote: >> Hello, >> >> The following simple petsc4py snippet runs out of memory, but I would like to handle it from python with the usual try-except. Is there any way to do so? How can I get the PETSc error codes in the python interface? >> >> Thanks >> >> from petsc4py import PETSc >> import sys, petsc4py >> petsc4py.init(sys.argv) >> try: >> m, n = 1000000, 1000000 >> A = PETSc.Mat().createAIJ([m, n], nnz=1e6) >> >> A.assemblyBegin() >> A.assemblyEnd() >> except Exception as e: >> print(f"An error occurred: {e}") >> >> An error occurred: error code 55 >> [0] MatSeqAIJSetPreallocation() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/mat/impls/aij/seq/aij.c:3942 >> [0] MatSeqAIJSetPreallocation_SeqAIJ() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/mat/impls/aij/seq/aij.c:4008 >> [0] PetscMallocA() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/sys/memory/mal.c:408 >> [0] PetscMallocAlign() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/sys/memory/mal.c:53 >> [0] Out of memory. Allocated: 0, Used by process: 59752448 >> [0] Memory requested 18446744064984991744 >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Nov 27 10:55:27 2023 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 27 Nov 2023 17:55:27 +0100 Subject: [petsc-users] Error handling in petsc4py In-Reply-To: References: Message-ID: <9396F3CC-294E-46EC-9D99-1A816C39B2B0@dsic.upv.es> The exception has been caught and the execution reaches the print() statement. I think you just need to disable the PETSc error handler, try with this: PETSc.Sys.pushErrorHandler("ignore") Jose > El 27 nov 2023, a las 16:41, Barry Smith escribi?: > > > I see > > cdef extern from * nogil: > ctypedef enum PetscErrorCode: > PETSC_SUCCESS > PETSC_ERR_PLIB > PETSC_ERR_SUP > PETSC_ERR_USER > PETSC_ERR_MEM > PETSC_ERR_MPI > PETSC_ERR_PYTHON > > ctypedef enum PetscErrorType: > PETSC_ERROR_INITIAL > PETSC_ERROR_REPEAT > > cdef PetscErrorCode CHKERR(PetscErrorCode) except PETSC_ERR_PYTHON nogil > > in src/binding/petsc4py/src/petsc4py/PETSc.pxd > > I don't know enough about cython to know how it could be accessible from Python code. > > > >> On Nov 27, 2023, at 3:46?AM, Miguel Angel Salazar de Troya wrote: >> >> Hello, >> >> Is there any way to get the PETSc error codes in the python interface? The test I provided below is just a simple example that I know will run out of memory. >> >> Miguel >> >> On Wed, Nov 15, 2023 at 10:00?AM Miguel Angel Salazar de Troya wrote: >> Hello, >> >> The following simple petsc4py snippet runs out of memory, but I would like to handle it from python with the usual try-except. Is there any way to do so? How can I get the PETSc error codes in the python interface? >> >> Thanks >> >> from petsc4py import PETSc >> import sys, petsc4py >> petsc4py.init(sys.argv) >> try: >> m, n = 1000000, 1000000 >> A = PETSc.Mat().createAIJ([m, n], nnz=1e6) >> >> A.assemblyBegin() >> A.assemblyEnd() >> except Exception as e: >> print(f"An error occurred: {e}") >> >> An error occurred: error code 55 >> [0] MatSeqAIJSetPreallocation() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/mat/impls/aij/seq/aij.c:3942 >> [0] MatSeqAIJSetPreallocation_SeqAIJ() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/mat/impls/aij/seq/aij.c:4008 >> [0] PetscMallocA() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/sys/memory/mal.c:408 >> [0] PetscMallocAlign() at /Users/miguel/repos/firedrake-glacierware/src/petsc/src/sys/memory/mal.c:53 >> [0] Out of memory. Allocated: 0, Used by process: 59752448 >> [0] Memory requested 18446744064984991744 >> >> >> > From s.roongta at mpie.de Tue Nov 28 07:38:09 2023 From: s.roongta at mpie.de (Sharan Roongta) Date: Tue, 28 Nov 2023 14:38:09 +0100 Subject: [petsc-users] msh4 geenerated from neper Message-ID: <652f1672-7d8b-44c4-8946-f1f13b883164@mpie.de> Hello, I have problem loading the msh4 file generated from neper. The mesh format is 4.1 The error is: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Unexpected data in file [0]PETSC ERROR: File is not a valid Gmsh file, expecting $Entities [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! [0]PETSC ERROR: Option left: name:-g value: test.msh4 [0]PETSC ERROR: Option left: name:-l value: tensionX.load [0]PETSC ERROR: Option left: name:-m value: material.yaml [0]PETSC ERROR: Option left: name:-n value: numerics.yaml [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.18.5, unknown [0]PETSC ERROR: DAMASK_mesh on a gfortran named mamc57x by work Mon Nov 27 21:50:06 2023 [0]PETSC ERROR: Configure options --with-mpi-f90module-visibility=0 --download-hdf5 --with-hdf5-fortran-bindings=1 --download-fftw --download-mpich --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack=1 --download-ml --download-mumps --download-scalapack --download-hypre --download-zlib [0]PETSC ERROR: #1 GmshExpect() at /home/work/petsc/src/dm/impls/plex/plexgmsh.c:270 [0]PETSC ERROR: #2 DMPlexCreateGmsh() at /home/work/petsc/src/dm/impls/plex/plexgmsh.c:1548 [0]PETSC ERROR: #3 DMPlexCreateGmshFromFile() at /home/work/petsc/src/dm/impls/plex/plexgmsh.c:1433 [0]PETSC ERROR: #4 DMPlexCreateFromFile() at /home/work/petsc/src/dm/impls/plex/plexcreate.c:5207 [0]PETSC ERROR: #5 /home/work/DAMASK/src/mesh/discretization_mesh.f90:112 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/ [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: No error traceback is available, the problem could be in the main program. application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=59 The file does contain the $Entities section (file attached). Has anyone encountered this before? Regards, Sharan P.S Sorry for the earlier mail that was sent to the incorrect mailing list. ----------------------------------------------- Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 ------------------------------------------------- Please consider that invitations and e-mails of our institute are only valid if they end with ... at mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ... at mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- Stay up to date and follow us on LinkedIn, Twitter and YouTube. Max-Planck-Institut f?r Eisenforschung GmbH Max-Planck-Stra?e 1 D-40237 D?sseldorf Handelsregister B 2533 Amtsgericht D?sseldorf Gesch?ftsf?hrung Prof. Dr. Gerhard Dehm Prof. Dr. J?rg Neugebauer Prof. Dr. Dierk Raabe Dr. Kai de Weldige Ust.-Id.-Nr.: DE 11 93 58 514 Steuernummer: 105 5891 1000 Please consider that invitations and e-mails of our institute are only valid if they end with ?@mpie.de. If you are not sure of the validity please contact rco at mpie.de Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails aus unserem Haus nur mit der Endung ?@mpie.de g?ltig sind. In Zweifelsf?llen wenden Sie sich bitte an rco at mpie.de ------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test.zip Type: application/x-zip-compressed Size: 7201 bytes Desc: not available URL: From bramkamp at nsc.liu.se Tue Nov 28 07:46:28 2023 From: bramkamp at nsc.liu.se (Frank Bramkamp) Date: Tue, 28 Nov 2023 14:46:28 +0100 Subject: [petsc-users] Fortran problem MatGetValuesLocal Message-ID: <67C3757A-3CAC-4481-85B6-00674406853C@nsc.liu.se> Dear PETSc team, We are using the latest petsc version 3.20.1, intel compiler 2023, and we found the following problem: We want to call the function MatGetValuesLocal to extract a block sub-matrix from an assembled matrix (e.g. a 5x5 blocked sub matrix). We use the matrix format MatCreateBAIJ in parallel. In particular we try to call MatGetValuesLocal in Fortran. It seems that the linked does not find the subroutine MatGetValuesLocal. The subroutine MatGetValues seems to be fine. I guess that the fortran stubs/fortran interface is missing for this routine. On the documentation side, you also write a note for developers that the fortran stubs and interface Is not automatically generated for MatGetValuesLocal. So maybe that has been forgotten to do. Unfortunately I do not have any small test example, since we just incorporated the function call into our own software. Otherwise I would first have to set a small test example for the parallel case. I think there is also an include file where one can check the fortran interfaces ?! I forgot where to look this up. Greetings, Frank Bramkamp From bsmith at petsc.dev Tue Nov 28 08:40:37 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 28 Nov 2023 09:40:37 -0500 Subject: [petsc-users] Fortran problem MatGetValuesLocal In-Reply-To: <67C3757A-3CAC-4481-85B6-00674406853C@nsc.liu.se> References: <67C3757A-3CAC-4481-85B6-00674406853C@nsc.liu.se> Message-ID: <888D23A7-F16F-4B50-92D1-835140C90AFF@petsc.dev> This is fixed in branch barry/2023-11-28/add-matsetvalueslocal-fortran/release see also https://gitlab.com/petsc/petsc/-/merge_requests/7065 > On Nov 28, 2023, at 8:46?AM, Frank Bramkamp wrote: > > Dear PETSc team, > > > We are using the latest petsc version 3.20.1, intel compiler 2023, > and we found the following problem: > > We want to call the function MatGetValuesLocal to extract a block sub-matrix > from an assembled matrix (e.g. a 5x5 blocked sub matrix). We use the matrix format MatCreateBAIJ in parallel. > In particular we try to call MatGetValuesLocal in Fortran. > > It seems that the linked does not find the subroutine MatGetValuesLocal. > The subroutine MatGetValues seems to be fine. > I guess that the fortran stubs/fortran interface is missing for this routine. > On the documentation side, you also write a note for developers that the fortran stubs and interface > Is not automatically generated for MatGetValuesLocal. So maybe that has been forgotten to do. > > > Unfortunately I do not have any small test example, since we just incorporated the function call into our own software. > Otherwise I would first have to set a small test example for the parallel case. > > I think there is also an include file where one can check the fortran interfaces ?! > I forgot where to look this up. > > > Greetings, Frank Bramkamp > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Nov 28 08:33:40 2023 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 28 Nov 2023 06:33:40 -0800 Subject: [petsc-users] Fortran problem MatGetValuesLocal In-Reply-To: <67C3757A-3CAC-4481-85B6-00674406853C@nsc.liu.se> References: <67C3757A-3CAC-4481-85B6-00674406853C@nsc.liu.se> Message-ID: It looks like we don't have a Fortran interface for this. In the source code: 2366: Developer Note:2367: This is labeled with C so does not automatically generate Fortran stubs and interfaces2368: because it requires multiple Fortran interfaces depending on which arguments are scalar or arrays. Someone else might be able to offer a workaround. Thanks, Mark On Tue, Nov 28, 2023 at 5:46?AM Frank Bramkamp wrote: > Dear PETSc team, > > > We are using the latest petsc version 3.20.1, intel compiler 2023, > and we found the following problem: > > We want to call the function MatGetValuesLocal to extract a block > sub-matrix > from an assembled matrix (e.g. a 5x5 blocked sub matrix). We use the > matrix format MatCreateBAIJ in parallel. > In particular we try to call MatGetValuesLocal in Fortran. > > It seems that the linked does not find the subroutine MatGetValuesLocal. > The subroutine MatGetValues seems to be fine. > I guess that the fortran stubs/fortran interface is missing for this > routine. > On the documentation side, you also write a note for developers that the > fortran stubs and interface > Is not automatically generated for MatGetValuesLocal. So maybe that has > been forgotten to do. > > > Unfortunately I do not have any small test example, since we just > incorporated the function call into our own software. > Otherwise I would first have to set a small test example for the parallel > case. > > I think there is also an include file where one can check the fortran > interfaces ?! > I forgot where to look this up. > > > Greetings, Frank Bramkamp > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Tue Nov 28 14:51:16 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 28 Nov 2023 14:51:16 -0600 Subject: [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: References: Message-ID: Hi, Philip, I opened hpcdb-PSI_9-serial and it seems you used PCLU. Since Kokkos does not have a GPU LU implementation, we do it on CPU via MatLUFactorNumeric_SeqAIJ(). Perhaps you can try other PC types? [image: Screenshot 2023-11-28 at 2.43.03?PM.png] --Junchao Zhang On Wed, Nov 22, 2023 at 10:43?AM Fackler, Philip wrote: > I definitely dropped the ball on this. I'm sorry for that. I have new > profiling data using the latest (as of yesterday) of petsc/main. I've put > them in a single google drive folder linked here: > > > https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link > > Have a happy holiday weekend! > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Monday, October 16, 2023 15:24 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; > xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net> > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses > switching to COO interface > > Hi, Philip, > That branch was merged to petsc/main today. Let me know once you have > new profiling results. > > Thanks. > --Junchao Zhang > > > On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: > > Junchao, > > I've attached updated timing plots (red and blue are swapped from before; > yellow is the new one). There is an improvement for the NE_3 case only with > CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI > cases, MatShift doesn't show up (I assume because we're using different > preconditioner arguments). So, there must be some other primary culprit. > I'll try to get updated profiling data to you soon. > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Fackler, Philip via Xolotl-psi-development < > xolotl-psi-development at lists.sourceforge.net> > *Sent:* Wednesday, October 11, 2023 11:31 > *To:* Junchao Zhang > *Cc:* petsc-users at mcs.anl.gov ; > xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net> > *Subject:* Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] > Unexpected performance losses switching to COO interface > > I'm on it. > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Wednesday, October 11, 2023 10:14 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; > xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; Blondel, Sophie < > sblondel at utk.edu> > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses > switching to COO interface > > Hi, Philip, > Could you try this branch > jczhang/2023-10-05/feature-support-matshift-aijkokkos ? > > Thanks. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip wrote: > > Aha! That makes sense. Thank you. > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Thursday, October 5, 2023 17:29 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; > xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; Blondel, Sophie < > sblondel at utk.edu> > *Subject:* [EXTERNAL] Re: [petsc-users] Unexpected performance losses > switching to COO interface > > Wait a moment, it seems it was because we do not have a GPU implementation > of MatShift... > Let me see how to add it. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: > > Hi, Philip, > I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() > instead of the COO interface? MatSetValues() needs to copy the data from > device to host and thus is expensive. > Do you have profiling results with COO enabled? > > [image: Screenshot 2023-10-05 at 10.55.29?AM.png] > > > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: > > Hi, Philip, > I will look into the tarballs and get back to you. > Thanks. > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > We finally have xolotl ported to use the new COO interface and the > aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port > to our previous version (using MatSetValuesStencil and the default Mat and > Vec implementations), we expected to see an improvement in performance for > both the "serial" and "cuda" builds (here I'm referring to the kokkos > configuration). > > Attached are two plots that show timings for three different cases. All of > these were run on Ascent (the Summit-like training system) with 6 MPI tasks > (on a single node). The CUDA cases were given one GPU per task (and used > CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases > we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent > as possible. > > The performance of RHSJacobian (where the bulk of computation happens in > xolotl) behaved basically as expected (better than expected in the serial > build). NE_3 case in CUDA was the only one that performed worse, but not > surprisingly, since its workload for the GPUs is much smaller. We've still > got more optimization to do on this. > > The real surprise was how much worse the overall solve times were. This > seems to be due simply to switching to the kokkos-based implementation. I'm > wondering if there are any changes we can make in configuration or runtime > arguments to help with PETSc's performance here. Any help looking into this > would be appreciated. > > The tarballs linked here > > and here > > are profiling databases which, once extracted, can be viewed with > hpcviewer. I don't know how helpful that will be, but hopefully it can give > you some direction. > > Thanks for your help, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot 2023-11-28 at 2.43.03?PM.png Type: image/png Size: 192934 bytes Desc: not available URL: From facklerpw at ornl.gov Tue Nov 28 15:16:31 2023 From: facklerpw at ornl.gov (Fackler, Philip) Date: Tue, 28 Nov 2023 21:16:31 +0000 Subject: [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: References: Message-ID: That makes sense. Here are the arguments that I think are relevant: -fieldsplit_1_pc_type redundant -fieldsplit_0_pc_type sor -pc_type fieldsplit -pc_fieldsplit_detect_coupling? What would you suggest to make this better? Also, note that the cases marked "serial" are running on CPU only, that is, using only the SERIAL backend for kokkos. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang Sent: Tuesday, November 28, 2023 15:51 To: Fackler, Philip Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface Hi, Philip, I opened hpcdb-PSI_9-serial and it seems you used PCLU. Since Kokkos does not have a GPU LU implementation, we do it on CPU via MatLUFactorNumeric_SeqAIJ(). Perhaps you can try other PC types? [Screenshot 2023-11-28 at 2.43.03?PM.png] --Junchao Zhang On Wed, Nov 22, 2023 at 10:43?AM Fackler, Philip > wrote: I definitely dropped the ball on this. I'm sorry for that. I have new profiling data using the latest (as of yesterday) of petsc/main. I've put them in a single google drive folder linked here: https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link Have a happy holiday weekend! Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Monday, October 16, 2023 15:24 To: Fackler, Philip > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface Hi, Philip, That branch was merged to petsc/main today. Let me know once you have new profiling results. Thanks. --Junchao Zhang On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: Junchao, I've attached updated timing plots (red and blue are swapped from before; yellow is the new one). There is an improvement for the NE_3 case only with CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI cases, MatShift doesn't show up (I assume because we're using different preconditioner arguments). So, there must be some other primary culprit. I'll try to get updated profiling data to you soon. Thanks, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Fackler, Philip via Xolotl-psi-development > Sent: Wednesday, October 11, 2023 11:31 To: Junchao Zhang > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > Subject: Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface I'm on it. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Wednesday, October 11, 2023 10:14 To: Fackler, Philip > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface Hi, Philip, Could you try this branch jczhang/2023-10-05/feature-support-matshift-aijkokkos ? Thanks. --Junchao Zhang On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip > wrote: Aha! That makes sense. Thank you. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Junchao Zhang > Sent: Thursday, October 5, 2023 17:29 To: Fackler, Philip > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > Subject: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface Wait a moment, it seems it was because we do not have a GPU implementation of MatShift... Let me see how to add it. --Junchao Zhang On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: Hi, Philip, I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() instead of the COO interface? MatSetValues() needs to copy the data from device to host and thus is expensive. Do you have profiling results with COO enabled? [Screenshot 2023-10-05 at 10.55.29?AM.png] --Junchao Zhang On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: Hi, Philip, I will look into the tarballs and get back to you. Thanks. --Junchao Zhang On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users > wrote: We finally have xolotl ported to use the new COO interface and the aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port to our previous version (using MatSetValuesStencil and the default Mat and Vec implementations), we expected to see an improvement in performance for both the "serial" and "cuda" builds (here I'm referring to the kokkos configuration). Attached are two plots that show timings for three different cases. All of these were run on Ascent (the Summit-like training system) with 6 MPI tasks (on a single node). The CUDA cases were given one GPU per task (and used CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent as possible. The performance of RHSJacobian (where the bulk of computation happens in xolotl) behaved basically as expected (better than expected in the serial build). NE_3 case in CUDA was the only one that performed worse, but not surprisingly, since its workload for the GPUs is much smaller. We've still got more optimization to do on this. The real surprise was how much worse the overall solve times were. This seems to be due simply to switching to the kokkos-based implementation. I'm wondering if there are any changes we can make in configuration or runtime arguments to help with PETSc's performance here. Any help looking into this would be appreciated. The tarballs linked here and here are profiling databases which, once extracted, can be viewed with hpcviewer. I don't know how helpful that will be, but hopefully it can give you some direction. Thanks for your help, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot 2023-11-28 at 2.43.03?PM.png Type: image/png Size: 192934 bytes Desc: Screenshot 2023-11-28 at 2.43.03?PM.png URL: From junchao.zhang at gmail.com Tue Nov 28 15:59:03 2023 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 28 Nov 2023 15:59:03 -0600 Subject: [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: References: Message-ID: On Tue, Nov 28, 2023 at 3:16?PM Fackler, Philip wrote: > That makes sense. Here are the arguments that I think are relevant: > > -fieldsplit_1_pc_type redundant -fieldsplit_0_pc_type sor -pc_type > fieldsplit -pc_fieldsplit_detect_coupling > > What would you suggest to make this better? > Jed, do you have suggestions > > Also, note that the cases marked "serial" are running on CPU only, that > is, using only the SERIAL backend for kokkos. > I did also look at hpcdb-NE_9-cuda, which also called MatLUFactorNumeric_SeqAIJ(). But hpcdb-NE_3-cuda called PCSetUp_GAMG(). It suggests you used different options for them. > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Tuesday, November 28, 2023 15:51 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; > xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net> > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses > switching to COO interface > > Hi, Philip, > I opened hpcdb-PSI_9-serial and it seems you used PCLU. Since Kokkos > does not have a GPU LU implementation, we do it on CPU via > MatLUFactorNumeric_SeqAIJ(). Perhaps you can try other PC types? > > [image: Screenshot 2023-11-28 at 2.43.03?PM.png] > --Junchao Zhang > > > On Wed, Nov 22, 2023 at 10:43?AM Fackler, Philip > wrote: > > I definitely dropped the ball on this. I'm sorry for that. I have new > profiling data using the latest (as of yesterday) of petsc/main. I've put > them in a single google drive folder linked here: > > > https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link > > > Have a happy holiday weekend! > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Monday, October 16, 2023 15:24 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; > xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net> > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses > switching to COO interface > > Hi, Philip, > That branch was merged to petsc/main today. Let me know once you have > new profiling results. > > Thanks. > --Junchao Zhang > > > On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: > > Junchao, > > I've attached updated timing plots (red and blue are swapped from before; > yellow is the new one). There is an improvement for the NE_3 case only with > CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI > cases, MatShift doesn't show up (I assume because we're using different > preconditioner arguments). So, there must be some other primary culprit. > I'll try to get updated profiling data to you soon. > > Thanks, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Fackler, Philip via Xolotl-psi-development < > xolotl-psi-development at lists.sourceforge.net> > *Sent:* Wednesday, October 11, 2023 11:31 > *To:* Junchao Zhang > *Cc:* petsc-users at mcs.anl.gov ; > xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net> > *Subject:* Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] > Unexpected performance losses switching to COO interface > > I'm on it. > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Wednesday, October 11, 2023 10:14 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; > xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; Blondel, Sophie < > sblondel at utk.edu> > *Subject:* Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses > switching to COO interface > > Hi, Philip, > Could you try this branch > jczhang/2023-10-05/feature-support-matshift-aijkokkos ? > > Thanks. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip wrote: > > Aha! That makes sense. Thank you. > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > ------------------------------ > *From:* Junchao Zhang > *Sent:* Thursday, October 5, 2023 17:29 > *To:* Fackler, Philip > *Cc:* petsc-users at mcs.anl.gov ; > xolotl-psi-development at lists.sourceforge.net < > xolotl-psi-development at lists.sourceforge.net>; Blondel, Sophie < > sblondel at utk.edu> > *Subject:* [EXTERNAL] Re: [petsc-users] Unexpected performance losses > switching to COO interface > > Wait a moment, it seems it was because we do not have a GPU implementation > of MatShift... > Let me see how to add it. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: > > Hi, Philip, > I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() > instead of the COO interface? MatSetValues() needs to copy the data from > device to host and thus is expensive. > Do you have profiling results with COO enabled? > > [image: Screenshot 2023-10-05 at 10.55.29?AM.png] > > > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: > > Hi, Philip, > I will look into the tarballs and get back to you. > Thanks. > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > We finally have xolotl ported to use the new COO interface and the > aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port > to our previous version (using MatSetValuesStencil and the default Mat and > Vec implementations), we expected to see an improvement in performance for > both the "serial" and "cuda" builds (here I'm referring to the kokkos > configuration). > > Attached are two plots that show timings for three different cases. All of > these were run on Ascent (the Summit-like training system) with 6 MPI tasks > (on a single node). The CUDA cases were given one GPU per task (and used > CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases > we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent > as possible. > > The performance of RHSJacobian (where the bulk of computation happens in > xolotl) behaved basically as expected (better than expected in the serial > build). NE_3 case in CUDA was the only one that performed worse, but not > surprisingly, since its workload for the GPUs is much smaller. We've still > got more optimization to do on this. > > The real surprise was how much worse the overall solve times were. This > seems to be due simply to switching to the kokkos-based implementation. I'm > wondering if there are any changes we can make in configuration or runtime > arguments to help with PETSc's performance here. Any help looking into this > would be appreciated. > > The tarballs linked here > > and here > > are profiling databases which, once extracted, can be viewed with > hpcviewer. I don't know how helpful that will be, but hopefully it can give > you some direction. > > Thanks for your help, > > > *Philip Fackler * > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > *Oak Ridge National Laboratory* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot 2023-11-28 at 2.43.03?PM.png Type: image/png Size: 192934 bytes Desc: not available URL: From mail2amneet at gmail.com Tue Nov 28 17:44:07 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Tue, 28 Nov 2023 15:44:07 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows Message-ID: Hi Folks, I am using MatZeroRows() to set Dirichlet boundary conditions. This works fine for the serial run and the solver produces correct results (verified through analytical solution). However, when I run the case in parallel, the simulation gets stuck at MatZeroRows(). My understanding is that this function needs to be called after the MatAssemblyBegin{End}() has been called, and should be called by all processors. Here is that bit of the code which calls MatZeroRows() after the matrix has been assembled https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 I ran the parallel code (on 3 processors) in the debugger (-start_in_debugger). Below is the call stack from the processor that gets stuck amneetb at APSB-MBP-16:~$ lldb -p 4307 (lldb) process attach --pid 4307 Process 4307 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 libsystem_kernel.dylib`: -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> 0x18a2d7510 <+12>: pacibsp 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! 0x18a2d7518 <+20>: mov x29, sp Target 0: (fo_acoustic_streaming_solver_2d) stopped. Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". Architecture set to: arm64-apple-macosx-. (lldb) cont Process 4307 resuming Process 4307 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0000000109d281b8 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: -> 0x109d281b8 <+400>: ldr w9, [x24] 0x109d281bc <+404>: cmp w8, w9 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> 0x109d281c4 <+412>: bl 0x109d28e64 ; MPID_Progress_test Target 0: (fo_acoustic_streaming_solver_2d) stopped. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x0000000109d281b8 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 frame #1: 0x0000000109d27d14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 frame #2: 0x0000000109d27b60 libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 frame #5: 0x00000001045ea638 libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c:235:5 frame #6: 0x00000001045f2910 libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, classid=1211227, class_name="PetscSF", descr="Star Forest", mansec="PetscSF", comm=-2080374782, destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 frame #7: 0x00000001049cf820 libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) at sf.c:62:3 frame #8: 0x0000000104cd3024 libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) at zerorows.c:36:5 frame #9: 0x000000010504ea50 libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:768:3 frame #10: 0x0000000104d95fac libpetsc.3.17.dylib`MatZeroRows(mat=0x00000001170c1270, numRows=1, rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 frame #11: 0x000000010067d320 fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016f91c178, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016f91c3a8, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016f914ed0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp :794:36 frame #12: 0x0000000100694bdc fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016f91c028, x=0x000000016f91d788, (null)=0x000000016f91d690) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 frame #13: 0x000000010083232c fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016f91c028, x=0x000000016f91d788, b=0x000000016f91d690) at PETScLevelSolver.cpp:340:5 frame #14: 0x00000001004eb230 fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016f91f460) at fo_acoustic_streaming_solver.cpp:400:22 frame #15: 0x0000000189fbbf28 dyld`start + 2236 Any suggestions on how to avoid this barrier? Here are all MAT options I am using (in the debug mode), if that is helpful: https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L453-L458 Thanks, -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Nov 28 18:07:24 2023 From: jed at jedbrown.org (Jed Brown) Date: Tue, 28 Nov 2023 17:07:24 -0700 Subject: [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: References: Message-ID: <8734wpe81v.fsf@jedbrown.org> "Fackler, Philip via petsc-users" writes: > That makes sense. Here are the arguments that I think are relevant: > > -fieldsplit_1_pc_type redundant -fieldsplit_0_pc_type sor -pc_type fieldsplit -pc_fieldsplit_detect_coupling? What sort of physics are in splits 0 and 1? SOR is not a good GPU algorithm, so we'll want to change that one way or another. Are the splits of similar size or very different? > What would you suggest to make this better? > > Also, note that the cases marked "serial" are running on CPU only, that is, using only the SERIAL backend for kokkos. > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > Sent: Tuesday, November 28, 2023 15:51 > To: Fackler, Philip > Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Hi, Philip, > I opened hpcdb-PSI_9-serial and it seems you used PCLU. Since Kokkos does not have a GPU LU implementation, we do it on CPU via MatLUFactorNumeric_SeqAIJ(). Perhaps you can try other PC types? > > [Screenshot 2023-11-28 at 2.43.03?PM.png] > --Junchao Zhang > > > On Wed, Nov 22, 2023 at 10:43?AM Fackler, Philip > wrote: > I definitely dropped the ball on this. I'm sorry for that. I have new profiling data using the latest (as of yesterday) of petsc/main. I've put them in a single google drive folder linked here: > > https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link > > Have a happy holiday weekend! > > Thanks, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > > Sent: Monday, October 16, 2023 15:24 > To: Fackler, Philip > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Hi, Philip, > That branch was merged to petsc/main today. Let me know once you have new profiling results. > > Thanks. > --Junchao Zhang > > > On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: > Junchao, > > I've attached updated timing plots (red and blue are swapped from before; yellow is the new one). There is an improvement for the NE_3 case only with CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI cases, MatShift doesn't show up (I assume because we're using different preconditioner arguments). So, there must be some other primary culprit. I'll try to get updated profiling data to you soon. > > Thanks, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Fackler, Philip via Xolotl-psi-development > > Sent: Wednesday, October 11, 2023 11:31 > To: Junchao Zhang > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > > Subject: Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > I'm on it. > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > > Sent: Wednesday, October 11, 2023 10:14 > To: Fackler, Philip > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Hi, Philip, > Could you try this branch jczhang/2023-10-05/feature-support-matshift-aijkokkos ? > > Thanks. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip > wrote: > Aha! That makes sense. Thank you. > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > > Sent: Thursday, October 5, 2023 17:29 > To: Fackler, Philip > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > > Subject: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Wait a moment, it seems it was because we do not have a GPU implementation of MatShift... > Let me see how to add it. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: > Hi, Philip, > I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() instead of the COO interface? MatSetValues() needs to copy the data from device to host and thus is expensive. > Do you have profiling results with COO enabled? > > [Screenshot 2023-10-05 at 10.55.29?AM.png] > > > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: > Hi, Philip, > I will look into the tarballs and get back to you. > Thanks. > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users > wrote: > We finally have xolotl ported to use the new COO interface and the aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port to our previous version (using MatSetValuesStencil and the default Mat and Vec implementations), we expected to see an improvement in performance for both the "serial" and "cuda" builds (here I'm referring to the kokkos configuration). > > Attached are two plots that show timings for three different cases. All of these were run on Ascent (the Summit-like training system) with 6 MPI tasks (on a single node). The CUDA cases were given one GPU per task (and used CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent as possible. > > The performance of RHSJacobian (where the bulk of computation happens in xolotl) behaved basically as expected (better than expected in the serial build). NE_3 case in CUDA was the only one that performed worse, but not surprisingly, since its workload for the GPUs is much smaller. We've still got more optimization to do on this. > > The real surprise was how much worse the overall solve times were. This seems to be due simply to switching to the kokkos-based implementation. I'm wondering if there are any changes we can make in configuration or runtime arguments to help with PETSc's performance here. Any help looking into this would be appreciated. > > The tarballs linked here and here are profiling databases which, once extracted, can be viewed with hpcviewer. I don't know how helpful that will be, but hopefully it can give you some direction. > > Thanks for your help, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory From bsmith at petsc.dev Tue Nov 28 20:42:18 2023 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 28 Nov 2023 21:42:18 -0500 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: Message-ID: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> for (int comp = 0; comp < 2; ++comp) { ....... for (Box::Iterator bc(bc_coef_box); bc; bc++) { ...... if (IBTK::abs_equal_eps(b, 0.0)) { const double diag_value = a; ierr = MatZeroRows(mat, 1, &u_dof_index, diag_value, NULL, NULL); IBTK_CHKERRQ(ierr); } } } In general, this code will not work because each process calls MatZeroRows a different number of times, so it cannot match up with all the processes. If u_dof_index is always local to the current process, you can call MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop and the MatZeroRows will not synchronize across the MPI processes (since it does not need to and you told it that). If the u_dof_index will not always be local, then you need, on each process, to list all the u_dof_index for each process in an array and then call MatZeroRows() once after the loop so it can exchange the needed information with the other MPI processes to get the row indices to the right place. Barry > On Nov 28, 2023, at 6:44?PM, Amneet Bhalla wrote: > > > Hi Folks, > > I am using MatZeroRows() to set Dirichlet boundary conditions. This works fine for the serial run and the solver produces correct results (verified through analytical solution). However, when I run the case in parallel, the simulation gets stuck at MatZeroRows(). My understanding is that this function needs to be called after the MatAssemblyBegin{End}() has been called, and should be called by all processors. Here is that bit of the code which calls MatZeroRows() after the matrix has been assembled > > https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 > > I ran the parallel code (on 3 processors) in the debugger (-start_in_debugger). Below is the call stack from the processor that gets stuck > > amneetb at APSB-MBP-16:~$ lldb -p 4307 > (lldb) process attach --pid 4307 > Process 4307 stopped > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 > libsystem_kernel.dylib`: > -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> > 0x18a2d7510 <+12>: pacibsp > 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! > 0x18a2d7518 <+20>: mov x29, sp > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". > Architecture set to: arm64-apple-macosx-. > (lldb) cont > Process 4307 resuming > Process 4307 stopped > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > frame #0: 0x0000000109d281b8 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 > libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: > -> 0x109d281b8 <+400>: ldr w9, [x24] > 0x109d281bc <+404>: cmp w8, w9 > 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> > 0x109d281c4 <+412>: bl 0x109d28e64 ; MPID_Progress_test > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > (lldb) bt > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > * frame #0: 0x0000000109d281b8 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 > frame #1: 0x0000000109d27d14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 > frame #2: 0x0000000109d27b60 libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 > frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 > frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 > frame #5: 0x00000001045ea638 libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c:235:5 > frame #6: 0x00000001045f2910 libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, classid=1211227, class_name="PetscSF", descr="Star Forest", mansec="PetscSF", comm=-2080374782, destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 > frame #7: 0x00000001049cf820 libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) at sf.c:62:3 > frame #8: 0x0000000104cd3024 libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) at zerorows.c:36:5 > frame #9: 0x000000010504ea50 libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:768:3 > frame #10: 0x0000000104d95fac libpetsc.3.17.dylib`MatZeroRows(mat=0x00000001170c1270, numRows=1, rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 > frame #11: 0x000000010067d320 fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016f91c178, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016f91c3a8, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016f914ed0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp:794:36 > frame #12: 0x0000000100694bdc fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016f91c028, x=0x000000016f91d788, (null)=0x000000016f91d690) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 > frame #13: 0x000000010083232c fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016f91c028, x=0x000000016f91d788, b=0x000000016f91d690) at PETScLevelSolver.cpp:340:5 > frame #14: 0x00000001004eb230 fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016f91f460) at fo_acoustic_streaming_solver.cpp:400:22 > frame #15: 0x0000000189fbbf28 dyld`start + 2236 > > > Any suggestions on how to avoid this barrier? Here are all MAT options I am using (in the debug mode), if that is helpful: > > https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L453-L458 > > Thanks, > -- > --Amneet > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Tue Nov 28 21:23:43 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Tue, 28 Nov 2023 19:23:43 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> Message-ID: On Tue, Nov 28, 2023 at 6:42?PM Barry Smith wrote: > > for (int comp = 0; comp < 2; ++comp) > { > ....... > for (Box::Iterator bc(bc_coef_box); bc; bc++) > { > ...... > if (IBTK::abs_equal_eps(b, 0.0)) > { > const double diag_value = a; > ierr = MatZeroRows(mat, 1, &u_dof_index, > diag_value, NULL, NULL); > IBTK_CHKERRQ(ierr); > } > } > } > > In general, this code will not work because each process calls MatZeroRows > a different number of times, so it cannot match up with all the processes. > > If u_dof_index is always local to the current process, you can call > MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop > and > the MatZeroRows will not synchronize across the MPI processes (since it > does not need to and you told it that). > Yes, u_dof_index is going to be local and I put a check on it a few lines before calling MatZeroRows. Can MatSetOption() be called after the matrix has been assembled? > If the u_dof_index will not always be local, then you need, on each > process, to list all the u_dof_index for each process in an array and then > call MatZeroRows() > once after the loop so it can exchange the needed information with the > other MPI processes to get the row indices to the right place. > > Barry > > > > > On Nov 28, 2023, at 6:44?PM, Amneet Bhalla wrote: > > > Hi Folks, > > I am using MatZeroRows() to set Dirichlet boundary conditions. This works > fine for the serial run and the solver produces correct results (verified > through analytical solution). However, when I run the case in parallel, the > simulation gets stuck at MatZeroRows(). My understanding is that this > function needs to be called after the MatAssemblyBegin{End}() has been > called, and should be called by all processors. Here is that bit of the > code which calls MatZeroRows() after the matrix has been assembled > > > https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 > > I ran the parallel code (on 3 processors) in the debugger > (-start_in_debugger). Below is the call stack from the processor that gets > stuck > > amneetb at APSB-MBP-16:~$ lldb -p 4307 > (lldb) process attach --pid 4307 > Process 4307 stopped > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal > + 8 > libsystem_kernel.dylib`: > -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> > 0x18a2d7510 <+12>: pacibsp > 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! > 0x18a2d7518 <+20>: mov x29, sp > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > Executable module set to > "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". > Architecture set to: arm64-apple-macosx-. > (lldb) cont > Process 4307 resuming > Process 4307 stopped > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > frame #0: 0x0000000109d281b8 > libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 > libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: > -> 0x109d281b8 <+400>: ldr w9, [x24] > 0x109d281bc <+404>: cmp w8, w9 > 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> > 0x109d281c4 <+412>: bl 0x109d28e64 ; > MPID_Progress_test > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > (lldb) bt > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > * frame #0: 0x0000000109d281b8 > libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 > frame #1: 0x0000000109d27d14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + > 224 > frame #2: 0x0000000109d27b60 > libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 > frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 > frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 > frame #5: 0x00000001045ea638 > libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, > comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c:235:5 > frame #6: 0x00000001045f2910 > libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, > classid=1211227, class_name="PetscSF", descr="Star Forest", > mansec="PetscSF", comm=-2080374782, > destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), > view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 > frame #7: 0x00000001049cf820 > libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) > at sf.c:62:3 > frame #8: 0x0000000104cd3024 > libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, > rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) > at zerorows.c:36:5 > frame #9: 0x000000010504ea50 > libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, > rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, > b=0x0000000000000000) at mpiaij.c:768:3 > frame #10: 0x0000000104d95fac > libpetsc.3.17.dylib`MatZeroRows(mat=0x00000001170c1270, numRows=1, > rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, > b=0x0000000000000000) at matrix.c:5935:3 > frame #11: 0x000000010067d320 > fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016f91c178, > omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, > u_bc_coefs=0x000000016f91c3a8, data_time=NaN, num_dofs_per_proc=size=3, > u_dof_index_idx=27, p_dof_index_idx=28, > patch_level=Pointer > @ 0x000000016f914ed0, > mu_interp_type=VC_HARMONIC_INTERP) at > AcousticStreamingPETScMatUtilities.cpp:794:36 > frame #12: 0x0000000100694bdc > fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016f91c028, > x=0x000000016f91d788, (null)=0x000000016f91d690) at > FOAcousticStreamingPETScLevelSolver.cpp:149:5 > frame #13: 0x000000010083232c > fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016f91c028, > x=0x000000016f91d788, b=0x000000016f91d690) at PETScLevelSolver.cpp:340:5 > frame #14: 0x00000001004eb230 > fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016f91f460) at > fo_acoustic_streaming_solver.cpp:400:22 > frame #15: 0x0000000189fbbf28 dyld`start + 2236 > > > Any suggestions on how to avoid this barrier? Here are all MAT options I > am using (in the debug mode), if that is helpful: > > > https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L453-L458 > > Thanks, > -- > --Amneet > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Tue Nov 28 23:57:35 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Tue, 28 Nov 2023 21:57:35 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> Message-ID: I added that option but the code still gets stuck at the same call MatZeroRows with 3 processors. On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla wrote: > > > On Tue, Nov 28, 2023 at 6:42?PM Barry Smith wrote: > >> >> for (int comp = 0; comp < 2; ++comp) >> { >> ....... >> for (Box::Iterator bc(bc_coef_box); bc; bc++) >> { >> ...... >> if (IBTK::abs_equal_eps(b, 0.0)) >> { >> const double diag_value = a; >> ierr = MatZeroRows(mat, 1, &u_dof_index, >> diag_value, NULL, NULL); >> IBTK_CHKERRQ(ierr); >> } >> } >> } >> >> In general, this code will not work because each process calls >> MatZeroRows a different number of times, so it cannot match up with all the >> processes. >> >> If u_dof_index is always local to the current process, you can call >> MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop >> and >> the MatZeroRows will not synchronize across the MPI processes (since it >> does not need to and you told it that). >> > > Yes, u_dof_index is going to be local and I put a check on it a few lines > before calling MatZeroRows. > > Can MatSetOption() be called after the matrix has been assembled? > > >> If the u_dof_index will not always be local, then you need, on each >> process, to list all the u_dof_index for each process in an array and then >> call MatZeroRows() >> once after the loop so it can exchange the needed information with the >> other MPI processes to get the row indices to the right place. >> >> Barry >> >> >> >> >> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla wrote: >> >> >> Hi Folks, >> >> I am using MatZeroRows() to set Dirichlet boundary conditions. This works >> fine for the serial run and the solver produces correct results (verified >> through analytical solution). However, when I run the case in parallel, the >> simulation gets stuck at MatZeroRows(). My understanding is that this >> function needs to be called after the MatAssemblyBegin{End}() has been >> called, and should be called by all processors. Here is that bit of the >> code which calls MatZeroRows() after the matrix has been assembled >> >> >> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >> >> I ran the parallel code (on 3 processors) in the debugger >> (-start_in_debugger). Below is the call stack from the processor that gets >> stuck >> >> amneetb at APSB-MBP-16:~$ lldb -p 4307 >> (lldb) process attach --pid 4307 >> Process 4307 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> libsystem_kernel.dylib`: >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> 0x18a2d7510 <+12>: pacibsp >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> 0x18a2d7518 <+20>: mov x29, sp >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> Architecture set to: arm64-apple-macosx-. >> (lldb) cont >> Process 4307 resuming >> Process 4307 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> frame #0: 0x0000000109d281b8 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >> -> 0x109d281b8 <+400>: ldr w9, [x24] >> 0x109d281bc <+404>: cmp w8, w9 >> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >> 0x109d281c4 <+412>: bl 0x109d28e64 ; >> MPID_Progress_test >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> (lldb) bt >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> * frame #0: 0x0000000109d281b8 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >> frame #1: 0x0000000109d27d14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier >> + 224 >> frame #2: 0x0000000109d27b60 >> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >> frame #5: 0x00000001045ea638 >> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c:235: >> 5 >> frame #6: 0x00000001045f2910 >> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, >> classid=1211227, class_name="PetscSF", descr="Star Forest", >> mansec="PetscSF", comm=-2080374782, >> destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), >> view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 >> frame #7: 0x00000001049cf820 >> libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) >> at sf.c:62:3 >> frame #8: 0x0000000104cd3024 >> libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, >> rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) >> at zerorows.c:36:5 >> frame #9: 0x000000010504ea50 >> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, >> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at mpiaij.c:768:3 >> frame #10: 0x0000000104d95fac >> libpetsc.3.17.dylib`MatZeroRows(mat=0x00000001170c1270, numRows=1, >> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at matrix.c:5935:3 >> frame #11: 0x000000010067d320 >> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016f91c178, >> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >> u_bc_coefs=0x000000016f91c3a8, data_time=NaN, num_dofs_per_proc=size=3, >> u_dof_index_idx=27, p_dof_index_idx=28, >> patch_level=Pointer > @ 0x000000016f914ed0, >> mu_interp_type=VC_HARMONIC_INTERP) at >> AcousticStreamingPETScMatUtilities.cpp:794:36 >> frame #12: 0x0000000100694bdc >> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016f91c028, >> x=0x000000016f91d788, (null)=0x000000016f91d690) at >> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >> frame #13: 0x000000010083232c >> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016f91c028, >> x=0x000000016f91d788, b=0x000000016f91d690) at PETScLevelSolver.cpp:340:5 >> frame #14: 0x00000001004eb230 >> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016f91f460) at >> fo_acoustic_streaming_solver.cpp:400:22 >> frame #15: 0x0000000189fbbf28 dyld`start + 2236 >> >> >> Any suggestions on how to avoid this barrier? Here are all MAT options I >> am using (in the debug mode), if that is helpful: >> >> >> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L453-L458 >> >> Thanks, >> -- >> --Amneet >> >> >> >> >> -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinw3 at vt.edu Wed Nov 29 07:59:47 2023 From: kevinw3 at vt.edu (Kevin G. Wang) Date: Wed, 29 Nov 2023 08:59:47 -0500 Subject: [petsc-users] Reading VTK files in PETSc Message-ID: Good morning everyone. I use the following functions to output parallel vectors --- "globalVec" in this example --- to VTK files. It works well, and is quite convenient. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ PetscViewer viewer; PetscViewerVTKOpen(PetscObjectComm((PetscObject)*dm), filename, FILE_MODE_WRITE, &viewer); VecView(globalVec, viewer); PetscViewerDestroy(&viewer); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Now, I am trying to do the opposite. I would like to read the VTK files generated by PETSc back into memory, and assign each one to a Vec. Could someone let me know how this can be done? Thanks! Kevin -- Kevin G. Wang, Ph.D. Associate Professor Kevin T. Crofton Department of Aerospace and Ocean Engineering Virginia Tech 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 Office: (540) 231-7547 | Mobile: (650) 862-2663 URL: https://www.aoe.vt.edu/people/faculty/wang.html Codes: https://github.com/kevinwgy -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 29 09:21:47 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 29 Nov 2023 10:21:47 -0500 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> Message-ID: > On Nov 29, 2023, at 1:16?AM, Amneet Bhalla wrote: > > BTW, I think you meant using MatSetOption(mat, MAT_NO_OFF_PROC_ZERO_ROWS, PETSC_TRUE) Yes > instead ofMatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES, PETSC_TRUE) ?? Please try setting both flags. > However, that also did not help to overcome the MPI Barrier issue. If there is still a problem please trap all the MPI processes when they hang in the debugger and send the output from using bt on all of them. This way we can see the different places the different MPI processes are stuck at. > > On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla > wrote: >> I added that option but the code still gets stuck at the same call MatZeroRows with 3 processors. >> >> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla > wrote: >>> >>> >>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith > wrote: >>>> >>>> for (int comp = 0; comp < 2; ++comp) >>>> { >>>> ....... >>>> for (Box::Iterator bc(bc_coef_box); bc; bc++) >>>> { >>>> ...... >>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>> { >>>> const double diag_value = a; >>>> ierr = MatZeroRows(mat, 1, &u_dof_index, diag_value, NULL, NULL); >>>> IBTK_CHKERRQ(ierr); >>>> } >>>> } >>>> } >>>> >>>> In general, this code will not work because each process calls MatZeroRows a different number of times, so it cannot match up with all the processes. >>>> >>>> If u_dof_index is always local to the current process, you can call MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop and >>>> the MatZeroRows will not synchronize across the MPI processes (since it does not need to and you told it that). >>> >>> Yes, u_dof_index is going to be local and I put a check on it a few lines before calling MatZeroRows. >>> >>> Can MatSetOption() be called after the matrix has been assembled? >>> >>>> >>>> If the u_dof_index will not always be local, then you need, on each process, to list all the u_dof_index for each process in an array and then call MatZeroRows() >>>> once after the loop so it can exchange the needed information with the other MPI processes to get the row indices to the right place. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla > wrote: >>>>> >>>>> >>>>> Hi Folks, >>>>> >>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This works fine for the serial run and the solver produces correct results (verified through analytical solution). However, when I run the case in parallel, the simulation gets stuck at MatZeroRows(). My understanding is that this function needs to be called after the MatAssemblyBegin{End}() has been called, and should be called by all processors. Here is that bit of the code which calls MatZeroRows() after the matrix has been assembled >>>>> >>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>> >>>>> I ran the parallel code (on 3 processors) in the debugger (-start_in_debugger). Below is the call stack from the processor that gets stuck >>>>> >>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>> (lldb) process attach --pid 4307 >>>>> Process 4307 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 >>>>> libsystem_kernel.dylib`: >>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>> 0x18a2d7510 <+12>: pacibsp >>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>> Architecture set to: arm64-apple-macosx-. >>>>> (lldb) cont >>>>> Process 4307 resuming >>>>> Process 4307 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>>> frame #0: 0x0000000109d281b8 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; MPID_Progress_test >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> (lldb) bt >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>>> * frame #0: 0x0000000109d281b8 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>> frame #1: 0x0000000109d27d14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>> frame #2: 0x0000000109d27b60 libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>> frame #5: 0x00000001045ea638 libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c:235:5 >>>>> frame #6: 0x00000001045f2910 libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, classid=1211227, class_name="PetscSF", descr="Star Forest", mansec="PetscSF", comm=-2080374782, destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 >>>>> frame #7: 0x00000001049cf820 libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) at sf.c:62:3 >>>>> frame #8: 0x0000000104cd3024 libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) at zerorows.c:36:5 >>>>> frame #9: 0x000000010504ea50 libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:768:3 >>>>> frame #10: 0x0000000104d95fac libpetsc.3.17.dylib`MatZeroRows(mat=0x00000001170c1270, numRows=1, rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 >>>>> frame #11: 0x000000010067d320 fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016f91c178, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016f91c3a8, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016f914ed0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp:794:36 >>>>> frame #12: 0x0000000100694bdc fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016f91c028, x=0x000000016f91d788, (null)=0x000000016f91d690) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>>> frame #13: 0x000000010083232c fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016f91c028, x=0x000000016f91d788, b=0x000000016f91d690) at PETScLevelSolver.cpp:340:5 >>>>> frame #14: 0x00000001004eb230 fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016f91f460) at fo_acoustic_streaming_solver.cpp:400:22 >>>>> frame #15: 0x0000000189fbbf28 dyld`start + 2236 >>>>> >>>>> >>>>> Any suggestions on how to avoid this barrier? Here are all MAT options I am using (in the debug mode), if that is helpful: >>>>> >>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L453-L458 >>>>> >>>>> Thanks, >>>>> -- >>>>> --Amneet >>>>> >>>>> >>>>> >>>> >> >> >> -- >> --Amneet >> >> >> > > > -- > --Amneet > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sblondel at utk.edu Wed Nov 29 10:03:49 2023 From: sblondel at utk.edu (Blondel, Sophie) Date: Wed, 29 Nov 2023 16:03:49 +0000 Subject: [petsc-users] [Xolotl-psi-development] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: <8734wpe81v.fsf@jedbrown.org> References: <8734wpe81v.fsf@jedbrown.org> Message-ID: Hi Jed, I'm not sure I'm going to reply to your question correctly because I don't really understand how the split is done. Is it related to on diagonal and off diagonal? If so, the off-diagonal part is usually pretty small (less than 20 DOFs) and related to diffusion, the diagonal part involves thousands of DOFs for the reaction term. Let us know what we can do to answer this question more accurately. Cheers, Sophie ________________________________ From: Jed Brown Sent: Tuesday, November 28, 2023 19:07 To: Fackler, Philip ; Junchao Zhang Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net Subject: Re: [Xolotl-psi-development] [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface [Some people who received this message don't often get email from jed at jedbrown.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] "Fackler, Philip via petsc-users" writes: > That makes sense. Here are the arguments that I think are relevant: > > -fieldsplit_1_pc_type redundant -fieldsplit_0_pc_type sor -pc_type fieldsplit -pc_fieldsplit_detect_coupling? What sort of physics are in splits 0 and 1? SOR is not a good GPU algorithm, so we'll want to change that one way or another. Are the splits of similar size or very different? > What would you suggest to make this better? > > Also, note that the cases marked "serial" are running on CPU only, that is, using only the SERIAL backend for kokkos. > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > Sent: Tuesday, November 28, 2023 15:51 > To: Fackler, Philip > Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Hi, Philip, > I opened hpcdb-PSI_9-serial and it seems you used PCLU. Since Kokkos does not have a GPU LU implementation, we do it on CPU via MatLUFactorNumeric_SeqAIJ(). Perhaps you can try other PC types? > > [Screenshot 2023-11-28 at 2.43.03?PM.png] > --Junchao Zhang > > > On Wed, Nov 22, 2023 at 10:43?AM Fackler, Philip > wrote: > I definitely dropped the ball on this. I'm sorry for that. I have new profiling data using the latest (as of yesterday) of petsc/main. I've put them in a single google drive folder linked here: > > https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link > > Have a happy holiday weekend! > > Thanks, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > > Sent: Monday, October 16, 2023 15:24 > To: Fackler, Philip > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Hi, Philip, > That branch was merged to petsc/main today. Let me know once you have new profiling results. > > Thanks. > --Junchao Zhang > > > On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: > Junchao, > > I've attached updated timing plots (red and blue are swapped from before; yellow is the new one). There is an improvement for the NE_3 case only with CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI cases, MatShift doesn't show up (I assume because we're using different preconditioner arguments). So, there must be some other primary culprit. I'll try to get updated profiling data to you soon. > > Thanks, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Fackler, Philip via Xolotl-psi-development > > Sent: Wednesday, October 11, 2023 11:31 > To: Junchao Zhang > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > > Subject: Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > I'm on it. > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > > Sent: Wednesday, October 11, 2023 10:14 > To: Fackler, Philip > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Hi, Philip, > Could you try this branch jczhang/2023-10-05/feature-support-matshift-aijkokkos ? > > Thanks. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip > wrote: > Aha! That makes sense. Thank you. > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > > Sent: Thursday, October 5, 2023 17:29 > To: Fackler, Philip > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > > Subject: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Wait a moment, it seems it was because we do not have a GPU implementation of MatShift... > Let me see how to add it. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: > Hi, Philip, > I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() instead of the COO interface? MatSetValues() needs to copy the data from device to host and thus is expensive. > Do you have profiling results with COO enabled? > > [Screenshot 2023-10-05 at 10.55.29?AM.png] > > > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: > Hi, Philip, > I will look into the tarballs and get back to you. > Thanks. > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users > wrote: > We finally have xolotl ported to use the new COO interface and the aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port to our previous version (using MatSetValuesStencil and the default Mat and Vec implementations), we expected to see an improvement in performance for both the "serial" and "cuda" builds (here I'm referring to the kokkos configuration). > > Attached are two plots that show timings for three different cases. All of these were run on Ascent (the Summit-like training system) with 6 MPI tasks (on a single node). The CUDA cases were given one GPU per task (and used CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent as possible. > > The performance of RHSJacobian (where the bulk of computation happens in xolotl) behaved basically as expected (better than expected in the serial build). NE_3 case in CUDA was the only one that performed worse, but not surprisingly, since its workload for the GPUs is much smaller. We've still got more optimization to do on this. > > The real surprise was how much worse the overall solve times were. This seems to be due simply to switching to the kokkos-based implementation. I'm wondering if there are any changes we can make in configuration or runtime arguments to help with PETSc's performance here. Any help looking into this would be appreciated. > > The tarballs linked here and here are profiling databases which, once extracted, can be viewed with hpcviewer. I don't know how helpful that will be, but hopefully it can give you some direction. > > Thanks for your help, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory _______________________________________________ Xolotl-psi-development mailing list Xolotl-psi-development at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xolotl-psi-development -------------- next part -------------- An HTML attachment was scrubbed... URL: From facklerpw at ornl.gov Wed Nov 29 10:06:52 2023 From: facklerpw at ornl.gov (Fackler, Philip) Date: Wed, 29 Nov 2023 16:06:52 +0000 Subject: [petsc-users] [Xolotl-psi-development] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: References: <8734wpe81v.fsf@jedbrown.org> Message-ID: I'm sorry for the extra confusion. I copied those arguments from the wrong place. We're actually using jacobi? instead of sor? for fieldsplit 0. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: Blondel, Sophie Sent: Wednesday, November 29, 2023 11:03 To: Brown, Jed ; Fackler, Philip ; Junchao Zhang Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net Subject: Re: [Xolotl-psi-development] [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface Hi Jed, I'm not sure I'm going to reply to your question correctly because I don't really understand how the split is done. Is it related to on diagonal and off diagonal? If so, the off-diagonal part is usually pretty small (less than 20 DOFs) and related to diffusion, the diagonal part involves thousands of DOFs for the reaction term. Let us know what we can do to answer this question more accurately. Cheers, Sophie ________________________________ From: Jed Brown Sent: Tuesday, November 28, 2023 19:07 To: Fackler, Philip ; Junchao Zhang Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net Subject: Re: [Xolotl-psi-development] [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface [Some people who received this message don't often get email from jed at jedbrown.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] "Fackler, Philip via petsc-users" writes: > That makes sense. Here are the arguments that I think are relevant: > > -fieldsplit_1_pc_type redundant -fieldsplit_0_pc_type sor -pc_type fieldsplit -pc_fieldsplit_detect_coupling? What sort of physics are in splits 0 and 1? SOR is not a good GPU algorithm, so we'll want to change that one way or another. Are the splits of similar size or very different? > What would you suggest to make this better? > > Also, note that the cases marked "serial" are running on CPU only, that is, using only the SERIAL backend for kokkos. > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > Sent: Tuesday, November 28, 2023 15:51 > To: Fackler, Philip > Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Hi, Philip, > I opened hpcdb-PSI_9-serial and it seems you used PCLU. Since Kokkos does not have a GPU LU implementation, we do it on CPU via MatLUFactorNumeric_SeqAIJ(). Perhaps you can try other PC types? > > [Screenshot 2023-11-28 at 2.43.03?PM.png] > --Junchao Zhang > > > On Wed, Nov 22, 2023 at 10:43?AM Fackler, Philip > wrote: > I definitely dropped the ball on this. I'm sorry for that. I have new profiling data using the latest (as of yesterday) of petsc/main. I've put them in a single google drive folder linked here: > > https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link > > Have a happy holiday weekend! > > Thanks, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > > Sent: Monday, October 16, 2023 15:24 > To: Fackler, Philip > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Hi, Philip, > That branch was merged to petsc/main today. Let me know once you have new profiling results. > > Thanks. > --Junchao Zhang > > > On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: > Junchao, > > I've attached updated timing plots (red and blue are swapped from before; yellow is the new one). There is an improvement for the NE_3 case only with CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI cases, MatShift doesn't show up (I assume because we're using different preconditioner arguments). So, there must be some other primary culprit. I'll try to get updated profiling data to you soon. > > Thanks, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Fackler, Philip via Xolotl-psi-development > > Sent: Wednesday, October 11, 2023 11:31 > To: Junchao Zhang > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > > Subject: Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > I'm on it. > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > > Sent: Wednesday, October 11, 2023 10:14 > To: Fackler, Philip > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > > Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Hi, Philip, > Could you try this branch jczhang/2023-10-05/feature-support-matshift-aijkokkos ? > > Thanks. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip > wrote: > Aha! That makes sense. Thank you. > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory > ________________________________ > From: Junchao Zhang > > Sent: Thursday, October 5, 2023 17:29 > To: Fackler, Philip > > Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > > Subject: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface > > Wait a moment, it seems it was because we do not have a GPU implementation of MatShift... > Let me see how to add it. > --Junchao Zhang > > > On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: > Hi, Philip, > I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() instead of the COO interface? MatSetValues() needs to copy the data from device to host and thus is expensive. > Do you have profiling results with COO enabled? > > [Screenshot 2023-10-05 at 10.55.29?AM.png] > > > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: > Hi, Philip, > I will look into the tarballs and get back to you. > Thanks. > --Junchao Zhang > > > On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users > wrote: > We finally have xolotl ported to use the new COO interface and the aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port to our previous version (using MatSetValuesStencil and the default Mat and Vec implementations), we expected to see an improvement in performance for both the "serial" and "cuda" builds (here I'm referring to the kokkos configuration). > > Attached are two plots that show timings for three different cases. All of these were run on Ascent (the Summit-like training system) with 6 MPI tasks (on a single node). The CUDA cases were given one GPU per task (and used CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent as possible. > > The performance of RHSJacobian (where the bulk of computation happens in xolotl) behaved basically as expected (better than expected in the serial build). NE_3 case in CUDA was the only one that performed worse, but not surprisingly, since its workload for the GPUs is much smaller. We've still got more optimization to do on this. > > The real surprise was how much worse the overall solve times were. This seems to be due simply to switching to the kokkos-based implementation. I'm wondering if there are any changes we can make in configuration or runtime arguments to help with PETSc's performance here. Any help looking into this would be appreciated. > > The tarballs linked here and here are profiling databases which, once extracted, can be viewed with hpcviewer. I don't know how helpful that will be, but hopefully it can give you some direction. > > Thanks for your help, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory _______________________________________________ Xolotl-psi-development mailing list Xolotl-psi-development at lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xolotl-psi-development -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Wed Nov 29 10:50:45 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Wed, 29 Nov 2023 08:50:45 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> Message-ID: Ok, I added both, but it still hangs. Here, is bt from all three tasks: Task 1: amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 (lldb) process attach --pid 44691 Process 44691 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 libsystem_kernel.dylib`: -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> 0x18a2d7510 <+12>: pacibsp 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! 0x18a2d7518 <+20>: mov x29, sp Target 0: (fo_acoustic_streaming_solver_2d) stopped. Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". Architecture set to: arm64-apple-macosx-. (lldb) cont Process 44691 resuming Process 44691 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x000000010ba40b60 libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: -> 0x10ba40b60 <+752>: add w8, w8, #0x1 0x10ba40b64 <+756>: ldr w9, [x22] 0x10ba40b68 <+760>: cmp w8, w9 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> Target 0: (fo_acoustic_streaming_solver_2d) stopped. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x000000010ba40b60 libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 frame #1: 0x000000010ba48528 libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 frame #2: 0x000000010ba47964 libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 frame #5: 0x0000000106d67650 libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:827:3 frame #6: 0x0000000106aadfac libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 frame #7: 0x00000001023952d0 fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016dbfcec0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp :799:36 frame #8: 0x00000001023acb8c fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, x=0x000000016dc05778, (null)=0x000000016dc05680) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 frame #9: 0x000000010254a2dc fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340:5 frame #10: 0x0000000102202e5c fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at fo_acoustic_streaming_solver.cpp:400:22 frame #11: 0x0000000189fbbf28 dyld`start + 2236 (lldb) Task 2: amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 (lldb) process attach --pid 44692 Process 44692 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 libsystem_kernel.dylib`: -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> 0x18a2d7510 <+12>: pacibsp 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! 0x18a2d7518 <+20>: mov x29, sp Target 0: (fo_acoustic_streaming_solver_2d) stopped. Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". Architecture set to: arm64-apple-macosx-. (lldb) cont Process 44692 resuming Process 44692 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x000000010e5a022c libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] 0x10e5a0230 <+520>: cmp x9, x10 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> 0x10e5a0238 <+528>: add w8, w8, #0x1 Target 0: (fo_acoustic_streaming_solver_2d) stopped. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x000000010e5a022c libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 frame #1: 0x000000010e59fd14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 frame #2: 0x000000010e59fb60 libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 frame #5: 0x0000000108e62638 libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c:235:5 frame #6: 0x0000000108e6a910 libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 frame #7: 0x000000010aa28010 libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at itcreate.c:679:3 frame #8: 0x00000001050aa2f4 fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344:12 frame #9: 0x0000000104d62e5c fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at fo_acoustic_streaming_solver.cpp:400:22 frame #10: 0x0000000189fbbf28 dyld`start + 2236 (lldb) Task 3: amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 (lldb) process attach --pid 44693 Process 44693 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 libsystem_kernel.dylib`: -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> 0x18a2d7510 <+12>: pacibsp 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! 0x18a2d7518 <+20>: mov x29, sp Target 0: (fo_acoustic_streaming_solver_2d) stopped. Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". Architecture set to: arm64-apple-macosx-. (lldb) cont Process 44693 resuming Process 44693 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x000000010e59c68c libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: -> 0x10e59c68c <+952>: ldr w9, [x21] 0x10e59c690 <+956>: cmp w8, w9 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> 0x10e59c698 <+964>: bl 0x10e59ce64 ; MPID_Progress_test Target 0: (fo_acoustic_streaming_solver_2d) stopped. (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP * frame #0: 0x000000010e59c68c libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 frame #1: 0x000000010e5a44bc libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 frame #2: 0x000000010e5a3964 libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 frame #5: 0x00000001098c3650 libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:827:3 frame #6: 0x0000000109609fac libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 frame #7: 0x0000000104ef12d0 fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016b0a0ec0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp :799:36 frame #8: 0x0000000104f08b8c fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, x=0x000000016b0a9778, (null)=0x000000016b0a9680) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 frame #9: 0x00000001050a62dc fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340:5 frame #10: 0x0000000104d5ee5c fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at fo_acoustic_streaming_solver.cpp:400:22 frame #11: 0x0000000189fbbf28 dyld`start + 2236 (lldb) On Wed, Nov 29, 2023 at 7:22?AM Barry Smith wrote: > > > On Nov 29, 2023, at 1:16?AM, Amneet Bhalla wrote: > > BTW, I think you meant using MatSetOption(mat, *MAT_NO_OFF_PROC_ZERO_ROWS*, > PETSC_TRUE) > > > Yes > > instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) ?? > > > Please try setting both flags. > > However, that also did not help to overcome the MPI Barrier issue. > > > If there is still a problem please trap all the MPI processes when they > hang in the debugger and send the output from using bt on all of them. This > way > we can see the different places the different MPI processes are stuck at. > > > > On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla > wrote: > >> I added that option but the code still gets stuck at the same call >> MatZeroRows with 3 processors. >> >> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla >> wrote: >> >>> >>> >>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith wrote: >>> >>>> >>>> for (int comp = 0; comp < 2; ++comp) >>>> { >>>> ....... >>>> for (Box::Iterator bc(bc_coef_box); bc; bc++) >>>> { >>>> ...... >>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>> { >>>> const double diag_value = a; >>>> ierr = MatZeroRows(mat, 1, &u_dof_index, >>>> diag_value, NULL, NULL); >>>> IBTK_CHKERRQ(ierr); >>>> } >>>> } >>>> } >>>> >>>> In general, this code will not work because each process calls >>>> MatZeroRows a different number of times, so it cannot match up with all the >>>> processes. >>>> >>>> If u_dof_index is always local to the current process, you can call >>>> MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop >>>> and >>>> the MatZeroRows will not synchronize across the MPI processes (since it >>>> does not need to and you told it that). >>>> >>> >>> Yes, u_dof_index is going to be local and I put a check on it a few >>> lines before calling MatZeroRows. >>> >>> Can MatSetOption() be called after the matrix has been assembled? >>> >>> >>>> If the u_dof_index will not always be local, then you need, on each >>>> process, to list all the u_dof_index for each process in an array and then >>>> call MatZeroRows() >>>> once after the loop so it can exchange the needed information with the >>>> other MPI processes to get the row indices to the right place. >>>> >>>> Barry >>>> >>>> >>>> >>>> >>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla >>>> wrote: >>>> >>>> >>>> Hi Folks, >>>> >>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This >>>> works fine for the serial run and the solver produces correct results >>>> (verified through analytical solution). However, when I run the case in >>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>> been called, and should be called by all processors. Here is that bit of >>>> the code which calls MatZeroRows() after the matrix has been assembled >>>> >>>> >>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>> >>>> I ran the parallel code (on 3 processors) in the debugger >>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>> stuck >>>> >>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>> (lldb) process attach --pid 4307 >>>> Process 4307 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> frame #0: 0x000000018a2d750c >>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>> libsystem_kernel.dylib`: >>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>> 0x18a2d7510 <+12>: pacibsp >>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>> 0x18a2d7518 <+20>: mov x29, sp >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> Executable module set to >>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>> Architecture set to: arm64-apple-macosx-. >>>> (lldb) cont >>>> Process 4307 resuming >>>> Process 4307 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> frame #0: 0x0000000109d281b8 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>> 0x109d281bc <+404>: cmp w8, w9 >>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>> MPID_Progress_test >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> (lldb) bt >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> * frame #0: 0x0000000109d281b8 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>> frame #1: 0x0000000109d27d14 >>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>> frame #2: 0x0000000109d27b60 >>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>> frame #5: 0x00000001045ea638 >>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c: >>>> 235:5 >>>> frame #6: 0x00000001045f2910 >>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, >>>> classid=1211227, class_name="PetscSF", descr="Star Forest", >>>> mansec="PetscSF", comm=-2080374782, >>>> destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), >>>> view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 >>>> frame #7: 0x00000001049cf820 >>>> libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) >>>> at sf.c:62:3 >>>> frame #8: 0x0000000104cd3024 >>>> libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, >>>> rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) >>>> at zerorows.c:36:5 >>>> frame #9: 0x000000010504ea50 >>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, >>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>> b=0x0000000000000000) at mpiaij.c:768:3 >>>> frame #10: 0x0000000104d95fac >>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x00000001170c1270, numRows=1, >>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>> b=0x0000000000000000) at matrix.c:5935:3 >>>> frame #11: 0x000000010067d320 >>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016f91c178, >>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>> u_bc_coefs=0x000000016f91c3a8, data_time=NaN, num_dofs_per_proc=size=3, >>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>> patch_level=Pointer > @ 0x000000016f914ed0, >>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>> AcousticStreamingPETScMatUtilities.cpp:794:36 >>>> frame #12: 0x0000000100694bdc >>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016f91c028, >>>> x=0x000000016f91d788, (null)=0x000000016f91d690) at >>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>> frame #13: 0x000000010083232c >>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016f91c028, >>>> x=0x000000016f91d788, b=0x000000016f91d690) at PETScLevelSolver.cpp:340 >>>> :5 >>>> frame #14: 0x00000001004eb230 >>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016f91f460) at >>>> fo_acoustic_streaming_solver.cpp:400:22 >>>> frame #15: 0x0000000189fbbf28 dyld`start + 2236 >>>> >>>> >>>> Any suggestions on how to avoid this barrier? Here are all MAT options >>>> I am using (in the debug mode), if that is helpful: >>>> >>>> >>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L453-L458 >>>> >>>> Thanks, >>>> -- >>>> --Amneet >>>> >>>> >>>> >>>> >>>> >> >> -- >> --Amneet >> >> >> >> > > -- > --Amneet > > > > > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Nov 29 12:02:30 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 29 Nov 2023 13:02:30 -0500 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> Message-ID: On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla wrote: > Ok, I added both, but it still hangs. Here, is bt from all three tasks: > It looks like two processes are calling AllReduce, but one is not. Are all procs not calling MatZeroRows? Thanks, Matt > Task 1: > > amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 > > (lldb) process attach --pid 44691 > > Process 44691 stopped > > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > > frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal > + 8 > > libsystem_kernel.dylib`: > > -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> > > 0x18a2d7510 <+12>: pacibsp > > 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! > > 0x18a2d7518 <+20>: mov x29, sp > > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > > Executable module set to > "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". > > Architecture set to: arm64-apple-macosx-. > > (lldb) cont > > Process 44691 resuming > > Process 44691 stopped > > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > > frame #0: 0x000000010ba40b60 > libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 > > libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: > > -> 0x10ba40b60 <+752>: add w8, w8, #0x1 > > 0x10ba40b64 <+756>: ldr w9, [x22] > > 0x10ba40b68 <+760>: cmp w8, w9 > > 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> > > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > > (lldb) bt > > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > > * frame #0: 0x000000010ba40b60 > libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 > > frame #1: 0x000000010ba48528 > libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 > > frame #2: 0x000000010ba47964 > libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 > > frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 > > frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 > > frame #5: 0x0000000106d67650 > libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, > rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, > b=0x0000000000000000) at mpiaij.c:827:3 > > frame #6: 0x0000000106aadfac > libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, > rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, > b=0x0000000000000000) at matrix.c:5935:3 > > frame #7: 0x00000001023952d0 > fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, > omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, > u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, > u_dof_index_idx=27, p_dof_index_idx=28, > patch_level=Pointer > @ 0x000000016dbfcec0, > mu_interp_type=VC_HARMONIC_INTERP) at > AcousticStreamingPETScMatUtilities.cpp:799:36 > > frame #8: 0x00000001023acb8c > fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, > x=0x000000016dc05778, (null)=0x000000016dc05680) at > FOAcousticStreamingPETScLevelSolver.cpp:149:5 > > frame #9: 0x000000010254a2dc > fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, > x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340:5 > > frame #10: 0x0000000102202e5c > fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at > fo_acoustic_streaming_solver.cpp:400:22 > > frame #11: 0x0000000189fbbf28 dyld`start + 2236 > > (lldb) > > > Task 2: > > amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 > > (lldb) process attach --pid 44692 > > Process 44692 stopped > > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > > frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal > + 8 > > libsystem_kernel.dylib`: > > -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> > > 0x18a2d7510 <+12>: pacibsp > > 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! > > 0x18a2d7518 <+20>: mov x29, sp > > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > > Executable module set to > "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". > > Architecture set to: arm64-apple-macosx-. > > (lldb) cont > > Process 44692 resuming > > Process 44692 stopped > > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > > frame #0: 0x000000010e5a022c > libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 > > libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: > > -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] > > 0x10e5a0230 <+520>: cmp x9, x10 > > 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> > > 0x10e5a0238 <+528>: add w8, w8, #0x1 > > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > > (lldb) bt > > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > > * frame #0: 0x000000010e5a022c > libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 > > frame #1: 0x000000010e59fd14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + > 224 > > frame #2: 0x000000010e59fb60 > libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 > > frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 > > frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 > > frame #5: 0x0000000108e62638 > libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, > comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c:235:5 > > frame #6: 0x0000000108e6a910 > libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, > classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", > comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), > view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 > > frame #7: 0x000000010aa28010 > libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at > itcreate.c:679:3 > > frame #8: 0x00000001050aa2f4 > fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, > x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344:12 > > frame #9: 0x0000000104d62e5c > fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at > fo_acoustic_streaming_solver.cpp:400:22 > > frame #10: 0x0000000189fbbf28 dyld`start + 2236 > > (lldb) > > > Task 3: > > amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 > > (lldb) process attach --pid 44693 > > Process 44693 stopped > > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > > frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal > + 8 > > libsystem_kernel.dylib`: > > -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> > > 0x18a2d7510 <+12>: pacibsp > > 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! > > 0x18a2d7518 <+20>: mov x29, sp > > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > > Executable module set to > "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". > > Architecture set to: arm64-apple-macosx-. > > (lldb) cont > > Process 44693 resuming > > Process 44693 stopped > > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > > frame #0: 0x000000010e59c68c > libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 > > libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: > > -> 0x10e59c68c <+952>: ldr w9, [x21] > > 0x10e59c690 <+956>: cmp w8, w9 > > 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> > > 0x10e59c698 <+964>: bl 0x10e59ce64 ; > MPID_Progress_test > > Target 0: (fo_acoustic_streaming_solver_2d) stopped. > > (lldb) bt > > * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP > > * frame #0: 0x000000010e59c68c > libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 > > frame #1: 0x000000010e5a44bc > libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 > > frame #2: 0x000000010e5a3964 > libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 > > frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 > > frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 > > frame #5: 0x00000001098c3650 > libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, > rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, > b=0x0000000000000000) at mpiaij.c:827:3 > > frame #6: 0x0000000109609fac > libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, > rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, > b=0x0000000000000000) at matrix.c:5935:3 > > frame #7: 0x0000000104ef12d0 > fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, > omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, > u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, > u_dof_index_idx=27, p_dof_index_idx=28, > patch_level=Pointer > @ 0x000000016b0a0ec0, > mu_interp_type=VC_HARMONIC_INTERP) at > AcousticStreamingPETScMatUtilities.cpp:799:36 > > frame #8: 0x0000000104f08b8c > fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, > x=0x000000016b0a9778, (null)=0x000000016b0a9680) at > FOAcousticStreamingPETScLevelSolver.cpp:149:5 > > frame #9: 0x00000001050a62dc > fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, > x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340:5 > > frame #10: 0x0000000104d5ee5c > fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at > fo_acoustic_streaming_solver.cpp:400:22 > > frame #11: 0x0000000189fbbf28 dyld`start + 2236 > > (lldb) > > > On Wed, Nov 29, 2023 at 7:22?AM Barry Smith wrote: > >> >> >> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla wrote: >> >> BTW, I think you meant using MatSetOption(mat, >> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >> >> >> Yes >> >> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) ?? >> >> >> Please try setting both flags. >> >> However, that also did not help to overcome the MPI Barrier issue. >> >> >> If there is still a problem please trap all the MPI processes when they >> hang in the debugger and send the output from using bt on all of them. This >> way >> we can see the different places the different MPI processes are stuck at. >> >> >> >> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla >> wrote: >> >>> I added that option but the code still gets stuck at the same call >>> MatZeroRows with 3 processors. >>> >>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla >>> wrote: >>> >>>> >>>> >>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith wrote: >>>> >>>>> >>>>> for (int comp = 0; comp < 2; ++comp) >>>>> { >>>>> ....... >>>>> for (Box::Iterator bc(bc_coef_box); bc; bc++) >>>>> { >>>>> ...... >>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>> { >>>>> const double diag_value = a; >>>>> ierr = MatZeroRows(mat, 1, &u_dof_index, >>>>> diag_value, NULL, NULL); >>>>> IBTK_CHKERRQ(ierr); >>>>> } >>>>> } >>>>> } >>>>> >>>>> In general, this code will not work because each process calls >>>>> MatZeroRows a different number of times, so it cannot match up with all the >>>>> processes. >>>>> >>>>> If u_dof_index is always local to the current process, you can call >>>>> MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop >>>>> and >>>>> the MatZeroRows will not synchronize across the MPI processes (since >>>>> it does not need to and you told it that). >>>>> >>>> >>>> Yes, u_dof_index is going to be local and I put a check on it a few >>>> lines before calling MatZeroRows. >>>> >>>> Can MatSetOption() be called after the matrix has been assembled? >>>> >>>> >>>>> If the u_dof_index will not always be local, then you need, on each >>>>> process, to list all the u_dof_index for each process in an array and then >>>>> call MatZeroRows() >>>>> once after the loop so it can exchange the needed information with the >>>>> other MPI processes to get the row indices to the right place. >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla >>>>> wrote: >>>>> >>>>> >>>>> Hi Folks, >>>>> >>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This >>>>> works fine for the serial run and the solver produces correct results >>>>> (verified through analytical solution). However, when I run the case in >>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>>> been called, and should be called by all processors. Here is that bit of >>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>> >>>>> >>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>> >>>>> I ran the parallel code (on 3 processors) in the debugger >>>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>>> stuck >>>>> >>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>> (lldb) process attach --pid 4307 >>>>> Process 4307 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> frame #0: 0x000000018a2d750c >>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>> libsystem_kernel.dylib`: >>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>> 0x18a2d7510 <+12>: pacibsp >>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> Executable module set to >>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>> Architecture set to: arm64-apple-macosx-. >>>>> (lldb) cont >>>>> Process 4307 resuming >>>>> Process 4307 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> frame #0: 0x0000000109d281b8 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>> MPID_Progress_test >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> (lldb) bt >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> * frame #0: 0x0000000109d281b8 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>> frame #1: 0x0000000109d27d14 >>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>> frame #2: 0x0000000109d27b60 >>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>> frame #5: 0x00000001045ea638 >>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c: >>>>> 235:5 >>>>> frame #6: 0x00000001045f2910 >>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, >>>>> classid=1211227, class_name="PetscSF", descr="Star Forest", >>>>> mansec="PetscSF", comm=-2080374782, >>>>> destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), >>>>> view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 >>>>> frame #7: 0x00000001049cf820 >>>>> libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) >>>>> at sf.c:62:3 >>>>> frame #8: 0x0000000104cd3024 >>>>> libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, >>>>> rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) >>>>> at zerorows.c:36:5 >>>>> frame #9: 0x000000010504ea50 >>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, >>>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>>> b=0x0000000000000000) at mpiaij.c:768:3 >>>>> frame #10: 0x0000000104d95fac >>>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x00000001170c1270, numRows=1, >>>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>>> b=0x0000000000000000) at matrix.c:5935:3 >>>>> frame #11: 0x000000010067d320 >>>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016f91c178, >>>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>>> u_bc_coefs=0x000000016f91c3a8, data_time=NaN, num_dofs_per_proc=size=3, >>>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>>> patch_level=Pointer > @ 0x000000016f914ed0, >>>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>>> AcousticStreamingPETScMatUtilities.cpp:794:36 >>>>> frame #12: 0x0000000100694bdc >>>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016f91c028, >>>>> x=0x000000016f91d788, (null)=0x000000016f91d690) at >>>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>>> frame #13: 0x000000010083232c >>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016f91c028, >>>>> x=0x000000016f91d788, b=0x000000016f91d690) at PETScLevelSolver.cpp: >>>>> 340:5 >>>>> frame #14: 0x00000001004eb230 >>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016f91f460) at >>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>> frame #15: 0x0000000189fbbf28 dyld`start + 2236 >>>>> >>>>> >>>>> Any suggestions on how to avoid this barrier? Here are all MAT options >>>>> I am using (in the debug mode), if that is helpful: >>>>> >>>>> >>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L453-L458 >>>>> >>>>> Thanks, >>>>> -- >>>>> --Amneet >>>>> >>>>> >>>>> >>>>> >>>>> >>> >>> -- >>> --Amneet >>> >>> >>> >>> >> >> -- >> --Amneet >> >> >> >> >> > > -- > --Amneet > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 29 12:50:37 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 29 Nov 2023 13:50:37 -0500 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> Message-ID: <6A3DA004-D944-43D2-A66C-3075F902883E@petsc.dev> What PETSc version are you using? > On Nov 29, 2023, at 1:02?PM, Matthew Knepley wrote: > > On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla > wrote: >> Ok, I added both, but it still hangs. Here, is bt from all three tasks: > > It looks like two processes are calling AllReduce, but one is not. Are all procs not calling MatZeroRows? > > Thanks, > > Matt > >> Task 1: >> >> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >> (lldb) process attach --pid 44691 >> Process 44691 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 >> libsystem_kernel.dylib`: >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> 0x18a2d7510 <+12>: pacibsp >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> 0x18a2d7518 <+20>: mov x29, sp >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> Architecture set to: arm64-apple-macosx-. >> (lldb) cont >> Process 44691 resuming >> Process 44691 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >> frame #0: 0x000000010ba40b60libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >> 0x10ba40b64 <+756>: ldr w9, [x22] >> 0x10ba40b68 <+760>: cmp w8, w9 >> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> (lldb) bt >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >> * frame #0: 0x000000010ba40b60libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >> frame #1: 0x000000010ba48528libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 >> frame #2: 0x000000010ba47964libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >> frame #5: 0x0000000106d67650libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:827:3 >> frame #6: 0x0000000106aadfaclibpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 >> frame #7: 0x00000001023952d0fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016dbfcec0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp:799:36 >> frame #8: 0x00000001023acb8cfo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, x=0x000000016dc05778, (null)=0x000000016dc05680) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 >> frame #9: 0x000000010254a2dcfo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340:5 >> frame #10: 0x0000000102202e5c fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at fo_acoustic_streaming_solver.cpp:400:22 >> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >> (lldb) >> >> >> Task 2: >> >> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >> (lldb) process attach --pid 44692 >> Process 44692 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 >> libsystem_kernel.dylib`: >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> 0x18a2d7510 <+12>: pacibsp >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> 0x18a2d7518 <+20>: mov x29, sp >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> Architecture set to: arm64-apple-macosx-. >> (lldb) cont >> Process 44692 resuming >> Process 44692 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >> frame #0: 0x000000010e5a022clibpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >> 0x10e5a0230 <+520>: cmp x9, x10 >> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >> 0x10e5a0238 <+528>: add w8, w8, #0x1 >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> (lldb) bt >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >> * frame #0: 0x000000010e5a022clibpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >> frame #1: 0x000000010e59fd14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >> frame #2: 0x000000010e59fb60libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >> frame #5: 0x0000000108e62638libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c:235:5 >> frame #6: 0x0000000108e6a910libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 >> frame #7: 0x000000010aa28010 libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at itcreate.c:679:3 >> frame #8: 0x00000001050aa2f4fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344:12 >> frame #9: 0x0000000104d62e5c fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at fo_acoustic_streaming_solver.cpp:400:22 >> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >> (lldb) >> >> >> Task 3: >> >> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >> (lldb) process attach --pid 44693 >> Process 44693 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 >> libsystem_kernel.dylib`: >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> 0x18a2d7510 <+12>: pacibsp >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> 0x18a2d7518 <+20>: mov x29, sp >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> Architecture set to: arm64-apple-macosx-. >> (lldb) cont >> Process 44693 resuming >> Process 44693 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >> frame #0: 0x000000010e59c68clibpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >> -> 0x10e59c68c <+952>: ldr w9, [x21] >> 0x10e59c690 <+956>: cmp w8, w9 >> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >> 0x10e59c698 <+964>: bl 0x10e59ce64 ; MPID_Progress_test >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> (lldb) bt >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >> * frame #0: 0x000000010e59c68clibpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >> frame #1: 0x000000010e5a44bclibpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 >> frame #2: 0x000000010e5a3964libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >> frame #5: 0x00000001098c3650libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:827:3 >> frame #6: 0x0000000109609faclibpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 >> frame #7: 0x0000000104ef12d0fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016b0a0ec0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp:799:36 >> frame #8: 0x0000000104f08b8cfo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, x=0x000000016b0a9778, (null)=0x000000016b0a9680) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 >> frame #9: 0x00000001050a62dcfo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340:5 >> frame #10: 0x0000000104d5ee5c fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at fo_acoustic_streaming_solver.cpp:400:22 >> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >> (lldb) >> >> >> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith > wrote: >>> >>> >>>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla > wrote: >>>> >>>> BTW, I think you meant using MatSetOption(mat, MAT_NO_OFF_PROC_ZERO_ROWS, PETSC_TRUE) >>> >>> Yes >>> >>>> instead ofMatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES, PETSC_TRUE) ?? >>> >>> Please try setting both flags. >>> >>>> However, that also did not help to overcome the MPI Barrier issue. >>> >>> If there is still a problem please trap all the MPI processes when they hang in the debugger and send the output from using bt on all of them. This way >>> we can see the different places the different MPI processes are stuck at. >>> >>> >>>> >>>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla > wrote: >>>>> I added that option but the code still gets stuck at the same call MatZeroRows with 3 processors. >>>>> >>>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla > wrote: >>>>>> >>>>>> >>>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith > wrote: >>>>>>> >>>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>>> { >>>>>>> ....... >>>>>>> for (Box::Iterator bc(bc_coef_box); bc; bc++) >>>>>>> { >>>>>>> ...... >>>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>>> { >>>>>>> const double diag_value = a; >>>>>>> ierr = MatZeroRows(mat, 1, &u_dof_index, diag_value, NULL, NULL); >>>>>>> IBTK_CHKERRQ(ierr); >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> In general, this code will not work because each process calls MatZeroRows a different number of times, so it cannot match up with all the processes. >>>>>>> >>>>>>> If u_dof_index is always local to the current process, you can call MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop and >>>>>>> the MatZeroRows will not synchronize across the MPI processes (since it does not need to and you told it that). >>>>>> >>>>>> Yes, u_dof_index is going to be local and I put a check on it a few lines before calling MatZeroRows. >>>>>> >>>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>>> >>>>>>> >>>>>>> If the u_dof_index will not always be local, then you need, on each process, to list all the u_dof_index for each process in an array and then call MatZeroRows() >>>>>>> once after the loop so it can exchange the needed information with the other MPI processes to get the row indices to the right place. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla > wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hi Folks, >>>>>>>> >>>>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This works fine for the serial run and the solver produces correct results (verified through analytical solution). However, when I run the case in parallel, the simulation gets stuck at MatZeroRows(). My understanding is that this function needs to be called after the MatAssemblyBegin{End}() has been called, and should be called by all processors. Here is that bit of the code which calls MatZeroRows() after the matrix has been assembled >>>>>>>> >>>>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>>>> >>>>>>>> I ran the parallel code (on 3 processors) in the debugger (-start_in_debugger). Below is the call stack from the processor that gets stuck >>>>>>>> >>>>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>>>> (lldb) process attach --pid 4307 >>>>>>>> Process 4307 stopped >>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>>>>>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>>> libsystem_kernel.dylib`: >>>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>> Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>>> (lldb) cont >>>>>>>> Process 4307 resuming >>>>>>>> Process 4307 stopped >>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>>>>>> frame #0: 0x0000000109d281b8libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; MPID_Progress_test >>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>> (lldb) bt >>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>>>>>> * frame #0: 0x0000000109d281b8libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>> frame #1: 0x0000000109d27d14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>>>> frame #2: 0x0000000109d27b60libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>>>> frame #5: 0x00000001045ea638libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c:235:5 >>>>>>>> frame #6: 0x00000001045f2910libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, classid=1211227, class_name="PetscSF", descr="Star Forest", mansec="PetscSF", comm=-2080374782, destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 >>>>>>>> frame #7: 0x00000001049cf820libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) at sf.c:62:3 >>>>>>>> frame #8: 0x0000000104cd3024libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) at zerorows.c:36:5 >>>>>>>> frame #9: 0x000000010504ea50libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:768:3 >>>>>>>> frame #10: 0x0000000104d95faclibpetsc.3.17.dylib`MatZeroRows(mat=0x00000001170c1270, numRows=1, rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 >>>>>>>> frame #11: 0x000000010067d320fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016f91c178, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016f91c3a8, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016f914ed0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp:794:36 >>>>>>>> frame #12: 0x0000000100694bdcfo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016f91c028, x=0x000000016f91d788, (null)=0x000000016f91d690) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>>>>>> frame #13: 0x000000010083232cfo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016f91c028, x=0x000000016f91d788, b=0x000000016f91d690) at PETScLevelSolver.cpp:340:5 >>>>>>>> frame #14: 0x00000001004eb230fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016f91f460) at fo_acoustic_streaming_solver.cpp:400:22 >>>>>>>> frame #15: 0x0000000189fbbf28 dyld`start + 2236 >>>>>>>> >>>>>>>> >>>>>>>> Any suggestions on how to avoid this barrier? Here are all MAT options I am using (in the debug mode), if that is helpful: >>>>>>>> >>>>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L453-L458 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> -- >>>>>>>> --Amneet >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> >>>>> -- >>>>> --Amneet >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> --Amneet >>>> >>>> >>>> >>> >> >> >> -- >> --Amneet >> >> >> > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Wed Nov 29 12:55:43 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Wed, 29 Nov 2023 10:55:43 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> Message-ID: So the code logic is after the matrix is assembled, I iterate over all distributed patches in the domain to see which of the patch is abutting a Dirichlet boundary. Depending upon which patch abuts a physical and Dirichlet boundary, a processor will call this routine. However, that same processor is ?owning? that DoF, which would be on its diagonal. I think Barry already mentioned this is not going to work unless I use the flag to not communicate explicitly. However, that flag is not working as it should over here for some reason. I can always change the matrix coefficients for Dirichlet rows during MatSetValues. However, that would lengthen my code and I was trying to avoid that. On Wed, Nov 29, 2023 at 10:02?AM Matthew Knepley wrote: > On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla > wrote: > >> Ok, I added both, but it still hangs. Here, is bt from all three tasks: >> > > It looks like two processes are calling AllReduce, but one is not. Are all > procs not calling MatZeroRows? > > Thanks, > > Matt > > >> Task 1: >> >> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >> >> (lldb) process attach --pid 44691 >> >> Process 44691 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> >> libsystem_kernel.dylib`: >> >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> >> 0x18a2d7510 <+12>: pacibsp >> >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> >> 0x18a2d7518 <+20>: mov x29, sp >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> >> Architecture set to: arm64-apple-macosx-. >> >> (lldb) cont >> >> Process 44691 resuming >> >> Process 44691 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000010ba40b60 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >> >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >> >> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >> >> 0x10ba40b64 <+756>: ldr w9, [x22] >> >> 0x10ba40b68 <+760>: cmp w8, w9 >> >> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> (lldb) bt >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> * frame #0: 0x000000010ba40b60 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >> >> frame #1: 0x000000010ba48528 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 >> >> frame #2: 0x000000010ba47964 >> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >> >> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >> >> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >> >> frame #5: 0x0000000106d67650 >> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, >> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at mpiaij.c:827:3 >> >> frame #6: 0x0000000106aadfac >> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, >> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at matrix.c:5935:3 >> >> frame #7: 0x00000001023952d0 >> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, >> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >> u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, >> u_dof_index_idx=27, p_dof_index_idx=28, >> patch_level=Pointer > @ 0x000000016dbfcec0, >> mu_interp_type=VC_HARMONIC_INTERP) at >> AcousticStreamingPETScMatUtilities.cpp:799:36 >> >> frame #8: 0x00000001023acb8c >> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, >> x=0x000000016dc05778, (null)=0x000000016dc05680) at >> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >> >> frame #9: 0x000000010254a2dc >> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, >> x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340:5 >> >> frame #10: 0x0000000102202e5c >> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at >> fo_acoustic_streaming_solver.cpp:400:22 >> >> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >> >> (lldb) >> >> >> Task 2: >> >> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >> >> (lldb) process attach --pid 44692 >> >> Process 44692 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> >> libsystem_kernel.dylib`: >> >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> >> 0x18a2d7510 <+12>: pacibsp >> >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> >> 0x18a2d7518 <+20>: mov x29, sp >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> >> Architecture set to: arm64-apple-macosx-. >> >> (lldb) cont >> >> Process 44692 resuming >> >> Process 44692 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000010e5a022c >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >> >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >> >> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >> >> 0x10e5a0230 <+520>: cmp x9, x10 >> >> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >> >> 0x10e5a0238 <+528>: add w8, w8, #0x1 >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> (lldb) bt >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> * frame #0: 0x000000010e5a022c >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >> >> frame #1: 0x000000010e59fd14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier >> + 224 >> >> frame #2: 0x000000010e59fb60 >> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >> >> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >> >> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >> >> frame #5: 0x0000000108e62638 >> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, >> comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c:235: >> 5 >> >> frame #6: 0x0000000108e6a910 >> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, >> classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", >> comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), >> view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 >> >> frame #7: 0x000000010aa28010 >> libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at >> itcreate.c:679:3 >> >> frame #8: 0x00000001050aa2f4 >> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, >> x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344: >> 12 >> >> frame #9: 0x0000000104d62e5c >> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at >> fo_acoustic_streaming_solver.cpp:400:22 >> >> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >> >> (lldb) >> >> >> Task 3: >> >> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >> >> (lldb) process attach --pid 44693 >> >> Process 44693 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> >> libsystem_kernel.dylib`: >> >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> >> 0x18a2d7510 <+12>: pacibsp >> >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> >> 0x18a2d7518 <+20>: mov x29, sp >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> >> Architecture set to: arm64-apple-macosx-. >> >> (lldb) cont >> >> Process 44693 resuming >> >> Process 44693 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000010e59c68c >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >> >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >> >> -> 0x10e59c68c <+952>: ldr w9, [x21] >> >> 0x10e59c690 <+956>: cmp w8, w9 >> >> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >> >> 0x10e59c698 <+964>: bl 0x10e59ce64 ; >> MPID_Progress_test >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> (lldb) bt >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> * frame #0: 0x000000010e59c68c >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >> >> frame #1: 0x000000010e5a44bc >> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 >> >> frame #2: 0x000000010e5a3964 >> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >> >> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >> >> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >> >> frame #5: 0x00000001098c3650 >> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, >> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at mpiaij.c:827:3 >> >> frame #6: 0x0000000109609fac >> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, >> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at matrix.c:5935:3 >> >> frame #7: 0x0000000104ef12d0 >> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, >> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >> u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, >> u_dof_index_idx=27, p_dof_index_idx=28, >> patch_level=Pointer > @ 0x000000016b0a0ec0, >> mu_interp_type=VC_HARMONIC_INTERP) at >> AcousticStreamingPETScMatUtilities.cpp:799:36 >> >> frame #8: 0x0000000104f08b8c >> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, >> x=0x000000016b0a9778, (null)=0x000000016b0a9680) at >> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >> >> frame #9: 0x00000001050a62dc >> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, >> x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340:5 >> >> frame #10: 0x0000000104d5ee5c >> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at >> fo_acoustic_streaming_solver.cpp:400:22 >> >> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >> >> (lldb) >> >> >> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith wrote: >> >>> >>> >>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla >>> wrote: >>> >>> BTW, I think you meant using MatSetOption(mat, >>> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >>> >>> >>> Yes >>> >>> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) ?? >>> >>> >>> Please try setting both flags. >>> >>> However, that also did not help to overcome the MPI Barrier issue. >>> >>> >>> If there is still a problem please trap all the MPI processes when >>> they hang in the debugger and send the output from using bt on all of them. >>> This way >>> we can see the different places the different MPI processes are stuck at. >>> >>> >>> >>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla >>> wrote: >>> >>>> I added that option but the code still gets stuck at the same call >>>> MatZeroRows with 3 processors. >>>> >>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla >>>> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith wrote: >>>>> >>>>>> >>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>> { >>>>>> ....... >>>>>> for (Box::Iterator bc(bc_coef_box); bc; >>>>>> bc++) >>>>>> { >>>>>> ...... >>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>> { >>>>>> const double diag_value = a; >>>>>> ierr = MatZeroRows(mat, 1, &u_dof_index, >>>>>> diag_value, NULL, NULL); >>>>>> IBTK_CHKERRQ(ierr); >>>>>> } >>>>>> } >>>>>> } >>>>>> >>>>>> In general, this code will not work because each process calls >>>>>> MatZeroRows a different number of times, so it cannot match up with all the >>>>>> processes. >>>>>> >>>>>> If u_dof_index is always local to the current process, you can call >>>>>> MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop >>>>>> and >>>>>> the MatZeroRows will not synchronize across the MPI processes (since >>>>>> it does not need to and you told it that). >>>>>> >>>>> >>>>> Yes, u_dof_index is going to be local and I put a check on it a few >>>>> lines before calling MatZeroRows. >>>>> >>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>> >>>>> >>>>>> If the u_dof_index will not always be local, then you need, on each >>>>>> process, to list all the u_dof_index for each process in an array and then >>>>>> call MatZeroRows() >>>>>> once after the loop so it can exchange the needed information with >>>>>> the other MPI processes to get the row indices to the right place. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla >>>>>> wrote: >>>>>> >>>>>> >>>>>> Hi Folks, >>>>>> >>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This >>>>>> works fine for the serial run and the solver produces correct results >>>>>> (verified through analytical solution). However, when I run the case in >>>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>>>> been called, and should be called by all processors. Here is that bit of >>>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>>> >>>>>> >>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>> >>>>>> I ran the parallel code (on 3 processors) in the debugger >>>>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>>>> stuck >>>>>> >>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>> (lldb) process attach --pid 4307 >>>>>> Process 4307 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x000000018a2d750c >>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>> libsystem_kernel.dylib`: >>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> Executable module set to >>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>> Architecture set to: arm64-apple-macosx-. >>>>>> (lldb) cont >>>>>> Process 4307 resuming >>>>>> Process 4307 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x0000000109d281b8 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>>> MPID_Progress_test >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> (lldb) bt >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> * frame #0: 0x0000000109d281b8 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>> frame #1: 0x0000000109d27d14 >>>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>> frame #2: 0x0000000109d27b60 >>>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>> frame #5: 0x00000001045ea638 >>>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c: >>>>>> 235:5 >>>>>> frame #6: 0x00000001045f2910 >>>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, >>>>>> classid=1211227, class_name="PetscSF", descr="Star Forest", >>>>>> mansec="PetscSF", comm=-2080374782, >>>>>> destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), >>>>>> view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 >>>>>> frame #7: 0x00000001049cf820 >>>>>> libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) >>>>>> at sf.c:62:3 >>>>>> frame #8: 0x0000000104cd3024 >>>>>> libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, >>>>>> rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) >>>>>> at zerorows.c:36:5 >>>>>> frame #9: 0x000000010504ea50 >>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, >>>>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>>>> b=0x0000000000000000) at mpiaij.c:768:3 >>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Wed Nov 29 12:56:44 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Wed, 29 Nov 2023 10:56:44 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: <6A3DA004-D944-43D2-A66C-3075F902883E@petsc.dev> References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> <6A3DA004-D944-43D2-A66C-3075F902883E@petsc.dev> Message-ID: I am using 3.17 On Wed, Nov 29, 2023 at 10:50?AM Barry Smith wrote: > > What PETSc version are you using? > > > On Nov 29, 2023, at 1:02?PM, Matthew Knepley wrote: > > On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla > wrote: > >> Ok, I added both, but it still hangs. Here, is bt from all three tasks: >> > > It looks like two processes are calling AllReduce, but one is not. Are all > procs not calling MatZeroRows? > > Thanks, > > Matt > > >> Task 1: >> >> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >> (lldb) process attach --pid 44691 >> Process 44691 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> libsystem_kernel.dylib`: >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> 0x18a2d7510 <+12>: pacibsp >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> 0x18a2d7518 <+20>: mov x29, sp >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> Architecture set to: arm64-apple-macosx-. >> (lldb) cont >> Process 44691 resuming >> Process 44691 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> frame #0: 0x000000010ba40b60libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release >> + 752 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >> 0x10ba40b64 <+756>: ldr w9, [x22] >> 0x10ba40b68 <+760>: cmp w8, w9 >> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> (lldb) bt >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> * frame #0: 0x000000010ba40b60libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release >> + 752 >> frame #1: 0x000000010ba48528libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather >> + 1088 >> frame #2: 0x000000010ba47964libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma >> + 368 >> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >> frame #5: 0x0000000106d67650libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, >> N=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at mpiaij.c:827:3 >> frame #6: 0x0000000106aadfaclibpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, >> numRows=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at matrix.c:5935:3 >> frame #7: 0x00000001023952d0fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, >> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >> u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, >> u_dof_index_idx=27, p_dof_index_idx=28, >> patch_level=Pointer > @ 0x000000016dbfcec0, >> mu_interp_type=VC_HARMONIC_INTERP) at >> AcousticStreamingPETScMatUtilities.cpp:799:36 >> frame #8: 0x00000001023acb8cfo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, >> x=0x000000016dc05778, (null)=0x000000016dc05680) at >> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >> frame #9: 0x000000010254a2dcfo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, >> x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340:5 >> frame #10: 0x0000000102202e5c fo_acoustic_streaming_solver_2d`main(argc=11, >> argv=0x000000016dc07450) at fo_acoustic_streaming_solver.cpp:400:22 >> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >> (lldb) >> >> >> Task 2: >> >> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >> (lldb) process attach --pid 44692 >> Process 44692 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> libsystem_kernel.dylib`: >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> 0x18a2d7510 <+12>: pacibsp >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> 0x18a2d7518 <+20>: mov x29, sp >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> Architecture set to: arm64-apple-macosx-. >> (lldb) cont >> Process 44692 resuming >> Process 44692 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> frame #0: 0x000000010e5a022clibpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather >> + 516 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >> 0x10e5a0230 <+520>: cmp x9, x10 >> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >> 0x10e5a0238 <+528>: add w8, w8, #0x1 >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> (lldb) bt >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> * frame #0: 0x000000010e5a022clibpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather >> + 516 >> frame #1: 0x000000010e59fd14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier >> + 224 >> frame #2: 0x000000010e59fb60libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha >> + 44 >> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >> frame #5: 0x0000000108e62638libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, >> comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c:235: >> 5 >> frame #6: 0x0000000108e6a910libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, >> classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", >> comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), >> view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 >> frame #7: 0x000000010aa28010 libpetsc.3.17.dylib`KSPCreate(comm=1140850688, >> inksp=0x000000016b0a4160) at itcreate.c:679:3 >> frame #8: 0x00000001050aa2f4fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, >> x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344: >> 12 >> frame #9: 0x0000000104d62e5c fo_acoustic_streaming_solver_2d`main(argc=11, >> argv=0x000000016b0a7450) at fo_acoustic_streaming_solver.cpp:400:22 >> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >> (lldb) >> >> >> Task 3: >> >> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >> (lldb) process attach --pid 44693 >> Process 44693 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> libsystem_kernel.dylib`: >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> 0x18a2d7510 <+12>: pacibsp >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> 0x18a2d7518 <+20>: mov x29, sp >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> Architecture set to: arm64-apple-macosx-. >> (lldb) cont >> Process 44693 resuming >> Process 44693 stopped >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> frame #0: 0x000000010e59c68clibpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather >> + 952 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >> -> 0x10e59c68c <+952>: ldr w9, [x21] >> 0x10e59c690 <+956>: cmp w8, w9 >> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >> 0x10e59c698 <+964>: bl 0x10e59ce64 ; >> MPID_Progress_test >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> (lldb) bt >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> * frame #0: 0x000000010e59c68clibpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather >> + 952 >> frame #1: 0x000000010e5a44bclibpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather >> + 980 >> frame #2: 0x000000010e5a3964libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma >> + 368 >> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >> frame #5: 0x00000001098c3650libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, >> N=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at mpiaij.c:827:3 >> frame #6: 0x0000000109609faclibpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, >> numRows=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at matrix.c:5935:3 >> frame #7: 0x0000000104ef12d0fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, >> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >> u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, >> u_dof_index_idx=27, p_dof_index_idx=28, >> patch_level=Pointer > @ 0x000000016b0a0ec0, >> mu_interp_type=VC_HARMONIC_INTERP) at >> AcousticStreamingPETScMatUtilities.cpp:799:36 >> frame #8: 0x0000000104f08b8cfo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, >> x=0x000000016b0a9778, (null)=0x000000016b0a9680) at >> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >> frame #9: 0x00000001050a62dcfo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, >> x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340:5 >> > frame #10: 0x0000000104d5ee5c fo_acoustic_streaming_solver_2d`main(argc=11, >> argv=0x000000016b0ab450) at fo_acoustic_streaming_solver.cpp:400:22 >> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >> (lldb) >> > >> >> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith wrote: >> > >>> >>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla >>> wrote: >>> >>> BTW, I think you meant using MatSetOption(mat, >>> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >>> >>> >>> Yes >>> >>> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) ?? >>> >>> >>> Please try setting both flags. >>> >>> However, that also did not help to overcome the MPI Barrier issue. >>> >>> >>> If there is still a problem please trap all the MPI processes when >>> they hang in the debugger and send the output from using bt on all of them. >>> This way >>> we can see the different places the different MPI processes are stuck at. >>> >>> >>> >>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla >>> wrote: >>> >>> I added that option but the code still gets stuck at the same call >>>> MatZeroRows with 3 processors. >>>> >>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla >>>> wrote: >>>> >>> >>>>> >>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith wrote: >>>>> >>>>>> >>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>> { >>>>>> ....... >>>>>> for (Box::Iterator bc(bc_coef_box); bc; >>>>>> bc++) >>>>>> { >>>>>> ...... >>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>> { >>>>>> const double diag_value = a; >>>>>> ierr = MatZeroRows(mat, 1, &u_dof_index, >>>>>> diag_value, NULL, NULL); >>>>>> IBTK_CHKERRQ(ierr); >>>>>> } >>>>>> } >>>>>> } >>>>>> >>>>>> In general, this code will not work because each process calls >>>>>> MatZeroRows a different number of times, so it cannot match up with all the >>>>>> processes. >>>>>> >>>>>> If u_dof_index is always local to the current process, you can call >>>>>> MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop >>>>>> and >>>>>> the MatZeroRows will not synchronize across the MPI processes (since >>>>>> it does not need to and you told it that). >>>>>> >>>>> >>>>> Yes, u_dof_index is going to be local and I put a check on it a few >>>>> lines before calling MatZeroRows. >>>>> >>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>> >>>>> >>>>>> If the u_dof_index will not always be local, then you need, on each >>>>>> process, to list all the u_dof_index for each process in an array and then >>>>>> call MatZeroRows() >>>>>> once after the loop so it can exchange the needed information with >>>>>> the other MPI processes to get the row indices to the right place. >>>>>> >>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla >>>>>> wrote: >>>>>> >>>>>> >>>>>> Hi Folks, >>>>>> >>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This >>>>>> works fine for the serial run and the solver produces correct results >>>>>> (verified through analytical solution). However, when I run the case in >>>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>>>> been called, and should be called by all processors. Here is that bit of >>>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>>> >>>>>> >>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>> >>>>>> >>>>>> I ran the parallel code (on 3 processors) in the debugger >>>>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>>>> stuck >>>>>> >>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>> (lldb) process attach --pid 4307 >>>>>> Process 4307 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >>>>>> + 8 >>>>>> libsystem_kernel.dylib`: >>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> Executable module set to >>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>> Architecture set to: arm64-apple-macosx-. >>>>>> (lldb) cont >>>>>> Process 4307 resuming >>>>>> Process 4307 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> >>>>>> frame #0: 0x0000000109d281b8libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather >>>>>> + 400 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>>> MPID_Progress_test >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> (lldb) bt >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> * frame #0: 0x0000000109d281b8libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather >>>>>> + 400 >>>>>> frame #1: 0x0000000109d27d14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier >>>>>> + 224 >>>>>> frame #2: 0x0000000109d27b60libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha >>>>>> + 44 >>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>> frame #5: 0x00000001045ea638libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c: >>>>>> 235 >>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Wed Nov 29 12:59:00 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Wed, 29 Nov 2023 10:59:00 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> <6A3DA004-D944-43D2-A66C-3075F902883E@petsc.dev> Message-ID: Actually it is 3.17.5 On Wed, Nov 29, 2023 at 10:56?AM Amneet Bhalla wrote: > I am using 3.17 > > On Wed, Nov 29, 2023 at 10:50?AM Barry Smith wrote: > >> >> What PETSc version are you using? >> >> >> On Nov 29, 2023, at 1:02?PM, Matthew Knepley wrote: >> >> On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla >> wrote: >> >>> Ok, I added both, but it still hangs. Here, is bt from all three tasks: >>> >> >> It looks like two processes are calling AllReduce, but one is not. Are >> all procs not calling MatZeroRows? >> >> Thanks, >> >> Matt >> >> >>> Task 1: >>> >>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >>> (lldb) process attach --pid 44691 >>> Process 44691 stopped >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >>> + 8 >>> libsystem_kernel.dylib`: >>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>> 0x18a2d7510 <+12>: pacibsp >>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>> 0x18a2d7518 <+20>: mov x29, sp >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> Executable module set to >>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>> Architecture set to: arm64-apple-macosx-. >>> (lldb) cont >>> Process 44691 resuming >>> Process 44691 stopped >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> frame #0: 0x000000010ba40b60libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release >>> + 752 >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >>> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >>> 0x10ba40b64 <+756>: ldr w9, [x22] >>> 0x10ba40b68 <+760>: cmp w8, w9 >>> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> (lldb) bt >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> * frame #0: 0x000000010ba40b60libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release >>> + 752 >>> frame #1: 0x000000010ba48528libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather >>> + 1088 >>> frame #2: 0x000000010ba47964libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma >>> + 368 >>> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >>> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >>> frame #5: 0x0000000106d67650libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, >>> N=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>> b=0x0000000000000000) at mpiaij.c:827:3 >>> frame #6: 0x0000000106aadfaclibpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, >>> numRows=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>> b=0x0000000000000000) at matrix.c:5935:3 >>> frame #7: 0x00000001023952d0fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, >>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>> u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, >>> u_dof_index_idx=27, p_dof_index_idx=28, >>> patch_level=Pointer > @ 0x000000016dbfcec0, >>> mu_interp_type=VC_HARMONIC_INTERP) at >>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>> frame #8: 0x00000001023acb8cfo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, >>> x=0x000000016dc05778, (null)=0x000000016dc05680) at >>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>> frame #9: 0x000000010254a2dcfo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, >>> x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340: >>> 5 >>> frame #10: 0x0000000102202e5c fo_acoustic_streaming_solver_2d`main(argc=11, >>> argv=0x000000016dc07450) at fo_acoustic_streaming_solver.cpp:400:22 >>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>> (lldb) >>> >>> >>> Task 2: >>> >>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >>> (lldb) process attach --pid 44692 >>> Process 44692 stopped >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >>> + 8 >>> libsystem_kernel.dylib`: >>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>> 0x18a2d7510 <+12>: pacibsp >>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>> 0x18a2d7518 <+20>: mov x29, sp >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> Executable module set to >>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>> Architecture set to: arm64-apple-macosx-. >>> (lldb) cont >>> Process 44692 resuming >>> Process 44692 stopped >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> frame #0: 0x000000010e5a022clibpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather >>> + 516 >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >>> 0x10e5a0230 <+520>: cmp x9, x10 >>> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >>> 0x10e5a0238 <+528>: add w8, w8, #0x1 >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> (lldb) bt >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> * frame #0: 0x000000010e5a022clibpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather >>> + 516 >>> frame #1: 0x000000010e59fd14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier >>> + 224 >>> frame #2: 0x000000010e59fb60libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha >>> + 44 >>> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >>> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >>> frame #5: 0x0000000108e62638libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, >>> comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c:235 >>> :5 >>> frame #6: 0x0000000108e6a910libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, >>> classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", >>> comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), >>> view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 >>> frame #7: 0x000000010aa28010 libpetsc.3.17.dylib`KSPCreate(comm=1140850688, >>> inksp=0x000000016b0a4160) at itcreate.c:679:3 >>> frame #8: 0x00000001050aa2f4fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, >>> x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344: >>> 12 >>> frame #9: 0x0000000104d62e5c fo_acoustic_streaming_solver_2d`main(argc=11, >>> argv=0x000000016b0a7450) at fo_acoustic_streaming_solver.cpp:400:22 >>> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >>> (lldb) >>> >>> >>> Task 3: >>> >>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >>> (lldb) process attach --pid 44693 >>> Process 44693 stopped >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >>> + 8 >>> libsystem_kernel.dylib`: >>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>> 0x18a2d7510 <+12>: pacibsp >>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>> 0x18a2d7518 <+20>: mov x29, sp >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> Executable module set to >>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>> Architecture set to: arm64-apple-macosx-. >>> (lldb) cont >>> Process 44693 resuming >>> Process 44693 stopped >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> frame #0: 0x000000010e59c68clibpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather >>> + 952 >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >>> -> 0x10e59c68c <+952>: ldr w9, [x21] >>> 0x10e59c690 <+956>: cmp w8, w9 >>> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >>> 0x10e59c698 <+964>: bl 0x10e59ce64 ; >>> MPID_Progress_test >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> (lldb) bt >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> * frame #0: 0x000000010e59c68clibpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather >>> + 952 >>> frame #1: 0x000000010e5a44bclibpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather >>> + 980 >>> frame #2: 0x000000010e5a3964libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma >>> + 368 >>> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >>> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >>> frame #5: 0x00000001098c3650libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, >>> N=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>> b=0x0000000000000000) at mpiaij.c:827:3 >>> frame #6: 0x0000000109609faclibpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, >>> numRows=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>> b=0x0000000000000000) at matrix.c:5935:3 >>> frame #7: 0x0000000104ef12d0fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, >>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>> u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, >>> u_dof_index_idx=27, p_dof_index_idx=28, >>> patch_level=Pointer > @ 0x000000016b0a0ec0, >>> mu_interp_type=VC_HARMONIC_INTERP) at >>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>> frame #8: 0x0000000104f08b8cfo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, >>> x=0x000000016b0a9778, (null)=0x000000016b0a9680) at >>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>> frame #9: 0x00000001050a62dcfo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, >>> x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340: >>> 5 >>> >> frame #10: 0x0000000104d5ee5c fo_acoustic_streaming_solver_2d`main(argc=11, >>> argv=0x000000016b0ab450) at fo_acoustic_streaming_solver.cpp:400:22 >>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>> (lldb) >>> >> >>> >>> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith wrote: >>> >> >>>> >>>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla >>>> wrote: >>>> >>>> BTW, I think you meant using MatSetOption(mat, >>>> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >>>> >>>> >>>> Yes >>>> >>>> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) ?? >>>> >>>> >>>> Please try setting both flags. >>>> >>>> However, that also did not help to overcome the MPI Barrier issue. >>>> >>>> >>>> If there is still a problem please trap all the MPI processes when >>>> they hang in the debugger and send the output from using bt on all of them. >>>> This way >>>> we can see the different places the different MPI processes are stuck >>>> at. >>>> >>>> >>>> >>>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla >>>> wrote: >>>> >>>> I added that option but the code still gets stuck at the same call >>>>> MatZeroRows with 3 processors. >>>>> >>>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla >>>>> wrote: >>>>> >>>> >>>>>> >>>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith wrote: >>>>>> >>>>>>> >>>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>>> { >>>>>>> ....... >>>>>>> for (Box::Iterator bc(bc_coef_box); bc; >>>>>>> bc++) >>>>>>> { >>>>>>> ...... >>>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>>> { >>>>>>> const double diag_value = a; >>>>>>> ierr = MatZeroRows(mat, 1, >>>>>>> &u_dof_index, diag_value, NULL, NULL); >>>>>>> IBTK_CHKERRQ(ierr); >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> In general, this code will not work because each process calls >>>>>>> MatZeroRows a different number of times, so it cannot match up with all the >>>>>>> processes. >>>>>>> >>>>>>> If u_dof_index is always local to the current process, you can call >>>>>>> MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop >>>>>>> and >>>>>>> the MatZeroRows will not synchronize across the MPI processes (since >>>>>>> it does not need to and you told it that). >>>>>>> >>>>>> >>>>>> Yes, u_dof_index is going to be local and I put a check on it a few >>>>>> lines before calling MatZeroRows. >>>>>> >>>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>>> >>>>>> >>>>>>> If the u_dof_index will not always be local, then you need, on each >>>>>>> process, to list all the u_dof_index for each process in an array and then >>>>>>> call MatZeroRows() >>>>>>> once after the loop so it can exchange the needed information with >>>>>>> the other MPI processes to get the row indices to the right place. >>>>>>> >>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> Hi Folks, >>>>>>> >>>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This >>>>>>> works fine for the serial run and the solver produces correct results >>>>>>> (verified through analytical solution). However, when I run the case in >>>>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>>>>> been called, and should be called by all processors. Here is that bit of >>>>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>>>> >>>>>>> >>>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>>> >>>>>>> >>>>>>> I ran the parallel code (on 3 processors) in the debugger >>>>>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>>>>> stuck >>>>>>> >>>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>>> (lldb) process attach --pid 4307 >>>>>>> Process 4307 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >>>>>>> + 8 >>>>>>> libsystem_kernel.dylib`: >>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> Executable module set to >>>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>> (lldb) cont >>>>>>> Process 4307 resuming >>>>>>> Process 4307 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> >>>>>>> frame #0: 0x0000000109d281b8libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather >>>>>>> + 400 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>>>> MPID_Progress_test >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> (lldb) bt >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> * frame #0: 0x0000000109d281b8libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather >>>>>>> + 400 >>>>>>> frame #1: 0x0000000109d27d14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier >>>>>>> + 224 >>>>>>> frame #2: 0x0000000109d27b60libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha >>>>>>> + 44 >>>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>>> frame #5: 0x00000001045ea638libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>>>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c >>>>>>> :235 >>>>>>> >>>>>>> -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Nov 29 13:11:56 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 29 Nov 2023 14:11:56 -0500 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> Message-ID: On Wed, Nov 29, 2023 at 1:55?PM Amneet Bhalla wrote: > So the code logic is after the matrix is assembled, I iterate over all > distributed patches in the domain to see which of the patch is abutting a > Dirichlet boundary. Depending upon which patch abuts a physical and > Dirichlet boundary, a processor will call this routine. However, that same > processor is ?owning? that DoF, which would be on its diagonal. > > I think Barry already mentioned this is not going to work unless I use the > flag to not communicate explicitly. However, that flag is not working as it > should over here for some reason. > Oh, I do not think that is right. Barry, when I look at the code, MPIU_Allreduce is always going to be called to fix up the nonzero_state. Am I wrong about that? Thanks, Matt > I can always change the matrix coefficients for Dirichlet rows during > MatSetValues. However, that would lengthen my code and I was trying to > avoid that. > > On Wed, Nov 29, 2023 at 10:02?AM Matthew Knepley > wrote: > >> On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla >> wrote: >> >>> Ok, I added both, but it still hangs. Here, is bt from all three tasks: >>> >> >> It looks like two processes are calling AllReduce, but one is not. Are >> all procs not calling MatZeroRows? >> >> Thanks, >> >> Matt >> >> >>> Task 1: >>> >>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >>> >>> (lldb) process attach --pid 44691 >>> >>> Process 44691 stopped >>> >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> >>> frame #0: 0x000000018a2d750c >>> libsystem_kernel.dylib`__semwait_signal + 8 >>> >>> libsystem_kernel.dylib`: >>> >>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>> >>> 0x18a2d7510 <+12>: pacibsp >>> >>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>> >>> 0x18a2d7518 <+20>: mov x29, sp >>> >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> >>> Executable module set to >>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>> >>> Architecture set to: arm64-apple-macosx-. >>> >>> (lldb) cont >>> >>> Process 44691 resuming >>> >>> Process 44691 stopped >>> >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> >>> frame #0: 0x000000010ba40b60 >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>> >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >>> >>> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >>> >>> 0x10ba40b64 <+756>: ldr w9, [x22] >>> >>> 0x10ba40b68 <+760>: cmp w8, w9 >>> >>> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >>> >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> >>> (lldb) bt >>> >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> >>> * frame #0: 0x000000010ba40b60 >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>> >>> frame #1: 0x000000010ba48528 >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 >>> >>> frame #2: 0x000000010ba47964 >>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>> >>> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >>> >>> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >>> >>> frame #5: 0x0000000106d67650 >>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, >>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>> b=0x0000000000000000) at mpiaij.c:827:3 >>> >>> frame #6: 0x0000000106aadfac >>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, >>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>> b=0x0000000000000000) at matrix.c:5935:3 >>> >>> frame #7: 0x00000001023952d0 >>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, >>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>> u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, >>> u_dof_index_idx=27, p_dof_index_idx=28, >>> patch_level=Pointer > @ 0x000000016dbfcec0, >>> mu_interp_type=VC_HARMONIC_INTERP) at >>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>> >>> frame #8: 0x00000001023acb8c >>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, >>> x=0x000000016dc05778, (null)=0x000000016dc05680) at >>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>> >>> frame #9: 0x000000010254a2dc >>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, >>> x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340: >>> 5 >>> >>> frame #10: 0x0000000102202e5c >>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at >>> fo_acoustic_streaming_solver.cpp:400:22 >>> >>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>> >>> (lldb) >>> >>> >>> Task 2: >>> >>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >>> >>> (lldb) process attach --pid 44692 >>> >>> Process 44692 stopped >>> >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> >>> frame #0: 0x000000018a2d750c >>> libsystem_kernel.dylib`__semwait_signal + 8 >>> >>> libsystem_kernel.dylib`: >>> >>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>> >>> 0x18a2d7510 <+12>: pacibsp >>> >>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>> >>> 0x18a2d7518 <+20>: mov x29, sp >>> >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> >>> Executable module set to >>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>> >>> Architecture set to: arm64-apple-macosx-. >>> >>> (lldb) cont >>> >>> Process 44692 resuming >>> >>> Process 44692 stopped >>> >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> >>> frame #0: 0x000000010e5a022c >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>> >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>> >>> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >>> >>> 0x10e5a0230 <+520>: cmp x9, x10 >>> >>> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >>> >>> 0x10e5a0238 <+528>: add w8, w8, #0x1 >>> >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> >>> (lldb) bt >>> >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> >>> * frame #0: 0x000000010e5a022c >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>> >>> frame #1: 0x000000010e59fd14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier >>> + 224 >>> >>> frame #2: 0x000000010e59fb60 >>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>> >>> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >>> >>> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >>> >>> frame #5: 0x0000000108e62638 >>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, >>> comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c:235 >>> :5 >>> >>> frame #6: 0x0000000108e6a910 >>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, >>> classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", >>> comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), >>> view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 >>> >>> frame #7: 0x000000010aa28010 >>> libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at >>> itcreate.c:679:3 >>> >>> frame #8: 0x00000001050aa2f4 >>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, >>> x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344: >>> 12 >>> >>> frame #9: 0x0000000104d62e5c >>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at >>> fo_acoustic_streaming_solver.cpp:400:22 >>> >>> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >>> >>> (lldb) >>> >>> >>> Task 3: >>> >>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >>> >>> (lldb) process attach --pid 44693 >>> >>> Process 44693 stopped >>> >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> >>> frame #0: 0x000000018a2d750c >>> libsystem_kernel.dylib`__semwait_signal + 8 >>> >>> libsystem_kernel.dylib`: >>> >>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>> >>> 0x18a2d7510 <+12>: pacibsp >>> >>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>> >>> 0x18a2d7518 <+20>: mov x29, sp >>> >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> >>> Executable module set to >>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>> >>> Architecture set to: arm64-apple-macosx-. >>> >>> (lldb) cont >>> >>> Process 44693 resuming >>> >>> Process 44693 stopped >>> >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> >>> frame #0: 0x000000010e59c68c >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>> >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >>> >>> -> 0x10e59c68c <+952>: ldr w9, [x21] >>> >>> 0x10e59c690 <+956>: cmp w8, w9 >>> >>> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >>> >>> 0x10e59c698 <+964>: bl 0x10e59ce64 ; >>> MPID_Progress_test >>> >>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>> >>> (lldb) bt >>> >>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>> SIGSTOP >>> >>> * frame #0: 0x000000010e59c68c >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>> >>> frame #1: 0x000000010e5a44bc >>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 >>> >>> frame #2: 0x000000010e5a3964 >>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>> >>> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >>> >>> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >>> >>> frame #5: 0x00000001098c3650 >>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, >>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>> b=0x0000000000000000) at mpiaij.c:827:3 >>> >>> frame #6: 0x0000000109609fac >>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, >>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>> b=0x0000000000000000) at matrix.c:5935:3 >>> >>> frame #7: 0x0000000104ef12d0 >>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, >>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>> u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, >>> u_dof_index_idx=27, p_dof_index_idx=28, >>> patch_level=Pointer > @ 0x000000016b0a0ec0, >>> mu_interp_type=VC_HARMONIC_INTERP) at >>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>> >>> frame #8: 0x0000000104f08b8c >>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, >>> x=0x000000016b0a9778, (null)=0x000000016b0a9680) at >>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>> >>> frame #9: 0x00000001050a62dc >>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, >>> x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340: >>> 5 >>> >>> frame #10: 0x0000000104d5ee5c >>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at >>> fo_acoustic_streaming_solver.cpp:400:22 >>> >>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>> >>> (lldb) >>> >>> >>> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith wrote: >>> >>>> >>>> >>>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla >>>> wrote: >>>> >>>> BTW, I think you meant using MatSetOption(mat, >>>> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >>>> >>>> >>>> Yes >>>> >>>> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) ?? >>>> >>>> >>>> Please try setting both flags. >>>> >>>> However, that also did not help to overcome the MPI Barrier issue. >>>> >>>> >>>> If there is still a problem please trap all the MPI processes when >>>> they hang in the debugger and send the output from using bt on all of them. >>>> This way >>>> we can see the different places the different MPI processes are stuck >>>> at. >>>> >>>> >>>> >>>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla >>>> wrote: >>>> >>>>> I added that option but the code still gets stuck at the same call >>>>> MatZeroRows with 3 processors. >>>>> >>>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith wrote: >>>>>> >>>>>>> >>>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>>> { >>>>>>> ....... >>>>>>> for (Box::Iterator bc(bc_coef_box); bc; >>>>>>> bc++) >>>>>>> { >>>>>>> ...... >>>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>>> { >>>>>>> const double diag_value = a; >>>>>>> ierr = MatZeroRows(mat, 1, &u_dof_index, >>>>>>> diag_value, NULL, NULL); >>>>>>> IBTK_CHKERRQ(ierr); >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> In general, this code will not work because each process calls >>>>>>> MatZeroRows a different number of times, so it cannot match up with all the >>>>>>> processes. >>>>>>> >>>>>>> If u_dof_index is always local to the current process, you can call >>>>>>> MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop >>>>>>> and >>>>>>> the MatZeroRows will not synchronize across the MPI processes (since >>>>>>> it does not need to and you told it that). >>>>>>> >>>>>> >>>>>> Yes, u_dof_index is going to be local and I put a check on it a few >>>>>> lines before calling MatZeroRows. >>>>>> >>>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>>> >>>>>> >>>>>>> If the u_dof_index will not always be local, then you need, on each >>>>>>> process, to list all the u_dof_index for each process in an array and then >>>>>>> call MatZeroRows() >>>>>>> once after the loop so it can exchange the needed information with >>>>>>> the other MPI processes to get the row indices to the right place. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> Hi Folks, >>>>>>> >>>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This >>>>>>> works fine for the serial run and the solver produces correct results >>>>>>> (verified through analytical solution). However, when I run the case in >>>>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>>>>> been called, and should be called by all processors. Here is that bit of >>>>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>>>> >>>>>>> >>>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>>> >>>>>>> I ran the parallel code (on 3 processors) in the debugger >>>>>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>>>>> stuck >>>>>>> >>>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>>> (lldb) process attach --pid 4307 >>>>>>> Process 4307 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> frame #0: 0x000000018a2d750c >>>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>> libsystem_kernel.dylib`: >>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> Executable module set to >>>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>> (lldb) cont >>>>>>> Process 4307 resuming >>>>>>> Process 4307 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> frame #0: 0x0000000109d281b8 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>>>> MPID_Progress_test >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> (lldb) bt >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> * frame #0: 0x0000000109d281b8 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>> frame #1: 0x0000000109d27d14 >>>>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>>> frame #2: 0x0000000109d27b60 >>>>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>>> frame #5: 0x00000001045ea638 >>>>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>>>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c >>>>>>> :235:5 >>>>>>> frame #6: 0x00000001045f2910 >>>>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, >>>>>>> classid=1211227, class_name="PetscSF", descr="Star Forest", >>>>>>> mansec="PetscSF", comm=-2080374782, >>>>>>> destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), >>>>>>> view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62: >>>>>>> 3 >>>>>>> frame #7: 0x00000001049cf820 >>>>>>> libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) >>>>>>> at sf.c:62:3 >>>>>>> frame #8: 0x0000000104cd3024 >>>>>>> libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, >>>>>>> rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) >>>>>>> at zerorows.c:36:5 >>>>>>> frame #9: 0x000000010504ea50 >>>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, >>>>>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>>>>> b=0x0000000000000000) at mpiaij.c:768:3 >>>>>>> >>>>>>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 29 13:31:51 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 29 Nov 2023 14:31:51 -0500 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> Message-ID: <5062E7EF-A0CE-465E-94F6-F43C761A95C1@petsc.dev> > On Nov 29, 2023, at 2:11?PM, Matthew Knepley wrote: > > On Wed, Nov 29, 2023 at 1:55?PM Amneet Bhalla > wrote: >> So the code logic is after the matrix is assembled, I iterate over all distributed patches in the domain to see which of the patch is abutting a Dirichlet boundary. Depending upon which patch abuts a physical and Dirichlet boundary, a processor will call this routine. However, that same processor is ?owning? that DoF, which would be on its diagonal. >> >> I think Barry already mentioned this is not going to work unless I use the flag to not communicate explicitly. However, that flag is not working as it should over here for some reason. > > Oh, I do not think that is right. > > Barry, when I look at the code, MPIU_Allreduce is always going to be called to fix up the nonzero_state. Am I wrong about that? No, you are correct. I missed that in my earlier look. Setting those flags reduce the number of MPI reductions but does not eliminate them completely. MatZeroRows is collective (as its manual page indicates) so you have to do the second thing I suggested. Inside your for loop construct an array containing all the local rows being zeroed and then make a single call by all MPI processes to MatZeroRows(). Note this is a small change of just a handful of lines of code. Barry > > Thanks, > > Matt > >> I can always change the matrix coefficients for Dirichlet rows during MatSetValues. However, that would lengthen my code and I was trying to avoid that. >> >> On Wed, Nov 29, 2023 at 10:02?AM Matthew Knepley > wrote: >>> On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla > wrote: >>>> Ok, I added both, but it still hangs. Here, is bt from all three tasks: >>> >>> It looks like two processes are calling AllReduce, but one is not. Are all procs not calling MatZeroRows? >>> >>> Thanks, >>> >>> Matt >>> >>>> Task 1: >>>> >>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >>>> (lldb) process attach --pid 44691 >>>> Process 44691 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 >>>> libsystem_kernel.dylib`: >>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>> 0x18a2d7510 <+12>: pacibsp >>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>> 0x18a2d7518 <+20>: mov x29, sp >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>> Architecture set to: arm64-apple-macosx-. >>>> (lldb) cont >>>> Process 44691 resuming >>>> Process 44691 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>> frame #0: 0x000000010ba40b60 libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >>>> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >>>> 0x10ba40b64 <+756>: ldr w9, [x22] >>>> 0x10ba40b68 <+760>: cmp w8, w9 >>>> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> (lldb) bt >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>> * frame #0: 0x000000010ba40b60 libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>> frame #1: 0x000000010ba48528 libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 >>>> frame #2: 0x000000010ba47964 libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >>>> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >>>> frame #5: 0x0000000106d67650 libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:827:3 >>>> frame #6: 0x0000000106aadfac libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 >>>> frame #7: 0x00000001023952d0 fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016dbfcec0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp:799:36 >>>> frame #8: 0x00000001023acb8c fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, x=0x000000016dc05778, (null)=0x000000016dc05680) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>> frame #9: 0x000000010254a2dc fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340:5 >>>> frame #10: 0x0000000102202e5c fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at fo_acoustic_streaming_solver.cpp:400:22 >>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>> (lldb) >>>> >>>> >>>> Task 2: >>>> >>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >>>> (lldb) process attach --pid 44692 >>>> Process 44692 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 >>>> libsystem_kernel.dylib`: >>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>> 0x18a2d7510 <+12>: pacibsp >>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>> 0x18a2d7518 <+20>: mov x29, sp >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>> Architecture set to: arm64-apple-macosx-. >>>> (lldb) cont >>>> Process 44692 resuming >>>> Process 44692 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>> frame #0: 0x000000010e5a022c libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >>>> 0x10e5a0230 <+520>: cmp x9, x10 >>>> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >>>> 0x10e5a0238 <+528>: add w8, w8, #0x1 >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> (lldb) bt >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>> * frame #0: 0x000000010e5a022c libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>> frame #1: 0x000000010e59fd14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>> frame #2: 0x000000010e59fb60 libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >>>> frame #5: 0x0000000108e62638 libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c:235:5 >>>> frame #6: 0x0000000108e6a910 libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 >>>> frame #7: 0x000000010aa28010 libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at itcreate.c:679:3 >>>> frame #8: 0x00000001050aa2f4 fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344:12 >>>> frame #9: 0x0000000104d62e5c fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at fo_acoustic_streaming_solver.cpp:400:22 >>>> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >>>> (lldb) >>>> >>>> >>>> Task 3: >>>> >>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >>>> (lldb) process attach --pid 44693 >>>> Process 44693 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 >>>> libsystem_kernel.dylib`: >>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>> 0x18a2d7510 <+12>: pacibsp >>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>> 0x18a2d7518 <+20>: mov x29, sp >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>> Architecture set to: arm64-apple-macosx-. >>>> (lldb) cont >>>> Process 44693 resuming >>>> Process 44693 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>> frame #0: 0x000000010e59c68c libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >>>> -> 0x10e59c68c <+952>: ldr w9, [x21] >>>> 0x10e59c690 <+956>: cmp w8, w9 >>>> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >>>> 0x10e59c698 <+964>: bl 0x10e59ce64 ; MPID_Progress_test >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> (lldb) bt >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>> * frame #0: 0x000000010e59c68c libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>> frame #1: 0x000000010e5a44bc libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 >>>> frame #2: 0x000000010e5a3964 libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >>>> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >>>> frame #5: 0x00000001098c3650 libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:827:3 >>>> frame #6: 0x0000000109609fac libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at matrix.c:5935:3 >>>> frame #7: 0x0000000104ef12d0 fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, u_dof_index_idx=27, p_dof_index_idx=28, patch_level=Pointer > @ 0x000000016b0a0ec0, mu_interp_type=VC_HARMONIC_INTERP) at AcousticStreamingPETScMatUtilities.cpp:799:36 >>>> frame #8: 0x0000000104f08b8c fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, x=0x000000016b0a9778, (null)=0x000000016b0a9680) at FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>> frame #9: 0x00000001050a62dc fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340:5 >>>> frame #10: 0x0000000104d5ee5c fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at fo_acoustic_streaming_solver.cpp:400:22 >>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>> (lldb) >>>> >>>> >>>> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith > wrote: >>>>> >>>>> >>>>>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla > wrote: >>>>>> >>>>>> BTW, I think you meant using MatSetOption(mat, MAT_NO_OFF_PROC_ZERO_ROWS, PETSC_TRUE) >>>>> >>>>> Yes >>>>> >>>>>> instead ofMatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES, PETSC_TRUE) ?? >>>>> >>>>> Please try setting both flags. >>>>> >>>>>> However, that also did not help to overcome the MPI Barrier issue. >>>>> >>>>> If there is still a problem please trap all the MPI processes when they hang in the debugger and send the output from using bt on all of them. This way >>>>> we can see the different places the different MPI processes are stuck at. >>>>> >>>>> >>>>>> >>>>>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla > wrote: >>>>>>> I added that option but the code still gets stuck at the same call MatZeroRows with 3 processors. >>>>>>> >>>>>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla > wrote: >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith > wrote: >>>>>>>>> >>>>>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>>>>> { >>>>>>>>> ....... >>>>>>>>> for (Box::Iterator bc(bc_coef_box); bc; bc++) >>>>>>>>> { >>>>>>>>> ...... >>>>>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>>>>> { >>>>>>>>> const double diag_value = a; >>>>>>>>> ierr = MatZeroRows(mat, 1, &u_dof_index, diag_value, NULL, NULL); >>>>>>>>> IBTK_CHKERRQ(ierr); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> In general, this code will not work because each process calls MatZeroRows a different number of times, so it cannot match up with all the processes. >>>>>>>>> >>>>>>>>> If u_dof_index is always local to the current process, you can call MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop and >>>>>>>>> the MatZeroRows will not synchronize across the MPI processes (since it does not need to and you told it that). >>>>>>>> >>>>>>>> Yes, u_dof_index is going to be local and I put a check on it a few lines before calling MatZeroRows. >>>>>>>> >>>>>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>>>>> >>>>>>>>> >>>>>>>>> If the u_dof_index will not always be local, then you need, on each process, to list all the u_dof_index for each process in an array and then call MatZeroRows() >>>>>>>>> once after the loop so it can exchange the needed information with the other MPI processes to get the row indices to the right place. >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla > wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Folks, >>>>>>>>>> >>>>>>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This works fine for the serial run and the solver produces correct results (verified through analytical solution). However, when I run the case in parallel, the simulation gets stuck at MatZeroRows(). My understanding is that this function needs to be called after the MatAssemblyBegin{End}() has been called, and should be called by all processors. Here is that bit of the code which calls MatZeroRows() after the matrix has been assembled >>>>>>>>>> >>>>>>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>>>>>> >>>>>>>>>> I ran the parallel code (on 3 processors) in the debugger (-start_in_debugger). Below is the call stack from the processor that gets stuck >>>>>>>>>> >>>>>>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>>>>>> (lldb) process attach --pid 4307 >>>>>>>>>> Process 4307 stopped >>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>>>>>>>> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>>>>> libsystem_kernel.dylib`: >>>>>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>>>> Executable module set to "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>>>>> (lldb) cont >>>>>>>>>> Process 4307 resuming >>>>>>>>>> Process 4307 stopped >>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>>>>>>>> frame #0: 0x0000000109d281b8 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; MPID_Progress_test >>>>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>>>> (lldb) bt >>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP >>>>>>>>>> * frame #0: 0x0000000109d281b8 libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>>>> frame #1: 0x0000000109d27d14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>>>>>> frame #2: 0x0000000109d27b60 libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>>>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>>>>>> frame #5: 0x00000001045ea638 libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c:235:5 >>>>>>>>>> frame #6: 0x00000001045f2910 libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, classid=1211227, class_name="PetscSF", descr="Star Forest", mansec="PetscSF", comm=-2080374782, destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 >>>>>>>>>> frame #7: 0x00000001049cf820 libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) at sf.c:62:3 >>>>>>>>>> frame #8: 0x0000000104cd3024 libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) at zerorows.c:36:5 >>>>>>>>>> frame #9: 0x000000010504ea50 libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, b=0x0000000000000000) at mpiaij.c:768:3 > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Di.Miao at synopsys.com Wed Nov 29 13:37:11 2023 From: Di.Miao at synopsys.com (Di Miao) Date: Wed, 29 Nov 2023 19:37:11 +0000 Subject: [petsc-users] petsc build could not pass make check Message-ID: Hi, I tried to compile PETSc with the following configuration: ./configure --with-debugging=0 COPTFLAGS='-O3' CXXOPTFLAGS='-O3' FOPTFLAGS='-O3' --with-clean=1 --with-make-exec=/SCRATCH/dimiao/test_space/installed/make/bin/make --with-cmake-exec=/SCRATCH/dimiao/test_space/cmake-3.27.9-linux-x86_64/bin/cmake --prefix=/SCRATCH/dimiao/test_space/installed/petsc_opt_mpi --with-mpi-dir=/SCRATCH/dimiao/test_space/installed/mpich PETSC_ARCH=petsc_opt_mpi --with-blaslapack-dir=/SCRATCH/dimiao/oneapi/mkl/latest --with-mkl_pardiso-dir=/SCRATCH/dimiao/oneapi/mkl/latest --with-x=0 I got three errors: Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process Below each error messages are nothing PETSc's performance summary. I have attached make.log, configure.log and the message from make check(make_check.log). Could you please give me some guidance on how to fix this issue? Thank you, Di -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1243534 bytes Desc: configure.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 114842 bytes Desc: make.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make_check.log Type: application/octet-stream Size: 41589 bytes Desc: make_check.log URL: From balay at mcs.anl.gov Wed Nov 29 13:41:07 2023 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 29 Nov 2023 13:41:07 -0600 (CST) Subject: [petsc-users] petsc build could not pass make check In-Reply-To: References: Message-ID: Do you have a ~/.petscrc file - with -log_view enabled? Satish On Wed, 29 Nov 2023, Di Miao via petsc-users wrote: > Hi, > > I tried to compile PETSc with the following configuration: > > ./configure --with-debugging=0 COPTFLAGS='-O3' CXXOPTFLAGS='-O3' FOPTFLAGS='-O3' --with-clean=1 --with-make-exec=/SCRATCH/dimiao/test_space/installed/make/bin/make --with-cmake-exec=/SCRATCH/dimiao/test_space/cmake-3.27.9-linux-x86_64/bin/cmake --prefix=/SCRATCH/dimiao/test_space/installed/petsc_opt_mpi --with-mpi-dir=/SCRATCH/dimiao/test_space/installed/mpich PETSC_ARCH=petsc_opt_mpi --with-blaslapack-dir=/SCRATCH/dimiao/oneapi/mkl/latest --with-mkl_pardiso-dir=/SCRATCH/dimiao/oneapi/mkl/latest --with-x=0 > > I got three errors: > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes > Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process > > Below each error messages are nothing PETSc's performance summary. > > I have attached make.log, configure.log and the message from make check(make_check.log). Could you please give me some guidance on how to fix this issue? > > Thank you, > Di > > > From bsmith at petsc.dev Wed Nov 29 13:43:27 2023 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 29 Nov 2023 14:43:27 -0500 Subject: [petsc-users] petsc build could not pass make check In-Reply-To: References: Message-ID: <161F4385-7092-451D-AD0E-A4B67564A958@petsc.dev> It appears you possibly have the environmental variable PETSC_OPTIONS set to -log_view or have a .petscrc file containing -log_view that is triggering the printing of the logging information. The logging information confuses the error checker in make check to make it think there may be an error in the output when there is not. The tests ran fine. Barry > On Nov 29, 2023, at 2:37?PM, Di Miao via petsc-users wrote: > > Hi, > > I tried to compile PETSc with the following configuration: > > ./configure --with-debugging=0 COPTFLAGS='-O3' CXXOPTFLAGS='-O3' FOPTFLAGS='-O3' --with-clean=1 --with-make-exec=/SCRATCH/dimiao/test_space/installed/make/bin/make --with-cmake-exec=/SCRATCH/dimiao/test_space/cmake-3.27.9-linux-x86_64/bin/cmake --prefix=/SCRATCH/dimiao/test_space/installed/petsc_opt_mpi --with-mpi-dir=/SCRATCH/dimiao/test_space/installed/mpich PETSC_ARCH=petsc_opt_mpi --with-blaslapack-dir=/SCRATCH/dimiao/oneapi/mkl/latest --with-mkl_pardiso-dir=/SCRATCH/dimiao/oneapi/mkl/latest --with-x=0 > > I got three errors: > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes > Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process > > Below each error messages are nothing PETSc?s performance summary. > > I have attached make.log, configure.log and the message from make check(make_check.log). Could you please give me some guidance on how to fix this issue? > > Thank you, > Di > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Di.Miao at synopsys.com Wed Nov 29 14:39:43 2023 From: Di.Miao at synopsys.com (Di Miao) Date: Wed, 29 Nov 2023 20:39:43 +0000 Subject: [petsc-users] petsc build could not pass make check In-Reply-To: References: Message-ID: Yes, it is caused by the .petscrc. Thank you for your help! Di -----Original Message----- From: Satish Balay Sent: Wednesday, November 29, 2023 11:41 AM To: Di Miao Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] petsc build could not pass make check Do you have a ~/.petscrc file - with -log_view enabled? Satish On Wed, 29 Nov 2023, Di Miao via petsc-users wrote: > Hi, > > I tried to compile PETSc with the following configuration: > > ./configure --with-debugging=0 COPTFLAGS='-O3' CXXOPTFLAGS='-O3' > FOPTFLAGS='-O3' --with-clean=1 > --with-make-exec=/SCRATCH/dimiao/test_space/installed/make/bin/make > --with-cmake-exec=/SCRATCH/dimiao/test_space/cmake-3.27.9-linux-x86_64 > /bin/cmake --prefix=/SCRATCH/dimiao/test_space/installed/petsc_opt_mpi > --with-mpi-dir=/SCRATCH/dimiao/test_space/installed/mpich > PETSC_ARCH=petsc_opt_mpi > --with-blaslapack-dir=/SCRATCH/dimiao/oneapi/mkl/latest > --with-mkl_pardiso-dir=/SCRATCH/dimiao/oneapi/mkl/latest --with-x=0 > > I got three errors: > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI > process Possible error running C/C++ src/snes/tutorials/ex19 with 2 > MPI processes Possible error running Fortran example > src/snes/tutorials/ex5f with 1 MPI process > > Below each error messages are nothing PETSc's performance summary. > > I have attached make.log, configure.log and the message from make check(make_check.log). Could you please give me some guidance on how to fix this issue? > > Thank you, > Di > > > From mail2amneet at gmail.com Wed Nov 29 17:48:49 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Wed, 29 Nov 2023 15:48:49 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: <5062E7EF-A0CE-465E-94F6-F43C761A95C1@petsc.dev> References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> <5062E7EF-A0CE-465E-94F6-F43C761A95C1@petsc.dev> Message-ID: Thanks Barry! I tried that and it seems to be working. This is what I did. It would be great if you could take a look at it and let me know if this is what you had in mind. 1. Collected Dirichlet rows locally https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L731 https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L797 2. MPI_allgatherv Dirichlet rows https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L805-L810 3. Called the MatZeroRows function https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L812-L814 On Wed, Nov 29, 2023 at 11:32?AM Barry Smith wrote: > > > On Nov 29, 2023, at 2:11?PM, Matthew Knepley wrote: > > On Wed, Nov 29, 2023 at 1:55?PM Amneet Bhalla > wrote: > >> So the code logic is after the matrix is assembled, I iterate over all >> distributed patches in the domain to see which of the patch is abutting a >> Dirichlet boundary. Depending upon which patch abuts a physical and >> Dirichlet boundary, a processor will call this routine. However, that same >> processor is ?owning? that DoF, which would be on its diagonal. >> >> I think Barry already mentioned this is not going to work unless I use >> the flag to not communicate explicitly. However, that flag is not working >> as it should over here for some reason. >> > > Oh, I do not think that is right. > > Barry, when I look at the code, MPIU_Allreduce is always going to be > called to fix up the nonzero_state. Am I wrong about that? > > > No, you are correct. I missed that in my earlier look. Setting those > flags reduce the number of MPI reductions but does not eliminate them > completely. > > MatZeroRows is collective (as its manual page indicates) so you have to > do the second thing I suggested. Inside your for loop construct an array > containing all the local > rows being zeroed and then make a single call by all MPI processes to > MatZeroRows(). Note this is a small change of just a handful of lines of > code. > > Barry > > > Thanks, > > Matt > > >> I can always change the matrix coefficients for Dirichlet rows during >> MatSetValues. However, that would lengthen my code and I was trying to >> avoid that. >> >> On Wed, Nov 29, 2023 at 10:02?AM Matthew Knepley >> wrote: >> >>> On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla >>> wrote: >>> >>>> Ok, I added both, but it still hangs. Here, is bt from all three tasks: >>>> >>> >>> It looks like two processes are calling AllReduce, but one is not. Are >>> all procs not calling MatZeroRows? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Task 1: >>>> >>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >>>> (lldb) process attach --pid 44691 >>>> Process 44691 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> frame #0: 0x000000018a2d750c >>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>> libsystem_kernel.dylib`: >>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>> 0x18a2d7510 <+12>: pacibsp >>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>> 0x18a2d7518 <+20>: mov x29, sp >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> Executable module set to >>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>> Architecture set to: arm64-apple-macosx-. >>>> (lldb) cont >>>> Process 44691 resuming >>>> Process 44691 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> frame #0: 0x000000010ba40b60 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >>>> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >>>> 0x10ba40b64 <+756>: ldr w9, [x22] >>>> 0x10ba40b68 <+760>: cmp w8, w9 >>>> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> (lldb) bt >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> * frame #0: 0x000000010ba40b60 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>> frame #1: 0x000000010ba48528 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 >>>> frame #2: 0x000000010ba47964 >>>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >>>> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >>>> frame #5: 0x0000000106d67650 >>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, >>>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>>> b=0x0000000000000000) at mpiaij.c:827:3 >>>> frame #6: 0x0000000106aadfac >>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, >>>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>>> b=0x0000000000000000) at matrix.c:5935:3 >>>> frame #7: 0x00000001023952d0 >>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, >>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>> u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, >>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>> patch_level=Pointer > @ 0x000000016dbfcec0, >>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>>> frame #8: 0x00000001023acb8c >>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, >>>> x=0x000000016dc05778, (null)=0x000000016dc05680) at >>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>> frame #9: 0x000000010254a2dc >>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, >>>> x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340 >>>> :5 >>>> frame #10: 0x0000000102202e5c >>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at >>>> fo_acoustic_streaming_solver.cpp:400:22 >>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>> (lldb) >>>> >>>> >>>> Task 2: >>>> >>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >>>> (lldb) process attach --pid 44692 >>>> Process 44692 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> frame #0: 0x000000018a2d750c >>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>> libsystem_kernel.dylib`: >>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>> 0x18a2d7510 <+12>: pacibsp >>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>> 0x18a2d7518 <+20>: mov x29, sp >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> Executable module set to >>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>> Architecture set to: arm64-apple-macosx-. >>>> (lldb) cont >>>> Process 44692 resuming >>>> Process 44692 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> frame #0: 0x000000010e5a022c >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >>>> 0x10e5a0230 <+520>: cmp x9, x10 >>>> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >>>> 0x10e5a0238 <+528>: add w8, w8, #0x1 >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> (lldb) bt >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> * frame #0: 0x000000010e5a022c >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>> frame #1: 0x000000010e59fd14 >>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>> frame #2: 0x000000010e59fb60 >>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >>>> frame #5: 0x0000000108e62638 >>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, >>>> comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c: >>>> 235:5 >>>> frame #6: 0x0000000108e6a910 >>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, >>>> classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", >>>> comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), >>>> view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 >>>> frame #7: 0x000000010aa28010 >>>> libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at >>>> itcreate.c:679:3 >>>> frame #8: 0x00000001050aa2f4 >>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, >>>> x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344 >>>> :12 >>>> frame #9: 0x0000000104d62e5c >>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at >>>> fo_acoustic_streaming_solver.cpp:400:22 >>>> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >>>> (lldb) >>>> >>>> >>>> Task 3: >>>> >>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >>>> (lldb) process attach --pid 44693 >>>> Process 44693 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> frame #0: 0x000000018a2d750c >>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>> libsystem_kernel.dylib`: >>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>> 0x18a2d7510 <+12>: pacibsp >>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>> 0x18a2d7518 <+20>: mov x29, sp >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> Executable module set to >>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>> Architecture set to: arm64-apple-macosx-. >>>> (lldb) cont >>>> Process 44693 resuming >>>> Process 44693 stopped >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> frame #0: 0x000000010e59c68c >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >>>> -> 0x10e59c68c <+952>: ldr w9, [x21] >>>> 0x10e59c690 <+956>: cmp w8, w9 >>>> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >>>> 0x10e59c698 <+964>: bl 0x10e59ce64 ; >>>> MPID_Progress_test >>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>> (lldb) bt >>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>> SIGSTOP >>>> * frame #0: 0x000000010e59c68c >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>> frame #1: 0x000000010e5a44bc >>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 >>>> frame #2: 0x000000010e5a3964 >>>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >>>> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >>>> frame #5: 0x00000001098c3650 >>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, >>>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>>> b=0x0000000000000000) at mpiaij.c:827:3 >>>> frame #6: 0x0000000109609fac >>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, >>>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>>> b=0x0000000000000000) at matrix.c:5935:3 >>>> frame #7: 0x0000000104ef12d0 >>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, >>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>> u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, >>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>> patch_level=Pointer > @ 0x000000016b0a0ec0, >>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>>> frame #8: 0x0000000104f08b8c >>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, >>>> x=0x000000016b0a9778, (null)=0x000000016b0a9680) at >>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>> frame #9: 0x00000001050a62dc >>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, >>>> x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340 >>>> :5 >>>> frame #10: 0x0000000104d5ee5c >>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at >>>> fo_acoustic_streaming_solver.cpp:400:22 >>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>> (lldb) >>>> >>>> >>>> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith wrote: >>>> >>>>> >>>>> >>>>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla >>>>> wrote: >>>>> >>>>> BTW, I think you meant using MatSetOption(mat, >>>>> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >>>>> >>>>> >>>>> Yes >>>>> >>>>> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) ?? >>>>> >>>>> >>>>> Please try setting both flags. >>>>> >>>>> However, that also did not help to overcome the MPI Barrier issue. >>>>> >>>>> >>>>> If there is still a problem please trap all the MPI processes when >>>>> they hang in the debugger and send the output from using bt on all of them. >>>>> This way >>>>> we can see the different places the different MPI processes are stuck >>>>> at. >>>>> >>>>> >>>>> >>>>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla >>>>> wrote: >>>>> >>>>>> I added that option but the code still gets stuck at the same call >>>>>> MatZeroRows with 3 processors. >>>>>> >>>>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>>>> { >>>>>>>> ....... >>>>>>>> for (Box::Iterator bc(bc_coef_box); bc; >>>>>>>> bc++) >>>>>>>> { >>>>>>>> ...... >>>>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>>>> { >>>>>>>> const double diag_value = a; >>>>>>>> ierr = MatZeroRows(mat, 1, >>>>>>>> &u_dof_index, diag_value, NULL, NULL); >>>>>>>> IBTK_CHKERRQ(ierr); >>>>>>>> } >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> In general, this code will not work because each process calls >>>>>>>> MatZeroRows a different number of times, so it cannot match up with all the >>>>>>>> processes. >>>>>>>> >>>>>>>> If u_dof_index is always local to the current process, you can call >>>>>>>> MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop >>>>>>>> and >>>>>>>> the MatZeroRows will not synchronize across the MPI processes >>>>>>>> (since it does not need to and you told it that). >>>>>>>> >>>>>>> >>>>>>> Yes, u_dof_index is going to be local and I put a check on it a few >>>>>>> lines before calling MatZeroRows. >>>>>>> >>>>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>>>> >>>>>>> >>>>>>>> If the u_dof_index will not always be local, then you need, on each >>>>>>>> process, to list all the u_dof_index for each process in an array and then >>>>>>>> call MatZeroRows() >>>>>>>> once after the loop so it can exchange the needed information with >>>>>>>> the other MPI processes to get the row indices to the right place. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hi Folks, >>>>>>>> >>>>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This >>>>>>>> works fine for the serial run and the solver produces correct results >>>>>>>> (verified through analytical solution). However, when I run the case in >>>>>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>>>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>>>>>> been called, and should be called by all processors. Here is that bit of >>>>>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>>>>> >>>>>>>> >>>>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>>>> >>>>>>>> I ran the parallel code (on 3 processors) in the debugger >>>>>>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>>>>>> stuck >>>>>>>> >>>>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>>>> (lldb) process attach --pid 4307 >>>>>>>> Process 4307 stopped >>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>> SIGSTOP >>>>>>>> frame #0: 0x000000018a2d750c >>>>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>>> libsystem_kernel.dylib`: >>>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>> Executable module set to >>>>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>>> (lldb) cont >>>>>>>> Process 4307 resuming >>>>>>>> Process 4307 stopped >>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>> SIGSTOP >>>>>>>> frame #0: 0x0000000109d281b8 >>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>>>>> MPID_Progress_test >>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>> (lldb) bt >>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>> SIGSTOP >>>>>>>> * frame #0: 0x0000000109d281b8 >>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>> frame #1: 0x0000000109d27d14 >>>>>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>>>> frame #2: 0x0000000109d27b60 >>>>>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + >>>>>>>> 900 >>>>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>>>> frame #5: 0x00000001045ea638 >>>>>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>>>>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at >>>>>>>> tagm.c:235:5 >>>>>>>> frame #6: 0x00000001045f2910 >>>>>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, >>>>>>>> classid=1211227, class_name="PetscSF", descr="Star Forest", >>>>>>>> mansec="PetscSF", comm=-2080374782, >>>>>>>> destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), >>>>>>>> view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62 >>>>>>>> :3 >>>>>>>> frame #7: 0x00000001049cf820 >>>>>>>> libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) >>>>>>>> at sf.c:62:3 >>>>>>>> frame #8: 0x0000000104cd3024 >>>>>>>> libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, >>>>>>>> rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) >>>>>>>> at zerorows.c:36:5 >>>>>>>> frame #9: 0x000000010504ea50 >>>>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, >>>>>>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>>>>>> b=0x0000000000000000) at mpiaij.c:768:3 >>>>>>>> >>>>>>>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Wed Nov 29 18:26:44 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Wed, 29 Nov 2023 16:26:44 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> <5062E7EF-A0CE-465E-94F6-F43C761A95C1@petsc.dev> Message-ID: Ah, I also tried without step 2 (i.e., manually doing MPI_allgatherv for Dirichlet rows), and that also works. So it seems that each processor needs to send in their own Dirichlet rows, and not a union of them. Is that correct? On Wed, Nov 29, 2023 at 3:48?PM Amneet Bhalla wrote: > Thanks Barry! I tried that and it seems to be working. This is what I did. > It would be great if you could take a look at it and let me know if this is > what you had in mind. > > 1. Collected Dirichlet rows locally > > https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L731 > > https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L797 > > > 2. MPI_allgatherv Dirichlet rows > > https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L805-L810 > > 3. Called the MatZeroRows function > > https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L812-L814 > > > > > On Wed, Nov 29, 2023 at 11:32?AM Barry Smith wrote: > >> >> >> On Nov 29, 2023, at 2:11?PM, Matthew Knepley wrote: >> >> On Wed, Nov 29, 2023 at 1:55?PM Amneet Bhalla >> wrote: >> >>> So the code logic is after the matrix is assembled, I iterate over all >>> distributed patches in the domain to see which of the patch is abutting a >>> Dirichlet boundary. Depending upon which patch abuts a physical and >>> Dirichlet boundary, a processor will call this routine. However, that same >>> processor is ?owning? that DoF, which would be on its diagonal. >>> >>> I think Barry already mentioned this is not going to work unless I use >>> the flag to not communicate explicitly. However, that flag is not working >>> as it should over here for some reason. >>> >> >> Oh, I do not think that is right. >> >> Barry, when I look at the code, MPIU_Allreduce is always going to be >> called to fix up the nonzero_state. Am I wrong about that? >> >> >> No, you are correct. I missed that in my earlier look. Setting those >> flags reduce the number of MPI reductions but does not eliminate them >> completely. >> >> MatZeroRows is collective (as its manual page indicates) so you have to >> do the second thing I suggested. Inside your for loop construct an array >> containing all the local >> rows being zeroed and then make a single call by all MPI processes to >> MatZeroRows(). Note this is a small change of just a handful of lines of >> code. >> >> Barry >> >> >> Thanks, >> >> Matt >> >> >>> I can always change the matrix coefficients for Dirichlet rows during >>> MatSetValues. However, that would lengthen my code and I was trying to >>> avoid that. >>> >>> On Wed, Nov 29, 2023 at 10:02?AM Matthew Knepley >>> wrote: >>> >>>> On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla >>>> wrote: >>>> >>>>> Ok, I added both, but it still hangs. Here, is bt from all three tasks: >>>>> >>>> >>>> It looks like two processes are calling AllReduce, but one is not. Are >>>> all procs not calling MatZeroRows? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Task 1: >>>>> >>>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >>>>> (lldb) process attach --pid 44691 >>>>> Process 44691 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> frame #0: 0x000000018a2d750c >>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>> libsystem_kernel.dylib`: >>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>> 0x18a2d7510 <+12>: pacibsp >>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> Executable module set to >>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>> Architecture set to: arm64-apple-macosx-. >>>>> (lldb) cont >>>>> Process 44691 resuming >>>>> Process 44691 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> frame #0: 0x000000010ba40b60 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >>>>> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >>>>> 0x10ba40b64 <+756>: ldr w9, [x22] >>>>> 0x10ba40b68 <+760>: cmp w8, w9 >>>>> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> (lldb) bt >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> * frame #0: 0x000000010ba40b60 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>>> frame #1: 0x000000010ba48528 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 >>>>> frame #2: 0x000000010ba47964 >>>>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>>> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + >>>>> 1588 >>>>> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >>>>> frame #5: 0x0000000106d67650 >>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, >>>>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>>>> b=0x0000000000000000) at mpiaij.c:827:3 >>>>> frame #6: 0x0000000106aadfac >>>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, >>>>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>>>> b=0x0000000000000000) at matrix.c:5935:3 >>>>> frame #7: 0x00000001023952d0 >>>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, >>>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>>> u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, >>>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>>> patch_level=Pointer > @ 0x000000016dbfcec0, >>>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>>>> frame #8: 0x00000001023acb8c >>>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, >>>>> x=0x000000016dc05778, (null)=0x000000016dc05680) at >>>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>>> frame #9: 0x000000010254a2dc >>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, >>>>> x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp: >>>>> 340:5 >>>>> frame #10: 0x0000000102202e5c >>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at >>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>>> (lldb) >>>>> >>>>> >>>>> Task 2: >>>>> >>>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >>>>> (lldb) process attach --pid 44692 >>>>> Process 44692 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> frame #0: 0x000000018a2d750c >>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>> libsystem_kernel.dylib`: >>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>> 0x18a2d7510 <+12>: pacibsp >>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> Executable module set to >>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>> Architecture set to: arm64-apple-macosx-. >>>>> (lldb) cont >>>>> Process 44692 resuming >>>>> Process 44692 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> frame #0: 0x000000010e5a022c >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >>>>> 0x10e5a0230 <+520>: cmp x9, x10 >>>>> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >>>>> 0x10e5a0238 <+528>: add w8, w8, #0x1 >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> (lldb) bt >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> * frame #0: 0x000000010e5a022c >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>>> frame #1: 0x000000010e59fd14 >>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>> frame #2: 0x000000010e59fb60 >>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >>>>> frame #5: 0x0000000108e62638 >>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, >>>>> comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c: >>>>> 235:5 >>>>> frame #6: 0x0000000108e6a910 >>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, >>>>> classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", >>>>> comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), >>>>> view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62: >>>>> 3 >>>>> frame #7: 0x000000010aa28010 >>>>> libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at >>>>> itcreate.c:679:3 >>>>> frame #8: 0x00000001050aa2f4 >>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, >>>>> x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp: >>>>> 344:12 >>>>> frame #9: 0x0000000104d62e5c >>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at >>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >>>>> (lldb) >>>>> >>>>> >>>>> Task 3: >>>>> >>>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >>>>> (lldb) process attach --pid 44693 >>>>> Process 44693 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> frame #0: 0x000000018a2d750c >>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>> libsystem_kernel.dylib`: >>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>> 0x18a2d7510 <+12>: pacibsp >>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> Executable module set to >>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>> Architecture set to: arm64-apple-macosx-. >>>>> (lldb) cont >>>>> Process 44693 resuming >>>>> Process 44693 stopped >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> frame #0: 0x000000010e59c68c >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >>>>> -> 0x10e59c68c <+952>: ldr w9, [x21] >>>>> 0x10e59c690 <+956>: cmp w8, w9 >>>>> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >>>>> 0x10e59c698 <+964>: bl 0x10e59ce64 ; >>>>> MPID_Progress_test >>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>> (lldb) bt >>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>> SIGSTOP >>>>> * frame #0: 0x000000010e59c68c >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>>> frame #1: 0x000000010e5a44bc >>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 >>>>> frame #2: 0x000000010e5a3964 >>>>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>>> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + >>>>> 1588 >>>>> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >>>>> frame #5: 0x00000001098c3650 >>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, >>>>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>>>> b=0x0000000000000000) at mpiaij.c:827:3 >>>>> frame #6: 0x0000000109609fac >>>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, >>>>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>>>> b=0x0000000000000000) at matrix.c:5935:3 >>>>> frame #7: 0x0000000104ef12d0 >>>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, >>>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>>> u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, >>>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>>> patch_level=Pointer > @ 0x000000016b0a0ec0, >>>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>>>> frame #8: 0x0000000104f08b8c >>>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, >>>>> x=0x000000016b0a9778, (null)=0x000000016b0a9680) at >>>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>>> frame #9: 0x00000001050a62dc >>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, >>>>> x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp: >>>>> 340:5 >>>>> frame #10: 0x0000000104d5ee5c >>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at >>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>>> (lldb) >>>>> >>>>> >>>>> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith wrote: >>>>> >>>>>> >>>>>> >>>>>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla >>>>>> wrote: >>>>>> >>>>>> BTW, I think you meant using MatSetOption(mat, >>>>>> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >>>>>> >>>>>> >>>>>> Yes >>>>>> >>>>>> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) >>>>>> ?? >>>>>> >>>>>> >>>>>> Please try setting both flags. >>>>>> >>>>>> However, that also did not help to overcome the MPI Barrier issue. >>>>>> >>>>>> >>>>>> If there is still a problem please trap all the MPI processes when >>>>>> they hang in the debugger and send the output from using bt on all of them. >>>>>> This way >>>>>> we can see the different places the different MPI processes are stuck >>>>>> at. >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla >>>>>> wrote: >>>>>> >>>>>>> I added that option but the code still gets stuck at the same call >>>>>>> MatZeroRows with 3 processors. >>>>>>> >>>>>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>>>>> { >>>>>>>>> ....... >>>>>>>>> for (Box::Iterator bc(bc_coef_box); bc; >>>>>>>>> bc++) >>>>>>>>> { >>>>>>>>> ...... >>>>>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>>>>> { >>>>>>>>> const double diag_value = a; >>>>>>>>> ierr = MatZeroRows(mat, 1, >>>>>>>>> &u_dof_index, diag_value, NULL, NULL); >>>>>>>>> IBTK_CHKERRQ(ierr); >>>>>>>>> } >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> In general, this code will not work because each process calls >>>>>>>>> MatZeroRows a different number of times, so it cannot match up with all the >>>>>>>>> processes. >>>>>>>>> >>>>>>>>> If u_dof_index is always local to the current process, you can >>>>>>>>> call MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for >>>>>>>>> loop and >>>>>>>>> the MatZeroRows will not synchronize across the MPI processes >>>>>>>>> (since it does not need to and you told it that). >>>>>>>>> >>>>>>>> >>>>>>>> Yes, u_dof_index is going to be local and I put a check on it a few >>>>>>>> lines before calling MatZeroRows. >>>>>>>> >>>>>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>>>>> >>>>>>>> >>>>>>>>> If the u_dof_index will not always be local, then you need, on >>>>>>>>> each process, to list all the u_dof_index for each process in an array and >>>>>>>>> then call MatZeroRows() >>>>>>>>> once after the loop so it can exchange the needed information with >>>>>>>>> the other MPI processes to get the row indices to the right place. >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi Folks, >>>>>>>>> >>>>>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. >>>>>>>>> This works fine for the serial run and the solver produces correct results >>>>>>>>> (verified through analytical solution). However, when I run the case in >>>>>>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>>>>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>>>>>>> been called, and should be called by all processors. Here is that bit of >>>>>>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>>>>>> >>>>>>>>> >>>>>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>>>>> >>>>>>>>> I ran the parallel code (on 3 processors) in the debugger >>>>>>>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>>>>>>> stuck >>>>>>>>> >>>>>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>>>>> (lldb) process attach --pid 4307 >>>>>>>>> Process 4307 stopped >>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>>> SIGSTOP >>>>>>>>> frame #0: 0x000000018a2d750c >>>>>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>>>> libsystem_kernel.dylib`: >>>>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>>> Executable module set to >>>>>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>>>> (lldb) cont >>>>>>>>> Process 4307 resuming >>>>>>>>> Process 4307 stopped >>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>>> SIGSTOP >>>>>>>>> frame #0: 0x0000000109d281b8 >>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>>>>>> MPID_Progress_test >>>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>>> (lldb) bt >>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>>> SIGSTOP >>>>>>>>> * frame #0: 0x0000000109d281b8 >>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>>> frame #1: 0x0000000109d27d14 >>>>>>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>>>>> frame #2: 0x0000000109d27b60 >>>>>>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + >>>>>>>>> 900 >>>>>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>>>>> frame #5: 0x00000001045ea638 >>>>>>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>>>>>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at >>>>>>>>> tagm.c:235:5 >>>>>>>>> frame #6: 0x00000001045f2910 >>>>>>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, >>>>>>>>> classid=1211227, class_name="PetscSF", descr="Star Forest", >>>>>>>>> mansec="PetscSF", comm=-2080374782, >>>>>>>>> destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), >>>>>>>>> view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c: >>>>>>>>> 62:3 >>>>>>>>> frame #7: 0x00000001049cf820 >>>>>>>>> libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) >>>>>>>>> at sf.c:62:3 >>>>>>>>> frame #8: 0x0000000104cd3024 >>>>>>>>> libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, >>>>>>>>> rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) >>>>>>>>> at zerorows.c:36:5 >>>>>>>>> frame #9: 0x000000010504ea50 >>>>>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, >>>>>>>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>>>>>>> b=0x0000000000000000) at mpiaij.c:768:3 >>>>>>>>> >>>>>>>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> > > -- > --Amneet > > > > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Nov 29 19:14:39 2023 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 29 Nov 2023 20:14:39 -0500 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> <5062E7EF-A0CE-465E-94F6-F43C761A95C1@petsc.dev> Message-ID: On Wed, Nov 29, 2023 at 7:27?PM Amneet Bhalla wrote: > Ah, I also tried without step 2 (i.e., manually doing MPI_allgatherv for > Dirichlet rows), and that also works. So it seems that each processor needs > to send in their own Dirichlet rows, and not a union of them. Is that > correct? > Yes, that is correct. Thanks, Matt > On Wed, Nov 29, 2023 at 3:48?PM Amneet Bhalla > wrote: > >> Thanks Barry! I tried that and it seems to be working. This is what I >> did. It would be great if you could take a look at it and let me know if >> this is what you had in mind. >> >> 1. Collected Dirichlet rows locally >> >> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L731 >> >> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L797 >> >> >> 2. MPI_allgatherv Dirichlet rows >> >> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L805-L810 >> >> 3. Called the MatZeroRows function >> >> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L812-L814 >> >> >> >> >> On Wed, Nov 29, 2023 at 11:32?AM Barry Smith wrote: >> >>> >>> >>> On Nov 29, 2023, at 2:11?PM, Matthew Knepley wrote: >>> >>> On Wed, Nov 29, 2023 at 1:55?PM Amneet Bhalla >>> wrote: >>> >>>> So the code logic is after the matrix is assembled, I iterate over all >>>> distributed patches in the domain to see which of the patch is abutting a >>>> Dirichlet boundary. Depending upon which patch abuts a physical and >>>> Dirichlet boundary, a processor will call this routine. However, that same >>>> processor is ?owning? that DoF, which would be on its diagonal. >>>> >>>> I think Barry already mentioned this is not going to work unless I use >>>> the flag to not communicate explicitly. However, that flag is not working >>>> as it should over here for some reason. >>>> >>> >>> Oh, I do not think that is right. >>> >>> Barry, when I look at the code, MPIU_Allreduce is always going to be >>> called to fix up the nonzero_state. Am I wrong about that? >>> >>> >>> No, you are correct. I missed that in my earlier look. Setting those >>> flags reduce the number of MPI reductions but does not eliminate them >>> completely. >>> >>> MatZeroRows is collective (as its manual page indicates) so you have >>> to do the second thing I suggested. Inside your for loop construct an array >>> containing all the local >>> rows being zeroed and then make a single call by all MPI processes to >>> MatZeroRows(). Note this is a small change of just a handful of lines of >>> code. >>> >>> Barry >>> >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> I can always change the matrix coefficients for Dirichlet rows during >>>> MatSetValues. However, that would lengthen my code and I was trying to >>>> avoid that. >>>> >>>> On Wed, Nov 29, 2023 at 10:02?AM Matthew Knepley >>>> wrote: >>>> >>>>> On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla >>>>> wrote: >>>>> >>>>>> Ok, I added both, but it still hangs. Here, is bt from all three >>>>>> tasks: >>>>>> >>>>> >>>>> It looks like two processes are calling AllReduce, but one is not. Are >>>>> all procs not calling MatZeroRows? >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Task 1: >>>>>> >>>>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >>>>>> (lldb) process attach --pid 44691 >>>>>> Process 44691 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x000000018a2d750c >>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>> libsystem_kernel.dylib`: >>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> Executable module set to >>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>> Architecture set to: arm64-apple-macosx-. >>>>>> (lldb) cont >>>>>> Process 44691 resuming >>>>>> Process 44691 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x000000010ba40b60 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >>>>>> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >>>>>> 0x10ba40b64 <+756>: ldr w9, [x22] >>>>>> 0x10ba40b68 <+760>: cmp w8, w9 >>>>>> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> (lldb) bt >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> * frame #0: 0x000000010ba40b60 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>>>> frame #1: 0x000000010ba48528 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 >>>>>> frame #2: 0x000000010ba47964 >>>>>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>>>> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + >>>>>> 1588 >>>>>> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >>>>>> frame #5: 0x0000000106d67650 >>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, >>>>>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>>>>> b=0x0000000000000000) at mpiaij.c:827:3 >>>>>> frame #6: 0x0000000106aadfac >>>>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, >>>>>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>>>>> b=0x0000000000000000) at matrix.c:5935:3 >>>>>> frame #7: 0x00000001023952d0 >>>>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, >>>>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>>>> u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, >>>>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>>>> patch_level=Pointer > @ 0x000000016dbfcec0, >>>>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>>>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>>>>> frame #8: 0x00000001023acb8c >>>>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, >>>>>> x=0x000000016dc05778, (null)=0x000000016dc05680) at >>>>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>>>> frame #9: 0x000000010254a2dc >>>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, >>>>>> x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp: >>>>>> 340:5 >>>>>> frame #10: 0x0000000102202e5c >>>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at >>>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>>>> (lldb) >>>>>> >>>>>> >>>>>> Task 2: >>>>>> >>>>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >>>>>> (lldb) process attach --pid 44692 >>>>>> Process 44692 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x000000018a2d750c >>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>> libsystem_kernel.dylib`: >>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> Executable module set to >>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>> Architecture set to: arm64-apple-macosx-. >>>>>> (lldb) cont >>>>>> Process 44692 resuming >>>>>> Process 44692 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x000000010e5a022c >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >>>>>> 0x10e5a0230 <+520>: cmp x9, x10 >>>>>> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >>>>>> 0x10e5a0238 <+528>: add w8, w8, #0x1 >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> (lldb) bt >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> * frame #0: 0x000000010e5a022c >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>>>> frame #1: 0x000000010e59fd14 >>>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>> frame #2: 0x000000010e59fb60 >>>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>>> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>> frame #5: 0x0000000108e62638 >>>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, >>>>>> comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c: >>>>>> 235:5 >>>>>> frame #6: 0x0000000108e6a910 >>>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, >>>>>> classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", >>>>>> comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), >>>>>> view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62 >>>>>> :3 >>>>>> frame #7: 0x000000010aa28010 >>>>>> libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at >>>>>> itcreate.c:679:3 >>>>>> frame #8: 0x00000001050aa2f4 >>>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, >>>>>> x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp: >>>>>> 344:12 >>>>>> frame #9: 0x0000000104d62e5c >>>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at >>>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>>> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >>>>>> (lldb) >>>>>> >>>>>> >>>>>> Task 3: >>>>>> >>>>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >>>>>> (lldb) process attach --pid 44693 >>>>>> Process 44693 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x000000018a2d750c >>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>> libsystem_kernel.dylib`: >>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> Executable module set to >>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>> Architecture set to: arm64-apple-macosx-. >>>>>> (lldb) cont >>>>>> Process 44693 resuming >>>>>> Process 44693 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x000000010e59c68c >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >>>>>> -> 0x10e59c68c <+952>: ldr w9, [x21] >>>>>> 0x10e59c690 <+956>: cmp w8, w9 >>>>>> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >>>>>> 0x10e59c698 <+964>: bl 0x10e59ce64 ; >>>>>> MPID_Progress_test >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> (lldb) bt >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> * frame #0: 0x000000010e59c68c >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>>>> frame #1: 0x000000010e5a44bc >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 >>>>>> frame #2: 0x000000010e5a3964 >>>>>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>>>> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + >>>>>> 1588 >>>>>> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >>>>>> frame #5: 0x00000001098c3650 >>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, >>>>>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>>>>> b=0x0000000000000000) at mpiaij.c:827:3 >>>>>> frame #6: 0x0000000109609fac >>>>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, >>>>>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>>>>> b=0x0000000000000000) at matrix.c:5935:3 >>>>>> frame #7: 0x0000000104ef12d0 >>>>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, >>>>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>>>> u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, >>>>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>>>> patch_level=Pointer > @ 0x000000016b0a0ec0, >>>>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>>>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>>>>> frame #8: 0x0000000104f08b8c >>>>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, >>>>>> x=0x000000016b0a9778, (null)=0x000000016b0a9680) at >>>>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>>>> frame #9: 0x00000001050a62dc >>>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, >>>>>> x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp: >>>>>> 340:5 >>>>>> frame #10: 0x0000000104d5ee5c >>>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at >>>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>>>> (lldb) >>>>>> >>>>>> >>>>>> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla >>>>>>> wrote: >>>>>>> >>>>>>> BTW, I think you meant using MatSetOption(mat, >>>>>>> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >>>>>>> >>>>>>> >>>>>>> Yes >>>>>>> >>>>>>> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) >>>>>>> ?? >>>>>>> >>>>>>> >>>>>>> Please try setting both flags. >>>>>>> >>>>>>> However, that also did not help to overcome the MPI Barrier issue. >>>>>>> >>>>>>> >>>>>>> If there is still a problem please trap all the MPI processes when >>>>>>> they hang in the debugger and send the output from using bt on all of them. >>>>>>> This way >>>>>>> we can see the different places the different MPI processes are >>>>>>> stuck at. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla >>>>>>> wrote: >>>>>>> >>>>>>>> I added that option but the code still gets stuck at the same call >>>>>>>> MatZeroRows with 3 processors. >>>>>>>> >>>>>>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla < >>>>>>>> mail2amneet at gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>>>>>> { >>>>>>>>>> ....... >>>>>>>>>> for (Box::Iterator bc(bc_coef_box); bc; >>>>>>>>>> bc++) >>>>>>>>>> { >>>>>>>>>> ...... >>>>>>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>>>>>> { >>>>>>>>>> const double diag_value = a; >>>>>>>>>> ierr = MatZeroRows(mat, 1, >>>>>>>>>> &u_dof_index, diag_value, NULL, NULL); >>>>>>>>>> IBTK_CHKERRQ(ierr); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> In general, this code will not work because each process calls >>>>>>>>>> MatZeroRows a different number of times, so it cannot match up with all the >>>>>>>>>> processes. >>>>>>>>>> >>>>>>>>>> If u_dof_index is always local to the current process, you can >>>>>>>>>> call MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for >>>>>>>>>> loop and >>>>>>>>>> the MatZeroRows will not synchronize across the MPI processes >>>>>>>>>> (since it does not need to and you told it that). >>>>>>>>>> >>>>>>>>> >>>>>>>>> Yes, u_dof_index is going to be local and I put a check on it a >>>>>>>>> few lines before calling MatZeroRows. >>>>>>>>> >>>>>>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>>>>>> >>>>>>>>> >>>>>>>>>> If the u_dof_index will not always be local, then you need, on >>>>>>>>>> each process, to list all the u_dof_index for each process in an array and >>>>>>>>>> then call MatZeroRows() >>>>>>>>>> once after the loop so it can exchange the needed information >>>>>>>>>> with the other MPI processes to get the row indices to the right place. >>>>>>>>>> >>>>>>>>>> Barry >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Folks, >>>>>>>>>> >>>>>>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. >>>>>>>>>> This works fine for the serial run and the solver produces correct results >>>>>>>>>> (verified through analytical solution). However, when I run the case in >>>>>>>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>>>>>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>>>>>>>> been called, and should be called by all processors. Here is that bit of >>>>>>>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>>>>>> >>>>>>>>>> I ran the parallel code (on 3 processors) in the debugger >>>>>>>>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>>>>>>>> stuck >>>>>>>>>> >>>>>>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>>>>>> (lldb) process attach --pid 4307 >>>>>>>>>> Process 4307 stopped >>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>>>> SIGSTOP >>>>>>>>>> frame #0: 0x000000018a2d750c >>>>>>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>>>>> libsystem_kernel.dylib`: >>>>>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>>>> Executable module set to >>>>>>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>>>>> (lldb) cont >>>>>>>>>> Process 4307 resuming >>>>>>>>>> Process 4307 stopped >>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>>>> SIGSTOP >>>>>>>>>> frame #0: 0x0000000109d281b8 >>>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>>>>>>> MPID_Progress_test >>>>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>>>> (lldb) bt >>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>>>> SIGSTOP >>>>>>>>>> * frame #0: 0x0000000109d281b8 >>>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>>>> frame #1: 0x0000000109d27d14 >>>>>>>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>>>>>> frame #2: 0x0000000109d27b60 >>>>>>>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + >>>>>>>>>> 900 >>>>>>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + >>>>>>>>>> 684 >>>>>>>>>> frame #5: 0x00000001045ea638 >>>>>>>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>>>>>>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at >>>>>>>>>> tagm.c:235:5 >>>>>>>>>> frame #6: 0x00000001045f2910 >>>>>>>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, >>>>>>>>>> classid=1211227, class_name="PetscSF", descr="Star Forest", >>>>>>>>>> mansec="PetscSF", comm=-2080374782, >>>>>>>>>> destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), >>>>>>>>>> view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c: >>>>>>>>>> 62:3 >>>>>>>>>> frame #7: 0x00000001049cf820 >>>>>>>>>> libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, sf=0x000000016f911a50) >>>>>>>>>> at sf.c:62:3 >>>>>>>>>> frame #8: 0x0000000104cd3024 >>>>>>>>>> libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, N=1, >>>>>>>>>> rows=0x000000016f912cb4, nr=0x000000016f911df8, olrows=0x000000016f911e00) >>>>>>>>>> at zerorows.c:36:5 >>>>>>>>>> frame #9: 0x000000010504ea50 >>>>>>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, >>>>>>>>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>>>>>>>> b=0x0000000000000000) at mpiaij.c:768:3 >>>>>>>>>> >>>>>>>>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> >> >> -- >> --Amneet >> >> >> >> > > -- > --Amneet > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Wed Nov 29 21:11:03 2023 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Wed, 29 Nov 2023 19:11:03 -0800 Subject: [petsc-users] MPI barrier issue using MatZeroRows In-Reply-To: References: <44C0BE3F-6228-4F7C-B90E-5A81D6030849@petsc.dev> <5062E7EF-A0CE-465E-94F6-F43C761A95C1@petsc.dev> Message-ID: Awesome! MatZeroRows is very useful and simplified the code logic. On Wed, Nov 29, 2023 at 5:14?PM Matthew Knepley wrote: > On Wed, Nov 29, 2023 at 7:27?PM Amneet Bhalla > wrote: > >> Ah, I also tried without step 2 (i.e., manually doing MPI_allgatherv for >> Dirichlet rows), and that also works. So it seems that each processor needs >> to send in their own Dirichlet rows, and not a union of them. Is that >> correct? >> > > Yes, that is correct. > > Thanks, > > Matt > > >> On Wed, Nov 29, 2023 at 3:48?PM Amneet Bhalla >> wrote: >> >>> Thanks Barry! I tried that and it seems to be working. This is what I >>> did. It would be great if you could take a look at it and let me know if >>> this is what you had in mind. >>> >>> 1. Collected Dirichlet rows locally >>> >>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L731 >>> >>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L797 >>> >>> >>> 2. MPI_allgatherv Dirichlet rows >>> >>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L805-L810 >>> >>> 3. Called the MatZeroRows function >>> >>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L812-L814 >>> >>> >>> >>> >>> On Wed, Nov 29, 2023 at 11:32?AM Barry Smith wrote: >>> >>>> >>>> >>>> On Nov 29, 2023, at 2:11?PM, Matthew Knepley wrote: >>>> >>>> On Wed, Nov 29, 2023 at 1:55?PM Amneet Bhalla >>>> wrote: >>>> >>>>> So the code logic is after the matrix is assembled, I iterate over all >>>>> distributed patches in the domain to see which of the patch is abutting a >>>>> Dirichlet boundary. Depending upon which patch abuts a physical and >>>>> Dirichlet boundary, a processor will call this routine. However, that same >>>>> processor is ?owning? that DoF, which would be on its diagonal. >>>>> >>>>> I think Barry already mentioned this is not going to work unless I use >>>>> the flag to not communicate explicitly. However, that flag is not working >>>>> as it should over here for some reason. >>>>> >>>> >>>> Oh, I do not think that is right. >>>> >>>> Barry, when I look at the code, MPIU_Allreduce is always going to be >>>> called to fix up the nonzero_state. Am I wrong about that? >>>> >>>> >>>> No, you are correct. I missed that in my earlier look. Setting those >>>> flags reduce the number of MPI reductions but does not eliminate them >>>> completely. >>>> >>>> MatZeroRows is collective (as its manual page indicates) so you have >>>> to do the second thing I suggested. Inside your for loop construct an array >>>> containing all the local >>>> rows being zeroed and then make a single call by all MPI processes to >>>> MatZeroRows(). Note this is a small change of just a handful of lines of >>>> code. >>>> >>>> Barry >>>> >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> I can always change the matrix coefficients for Dirichlet rows during >>>>> MatSetValues. However, that would lengthen my code and I was trying to >>>>> avoid that. >>>>> >>>>> On Wed, Nov 29, 2023 at 10:02?AM Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Wed, Nov 29, 2023 at 12:30?PM Amneet Bhalla >>>>>> wrote: >>>>>> >>>>>>> Ok, I added both, but it still hangs. Here, is bt from all three >>>>>>> tasks: >>>>>>> >>>>>> >>>>>> It looks like two processes are calling AllReduce, but one is not. >>>>>> Are all procs not calling MatZeroRows? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Task 1: >>>>>>> >>>>>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44691 >>>>>>> (lldb) process attach --pid 44691 >>>>>>> Process 44691 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> frame #0: 0x000000018a2d750c >>>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>> libsystem_kernel.dylib`: >>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> Executable module set to >>>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>> (lldb) cont >>>>>>> Process 44691 resuming >>>>>>> Process 44691 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> frame #0: 0x000000010ba40b60 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >>>>>>> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >>>>>>> 0x10ba40b64 <+756>: ldr w9, [x22] >>>>>>> 0x10ba40b68 <+760>: cmp w8, w9 >>>>>>> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> (lldb) bt >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> * frame #0: 0x000000010ba40b60 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >>>>>>> frame #1: 0x000000010ba48528 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 >>>>>>> frame #2: 0x000000010ba47964 >>>>>>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>>>>> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + >>>>>>> 1588 >>>>>>> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + >>>>>>> 2280 >>>>>>> frame #5: 0x0000000106d67650 >>>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, >>>>>>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>>>>>> b=0x0000000000000000) at mpiaij.c:827:3 >>>>>>> frame #6: 0x0000000106aadfac >>>>>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, >>>>>>> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >>>>>>> b=0x0000000000000000) at matrix.c:5935:3 >>>>>>> frame #7: 0x00000001023952d0 >>>>>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, >>>>>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>>>>> u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, >>>>>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>>>>> patch_level=Pointer > @ 0x000000016dbfcec0, >>>>>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>>>>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>>>>>> frame #8: 0x00000001023acb8c >>>>>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, >>>>>>> x=0x000000016dc05778, (null)=0x000000016dc05680) at >>>>>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>>>>> frame #9: 0x000000010254a2dc >>>>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, >>>>>>> x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp: >>>>>>> 340:5 >>>>>>> frame #10: 0x0000000102202e5c >>>>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at >>>>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>>>>> (lldb) >>>>>>> >>>>>>> >>>>>>> Task 2: >>>>>>> >>>>>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44692 >>>>>>> (lldb) process attach --pid 44692 >>>>>>> Process 44692 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> frame #0: 0x000000018a2d750c >>>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>> libsystem_kernel.dylib`: >>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> Executable module set to >>>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>> (lldb) cont >>>>>>> Process 44692 resuming >>>>>>> Process 44692 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> frame #0: 0x000000010e5a022c >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>>> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >>>>>>> 0x10e5a0230 <+520>: cmp x9, x10 >>>>>>> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >>>>>>> 0x10e5a0238 <+528>: add w8, w8, #0x1 >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> (lldb) bt >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> * frame #0: 0x000000010e5a022c >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >>>>>>> frame #1: 0x000000010e59fd14 >>>>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>>> frame #2: 0x000000010e59fb60 >>>>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>>> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>>>> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>>> frame #5: 0x0000000108e62638 >>>>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, >>>>>>> comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c >>>>>>> :235:5 >>>>>>> frame #6: 0x0000000108e6a910 >>>>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, >>>>>>> classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", >>>>>>> comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), >>>>>>> view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c: >>>>>>> 62:3 >>>>>>> frame #7: 0x000000010aa28010 >>>>>>> libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at >>>>>>> itcreate.c:679:3 >>>>>>> frame #8: 0x00000001050aa2f4 >>>>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, >>>>>>> x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp: >>>>>>> 344:12 >>>>>>> frame #9: 0x0000000104d62e5c >>>>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at >>>>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>>>> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >>>>>>> (lldb) >>>>>>> >>>>>>> >>>>>>> Task 3: >>>>>>> >>>>>>> amneetb at APSB-MacBook-Pro-16:~$ lldb -p 44693 >>>>>>> (lldb) process attach --pid 44693 >>>>>>> Process 44693 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> frame #0: 0x000000018a2d750c >>>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>> libsystem_kernel.dylib`: >>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> Executable module set to >>>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>> (lldb) cont >>>>>>> Process 44693 resuming >>>>>>> Process 44693 stopped >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> frame #0: 0x000000010e59c68c >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >>>>>>> -> 0x10e59c68c <+952>: ldr w9, [x21] >>>>>>> 0x10e59c690 <+956>: cmp w8, w9 >>>>>>> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >>>>>>> 0x10e59c698 <+964>: bl 0x10e59ce64 ; >>>>>>> MPID_Progress_test >>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>> (lldb) bt >>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>> SIGSTOP >>>>>>> * frame #0: 0x000000010e59c68c >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >>>>>>> frame #1: 0x000000010e5a44bc >>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 >>>>>>> frame #2: 0x000000010e5a3964 >>>>>>> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >>>>>>> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + >>>>>>> 1588 >>>>>>> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + >>>>>>> 2280 >>>>>>> frame #5: 0x00000001098c3650 >>>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, >>>>>>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>>>>>> b=0x0000000000000000) at mpiaij.c:827:3 >>>>>>> frame #6: 0x0000000109609fac >>>>>>> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, >>>>>>> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >>>>>>> b=0x0000000000000000) at matrix.c:5935:3 >>>>>>> frame #7: 0x0000000104ef12d0 >>>>>>> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, >>>>>>> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >>>>>>> u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, >>>>>>> u_dof_index_idx=27, p_dof_index_idx=28, >>>>>>> patch_level=Pointer > @ 0x000000016b0a0ec0, >>>>>>> mu_interp_type=VC_HARMONIC_INTERP) at >>>>>>> AcousticStreamingPETScMatUtilities.cpp:799:36 >>>>>>> frame #8: 0x0000000104f08b8c >>>>>>> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, >>>>>>> x=0x000000016b0a9778, (null)=0x000000016b0a9680) at >>>>>>> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >>>>>>> frame #9: 0x00000001050a62dc >>>>>>> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, >>>>>>> x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp: >>>>>>> 340:5 >>>>>>> frame #10: 0x0000000104d5ee5c >>>>>>> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at >>>>>>> fo_acoustic_streaming_solver.cpp:400:22 >>>>>>> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >>>>>>> (lldb) >>>>>>> >>>>>>> >>>>>>> On Wed, Nov 29, 2023 at 7:22?AM Barry Smith >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Nov 29, 2023, at 1:16?AM, Amneet Bhalla >>>>>>>> wrote: >>>>>>>> >>>>>>>> BTW, I think you meant using MatSetOption(mat, >>>>>>>> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >>>>>>>> >>>>>>>> >>>>>>>> Yes >>>>>>>> >>>>>>>> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, >>>>>>>> PETSC_TRUE) ?? >>>>>>>> >>>>>>>> >>>>>>>> Please try setting both flags. >>>>>>>> >>>>>>>> However, that also did not help to overcome the MPI Barrier issue. >>>>>>>> >>>>>>>> >>>>>>>> If there is still a problem please trap all the MPI processes >>>>>>>> when they hang in the debugger and send the output from using bt on all of >>>>>>>> them. This way >>>>>>>> we can see the different places the different MPI processes are >>>>>>>> stuck at. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Nov 28, 2023 at 9:57?PM Amneet Bhalla < >>>>>>>> mail2amneet at gmail.com> wrote: >>>>>>>> >>>>>>>>> I added that option but the code still gets stuck at the same call >>>>>>>>> MatZeroRows with 3 processors. >>>>>>>>> >>>>>>>>> On Tue, Nov 28, 2023 at 7:23?PM Amneet Bhalla < >>>>>>>>> mail2amneet at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Nov 28, 2023 at 6:42?PM Barry Smith >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>>>>>>> { >>>>>>>>>>> ....... >>>>>>>>>>> for (Box::Iterator bc(bc_coef_box); >>>>>>>>>>> bc; bc++) >>>>>>>>>>> { >>>>>>>>>>> ...... >>>>>>>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>>>>>>> { >>>>>>>>>>> const double diag_value = a; >>>>>>>>>>> ierr = MatZeroRows(mat, 1, >>>>>>>>>>> &u_dof_index, diag_value, NULL, NULL); >>>>>>>>>>> IBTK_CHKERRQ(ierr); >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> In general, this code will not work because each process calls >>>>>>>>>>> MatZeroRows a different number of times, so it cannot match up with all the >>>>>>>>>>> processes. >>>>>>>>>>> >>>>>>>>>>> If u_dof_index is always local to the current process, you can >>>>>>>>>>> call MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for >>>>>>>>>>> loop and >>>>>>>>>>> the MatZeroRows will not synchronize across the MPI processes >>>>>>>>>>> (since it does not need to and you told it that). >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Yes, u_dof_index is going to be local and I put a check on it a >>>>>>>>>> few lines before calling MatZeroRows. >>>>>>>>>> >>>>>>>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> If the u_dof_index will not always be local, then you need, on >>>>>>>>>>> each process, to list all the u_dof_index for each process in an array and >>>>>>>>>>> then call MatZeroRows() >>>>>>>>>>> once after the loop so it can exchange the needed information >>>>>>>>>>> with the other MPI processes to get the row indices to the right place. >>>>>>>>>>> >>>>>>>>>>> Barry >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Nov 28, 2023, at 6:44?PM, Amneet Bhalla < >>>>>>>>>>> mail2amneet at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi Folks, >>>>>>>>>>> >>>>>>>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. >>>>>>>>>>> This works fine for the serial run and the solver produces correct results >>>>>>>>>>> (verified through analytical solution). However, when I run the case in >>>>>>>>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>>>>>>>> that this function needs to be called after the MatAssemblyBegin{End}() has >>>>>>>>>>> been called, and should be called by all processors. Here is that bit of >>>>>>>>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>>>>>>> >>>>>>>>>>> I ran the parallel code (on 3 processors) in the debugger >>>>>>>>>>> (-start_in_debugger). Below is the call stack from the processor that gets >>>>>>>>>>> stuck >>>>>>>>>>> >>>>>>>>>>> amneetb at APSB-MBP-16:~$ lldb -p 4307 >>>>>>>>>>> (lldb) process attach --pid 4307 >>>>>>>>>>> Process 4307 stopped >>>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>>>>> SIGSTOP >>>>>>>>>>> frame #0: 0x000000018a2d750c >>>>>>>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>>>>>>> libsystem_kernel.dylib`: >>>>>>>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>>>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>>>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>>>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>>>>> Executable module set to >>>>>>>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>>>>>>> Architecture set to: arm64-apple-macosx-. >>>>>>>>>>> (lldb) cont >>>>>>>>>>> Process 4307 resuming >>>>>>>>>>> Process 4307 stopped >>>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>>>>> SIGSTOP >>>>>>>>>>> frame #0: 0x0000000109d281b8 >>>>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>>>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>>>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>>>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; >>>>>>>>>>> <+376> >>>>>>>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>>>>>>>> MPID_Progress_test >>>>>>>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>>>>>>> (lldb) bt >>>>>>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>>>>>>> SIGSTOP >>>>>>>>>>> * frame #0: 0x0000000109d281b8 >>>>>>>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>>>>>>> frame #1: 0x0000000109d27d14 >>>>>>>>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>>>>>>> >>>>>>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Nov 29 23:59:15 2023 From: jed at jedbrown.org (Jed Brown) Date: Wed, 29 Nov 2023 22:59:15 -0700 Subject: [petsc-users] [Xolotl-psi-development] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: References: <8734wpe81v.fsf@jedbrown.org> Message-ID: <87v89j6ato.fsf@jedbrown.org> "Blondel, Sophie" writes: > Hi Jed, > > I'm not sure I'm going to reply to your question correctly because I don't really understand how the split is done. Is it related to on diagonal and off diagonal? If so, the off-diagonal part is usually pretty small (less than 20 DOFs) and related to diffusion, the diagonal part involves thousands of DOFs for the reaction term. >From the run-time option, it'll be a default (additive) split and we're interested in the two diagonal blocks. One currently has a cheap solver that would only be efficient with a well-conditioned positive definite matrix and the other is using a direct solver ('redundant'). If you were to run with -ksp_view and share the output, it would be informative. Either way, I'd like to understand what physics are beind the equation currently being solved with 'redundant'. If it's diffusive, then algebraic multigrid would be a good place to start. > Let us know what we can do to answer this question more accurately. > > Cheers, > > Sophie > ________________________________ > From: Jed Brown > Sent: Tuesday, November 28, 2023 19:07 > To: Fackler, Philip ; Junchao Zhang > Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net > Subject: Re: [Xolotl-psi-development] [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface > > [Some people who received this message don't often get email from jed at jedbrown.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > "Fackler, Philip via petsc-users" writes: > >> That makes sense. Here are the arguments that I think are relevant: >> >> -fieldsplit_1_pc_type redundant -fieldsplit_0_pc_type sor -pc_type fieldsplit -pc_fieldsplit_detect_coupling? > > What sort of physics are in splits 0 and 1? > > SOR is not a good GPU algorithm, so we'll want to change that one way or another. Are the splits of similar size or very different? > >> What would you suggest to make this better? >> >> Also, note that the cases marked "serial" are running on CPU only, that is, using only the SERIAL backend for kokkos. >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Junchao Zhang >> Sent: Tuesday, November 28, 2023 15:51 >> To: Fackler, Philip >> Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net >> Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> Hi, Philip, >> I opened hpcdb-PSI_9-serial and it seems you used PCLU. Since Kokkos does not have a GPU LU implementation, we do it on CPU via MatLUFactorNumeric_SeqAIJ(). Perhaps you can try other PC types? >> >> [Screenshot 2023-11-28 at 2.43.03?PM.png] >> --Junchao Zhang >> >> >> On Wed, Nov 22, 2023 at 10:43?AM Fackler, Philip > wrote: >> I definitely dropped the ball on this. I'm sorry for that. I have new profiling data using the latest (as of yesterday) of petsc/main. I've put them in a single google drive folder linked here: >> >> https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link >> >> Have a happy holiday weekend! >> >> Thanks, >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Junchao Zhang > >> Sent: Monday, October 16, 2023 15:24 >> To: Fackler, Philip > >> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > >> Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> Hi, Philip, >> That branch was merged to petsc/main today. Let me know once you have new profiling results. >> >> Thanks. >> --Junchao Zhang >> >> >> On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: >> Junchao, >> >> I've attached updated timing plots (red and blue are swapped from before; yellow is the new one). There is an improvement for the NE_3 case only with CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI cases, MatShift doesn't show up (I assume because we're using different preconditioner arguments). So, there must be some other primary culprit. I'll try to get updated profiling data to you soon. >> >> Thanks, >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Fackler, Philip via Xolotl-psi-development > >> Sent: Wednesday, October 11, 2023 11:31 >> To: Junchao Zhang > >> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > >> Subject: Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> I'm on it. >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Junchao Zhang > >> Sent: Wednesday, October 11, 2023 10:14 >> To: Fackler, Philip > >> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > >> Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> Hi, Philip, >> Could you try this branch jczhang/2023-10-05/feature-support-matshift-aijkokkos ? >> >> Thanks. >> --Junchao Zhang >> >> >> On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip > wrote: >> Aha! That makes sense. Thank you. >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Junchao Zhang > >> Sent: Thursday, October 5, 2023 17:29 >> To: Fackler, Philip > >> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > >> Subject: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> Wait a moment, it seems it was because we do not have a GPU implementation of MatShift... >> Let me see how to add it. >> --Junchao Zhang >> >> >> On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: >> Hi, Philip, >> I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() instead of the COO interface? MatSetValues() needs to copy the data from device to host and thus is expensive. >> Do you have profiling results with COO enabled? >> >> [Screenshot 2023-10-05 at 10.55.29?AM.png] >> >> >> --Junchao Zhang >> >> >> On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: >> Hi, Philip, >> I will look into the tarballs and get back to you. >> Thanks. >> --Junchao Zhang >> >> >> On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users > wrote: >> We finally have xolotl ported to use the new COO interface and the aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port to our previous version (using MatSetValuesStencil and the default Mat and Vec implementations), we expected to see an improvement in performance for both the "serial" and "cuda" builds (here I'm referring to the kokkos configuration). >> >> Attached are two plots that show timings for three different cases. All of these were run on Ascent (the Summit-like training system) with 6 MPI tasks (on a single node). The CUDA cases were given one GPU per task (and used CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent as possible. >> >> The performance of RHSJacobian (where the bulk of computation happens in xolotl) behaved basically as expected (better than expected in the serial build). NE_3 case in CUDA was the only one that performed worse, but not surprisingly, since its workload for the GPUs is much smaller. We've still got more optimization to do on this. >> >> The real surprise was how much worse the overall solve times were. This seems to be due simply to switching to the kokkos-based implementation. I'm wondering if there are any changes we can make in configuration or runtime arguments to help with PETSc's performance here. Any help looking into this would be appreciated. >> >> The tarballs linked here and here are profiling databases which, once extracted, can be viewed with hpcviewer. I don't know how helpful that will be, but hopefully it can give you some direction. >> >> Thanks for your help, >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory > > > _______________________________________________ > Xolotl-psi-development mailing list > Xolotl-psi-development at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xolotl-psi-development From jed at jedbrown.org Thu Nov 30 00:02:48 2023 From: jed at jedbrown.org (Jed Brown) Date: Wed, 29 Nov 2023 23:02:48 -0700 Subject: [petsc-users] Reading VTK files in PETSc In-Reply-To: References: Message-ID: <87sf4n6anr.fsf@jedbrown.org> Is it necessary that it be VTK format or can it be PETSc's binary format or a different mesh format? VTK (be it legacy .vtk or the XML-based .vtu, etc.) is a bad format for parallel reading, no matter how much effort might go into an implementation. "Kevin G. Wang" writes: > Good morning everyone. > > I use the following functions to output parallel vectors --- "globalVec" in > this example --- to VTK files. It works well, and is quite convenient. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > PetscViewer viewer; > PetscViewerVTKOpen(PetscObjectComm((PetscObject)*dm), filename, > FILE_MODE_WRITE, &viewer); > VecView(globalVec, viewer); > PetscViewerDestroy(&viewer); > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Now, I am trying to do the opposite. I would like to read the VTK files > generated by PETSc back into memory, and assign each one to a Vec. Could > someone let me know how this can be done? > > Thanks! > Kevin > > > -- > Kevin G. Wang, Ph.D. > Associate Professor > Kevin T. Crofton Department of Aerospace and Ocean Engineering > Virginia Tech > 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 > Office: (540) 231-7547 | Mobile: (650) 862-2663 > URL: https://www.aoe.vt.edu/people/faculty/wang.html > Codes: https://github.com/kevinwgy From sblondel at utk.edu Thu Nov 30 08:59:37 2023 From: sblondel at utk.edu (Blondel, Sophie) Date: Thu, 30 Nov 2023 14:59:37 +0000 Subject: [petsc-users] [Xolotl-psi-development] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: <87v89j6ato.fsf@jedbrown.org> References: <8734wpe81v.fsf@jedbrown.org> <87v89j6ato.fsf@jedbrown.org> Message-ID: Attached is the output with -ksp_view, it is not exactly what Philip has been running because this was done on my laptop instead of Ascent. Looking at the options we've been using, "redundant" is in charge of the diffusive part here (finite difference cell centered), in 1D simulation. When we do 2D or 3D simulations we switch it to "-fieldsplit_1_pc_type gamg -fieldsplit_1_ksp_type gmres -ksp_type fgmres". Cheers, Sophie ________________________________ From: Jed Brown Sent: Thursday, November 30, 2023 00:59 To: Blondel, Sophie ; Fackler, Philip ; Junchao Zhang Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net Subject: Re: [Xolotl-psi-development] [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface [You don't often get email from jed at jedbrown.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] "Blondel, Sophie" writes: > Hi Jed, > > I'm not sure I'm going to reply to your question correctly because I don't really understand how the split is done. Is it related to on diagonal and off diagonal? If so, the off-diagonal part is usually pretty small (less than 20 DOFs) and related to diffusion, the diagonal part involves thousands of DOFs for the reaction term. From the run-time option, it'll be a default (additive) split and we're interested in the two diagonal blocks. One currently has a cheap solver that would only be efficient with a well-conditioned positive definite matrix and the other is using a direct solver ('redundant'). If you were to run with -ksp_view and share the output, it would be informative. Either way, I'd like to understand what physics are beind the equation currently being solved with 'redundant'. If it's diffusive, then algebraic multigrid would be a good place to start. > Let us know what we can do to answer this question more accurately. > > Cheers, > > Sophie > ________________________________ > From: Jed Brown > Sent: Tuesday, November 28, 2023 19:07 > To: Fackler, Philip ; Junchao Zhang > Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net > Subject: Re: [Xolotl-psi-development] [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface > > [Some people who received this message don't often get email from jed at jedbrown.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > "Fackler, Philip via petsc-users" writes: > >> That makes sense. Here are the arguments that I think are relevant: >> >> -fieldsplit_1_pc_type redundant -fieldsplit_0_pc_type sor -pc_type fieldsplit -pc_fieldsplit_detect_coupling? > > What sort of physics are in splits 0 and 1? > > SOR is not a good GPU algorithm, so we'll want to change that one way or another. Are the splits of similar size or very different? > >> What would you suggest to make this better? >> >> Also, note that the cases marked "serial" are running on CPU only, that is, using only the SERIAL backend for kokkos. >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Junchao Zhang >> Sent: Tuesday, November 28, 2023 15:51 >> To: Fackler, Philip >> Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net >> Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> Hi, Philip, >> I opened hpcdb-PSI_9-serial and it seems you used PCLU. Since Kokkos does not have a GPU LU implementation, we do it on CPU via MatLUFactorNumeric_SeqAIJ(). Perhaps you can try other PC types? >> >> [Screenshot 2023-11-28 at 2.43.03?PM.png] >> --Junchao Zhang >> >> >> On Wed, Nov 22, 2023 at 10:43?AM Fackler, Philip > wrote: >> I definitely dropped the ball on this. I'm sorry for that. I have new profiling data using the latest (as of yesterday) of petsc/main. I've put them in a single google drive folder linked here: >> >> https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link >> >> Have a happy holiday weekend! >> >> Thanks, >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Junchao Zhang > >> Sent: Monday, October 16, 2023 15:24 >> To: Fackler, Philip > >> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > >> Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> Hi, Philip, >> That branch was merged to petsc/main today. Let me know once you have new profiling results. >> >> Thanks. >> --Junchao Zhang >> >> >> On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: >> Junchao, >> >> I've attached updated timing plots (red and blue are swapped from before; yellow is the new one). There is an improvement for the NE_3 case only with CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI cases, MatShift doesn't show up (I assume because we're using different preconditioner arguments). So, there must be some other primary culprit. I'll try to get updated profiling data to you soon. >> >> Thanks, >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Fackler, Philip via Xolotl-psi-development > >> Sent: Wednesday, October 11, 2023 11:31 >> To: Junchao Zhang > >> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > >> Subject: Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> I'm on it. >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Junchao Zhang > >> Sent: Wednesday, October 11, 2023 10:14 >> To: Fackler, Philip > >> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > >> Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> Hi, Philip, >> Could you try this branch jczhang/2023-10-05/feature-support-matshift-aijkokkos ? >> >> Thanks. >> --Junchao Zhang >> >> >> On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip > wrote: >> Aha! That makes sense. Thank you. >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory >> ________________________________ >> From: Junchao Zhang > >> Sent: Thursday, October 5, 2023 17:29 >> To: Fackler, Philip > >> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > >> Subject: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >> >> Wait a moment, it seems it was because we do not have a GPU implementation of MatShift... >> Let me see how to add it. >> --Junchao Zhang >> >> >> On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: >> Hi, Philip, >> I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() instead of the COO interface? MatSetValues() needs to copy the data from device to host and thus is expensive. >> Do you have profiling results with COO enabled? >> >> [Screenshot 2023-10-05 at 10.55.29?AM.png] >> >> >> --Junchao Zhang >> >> >> On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: >> Hi, Philip, >> I will look into the tarballs and get back to you. >> Thanks. >> --Junchao Zhang >> >> >> On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users > wrote: >> We finally have xolotl ported to use the new COO interface and the aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port to our previous version (using MatSetValuesStencil and the default Mat and Vec implementations), we expected to see an improvement in performance for both the "serial" and "cuda" builds (here I'm referring to the kokkos configuration). >> >> Attached are two plots that show timings for three different cases. All of these were run on Ascent (the Summit-like training system) with 6 MPI tasks (on a single node). The CUDA cases were given one GPU per task (and used CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent as possible. >> >> The performance of RHSJacobian (where the bulk of computation happens in xolotl) behaved basically as expected (better than expected in the serial build). NE_3 case in CUDA was the only one that performed worse, but not surprisingly, since its workload for the GPUs is much smaller. We've still got more optimization to do on this. >> >> The real surprise was how much worse the overall solve times were. This seems to be due simply to switching to the kokkos-based implementation. I'm wondering if there are any changes we can make in configuration or runtime arguments to help with PETSc's performance here. Any help looking into this would be appreciated. >> >> The tarballs linked here and here are profiling databases which, once extracted, can be viewed with hpcviewer. I don't know how helpful that will be, but hopefully it can give you some direction. >> >> Thanks for your help, >> >> Philip Fackler >> Research Software Engineer, Application Engineering Group >> Advanced Computing Systems Research Section >> Computer Science and Mathematics Division >> Oak Ridge National Laboratory > > > _______________________________________________ > Xolotl-psi-development mailing list > Xolotl-psi-development at lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xolotl-psi-development -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ksp.txt URL: From jed at jedbrown.org Thu Nov 30 10:51:32 2023 From: jed at jedbrown.org (Jed Brown) Date: Thu, 30 Nov 2023 09:51:32 -0700 Subject: [petsc-users] [Xolotl-psi-development] [EXTERNAL] Re: Unexpected performance losses switching to COO interface In-Reply-To: References: <8734wpe81v.fsf@jedbrown.org> <87v89j6ato.fsf@jedbrown.org> Message-ID: <87cyvr5gmj.fsf@jedbrown.org> Looks like both splits are of the same size and same number of nonzeros. GAMG is probably a good choice there, and likely doesn't need an inner Krylov (i.e., you could use -fieldsplit_1_ksp_type preonly). "Blondel, Sophie" writes: > Attached is the output with -ksp_view, it is not exactly what Philip has been running because this was done on my laptop instead of Ascent. > > Looking at the options we've been using, "redundant" is in charge of the diffusive part here (finite difference cell centered), in 1D simulation. When we do 2D or 3D simulations we switch it to "-fieldsplit_1_pc_type gamg -fieldsplit_1_ksp_type gmres -ksp_type fgmres". > > Cheers, > > Sophie > ________________________________ > From: Jed Brown > Sent: Thursday, November 30, 2023 00:59 > To: Blondel, Sophie ; Fackler, Philip ; Junchao Zhang > Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net > Subject: Re: [Xolotl-psi-development] [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface > > [You don't often get email from jed at jedbrown.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > "Blondel, Sophie" writes: > >> Hi Jed, >> >> I'm not sure I'm going to reply to your question correctly because I don't really understand how the split is done. Is it related to on diagonal and off diagonal? If so, the off-diagonal part is usually pretty small (less than 20 DOFs) and related to diffusion, the diagonal part involves thousands of DOFs for the reaction term. > > From the run-time option, it'll be a default (additive) split and we're interested in the two diagonal blocks. One currently has a cheap solver that would only be efficient with a well-conditioned positive definite matrix and the other is using a direct solver ('redundant'). If you were to run with -ksp_view and share the output, it would be informative. > > Either way, I'd like to understand what physics are beind the equation currently being solved with 'redundant'. If it's diffusive, then algebraic multigrid would be a good place to start. > >> Let us know what we can do to answer this question more accurately. >> >> Cheers, >> >> Sophie >> ________________________________ >> From: Jed Brown >> Sent: Tuesday, November 28, 2023 19:07 >> To: Fackler, Philip ; Junchao Zhang >> Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net >> Subject: Re: [Xolotl-psi-development] [petsc-users] [EXTERNAL] Re: Unexpected performance losses switching to COO interface >> >> [Some people who received this message don't often get email from jed at jedbrown.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] >> >> "Fackler, Philip via petsc-users" writes: >> >>> That makes sense. Here are the arguments that I think are relevant: >>> >>> -fieldsplit_1_pc_type redundant -fieldsplit_0_pc_type sor -pc_type fieldsplit -pc_fieldsplit_detect_coupling? >> >> What sort of physics are in splits 0 and 1? >> >> SOR is not a good GPU algorithm, so we'll want to change that one way or another. Are the splits of similar size or very different? >> >>> What would you suggest to make this better? >>> >>> Also, note that the cases marked "serial" are running on CPU only, that is, using only the SERIAL backend for kokkos. >>> >>> Philip Fackler >>> Research Software Engineer, Application Engineering Group >>> Advanced Computing Systems Research Section >>> Computer Science and Mathematics Division >>> Oak Ridge National Laboratory >>> ________________________________ >>> From: Junchao Zhang >>> Sent: Tuesday, November 28, 2023 15:51 >>> To: Fackler, Philip >>> Cc: petsc-users at mcs.anl.gov ; xolotl-psi-development at lists.sourceforge.net >>> Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >>> >>> Hi, Philip, >>> I opened hpcdb-PSI_9-serial and it seems you used PCLU. Since Kokkos does not have a GPU LU implementation, we do it on CPU via MatLUFactorNumeric_SeqAIJ(). Perhaps you can try other PC types? >>> >>> [Screenshot 2023-11-28 at 2.43.03?PM.png] >>> --Junchao Zhang >>> >>> >>> On Wed, Nov 22, 2023 at 10:43?AM Fackler, Philip > wrote: >>> I definitely dropped the ball on this. I'm sorry for that. I have new profiling data using the latest (as of yesterday) of petsc/main. I've put them in a single google drive folder linked here: >>> >>> https://drive.google.com/drive/folders/14ScvyfxOzc4OzXs9HZVeQDO-g6FdIVAI?usp=drive_link >>> >>> Have a happy holiday weekend! >>> >>> Thanks, >>> >>> Philip Fackler >>> Research Software Engineer, Application Engineering Group >>> Advanced Computing Systems Research Section >>> Computer Science and Mathematics Division >>> Oak Ridge National Laboratory >>> ________________________________ >>> From: Junchao Zhang > >>> Sent: Monday, October 16, 2023 15:24 >>> To: Fackler, Philip > >>> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > >>> Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >>> >>> Hi, Philip, >>> That branch was merged to petsc/main today. Let me know once you have new profiling results. >>> >>> Thanks. >>> --Junchao Zhang >>> >>> >>> On Mon, Oct 16, 2023 at 9:33?AM Fackler, Philip > wrote: >>> Junchao, >>> >>> I've attached updated timing plots (red and blue are swapped from before; yellow is the new one). There is an improvement for the NE_3 case only with CUDA. Serial stays the same, and the PSI cases stay the same. In the PSI cases, MatShift doesn't show up (I assume because we're using different preconditioner arguments). So, there must be some other primary culprit. I'll try to get updated profiling data to you soon. >>> >>> Thanks, >>> >>> Philip Fackler >>> Research Software Engineer, Application Engineering Group >>> Advanced Computing Systems Research Section >>> Computer Science and Mathematics Division >>> Oak Ridge National Laboratory >>> ________________________________ >>> From: Fackler, Philip via Xolotl-psi-development > >>> Sent: Wednesday, October 11, 2023 11:31 >>> To: Junchao Zhang > >>> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net > >>> Subject: Re: [Xolotl-psi-development] [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >>> >>> I'm on it. >>> >>> Philip Fackler >>> Research Software Engineer, Application Engineering Group >>> Advanced Computing Systems Research Section >>> Computer Science and Mathematics Division >>> Oak Ridge National Laboratory >>> ________________________________ >>> From: Junchao Zhang > >>> Sent: Wednesday, October 11, 2023 10:14 >>> To: Fackler, Philip > >>> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > >>> Subject: Re: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >>> >>> Hi, Philip, >>> Could you try this branch jczhang/2023-10-05/feature-support-matshift-aijkokkos ? >>> >>> Thanks. >>> --Junchao Zhang >>> >>> >>> On Thu, Oct 5, 2023 at 4:52?PM Fackler, Philip > wrote: >>> Aha! That makes sense. Thank you. >>> >>> Philip Fackler >>> Research Software Engineer, Application Engineering Group >>> Advanced Computing Systems Research Section >>> Computer Science and Mathematics Division >>> Oak Ridge National Laboratory >>> ________________________________ >>> From: Junchao Zhang > >>> Sent: Thursday, October 5, 2023 17:29 >>> To: Fackler, Philip > >>> Cc: petsc-users at mcs.anl.gov >; xolotl-psi-development at lists.sourceforge.net >; Blondel, Sophie > >>> Subject: [EXTERNAL] Re: [petsc-users] Unexpected performance losses switching to COO interface >>> >>> Wait a moment, it seems it was because we do not have a GPU implementation of MatShift... >>> Let me see how to add it. >>> --Junchao Zhang >>> >>> >>> On Thu, Oct 5, 2023 at 10:58?AM Junchao Zhang > wrote: >>> Hi, Philip, >>> I looked at the hpcdb-NE_3-cuda file. It seems you used MatSetValues() instead of the COO interface? MatSetValues() needs to copy the data from device to host and thus is expensive. >>> Do you have profiling results with COO enabled? >>> >>> [Screenshot 2023-10-05 at 10.55.29?AM.png] >>> >>> >>> --Junchao Zhang >>> >>> >>> On Mon, Oct 2, 2023 at 9:52?AM Junchao Zhang > wrote: >>> Hi, Philip, >>> I will look into the tarballs and get back to you. >>> Thanks. >>> --Junchao Zhang >>> >>> >>> On Mon, Oct 2, 2023 at 9:41?AM Fackler, Philip via petsc-users > wrote: >>> We finally have xolotl ported to use the new COO interface and the aijkokkos implementation for Mat (and kokkos for Vec). Comparing this port to our previous version (using MatSetValuesStencil and the default Mat and Vec implementations), we expected to see an improvement in performance for both the "serial" and "cuda" builds (here I'm referring to the kokkos configuration). >>> >>> Attached are two plots that show timings for three different cases. All of these were run on Ascent (the Summit-like training system) with 6 MPI tasks (on a single node). The CUDA cases were given one GPU per task (and used CUDA-aware MPI). The labels on the blue bars indicate speedup. In all cases we used "-fieldsplit_0_pc_type jacobi" to keep the comparison as consistent as possible. >>> >>> The performance of RHSJacobian (where the bulk of computation happens in xolotl) behaved basically as expected (better than expected in the serial build). NE_3 case in CUDA was the only one that performed worse, but not surprisingly, since its workload for the GPUs is much smaller. We've still got more optimization to do on this. >>> >>> The real surprise was how much worse the overall solve times were. This seems to be due simply to switching to the kokkos-based implementation. I'm wondering if there are any changes we can make in configuration or runtime arguments to help with PETSc's performance here. Any help looking into this would be appreciated. >>> >>> The tarballs linked here and here are profiling databases which, once extracted, can be viewed with hpcviewer. I don't know how helpful that will be, but hopefully it can give you some direction. >>> >>> Thanks for your help, >>> >>> Philip Fackler >>> Research Software Engineer, Application Engineering Group >>> Advanced Computing Systems Research Section >>> Computer Science and Mathematics Division >>> Oak Ridge National Laboratory >> >> >> _______________________________________________ >> Xolotl-psi-development mailing list >> Xolotl-psi-development at lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/xolotl-psi-development > Starting Xolotl (2.4.0-83-e84edee7) > Thu Nov 30 09:48:13 2023 > NetworkHandler: Loaded network of 312 DOF with: Helium Vacancy Interstitial > MaterialHandler: The selected material is: W111 with the following processes: advec attenuation diff modifiedTM movingSurface reaction ; a custom fit flux handler is used reading: tridyn_benchmark_PSI_9.dat > TemperatureHandler: Using the time profile defined in: temp_benchmark_PSI_9.dat > SolverHandler: 1D simulation with surface BC: free surface and bulk BC: free surface, initial concentration for Id: 0 of: 1e-18 nm-3, grid (nm): 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.7999999999999999 0.8999999999999999 0.9999999999999999 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.100000000000001 2.200000000000001 2.300000000000001 2.400000000000001 2.500000000000001 2.750000000000001 3.000000000000001 3.250000000000001 3.500000000000001 3.750000000000001 4.000000000000002 4.250000000000002 4.500000000000002 4.750000000000002 5.000000000000002 > 0 TS dt 1e-10 time 0. > > Time: 0 > Helium content = 0 > Vacancy content = 4.750000000000002e-18 > Interstitial content = 0 > Fluence = 0 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 1 TS dt 1e-09 time 1e-10 > > Time: 1e-10 > Helium content = 2.424952199503062e-06 > Vacancy content = 6.71808169858663e-15 > Interstitial content = 3.133080177694656e-09 > Fluence = 5.4e-06 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 2 TS dt 1e-08 time 1.1e-09 > > Time: 1.1e-09 > Helium content = 2.296916600451812e-05 > Vacancy content = 2.31356670859338e-12 > Interstitial content = 3.19440449167048e-09 > Fluence = 5.940000000000001e-05 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 3 TS dt 6.80732e-08 time 1.11e-08 > > Time: 1.11e-08 > Helium content = 0.0001055007230322544 > Vacancy content = 3.823043552405469e-09 > Interstitial content = 3.218807071399915e-09 > Fluence = 0.0005994 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 4 TS dt 5.05362e-07 time 7.91732e-08 > > Time: 7.917317101260537e-08 > Helium content = 0.0001189610844555452 > Vacancy content = 3.906703873501566e-07 > Interstitial content = 3.444351103043243e-09 > Fluence = 0.00427535123468069 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 5 TS dt 5.05362e-06 time 5.84535e-07 > > Time: 5.845350088239711e-07 > Helium content = 0.0001258488093923175 > Vacancy content = 5.658565060233405e-06 > Interstitial content = 3.489696739356304e-09 > Fluence = 0.03156489047649444 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 6 TS dt 1e-05 time 5.63815e-06 > > Time: 5.638153386937629e-06 > Helium content = 0.0002010574732121331 > Vacancy content = 5.316793387390591e-05 > Interstitial content = 3.364559413993797e-09 > Fluence = 0.304460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 7 TS dt 1e-05 time 1.56382e-05 > > Time: 1.563815338693763e-05 > Helium content = 0.0003315813442163533 > Vacancy content = 0.0001231337003572065 > Interstitial content = 3.19428926503291e-09 > Fluence = 0.8444602828946322 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 8 TS dt 1e-05 time 2.56382e-05 > > Time: 2.563815338693763e-05 > Helium content = 0.0004476286196065412 > Vacancy content = 0.0001729974611806802 > Interstitial content = 3.07534661688012e-09 > Fluence = 1.384460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 9 TS dt 1e-05 time 3.56382e-05 > > Time: 3.563815338693763e-05 > Helium content = 0.0005560483217551426 > Vacancy content = 0.0002106012850672489 > Interstitial content = 2.986545711612676e-09 > Fluence = 1.924460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 10 TS dt 1e-05 time 4.56382e-05 > > Time: 4.563815338693763e-05 > Helium content = 0.0006606877895992233 > Vacancy content = 0.0002402069186172472 > Interstitial content = 2.917129573419367e-09 > Fluence = 2.464460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 11 TS dt 1e-05 time 5.56382e-05 > > Time: 5.563815338693763e-05 > Helium content = 0.0007637911332658899 > Vacancy content = 0.0002643132393436003 > Interstitial content = 2.861037068682926e-09 > Fluence = 3.004460282894633 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 12 TS dt 1e-05 time 6.56382e-05 > > Time: 6.563815338693763e-05 > Helium content = 0.0008667085407284311 > Vacancy content = 0.0002844785068171919 > Interstitial content = 2.814578315379641e-09 > Fluence = 3.544460282894633 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 13 TS dt 1e-05 time 7.56382e-05 > > Time: 7.563815338693763e-05 > Helium content = 0.0009702704188514846 > Vacancy content = 0.000301727522122802 > Interstitial content = 2.775369688878707e-09 > Fluence = 4.084460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 14 TS dt 1e-05 time 8.56382e-05 > > Time: 8.563815338693763e-05 > Helium content = 0.001074993517662594 > Vacancy content = 0.0003167689818082796 > Interstitial content = 2.741800128508006e-09 > Fluence = 4.624460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 15 TS dt 1e-05 time 9.56382e-05 > > Time: 9.563815338693763e-05 > Helium content = 0.001181198912218485 > Vacancy content = 0.0003301189984602513 > Interstitial content = 2.712742195940176e-09 > Fluence = 5.164460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 16 TS dt 1e-05 time 0.000105638 > > Time: 0.0001056381533869376 > Helium content = 0.001289081649536771 > Vacancy content = 0.0003421747902000013 > Interstitial content = 2.687384476721404e-09 > Fluence = 5.704460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 17 TS dt 1e-05 time 0.000115638 > > Time: 0.0001156381533869376 > Helium content = 0.001398752812275666 > Vacancy content = 0.0003532599222239556 > Interstitial content = 2.66512810034976e-09 > Fluence = 6.244460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 18 TS dt 1e-05 time 0.000125638 > > Time: 0.0001256381533869376 > Helium content = 0.001510265353240789 > Vacancy content = 0.0003636520972540441 > Interstitial content = 2.645519412600572e-09 > Fluence = 6.784460282894633 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 19 TS dt 1e-05 time 0.000135638 > > Time: 0.0001356381533869376 > Helium content = 0.001623630275977358 > Vacancy content = 0.0003735995944935821 > Interstitial content = 2.628204479217124e-09 > Fluence = 7.324460282894633 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 20 TS dt 1e-05 time 0.000145638 > > Time: 0.0001456381533869376 > Helium content = 0.001738827119278414 > Vacancy content = 0.0003833301014514786 > Interstitial content = 2.612897862716965e-09 > Fluence = 7.864460282894633 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 21 TS dt 1e-05 time 0.000155638 > > Time: 0.0001556381533869376 > Helium content = 0.001855811114887512 > Vacancy content = 0.0003930545495852893 > Interstitial content = 2.599361401953524e-09 > Fluence = 8.404460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 22 TS dt 1e-05 time 0.000165638 > > Time: 0.0001656381533869376 > Helium content = 0.00197451801008536 > Vacancy content = 0.0004029679290481984 > Interstitial content = 2.587390239560163e-09 > Fluence = 8.944460282894632 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 23 TS dt 1e-05 time 0.000175638 > > Time: 0.0001756381533869376 > Helium content = 0.002094865846606066 > Vacancy content = 0.0004132485666126602 > Interstitial content = 2.576803945033726e-09 > Fluence = 9.484460282894631 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 24 TS dt 1e-05 time 0.000185638 > > Time: 0.0001856381533869376 > Helium content = 0.00221675055459619 > Vacancy content = 0.0004240568281110305 > Interstitial content = 2.567440755652846e-09 > Fluence = 10.02446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 25 TS dt 1e-05 time 0.000195638 > > Time: 0.0001956381533869376 > Helium content = 0.002339757010735212 > Vacancy content = 0.0004353587957753541 > Interstitial content = 2.560356349853614e-09 > Fluence = 10.56446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 26 TS dt 1e-05 time 0.000205638 > > Time: 0.0002056381533869376 > Helium content = 0.002465509064965822 > Vacancy content = 0.0004484855911997163 > Interstitial content = 2.559284490854252e-09 > Fluence = 11.10446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 27 TS dt 1e-05 time 0.000215638 > > Time: 0.0002156381533869376 > Helium content = 0.002593525609007556 > Vacancy content = 0.0004632835379792119 > Interstitial content = 2.563163403775028e-09 > Fluence = 11.64446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 28 TS dt 1e-05 time 0.000225638 > > Time: 0.0002256381533869376 > Helium content = 0.002724034355062431 > Vacancy content = 0.0004799838561594826 > Interstitial content = 2.567424698324675e-09 > Fluence = 12.18446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 29 TS dt 1e-05 time 0.000235638 > > Time: 0.0002356381533869376 > Helium content = 0.002857459707245525 > Vacancy content = 0.0004988266047559642 > Interstitial content = 2.572168368229206e-09 > Fluence = 12.72446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 30 TS dt 1e-05 time 0.000245638 > > Time: 0.0002456381533869377 > Helium content = 0.002994370187094124 > Vacancy content = 0.0005201303348919841 > Interstitial content = 2.577560955910943e-09 > Fluence = 13.26446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 31 TS dt 1e-05 time 0.000255638 > > Time: 0.0002556381533869377 > Helium content = 0.0031355364822675 > Vacancy content = 0.000544334096574835 > Interstitial content = 2.583857821748516e-09 > Fluence = 13.80446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 32 TS dt 1e-05 time 0.000265638 > > Time: 0.0002656381533869377 > Helium content = 0.003282012986122711 > Vacancy content = 0.0005720544117433223 > Interstitial content = 2.591434559232261e-09 > Fluence = 14.34446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 33 TS dt 1e-05 time 0.000275638 > > Time: 0.0002756381533869377 > Helium content = 0.003435251906943356 > Vacancy content = 0.0006041631318809235 > Interstitial content = 2.600830756904739e-09 > Fluence = 14.88446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 34 TS dt 1e-05 time 0.000285638 > > Time: 0.0002856381533869378 > Helium content = 0.003597262513369005 > Vacancy content = 0.0006418942534852459 > Interstitial content = 2.612810634859524e-09 > Fluence = 15.42446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 35 TS dt 1e-05 time 0.000295638 > > Time: 0.0002956381533869378 > Helium content = 0.003770232089351205 > Vacancy content = 0.0006864766983993774 > Interstitial content = 2.628167104194366e-09 > Fluence = 15.96446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 36 TS dt 1e-05 time 0.000305638 > > Time: 0.0003056381533869378 > Helium content = 0.003958238589176528 > Vacancy content = 0.0007404855168383525 > Interstitial content = 2.654075089372328e-09 > Fluence = 16.50446028289463 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 37 TS dt 1e-05 time 0.000315638 > > Time: 0.0003156381533869378 > Helium content = 0.004139532180905154 > Vacancy content = 0.0007827827581327699 > Interstitial content = 2.628571414623529e-09 > Fluence = 17.04446028289464 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 38 TS dt 1e-05 time 0.000325638 > > Time: 0.0003256381533869379 > Helium content = 0.004322983495814651 > Vacancy content = 0.0008236313750226637 > Interstitial content = 2.645245463163054e-09 > Fluence = 17.58446028289464 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 39 TS dt 1e-05 time 0.000335638 > > Time: 0.0003356381533869379 > Helium content = 0.00451257705221086 > Vacancy content = 0.0008671785596303175 > Interstitial content = 2.663027510894321e-09 > Fluence = 18.12446028289464 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 40 TS dt 1e-05 time 0.000345638 > > Time: 0.0003456381533869379 > Helium content = 0.00470870580030648 > Vacancy content = 0.0009135015211718177 > Interstitial content = 2.682059197538296e-09 > Fluence = 18.66446028289464 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 41 TS dt 1e-05 time 0.000355638 > > Time: 0.0003556381533869379 > Helium content = 0.004911798713845195 > Vacancy content = 0.0009626929139013768 > Interstitial content = 2.702531605968983e-09 > Fluence = 19.20446028289465 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 42 TS dt 1e-05 time 0.000365638 > > Time: 0.000365638153386938 > Helium content = 0.005122321788583731 > Vacancy content = 0.001014864404341224 > Interstitial content = 2.724674628689288e-09 > Fluence = 19.74446028289465 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 43 TS dt 1e-05 time 0.000375638 > > Time: 0.000375638153386938 > Helium content = 0.005340786761217188 > Vacancy content = 0.001070150451130229 > Interstitial content = 2.74876559889709e-09 > Fluence = 20.28446028289465 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 44 TS dt 1e-05 time 0.000385638 > > Time: 0.000385638153386938 > Helium content = 0.005567765323462502 > Vacancy content = 0.001128712631895018 > Interstitial content = 2.775141488393701e-09 > Fluence = 20.82446028289466 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 45 TS dt 1e-05 time 0.000395638 > > Time: 0.000395638153386938 > Helium content = 0.00580390937317407 > Vacancy content = 0.001190745259918844 > Interstitial content = 2.804216993218947e-09 > Fluence = 21.36446028289466 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 46 TS dt 1e-05 time 0.000405638 > > Time: 0.0004056381533869381 > Helium content = 0.006049980429867046 > Vacancy content = 0.001256483679967215 > Interstitial content = 2.836511925052035e-09 > Fluence = 21.90446028289466 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 47 TS dt 1e-05 time 0.000415638 > > Time: 0.0004156381533869381 > Helium content = 0.006306894494287616 > Vacancy content = 0.001326217583032197 > Interstitial content = 2.872693213518214e-09 > Fluence = 22.44446028289466 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 48 TS dt 1e-05 time 0.000425638 > > Time: 0.0004256381533869381 > Helium content = 0.006575793505168974 > Vacancy content = 0.00140031323246831 > Interstitial content = 2.913640511641545e-09 > Fluence = 22.98446028289467 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 49 TS dt 1e-05 time 0.000435638 > > Time: 0.0004356381533869382 > Helium content = 0.006858163417956545 > Vacancy content = 0.001479251371861391 > Interstitial content = 2.960551627823275e-09 > Fluence = 23.52446028289467 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 50 TS dt 1e-05 time 0.000445638 > > Time: 0.0004456381533869382 > Helium content = 0.00715603606123959 > Vacancy content = 0.001563693095010153 > Interstitial content = 3.015118615075782e-09 > Fluence = 24.06446028289467 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 51 TS dt 1e-05 time 0.000455638 > > Time: 0.0004556381533869382 > Helium content = 0.007472347434810338 > Vacancy content = 0.00165459723792622 > Interstitial content = 3.079836752663914e-09 > Fluence = 24.60446028289467 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 52 TS dt 1e-05 time 0.000465638 > > Time: 0.0004656381533869382 > Helium content = 0.007811603076341338 > Vacancy content = 0.001753437069540143 > Interstitial content = 3.158580279676169e-09 > Fluence = 25.14446028289468 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 53 TS dt 1e-05 time 0.000475638 > > Time: 0.0004756381533869383 > Helium content = 0.008181184798735803 > Vacancy content = 0.001862619430467793 > Interstitial content = 3.257755641637142e-09 > Fluence = 25.68446028289468 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 54 TS dt 7.18092e-06 time 0.000485638 > > Time: 0.0004856381533869383 > Helium content = 0.008594102836158505 > Vacancy content = 0.001986345473465493 > Interstitial content = 3.388819398835117e-09 > Fluence = 26.22446028289468 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 55 TS dt 7.18092e-06 time 0.000492819 > > Time: 0.0004928190766934691 > Helium content = 0.00893047196645343 > Vacancy content = 0.002088298687205072 > Interstitial content = 3.514456458882748e-09 > Fluence = 26.61223014144734 > > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > KSP Object: 4 MPI processes > type: gmres > > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 4 MPI processes > type: fieldsplit > > FieldSplit with MULTIPLICATIVE composition: total splits = 2, blocksize = 313 > > Solver info for each split is in the following KSP objects: > > Split number 0 Defined by IS > > KSP Object: (fieldsplit_0_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_0_) 4 MPI processes > > type: jacobi > > > type DIAGONAL > > linear system matrix = precond matrix: > > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268, bs=313 > > > > total: nonzeros=399318, allocated nonzeros=0 > > > > total number of mallocs used during MatSetValues calls=0 > > Split number 1 Defined by IS > > KSP Object: (fieldsplit_1_) 4 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > > left preconditioning > > using NONE norm type for convergence test > > PC Object: (fieldsplit_1_) 4 MPI processes > > type: redundant > > > First (color=0) of 4 PCs follows > KSP Object: (fieldsplit_1_redundant_) 1 MPI process > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_1_redundant_) 1 MPI process > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 3.17245 > Factored matrix follows: > Mat Object: (fieldsplit_1_redundant_) 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > package used to perform factorization: kokkos > total: nonzeros=1266817, allocated nonzeros=1266817 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI process > type: seqaijkokkos > rows=11268, cols=11268 > total: nonzeros=399318, allocated nonzeros=399318 > total number of mallocs used during MatSetValues calls=0 > not using I-node routines > > linear system matrix = precond matrix: > > > Mat Object: (fieldsplit_1_) 4 MPI processes > > > type: mpiaijkokkos > > > > rows=11268, cols=11268 > > > > total: nonzeros=399318, allocated nonzeros=399318 > > > > total number of mallocs used during MatSetValues calls=0 > > > > > not using I-node (on process 0) routines > linear system matrix = precond matrix: > > Mat Object: Mat_0x556cd9a44ae0_0 (fieldsplit_0_) 4 MPI processes > > type: mpiaijkokkos > > > rows=11268, cols=11268, bs=313 > > > total: nonzeros=399318, allocated nonzeros=0 > > > total number of mallocs used during MatSetValues calls=0 > 56 TS dt 1e-05 time 0.0005 > > Time: 0.0005 > Helium content = 0.009318381421541287 > Vacancy content = 0.002206840625499091 > Interstitial content = 3.683255457749294e-09 > Fluence = 27.00000000000001 > > > --- > Timers: > Flux: > process_count: 4 > min: 2.57282805500001 > max: 2.673946429000008 > average: 2.625496042750001 > stdev: 0.04738531765081894 > > Partial Derivatives: > process_count: 4 > min: 2.023499907999995 > max: 2.288700306 > average: 2.136961279749999 > stdev: 0.1090961292067129 > > monitor1D:checkNeg: > process_count: 4 > min: 0 > max: 0 > average: 0 > stdev: 0 > > monitor1D:event: > process_count: 4 > min: 0.003643629999999999 > max: 0.003649098 > average: 0.0036460355 > stdev: 1.961620439678616e-06 > > monitor1D:heRet: > process_count: 4 > min: 0.007963680999999998 > max: 0.008958236 > average: 0.008215529499999999 > stdev: 0.0004288129133203095 > > monitor1D:init: > process_count: 4 > min: 1.9974e-05 > max: 7.3905e-05 > average: 3.35945e-05 > stdev: 2.327379309975064e-05 > > monitor1D:postEvent: > process_count: 4 > min: 0 > max: 0 > average: 0 > stdev: 0 > > monitor1D:scatter: > process_count: 4 > min: 0 > max: 0 > average: 0 > stdev: 0 > > monitor1D:series: > process_count: 4 > min: 0 > max: 0 > average: 0 > stdev: 0 > > monitor1D:startStop: > process_count: 4 > min: 0 > max: 0 > average: 0 > stdev: 0 > > monitor1D:tridyn: > process_count: 4 > min: 0 > max: 0 > average: 0 > stdev: 0 > > monitor1D:xeRet: > process_count: 4 > min: 0 > max: 0 > average: 0 > stdev: 0 > > rhsFunctionTimer: > process_count: 4 > min: 2.745660812 > max: 2.883327671 > average: 2.81164295975 > stdev: 0.06020129453595313 > > rhsJacobianTimer: > process_count: 4 > min: 2.348997845 > max: 2.554881655 > average: 2.4384714175 > stdev: 0.08492036181598517 > > solveTimer: > process_count: 4 > min: 31.559086476 > max: 31.559115041 > average: 31.55910728175 > stdev: 1.203010159639632e-05 > > > Counters: > Flux: > process_count: 4 > min: 4448 > max: 5004 > average: 4726 > stdev: 278 > > Partial Derivatives: > process_count: 4 > min: 2656 > max: 2988 > average: 2822 > stdev: 166 From jed at jedbrown.org Thu Nov 30 11:48:03 2023 From: jed at jedbrown.org (Jed Brown) Date: Thu, 30 Nov 2023 10:48:03 -0700 Subject: [petsc-users] Reading VTK files in PETSc In-Reply-To: References: <87sf4n6anr.fsf@jedbrown.org> Message-ID: <87a5qv5e0c.fsf@jedbrown.org> I assume you're working with a DA, in which case you can write in HDF5 format and add an Xdmf header so Paraview can read it. The utility lib/petsc/bin/petsc_gen_xdmf.py should be able to handle it. I haven't written support for it (my problems are unstructured so I've focused on DMPlex), but the CGNS format supports structured meshes and that would be an efficient parallel format that doesn't need a header. "Kevin G. Wang" writes: > Hi Jed, > > Thanks for your help! It does not have to be VTK (.vtr in my case). But I > am trying to have a "seamless" workflow between reading, writing, and > visualization, without the need of format conversions. I opted for VTK > simply because it is easy to write, and can be directly visualized (using > Paraview). > > Could you please advise as to what is the best practice in my scenario? My > meshes are Cartesian, but non-uniform. > > Thanks, > Kevin > > On Thu, Nov 30, 2023 at 1:02?AM Jed Brown wrote: > >> Is it necessary that it be VTK format or can it be PETSc's binary format >> or a different mesh format? VTK (be it legacy .vtk or the XML-based .vtu, >> etc.) is a bad format for parallel reading, no matter how much effort might >> go into an implementation. >> >> "Kevin G. Wang" writes: >> >> > Good morning everyone. >> > >> > I use the following functions to output parallel vectors --- "globalVec" >> in >> > this example --- to VTK files. It works well, and is quite convenient. >> > >> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> > PetscViewer viewer; >> > PetscViewerVTKOpen(PetscObjectComm((PetscObject)*dm), filename, >> > FILE_MODE_WRITE, &viewer); >> > VecView(globalVec, viewer); >> > PetscViewerDestroy(&viewer); >> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> > >> > Now, I am trying to do the opposite. I would like to read the VTK files >> > generated by PETSc back into memory, and assign each one to a Vec. Could >> > someone let me know how this can be done? >> > >> > Thanks! >> > Kevin >> > >> > >> > -- >> > Kevin G. Wang, Ph.D. >> > Associate Professor >> > Kevin T. Crofton Department of Aerospace and Ocean Engineering >> > Virginia Tech >> > 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 >> > Office: (540) 231-7547 | Mobile: (650) 862-2663 >> > URL: https://www.aoe.vt.edu/people/faculty/wang.html >> > Codes: https://github.com/kevinwgy >> > > > -- > Kevin G. Wang, Ph.D. > Associate Professor > Kevin T. Crofton Department of Aerospace and Ocean Engineering > Virginia Tech > 1600 Innovation Dr., VTSS Rm 224H, Blacksburg, VA 24061 > Office: (540) 231-7547 | Mobile: (650) 862-2663 > URL: https://www.aoe.vt.edu/people/faculty/wang.html > Codes: https://github.com/kevinwgy From alexlindsay239 at gmail.com Thu Nov 30 14:25:20 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Thu, 30 Nov 2023 12:25:20 -0800 Subject: [petsc-users] Pre-check before each line search function evaluation Message-ID: Hi I'm looking at the sources, and I believe the answer is no, but is there a dedicated callback that is akin to SNESLineSearchPrecheck but is called before *each* function evaluation in a line search method? I am using a Hybridized Discontinuous Galerkin method in which most of the degrees of freedom are eliminated from the global system. However, an accurate function evaluation requires that an update to the "global" dofs also trigger an update to the eliminated dofs. I can almost certainly do this manually but I believe it would be more prone to error than a dedicated callback. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Nov 30 14:32:13 2023 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 30 Nov 2023 15:32:13 -0500 Subject: [petsc-users] Pre-check before each line search function evaluation In-Reply-To: References: Message-ID: <80C2752C-1202-4F86-B3A8-FEA0EBC3833B@petsc.dev> Why is this all not part of the function evaluation? > On Nov 30, 2023, at 3:25?PM, Alexander Lindsay wrote: > > Hi I'm looking at the sources, and I believe the answer is no, but is there a dedicated callback that is akin to SNESLineSearchPrecheck but is called before *each* function evaluation in a line search method? I am using a Hybridized Discontinuous Galerkin method in which most of the degrees of freedom are eliminated from the global system. However, an accurate function evaluation requires that an update to the "global" dofs also trigger an update to the eliminated dofs. > > I can almost certainly do this manually but I believe it would be more prone to error than a dedicated callback. From alexlindsay239 at gmail.com Thu Nov 30 15:22:44 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Thu, 30 Nov 2023 13:22:44 -0800 Subject: [petsc-users] Pre-check before each line search function evaluation In-Reply-To: <80C2752C-1202-4F86-B3A8-FEA0EBC3833B@petsc.dev> References: <80C2752C-1202-4F86-B3A8-FEA0EBC3833B@petsc.dev> Message-ID: If someone passes me just L, where L represents the "global" degrees of freedom, in this case they represent unknowns on the trace of the mesh, this is insufficient information for me to evaluate my function. Because in truth my degrees of freedom are the sum of the trace unknowns (the unknowns in the global solution vector) and the eliminated unknowns which are entirely local to each element. So I will say my dofs are L + U. I start with some initial guess L0 and U0. I perform a finite element assembly procedure on each element which gives me things like K_LL, K_UL, K_LU, K_UU, F_U, and F_L. I can do some math: K_LL = -K_LU * K_UU^-1 * K_UL + K_LL F_L = -K_LU * K_UU^-1 * F_U + F_L And then I feed K_LL and F_L into the global system matrix and vector respectively. I do something (like a linear solve) which gives me an increment to L, I'll call it dL. I loop back through and do a finite element assembly again using **L0 and U0** (or one could in theory save off the previous assemblies) to once again obtain the same K_LL, K_UL, K_LU, K_UU, F_U, F_L. And now I can compute the increment for U, dU, according to dU = K_UU^-1 * (-F_U - K_UL * dL) Armed now with both dL and dU, I am ready to perform a new residual evaluation with (L0 + dL, U0 + dU) = (L1, U1). The key part is that I cannot get U1 (or more generally an arbitrary U) just given L1 (or more generally an arbitrary L). In order to get U1, I must know both L0 and dL (and U0 of course). This is because at its core U is not some auxiliary vector; it represents true degrees of freedom. On Thu, Nov 30, 2023 at 12:32?PM Barry Smith wrote: > > Why is this all not part of the function evaluation? > > > > On Nov 30, 2023, at 3:25?PM, Alexander Lindsay > wrote: > > > > Hi I'm looking at the sources, and I believe the answer is no, but is > there a dedicated callback that is akin to SNESLineSearchPrecheck but is > called before *each* function evaluation in a line search method? I am > using a Hybridized Discontinuous Galerkin method in which most of the > degrees of freedom are eliminated from the global system. However, an > accurate function evaluation requires that an update to the "global" dofs > also trigger an update to the eliminated dofs. > > > > I can almost certainly do this manually but I believe it would be more > prone to error than a dedicated callback. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexlindsay239 at gmail.com Thu Nov 30 15:27:50 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Thu, 30 Nov 2023 13:27:50 -0800 Subject: [petsc-users] Pre-check before each line search function evaluation In-Reply-To: References: <80C2752C-1202-4F86-B3A8-FEA0EBC3833B@petsc.dev> Message-ID: I can do exactly what I want using SNESLineSearchPrecheck and -snes_linesearch_type basic ... I just can't use any more exotic line searches On Thu, Nov 30, 2023 at 1:22?PM Alexander Lindsay wrote: > If someone passes me just L, where L represents the "global" degrees of > freedom, in this case they represent unknowns on the trace of the mesh, > this is insufficient information for me to evaluate my function. Because in > truth my degrees of freedom are the sum of the trace unknowns (the unknowns > in the global solution vector) and the eliminated unknowns which are > entirely local to each element. So I will say my dofs are L + U. > > I start with some initial guess L0 and U0. I perform a finite element > assembly procedure on each element which gives me things like K_LL, K_UL, > K_LU, K_UU, F_U, and F_L. I can do some math: > > K_LL = -K_LU * K_UU^-1 * K_UL + K_LL > F_L = -K_LU * K_UU^-1 * F_U + F_L > > And then I feed K_LL and F_L into the global system matrix and vector > respectively. I do something (like a linear solve) which gives me an > increment to L, I'll call it dL. I loop back through and do a finite > element assembly again using **L0 and U0** (or one could in theory save off > the previous assemblies) to once again obtain the same K_LL, K_UL, K_LU, > K_UU, F_U, F_L. And now I can compute the increment for U, dU, according to > > dU = K_UU^-1 * (-F_U - K_UL * dL) > > Armed now with both dL and dU, I am ready to perform a new residual > evaluation with (L0 + dL, U0 + dU) = (L1, U1). > > The key part is that I cannot get U1 (or more generally an arbitrary U) > just given L1 (or more generally an arbitrary L). In order to get U1, I > must know both L0 and dL (and U0 of course). This is because at its core U > is not some auxiliary vector; it represents true degrees of freedom. > > On Thu, Nov 30, 2023 at 12:32?PM Barry Smith wrote: > >> >> Why is this all not part of the function evaluation? >> >> >> > On Nov 30, 2023, at 3:25?PM, Alexander Lindsay < >> alexlindsay239 at gmail.com> wrote: >> > >> > Hi I'm looking at the sources, and I believe the answer is no, but is >> there a dedicated callback that is akin to SNESLineSearchPrecheck but is >> called before *each* function evaluation in a line search method? I am >> using a Hybridized Discontinuous Galerkin method in which most of the >> degrees of freedom are eliminated from the global system. However, an >> accurate function evaluation requires that an update to the "global" dofs >> also trigger an update to the eliminated dofs. >> > >> > I can almost certainly do this manually but I believe it would be more >> prone to error than a dedicated callback. >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Nov 30 15:47:23 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 30 Nov 2023 16:47:23 -0500 Subject: [petsc-users] Pre-check before each line search function evaluation In-Reply-To: References: <80C2752C-1202-4F86-B3A8-FEA0EBC3833B@petsc.dev> Message-ID: On Thu, Nov 30, 2023 at 4:23?PM Alexander Lindsay wrote: > If someone passes me just L, where L represents the "global" degrees of > freedom, in this case they represent unknowns on the trace of the mesh, > this is insufficient information for me to evaluate my function. Because in > truth my degrees of freedom are the sum of the trace unknowns (the unknowns > in the global solution vector) and the eliminated unknowns which are > entirely local to each element. So I will say my dofs are L + U. > I want to try and reduce this to the simplest possible thing so that I can understand. We have some system which has two parts to the solution, L and U. If this problem is linear, we have / A B \ / U \ = / f \ \ C D / \ L / \ g / and we assume that A is easily invertible, so that U + A^{-1} B L = f U = f - A^{-1} B L C U + D L = g C (f - A^{-1} B L) + D L = g (D - C A^{-1} B) L = g - C f where I have reproduced the Schur complement derivation. Here, given any L, I can construct the corresponding U by inverting A. I know your system may be different, but if you are only solving for L, it should have this property I think. Thus, if the line search generates a new L, say L_1, I should be able to get U_1 by just plugging in. If this is not so, can you write out the equations so we can see why this is not true? Thanks, Matt > I start with some initial guess L0 and U0. I perform a finite element > assembly procedure on each element which gives me things like K_LL, K_UL, > K_LU, K_UU, F_U, and F_L. I can do some math: > > K_LL = -K_LU * K_UU^-1 * K_UL + K_LL > F_L = -K_LU * K_UU^-1 * F_U + F_L > > And then I feed K_LL and F_L into the global system matrix and vector > respectively. I do something (like a linear solve) which gives me an > increment to L, I'll call it dL. I loop back through and do a finite > element assembly again using **L0 and U0** (or one could in theory save off > the previous assemblies) to once again obtain the same K_LL, K_UL, K_LU, > K_UU, F_U, F_L. And now I can compute the increment for U, dU, according to > > dU = K_UU^-1 * (-F_U - K_UL * dL) > > Armed now with both dL and dU, I am ready to perform a new residual > evaluation with (L0 + dL, U0 + dU) = (L1, U1). > > The key part is that I cannot get U1 (or more generally an arbitrary U) > just given L1 (or more generally an arbitrary L). In order to get U1, I > must know both L0 and dL (and U0 of course). This is because at its core U > is not some auxiliary vector; it represents true degrees of freedom. > > On Thu, Nov 30, 2023 at 12:32?PM Barry Smith wrote: > >> >> Why is this all not part of the function evaluation? >> >> >> > On Nov 30, 2023, at 3:25?PM, Alexander Lindsay < >> alexlindsay239 at gmail.com> wrote: >> > >> > Hi I'm looking at the sources, and I believe the answer is no, but is >> there a dedicated callback that is akin to SNESLineSearchPrecheck but is >> called before *each* function evaluation in a line search method? I am >> using a Hybridized Discontinuous Galerkin method in which most of the >> degrees of freedom are eliminated from the global system. However, an >> accurate function evaluation requires that an update to the "global" dofs >> also trigger an update to the eliminated dofs. >> > >> > I can almost certainly do this manually but I believe it would be more >> prone to error than a dedicated callback. >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexlindsay239 at gmail.com Thu Nov 30 16:07:59 2023 From: alexlindsay239 at gmail.com (Alexander Lindsay) Date: Thu, 30 Nov 2023 14:07:59 -0800 Subject: [petsc-users] Pre-check before each line search function evaluation In-Reply-To: References: <80C2752C-1202-4F86-B3A8-FEA0EBC3833B@petsc.dev> Message-ID: Hi Matt, your derivation is spot on. However, the problem is not linear, which is why I am using SNES. So you need to replace U = A^{-1} f - A^{-1} B L with dU = A^{-1} f - A^{-1} B dL On Thu, Nov 30, 2023 at 1:47?PM Matthew Knepley wrote: > On Thu, Nov 30, 2023 at 4:23?PM Alexander Lindsay < > alexlindsay239 at gmail.com> wrote: > >> If someone passes me just L, where L represents the "global" degrees of >> freedom, in this case they represent unknowns on the trace of the mesh, >> this is insufficient information for me to evaluate my function. Because in >> truth my degrees of freedom are the sum of the trace unknowns (the unknowns >> in the global solution vector) and the eliminated unknowns which are >> entirely local to each element. So I will say my dofs are L + U. >> > > I want to try and reduce this to the simplest possible thing so that I can > understand. We have some system which has two parts to the solution, L and > U. If this problem is linear, we have > > / A B \ / U \ = / f \ > \ C D / \ L / \ g / > > and we assume that A is easily invertible, so that > > U + A^{-1} B L = f > U = f - A^{-1} B L > > C U + D L = g > C (f - A^{-1} B L) + D L = g > (D - C A^{-1} B) L = g - C f > > where I have reproduced the Schur complement derivation. Here, given any > L, I can construct the corresponding U by inverting A. I know your system > may be different, but if you are only solving for L, > it should have this property I think. > > Thus, if the line search generates a new L, say L_1, I should be able to > get U_1 by just plugging in. If this is not so, can you write out the > equations so we can see why this is not true? > > Thanks, > > Matt > > >> I start with some initial guess L0 and U0. I perform a finite element >> assembly procedure on each element which gives me things like K_LL, K_UL, >> K_LU, K_UU, F_U, and F_L. I can do some math: >> >> K_LL = -K_LU * K_UU^-1 * K_UL + K_LL >> F_L = -K_LU * K_UU^-1 * F_U + F_L >> >> And then I feed K_LL and F_L into the global system matrix and vector >> respectively. I do something (like a linear solve) which gives me an >> increment to L, I'll call it dL. I loop back through and do a finite >> element assembly again using **L0 and U0** (or one could in theory save off >> the previous assemblies) to once again obtain the same K_LL, K_UL, K_LU, >> K_UU, F_U, F_L. And now I can compute the increment for U, dU, according to >> >> dU = K_UU^-1 * (-F_U - K_UL * dL) >> >> Armed now with both dL and dU, I am ready to perform a new residual >> evaluation with (L0 + dL, U0 + dU) = (L1, U1). >> >> The key part is that I cannot get U1 (or more generally an arbitrary U) >> just given L1 (or more generally an arbitrary L). In order to get U1, I >> must know both L0 and dL (and U0 of course). This is because at its core U >> is not some auxiliary vector; it represents true degrees of freedom. >> >> On Thu, Nov 30, 2023 at 12:32?PM Barry Smith wrote: >> >>> >>> Why is this all not part of the function evaluation? >>> >>> >>> > On Nov 30, 2023, at 3:25?PM, Alexander Lindsay < >>> alexlindsay239 at gmail.com> wrote: >>> > >>> > Hi I'm looking at the sources, and I believe the answer is no, but is >>> there a dedicated callback that is akin to SNESLineSearchPrecheck but is >>> called before *each* function evaluation in a line search method? I am >>> using a Hybridized Discontinuous Galerkin method in which most of the >>> degrees of freedom are eliminated from the global system. However, an >>> accurate function evaluation requires that an update to the "global" dofs >>> also trigger an update to the eliminated dofs. >>> > >>> > I can almost certainly do this manually but I believe it would be more >>> prone to error than a dedicated callback. >>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Nov 30 16:27:10 2023 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 30 Nov 2023 17:27:10 -0500 Subject: [petsc-users] Pre-check before each line search function evaluation In-Reply-To: References: <80C2752C-1202-4F86-B3A8-FEA0EBC3833B@petsc.dev> Message-ID: On Thu, Nov 30, 2023 at 5:08?PM Alexander Lindsay wrote: > Hi Matt, your derivation is spot on. However, the problem is not linear, > which is why I am using SNES. So you need to replace > > U = A^{-1} f - A^{-1} B L > > with > > dU = A^{-1} f - A^{-1} B dL > I see two cases: 1) There is an easy nonlinear elimination for U. In this case, you do this to get U_1. 2) There is only a linear elimination. In this case, I don't think the nonlinear system should be phrased only on L, but rather on (U, L) itself. The linear elimination can be used as an excellent preconditioner for the Newton system. Thanks, Matt > On Thu, Nov 30, 2023 at 1:47?PM Matthew Knepley wrote: > >> On Thu, Nov 30, 2023 at 4:23?PM Alexander Lindsay < >> alexlindsay239 at gmail.com> wrote: >> >>> If someone passes me just L, where L represents the "global" degrees of >>> freedom, in this case they represent unknowns on the trace of the mesh, >>> this is insufficient information for me to evaluate my function. Because in >>> truth my degrees of freedom are the sum of the trace unknowns (the unknowns >>> in the global solution vector) and the eliminated unknowns which are >>> entirely local to each element. So I will say my dofs are L + U. >>> >> >> I want to try and reduce this to the simplest possible thing so that I >> can understand. We have some system which has two parts to the solution, L >> and U. If this problem is linear, we have >> >> / A B \ / U \ = / f \ >> \ C D / \ L / \ g / >> >> and we assume that A is easily invertible, so that >> >> U + A^{-1} B L = f >> U = f - A^{-1} B L >> >> C U + D L = g >> C (f - A^{-1} B L) + D L = g >> (D - C A^{-1} B) L = g - C f >> >> where I have reproduced the Schur complement derivation. Here, given any >> L, I can construct the corresponding U by inverting A. I know your system >> may be different, but if you are only solving for L, >> it should have this property I think. >> >> Thus, if the line search generates a new L, say L_1, I should be able to >> get U_1 by just plugging in. If this is not so, can you write out the >> equations so we can see why this is not true? >> >> Thanks, >> >> Matt >> >> >>> I start with some initial guess L0 and U0. I perform a finite element >>> assembly procedure on each element which gives me things like K_LL, K_UL, >>> K_LU, K_UU, F_U, and F_L. I can do some math: >>> >>> K_LL = -K_LU * K_UU^-1 * K_UL + K_LL >>> F_L = -K_LU * K_UU^-1 * F_U + F_L >>> >>> And then I feed K_LL and F_L into the global system matrix and vector >>> respectively. I do something (like a linear solve) which gives me an >>> increment to L, I'll call it dL. I loop back through and do a finite >>> element assembly again using **L0 and U0** (or one could in theory save off >>> the previous assemblies) to once again obtain the same K_LL, K_UL, K_LU, >>> K_UU, F_U, F_L. And now I can compute the increment for U, dU, according to >>> >>> dU = K_UU^-1 * (-F_U - K_UL * dL) >>> >>> Armed now with both dL and dU, I am ready to perform a new residual >>> evaluation with (L0 + dL, U0 + dU) = (L1, U1). >>> >>> The key part is that I cannot get U1 (or more generally an arbitrary U) >>> just given L1 (or more generally an arbitrary L). In order to get U1, I >>> must know both L0 and dL (and U0 of course). This is because at its core U >>> is not some auxiliary vector; it represents true degrees of freedom. >>> >>> On Thu, Nov 30, 2023 at 12:32?PM Barry Smith wrote: >>> >>>> >>>> Why is this all not part of the function evaluation? >>>> >>>> >>>> > On Nov 30, 2023, at 3:25?PM, Alexander Lindsay < >>>> alexlindsay239 at gmail.com> wrote: >>>> > >>>> > Hi I'm looking at the sources, and I believe the answer is no, but is >>>> there a dedicated callback that is akin to SNESLineSearchPrecheck but is >>>> called before *each* function evaluation in a line search method? I am >>>> using a Hybridized Discontinuous Galerkin method in which most of the >>>> degrees of freedom are eliminated from the global system. However, an >>>> accurate function evaluation requires that an update to the "global" dofs >>>> also trigger an update to the eliminated dofs. >>>> > >>>> > I can almost certainly do this manually but I believe it would be >>>> more prone to error than a dedicated callback. >>>> >>>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: