From patrick.sanan at gmail.com Tue Feb 1 09:06:17 2022 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 1 Feb 2022 16:06:17 +0100 Subject: [petsc-users] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: References: Message-ID: Sorry about the delay on this. I can reproduce. This regression appears to be a result of this optimization: https://gitlab.com/petsc/petsc/-/merge_requests/4273 The changes there including having MatPreallocator destroy its internal hash structure within MatPreallocatorPreallocate(), which allows for a lower overall memory footprint but prevents usage of the same MatPreallocate object for two Mats. The error you see is because this hash structure was destroyed during the first preallocation. We didn't catch this because our test suite doesn't test that usage. cc'ing PETSc dev because I'm not sure how to best resolve this - enforce that a MatPreallocator is only "good once", remove the PetscHSetIJDestroy() calls and accept the bigger memory footprint, or something else more clever? My test to reproduce with C, which can be included in our fix in src/mat/tests , attached. Am Mo., 24. Jan. 2022 um 10:33 Uhr schrieb Marius Buerkle : > > Hi, > > I try to use MatPreallocatorPreallocate to allocate a MATMPIAIJ matrix A . > I define the MATPREALLOCATOR preM with MatSetValues and then call > MatPreallocatorPreallocate to get A. This works on the first call to > MatPreallocatorPreallocate, but if I call MatPreallocatorPreallocate again > with the same preM to get another matrix B then I get a segfault, although > the program continues to run (see below). It worked with PETSC 3.15 but > with 3.16 I stopped working. > When I check mat_info_nz_allocated and mat_info_nz_used for the allocated > matrix it looks correct for the first call, but on the second call > mat_info_nz_used is 0. I also attached a minimal example. > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Null Pointer: Parameter # 1 > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: [1]PETSC ERROR: Null argument, when expecting valid pointer > [1]PETSC ERROR: Petsc Development GIT revision: v3.16.3-686-g5e81a90 GIT > Date: 2022-01-23 05:13:26 +0000 > [0]PETSC ERROR: ./prem_test on a named cd001 by cdfmat_marius Mon Jan 24 > 18:21:17 2022 > [0]PETSC ERROR: Null Pointer: Parameter # 1 > [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [1]PETSC ERROR: Configure options > --prefix=/home/cdfmat_marius/prog/petsc/petsc_main_dbg > --with-scalar-type=complex --with-fortran-kernels=1 --with-64-bit-indices=0 > --CC=mpiicc --COPTFLAGS="-g -traceback" --CXX=mpiicpc --CXXOPTFLAGS="-g > -traceback" --FC=mpiifort --FOPTFLAGS="-g -traceback" --with-mpi=1 > --with-x=0 --with-cuda=0 > --download-parmetis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.parmetis.tar.gz > --download-parmetis-commit=HEAD > --download-metis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.metis.tar.gz > --download-metis-commit=HEAD > --download-slepc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.slepc_main.tar.gz > --download-slepc-commit=HEAD > --download-superlu_dist=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.superlu_dist.tar.gz > --download-superlu_dist-commit=HEAD > --download-mumps=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.mumps.tar.gz > --download-mumps-commit=HEAD > --download-hypre=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.hypre.tar.gz > --download-hypre-commit=HEAD > --download-hwloc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/hwloc-2.5.0.tar.gz > --download-sowing=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.sowing.tar.gz > --download-elemental=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.elemental.tar.gz > --download-elemental-commit=HEAD > --download-make=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/make-4.2.1-6.fc28.tar.gz > --download-ptscotch=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.ptscotch.tar.gz > --download-ptscotch-commit=HEAD --with-openmp=0 --with-pthread=0 > --with-cxx-dialect=c++11 --with-debugging=1 --with-cuda=0 --with-cudac=0 > --with-valgrind=0 --with-blaslapack-lib="-mkl=sequential > -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl" > --with-scalapack-lib="-mkl=sequential -lmkl_scalapack_lp64 > -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl" > --with-mkl_pardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl > --with-mkl_cpardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl > --with-mkl_sparse-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl > --with-mkl_sparse_optimize-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl > [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-686-g5e81a90 GIT > Date: 2022-01-23 05:13:26 +0000 > [1]PETSC ERROR: ./prem_test on a named cd001 by cdfmat_marius Mon Jan 24 > 18:21:17 2022 > [1]PETSC ERROR: #1 PetscHSetIJGetSize() at > /home/cdfmat_marius/prog/petsc/git/petsc_main/include/petsc/private/hashsetij.h:13 > [0]PETSC ERROR: #2 MatPreallocatorPreallocate_Preallocator() at > /home/cdfmat_marius/prog/petsc/git/petsc_main/src/mat/impls/preallocator/matpreallocator.c:165 > [0]PETSC ERROR: #3 MatPreallocatorPreallocate() at > /home/cdfmat_marius/prog/petsc/git/petsc_main/src/mat/impls/preallocator/matpreallocator.c:234 > Configure options --prefix=/home/cdfmat_marius/prog/petsc/petsc_main_dbg > --with-scalar-type=complex --with-fortran-kernels=1 --with-64-bit-indices=0 > --CC=mpiicc --COPTFLAGS="-g -traceback" --CXX=mpiicpc --CXXOPTFLAGS="-g > -traceback" --FC=mpiifort --FOPTFLAGS="-g -traceback" --with-mpi=1 > --with-x=0 --with-cuda=0 > --download-parmetis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.parmetis.tar.gz > --download-parmetis-commit=HEAD > --download-metis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.metis.tar.gz > --download-metis-commit=HEAD > --download-slepc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.slepc_main.tar.gz > --download-slepc-commit=HEAD > --download-superlu_dist=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.superlu_dist.tar.gz > --download-superlu_dist-commit=HEAD > --download-mumps=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.mumps.tar.gz > --download-mumps-commit=HEAD > --download-hypre=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.hypre.tar.gz > --download-hypre-commit=HEAD > --download-hwloc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/hwloc-2.5.0.tar.gz > --download-sowing=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.sowing.tar.gz > --download-elemental=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.elemental.tar.gz > --download-elemental-commit=HEAD > --download-make=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/make-4.2.1-6.fc28.tar.gz > --download-ptscotch=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.ptscotch.tar.gz > --download-ptscotch-commit=HEAD --with-openmp=0 --with-pthread=0 > --with-cxx-dialect=c++11 --with-debugging=1 --with-cuda=0 --with-cudac=0 > --with-valgrind=0 --with-blaslapack-lib="-mkl=sequential > -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl" > --with-scalapack-lib="-mkl=sequential -lmkl_scalapack_lp64 > -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl" > --with-mkl_pardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl > --with-mkl_cpardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl > --with-mkl_sparse-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl > --with-mkl_sparse_optimize-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl > [1]PETSC ERROR: #1 PetscHSetIJGetSize() at > /home/cdfmat_marius/prog/petsc/git/petsc_main/include/petsc/private/hashsetij.h:13 > [1]PETSC ERROR: #2 MatPreallocatorPreallocate_Preallocator() at > /home/cdfmat_marius/prog/petsc/git/petsc_main/src/mat/impls/preallocator/matpreallocator.c:165 > [1]PETSC ERROR: #3 MatPreallocatorPreallocate() at > /home/cdfmat_marius/prog/petsc/git/petsc_main/src/mat/impls/preallocator/matpreallocator.c:234 > > Best and Thanks, > Marius > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex250.c Type: application/octet-stream Size: 2825 bytes Desc: not available URL: From jed at jedbrown.org Tue Feb 1 09:20:16 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 01 Feb 2022 08:20:16 -0700 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: References: Message-ID: <87tudi62bz.fsf@jedbrown.org> Patrick Sanan writes: > Sorry about the delay on this. I can reproduce. > > This regression appears to be a result of this optimization: > https://gitlab.com/petsc/petsc/-/merge_requests/4273 Thanks for tracking this down. Is there a reason to prefer preallocating twice ierr = MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr); ierr = MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr); versus using MatDuplicate() or MatConvert()? From patrick.sanan at gmail.com Tue Feb 1 09:28:50 2022 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 1 Feb 2022 16:28:50 +0100 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: <87tudi62bz.fsf@jedbrown.org> References: <87tudi62bz.fsf@jedbrown.org> Message-ID: Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown : > Patrick Sanan writes: > > > Sorry about the delay on this. I can reproduce. > > > > This regression appears to be a result of this optimization: > > https://gitlab.com/petsc/petsc/-/merge_requests/4273 > > Thanks for tracking this down. Is there a reason to prefer preallocating > twice > > ierr = > MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr); > ierr = > MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr); > > versus using MatDuplicate() or MatConvert()? > Maybe if your preallocation is an overestimate for each of two different post-assembly non-zero structures in A and A_duplicate? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Feb 1 09:33:36 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 01 Feb 2022 08:33:36 -0700 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: References: <87tudi62bz.fsf@jedbrown.org> Message-ID: <87r18m61pr.fsf@jedbrown.org> Patrick Sanan writes: > Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown : > >> Patrick Sanan writes: >> >> > Sorry about the delay on this. I can reproduce. >> > >> > This regression appears to be a result of this optimization: >> > https://gitlab.com/petsc/petsc/-/merge_requests/4273 >> >> Thanks for tracking this down. Is there a reason to prefer preallocating >> twice >> >> ierr = >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr); >> ierr = >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr); >> >> versus using MatDuplicate() or MatConvert()? >> > > Maybe if your preallocation is an overestimate for each of two different > post-assembly non-zero structures in A and A_duplicate? Even then, why not preallocate A and duplicate immediately, before compressing out zeros? From patrick.sanan at gmail.com Tue Feb 1 09:45:34 2022 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 1 Feb 2022 16:45:34 +0100 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: <87r18m61pr.fsf@jedbrown.org> References: <87tudi62bz.fsf@jedbrown.org> <87r18m61pr.fsf@jedbrown.org> Message-ID: That works, as in the attached example - Marius, would that work for your case? Am Di., 1. Feb. 2022 um 16:33 Uhr schrieb Jed Brown : > Patrick Sanan writes: > > > Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown : > > > >> Patrick Sanan writes: > >> > >> > Sorry about the delay on this. I can reproduce. > >> > > >> > This regression appears to be a result of this optimization: > >> > https://gitlab.com/petsc/petsc/-/merge_requests/4273 > >> > >> Thanks for tracking this down. Is there a reason to prefer preallocating > >> twice > >> > >> ierr = > >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr); > >> ierr = > >> > MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr); > >> > >> versus using MatDuplicate() or MatConvert()? > >> > > > > Maybe if your preallocation is an overestimate for each of two different > > post-assembly non-zero structures in A and A_duplicate? > > Even then, why not preallocate A and duplicate immediately, before > compressing out zeros? > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex251.c Type: application/octet-stream Size: 2442 bytes Desc: not available URL: From stefano.zampini at gmail.com Tue Feb 1 09:59:54 2022 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Tue, 1 Feb 2022 18:59:54 +0300 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: <87r18m61pr.fsf@jedbrown.org> References: <87tudi62bz.fsf@jedbrown.org> <87r18m61pr.fsf@jedbrown.org> Message-ID: Il giorno mar 1 feb 2022 alle ore 18:34 Jed Brown ha scritto: > Patrick Sanan writes: > > > Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown : > > > >> Patrick Sanan writes: > >> > >> > Sorry about the delay on this. I can reproduce. > >> > > >> > This regression appears to be a result of this optimization: > >> > https://gitlab.com/petsc/petsc/-/merge_requests/4273 > >> > >> Thanks for tracking this down. Is there a reason to prefer preallocating > >> twice > >> > >> ierr = > >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr); > >> ierr = > >> > MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr); > >> > >> versus using MatDuplicate() or MatConvert()? > >> > Jed this is not the point. Suppose you pass around only a preallocator, but do not pass around the matrices. Reusing the preallocator should be allowed. > > > Maybe if your preallocation is an overestimate for each of two different > > post-assembly non-zero structures in A and A_duplicate? > > Even then, why not preallocate A and duplicate immediately, before > compressing out zeros? > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Feb 1 10:07:39 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 01 Feb 2022 09:07:39 -0700 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: References: <87tudi62bz.fsf@jedbrown.org> <87r18m61pr.fsf@jedbrown.org> Message-ID: <87o83q6050.fsf@jedbrown.org> Stefano Zampini writes: > Il giorno mar 1 feb 2022 alle ore 18:34 Jed Brown ha > scritto: > >> Patrick Sanan writes: >> >> > Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown : >> > >> >> Patrick Sanan writes: >> >> >> >> > Sorry about the delay on this. I can reproduce. >> >> > >> >> > This regression appears to be a result of this optimization: >> >> > https://gitlab.com/petsc/petsc/-/merge_requests/4273 >> >> >> >> Thanks for tracking this down. Is there a reason to prefer preallocating >> >> twice >> >> >> >> ierr = >> >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr); >> >> ierr = >> >> >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr); >> >> >> >> versus using MatDuplicate() or MatConvert()? >> >> >> > > Jed > > this is not the point. Suppose you pass around only a preallocator, but do > not pass around the matrices. Reusing the preallocator should be allowed. The current code is not okay (crashing is not okay), but we should decide whether to consume the preallocator or to retain the data structure. Peak memory use is the main reason hash-based allocation hasn't been default and wasn't adopted sooner. Retaining the hash until the preallocator is destroyed increases that peak. From patrick.sanan at gmail.com Wed Feb 2 03:31:43 2022 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Wed, 2 Feb 2022 10:31:43 +0100 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: <87o83q6050.fsf@jedbrown.org> References: <87tudi62bz.fsf@jedbrown.org> <87r18m61pr.fsf@jedbrown.org> <87o83q6050.fsf@jedbrown.org> Message-ID: There is also the hedge of adding a parameter and API function to control which of these two behaviors is used, and if trying to preallocate twice, throwing an error that instructs the user how to change the behavior, noting that it will increase peak memory usage. Am Di., 1. Feb. 2022 um 17:07 Uhr schrieb Jed Brown : > Stefano Zampini writes: > > > Il giorno mar 1 feb 2022 alle ore 18:34 Jed Brown ha > > scritto: > > > >> Patrick Sanan writes: > >> > >> > Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown >: > >> > > >> >> Patrick Sanan writes: > >> >> > >> >> > Sorry about the delay on this. I can reproduce. > >> >> > > >> >> > This regression appears to be a result of this optimization: > >> >> > https://gitlab.com/petsc/petsc/-/merge_requests/4273 > >> >> > >> >> Thanks for tracking this down. Is there a reason to prefer > preallocating > >> >> twice > >> >> > >> >> ierr = > >> >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr); > >> >> ierr = > >> >> > >> > MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr); > >> >> > >> >> versus using MatDuplicate() or MatConvert()? > >> >> > >> > > > > Jed > > > > this is not the point. Suppose you pass around only a preallocator, but > do > > not pass around the matrices. Reusing the preallocator should be allowed. > > The current code is not okay (crashing is not okay), but we should decide > whether to consume the preallocator or to retain the data structure. Peak > memory use is the main reason hash-based allocation hasn't been default and > wasn't adopted sooner. Retaining the hash until the preallocator is > destroyed increases that peak. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Feb 2 11:49:19 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 2 Feb 2022 11:49:19 -0600 (CST) Subject: [petsc-users] petsc-3.16.4 now available Message-ID: <8448463e-2f3c-a2c2-be4d-dcf0f2ecb465@mcs.anl.gov> Dear PETSc users, The patch release petsc-3.16.4 is now available for download. https://petsc.org/release/download/ Satish From jpilpo77 at gmail.com Wed Feb 2 11:51:54 2022 From: jpilpo77 at gmail.com (Pilhwa Lee) Date: Wed, 2 Feb 2022 12:51:54 -0500 Subject: [petsc-users] [petsc-dev] petsc-3.16.4 now available In-Reply-To: <8448463e-2f3c-a2c2-be4d-dcf0f2ecb465@mcs.anl.gov> References: <8448463e-2f3c-a2c2-be4d-dcf0f2ecb465@mcs.anl.gov> Message-ID: <10727709-BCDE-4010-B310-FF4E773E01F6@gmail.com> Dear Satish, Thanks. Nowadays, I?m embarking on the integration of PETSc and MFEM as well as parallelizing the immersed boundary method. Happy New Year. With best regards, Pilhwa From EPrudencio at slb.com Wed Feb 2 12:55:09 2022 From: EPrudencio at slb.com (Ernesto Prudencio) Date: Wed, 2 Feb 2022 18:55:09 +0000 Subject: [petsc-users] Error when trying to configure PETSc 3.16.3 Message-ID: Hi. I get the attached configure.log when trying to configure PETSc 3.16.3 in a docker environment. Any hints? The configure runs fine in my local environment, but we have to try via docker for other purposes. Thank you in advance, Ernesto. Schlumberger-Private -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 52888 bytes Desc: configure.log URL: From balay at mcs.anl.gov Wed Feb 2 12:59:34 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 2 Feb 2022 12:59:34 -0600 (CST) Subject: [petsc-users] Error when trying to configure PETSc 3.16.3 In-Reply-To: References: Message-ID: >>> Executing: /opt/mpitomo-third-parties/openmpi-4.1.2/bin/mpif90 -c -o /tmp/petsc-cbvxg1o9/config.setCompilers/conftest.o -I/tmp/petsc-cbvxg1o9/config.setCompilers /tmp/petsc-cbvxg1o9/config.setCompilers/conftest.F90 Possible ERROR while running compiler: exit code 1 stderr: -------------------------------------------------------------------------- No underlying compiler was specified in the wrapper compiler data file (e.g., mpicc-wrapper-data.txt) -------------------------------------------------------------------------- <<<< Likely gfortran was not installed in the docker environment Satish On Wed, 2 Feb 2022, Ernesto Prudencio via petsc-users wrote: > Hi. > > I get the attached configure.log when trying to configure PETSc 3.16.3 in a docker environment. Any hints? > > The configure runs fine in my local environment, but we have to try via docker for other purposes. > > Thank you in advance, > > Ernesto. > > > Schlumberger-Private > From zjorti at lanl.gov Wed Feb 2 17:07:41 2022 From: zjorti at lanl.gov (Jorti, Zakariae) Date: Wed, 2 Feb 2022 23:07:41 +0000 Subject: [petsc-users] Using Finite Difference Jacobian with TS Message-ID: Hello, I am using a TS to solve a differential algebraic equation (DAE). I do not provide the Jacobian matrix but instead set the TS to use a finite difference jacobian with coloring. For debugging, I only solve this DAE on one time step. Is there a way to output this finite difference jacobian in a matlab format (.m file)? I have already tried this flag : -mat_view ascii:ColFDJac.m:ascii_matlab , but when I use it the test takes too long. My guess is that this flag is trying to save all the matrices involved in the computations. I am only interested in the jacobian though. Thank you. Best, Zakariae -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 2 17:15:57 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 2 Feb 2022 18:15:57 -0500 Subject: [petsc-users] Using Finite Difference Jacobian with TS In-Reply-To: References: Message-ID: On Wed, Feb 2, 2022 at 6:08 PM Jorti, Zakariae via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I am using a TS to solve a differential algebraic equation (DAE). > I do not provide the Jacobian matrix but instead set the TS to use a > finite difference jacobian with coloring. > For debugging, I only solve this DAE on one time step. > Is there a way to output this finite difference jacobian in a matlab > format (.m file)? > I have already tried this flag : -mat_view ascii:ColFDJac.m:ascii_matlab , > but when I use it the test takes too long. My guess is that this flag is > trying to save all the matrices involved in the computations. I am only > interested in the jacobian though. > I tend to use -ksp_mat_view ascii:ColFDJac.m:ascii_matlab since that only outputs the KSP system matrix. Thanks, Matt > Thank you. > > Best, > > Zakariae > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Feb 2 17:17:34 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 2 Feb 2022 18:17:34 -0500 Subject: [petsc-users] Using Finite Difference Jacobian with TS In-Reply-To: References: Message-ID: <97D20DA4-8670-4C84-8FBA-E0A3123DAEB5@petsc.dev> For a big matrix the asii output is slow (especially parallel) Yes, this view will be called automatically on all matrix when they are assembled so if you are assembling a bunch of matrices it will output them all. You can attach a unique prefix to the matrix PetscObjectSetPrefix() and then use -uniqueprefix_mat.... You can use the binary format (faster) with -mat_view binary:ColFDJac.mat and use PetscBinaryRead() in Matlab share/petsc/matlab You can use MatView() directly in your code with a binary PetscViewer on exactly the matrix you wish to save. > On Feb 2, 2022, at 6:07 PM, Jorti, Zakariae via petsc-users wrote: > > Hello, > > I am using a TS to solve a differential algebraic equation (DAE). > I do not provide the Jacobian matrix but instead set the TS to use a finite difference jacobian with coloring. > For debugging, I only solve this DAE on one time step. > Is there a way to output this finite difference jacobian in a matlab format (.m file)? > I have already tried this flag : -mat_view ascii:ColFDJac.m:ascii_matlab , but when I use it the test takes too long. My guess is that this flag is trying to save all the matrices involved in the computations. I am only interested in the jacobian though. > Thank you. > > Best, > > Zakariae -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Feb 2 17:23:31 2022 From: jed at jedbrown.org (Jed Brown) Date: Wed, 02 Feb 2022 16:23:31 -0700 Subject: [petsc-users] Using Finite Difference Jacobian with TS In-Reply-To: References: Message-ID: <87czk4ont8.fsf@jedbrown.org> Matthew Knepley writes: > On Wed, Feb 2, 2022 at 6:08 PM Jorti, Zakariae via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> I am using a TS to solve a differential algebraic equation (DAE). >> I do not provide the Jacobian matrix but instead set the TS to use a >> finite difference jacobian with coloring. >> For debugging, I only solve this DAE on one time step. >> Is there a way to output this finite difference jacobian in a matlab >> format (.m file)? >> I have already tried this flag : -mat_view ascii:ColFDJac.m:ascii_matlab , >> but when I use it the test takes too long. My guess is that this flag is >> trying to save all the matrices involved in the computations. I am only >> interested in the jacobian though. >> > > I tend to use -ksp_mat_view ascii:ColFDJac.m:ascii_matlab since that only > outputs the KSP system matrix. And it'll be more efficient to do binary output -ksp_mat_view binary:ColFDJac.petsc and in MATLAB, use PetscBinaryRead('ColFDJac.petsc') https://petsc.org/release/docs/manual/matlab/?highlight=matlab#dumping-binary-data-for-matlab From mbuerkle at web.de Wed Feb 2 18:10:42 2022 From: mbuerkle at web.de (Marius Buerkle) Date: Thu, 3 Feb 2022 01:10:42 +0100 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: References: <87tudi62bz.fsf@jedbrown.org> <87r18m61pr.fsf@jedbrown.org> Message-ID: Thanks for they reply. Yes the example works, this is how I was doing it before. But the matrix is rather big and i need a matrix with the same structure at various points in my code. So it was convenient to create the matrix with preallocate, destroy it after using it to free the memory and creating it again later with the same preallocate. Anyway it works with MatDuplicate for now. > Gesendet: Dienstag, den 01.02.2022 um 16:45 Uhr > Von: "Patrick Sanan" > An: "Jed Brown" > Cc: "Marius Buerkle" , "PETSc users list" , petsc-dev > Betreff: Re: [petsc-dev] [petsc-users] MatPreallocatorPreallocate segfault with PETSC 3.16 > > That works, as in the attached example - Marius, would that work for your > case? > > Am Di., 1. Feb. 2022 um 16:33 Uhr schrieb Jed Brown : > > > Patrick Sanan writes: > > > > > Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown : > > > > > >> Patrick Sanan writes: > > >> > > >> > Sorry about the delay on this. I can reproduce. > > >> > > > >> > This regression appears to be a result of this optimization: > > >> > https://gitlab.com/petsc/petsc/-/merge_requests/4273 > > >> > > >> Thanks for tracking this down. Is there a reason to prefer preallocating > > >> twice > > >> > > >> ierr = > > >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr); > > >> ierr = > > >> > > MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr); > > >> > > >> versus using MatDuplicate() or MatConvert()? > > >> > > > > > > Maybe if your preallocation is an overestimate for each of two different > > > post-assembly non-zero structures in A and A_duplicate? > > > > Even then, why not preallocate A and duplicate immediately, before > > compressing out zeros? > > From yuanxi at advancesoft.jp Wed Feb 2 18:32:09 2022 From: yuanxi at advancesoft.jp (=?UTF-8?B?6KKB54WV?=) Date: Thu, 3 Feb 2022 09:32:09 +0900 Subject: [petsc-users] Is it possible to enforce some mesh entities owned by the same CPU? Message-ID: Hello everyone I need to enforce some specific nodes, for example, two nodes i,j in my finite element mesh, to be owned by the same CPU when doing DMPlex partition. Are there any means to implement it? Thanks in advance. Yuan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 2 18:50:19 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 2 Feb 2022 19:50:19 -0500 Subject: [petsc-users] Is it possible to enforce some mesh entities owned by the same CPU? In-Reply-To: References: Message-ID: On Wed, Feb 2, 2022 at 7:32 PM ?? wrote: > Hello everyone > > I need to enforce some specific nodes, for example, two nodes i,j in my > finite element mesh, to be owned by the same CPU when doing DMPlex > partition. Are there any means to implement it? > It might be possible using edge weights in the partitioner. However, we have no automatic support for that. To do it manually, you would probably have to make the CSR matrix for the mesh, wrap that in a Mat, add values for weights, call MatPartition, and then feed that partition to Plex. It is doable, but it would be some amount of work. Thanks, Matt > Thanks in advance. > > Yuan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Feb 2 20:09:33 2022 From: jed at jedbrown.org (Jed Brown) Date: Wed, 02 Feb 2022 19:09:33 -0700 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: References: <87tudi62bz.fsf@jedbrown.org> <87r18m61pr.fsf@jedbrown.org> Message-ID: <87a6f8og4i.fsf@jedbrown.org> Marius Buerkle writes: > Thanks for they reply. Yes the example works, this is how I was doing it before. But the matrix is rather big and i need a matrix with the same structure at various points in my code. So it was convenient to create the matrix with preallocate, destroy it after using it to free the memory and creating it again later with the same preallocate. > Anyway it works with MatDuplicate for now. I think it should take *less* memory to destroy the preallocator and duplicate the actual matrix than to destroy the matrix and persist the preallocator. If that is not the case (or close enough), we can make it so. From rohany at alumni.cmu.edu Wed Feb 2 23:11:33 2022 From: rohany at alumni.cmu.edu (Rohan Yadav) Date: Wed, 2 Feb 2022 21:11:33 -0800 Subject: [petsc-users] PETSc GPU MatMatMult performance question Message-ID: Hi All, I'm trying to use the MatMatMult function with 1 sparse matrix B and two dense matrices A, C. I'm computing A = B * C. My code is below: ``` void spmm(Mat B, int warmup, int niter) { Mat A, C; PetscInt i, j = 32, k; MatGetSize(B, &i, &k); MatCreateDenseCUDA(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, i, j, NULL, &A); MatCreateDenseCUDA(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, k, j, NULL, &C); // Initialize entries in the output. MatZeroEntries(A); setMatToConstant(C, 1.0); // Finally, do the computation. auto avgTime = benchmarkWithWarmup(warmup, niter, [&]() { MatMatMult(B, C, MAT_REUSE_MATRIX, PETSC_DEFAULT, &A); }); PetscPrintf(PETSC_COMM_WORLD, "Average time: %lf ms.\n", avgTime * 1000); } ``` where benchmarkWithWarmup is a simple wrapper function that runs a lambda several times. I'm running this function with arguments `-vec_type cuda -mat_type aijcusparse`, and see that the performance is relatively slow. I'm wondering if I'm using the API incorrectly, or the computation is executing as expected. `nvprof` shows that much of the time is spent in a device to host memcpys: ``` Type Time(%) Time Calls Avg Min Max Name GPU activities: 87.32% 11.9978s 33 363.57ms 1.5040us 388.26ms [CUDA memcpy DtoH] 8.71% 1.19611s 30 39.870ms 37.421ms 39.976ms void cusparse::csrmm_kernel, int, double, double, double>(bool=0, cusparse::csrmm_kernel, int, double, double, double>, cusparse::csrmm_kernel, int, double, double, double>, bool=0 const *, bool=0 const , bool=0, bool=0, int, cusparseOperation_t, cusparse::csrmm_kernel, int, double, double, double> const *, cusparse::csrmm_kernel, int, double, double, double> const , unsigned int=8 const *, unsigned int=8 const , cusparse::csrmm_kernel, int, double, double, double>, unsigned int=16*, cusparse::csrmm_kernel, int, double, double, double>) 3.87% 531.56ms 14 37.968ms 1.0240us 227.29ms [CUDA memcpy HtoD] 0.07% 9.7452ms 6 1.6242ms 1.0880us 3.2481ms [CUDA memset] 0.02% 2.8727ms 1 2.8727ms 2.8727ms 2.8727ms void thrust::cuda_cub::core::_kernel_agent, double>, unsigned long>, thrust::cuda_cub::__uninitialized_fill::functor, double>, unsigned long>(thrust::device_ptr, double) 0.01% 1.4953ms 2 747.67us 56.188us 1.4392ms void thrust::cuda_cub::core::_kernel_agent, int>, unsigned long>, thrust::cuda_cub::__uninitialized_fill::functor, int>, unsigned long>(thrust::device_ptr, int)``` The logview output is: ``` ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 Using Petsc Release Version 3.16.3, unknown Max Max/Min Avg Total Time (sec): 1.163e+02 1.000 1.163e+02 Objects: 4.800e+01 1.000 4.800e+01 Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 MPI Reductions: 8.100e+01 1.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 --------------------------------------------------------------------------------------------------------------------------------------------------------------- Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 37 30 2867511840 0. Viewer 2 0 0 0. Vector 4 1 1792 0. Index Set 2 2 1495248 0. Star Forest Graph 3 0 0 0. ======================================================================================================================== Average time to get PetscTime(): 3.83e-08 Average time for MPI_Barrier(): 7.874e-07 Average time for zero size MPI_Send(): 3.4035e-06 #PETSc Option Table entries: -bench spmm -enable_gpu -log_view -mat_type aijcusparse -matload_block_size 1 -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc -n 20 -vec_type cuda -warmup 10 ``` Thanks, Rohan Yadav -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Feb 3 00:27:31 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 3 Feb 2022 01:27:31 -0500 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: References: Message-ID: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> > On Feb 3, 2022, at 12:11 AM, Rohan Yadav wrote: > > Hi All, > > I'm trying to use the MatMatMult function with 1 sparse matrix B and two dense matrices A, C. I'm computing A = B * C. > > My code is below: > > ``` > void spmm(Mat B, int warmup, int niter) { > Mat A, C; > PetscInt i, j = 32, k; > MatGetSize(B, &i, &k); > MatCreateDenseCUDA(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, i, j, NULL, &A); > MatCreateDenseCUDA(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, k, j, NULL, &C); > > // Initialize entries in the output. > MatZeroEntries(A); > setMatToConstant(C, 1.0); > > // Finally, do the computation. > auto avgTime = benchmarkWithWarmup(warmup, niter, [&]() { > MatMatMult(B, C, MAT_REUSE_MATRIX, PETSC_DEFAULT, &A); > }); > PetscPrintf(PETSC_COMM_WORLD, "Average time: %lf ms.\n", avgTime * 1000); > } > ``` > where benchmarkWithWarmup is a simple wrapper function that runs a lambda several times. > > I'm running this function with arguments `-vec_type cuda -mat_type aijcusparse`, These arguments are not appropriate; they are only for certain examples, you shouldn't rely on them. > and see that the performance is relatively slow. I'm wondering if I'm using the API incorrectly, or the computation is executing as expected. `nvprof` shows that much of the time is spent in a device to host memcpys: Please send the code that builds the sparse B matrix and the setMatToConstant() routine. > ``` > Type Time(%) Time Calls Avg Min Max Name > GPU activities: 87.32% 11.9978s 33 363.57ms 1.5040us 388.26ms [CUDA memcpy DtoH] > 8.71% 1.19611s 30 39.870ms 37.421ms 39.976ms void cusparse::csrmm_kernel, int, double, double, double>(bool=0, cusparse::csrmm_kernel, int, double, double, double>, cusparse::csrmm_kernel, int, double, double, double>, bool=0 const *, bool=0 const , bool=0, bool=0, int, cusparseOperation_t, cusparse::csrmm_kernel, int, double, double, double> const *, cusparse::csrmm_kernel, int, double, double, double> const , unsigned int=8 const *, unsigned int=8 const , cusparse::csrmm_kernel, int, double, double, double>, unsigned int=16*, cusparse::csrmm_kernel, int, double, double, double>) > 3.87% 531.56ms 14 37.968ms 1.0240us 227.29ms [CUDA memcpy HtoD] > 0.07% 9.7452ms 6 1.6242ms 1.0880us 3.2481ms [CUDA memset] > 0.02% 2.8727ms 1 2.8727ms 2.8727ms 2.8727ms void thrust::cuda_cub::core::_kernel_agent, double>, unsigned long>, thrust::cuda_cub::__uninitialized_fill::functor, double>, unsigned long>(thrust::device_ptr, double) > 0.01% 1.4953ms 2 747.67us 56.188us 1.4392ms void thrust::cuda_cub::core::_kernel_agent, int>, unsigned long>, thrust::cuda_cub::__uninitialized_fill::functor, int>, unsigned long>(thrust::device_ptr, int) > ``` > The logview output is: > ```----------------------------------------------------------------------------------------------------------------------- Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 From the third line we see the sparse matrix is copied to the GPU once, this is good. From line 4 a dense matrix is copied to the GPU once, this is good. But from line 5 we see a dense matrix is copied from the GPU to the CPU 31 times! Looking at line 2 we see 30 copies from GPU to the CPU. The flop rate on the GPU is 920,026 which is fine, but the flop rate for the entire multiply time is a terrible 28,598, this is because this time includes all the copies between the GPU and CPU and CPU and GPU. So let's see if we can figure out why all these copies are taking place from the GPU to the CPU. But first please verify that if you run with one MPI rank the "on GPU" and the overall flop rates for the MatMatMult() are almost the same and there is no copy from the GPU for each multiply? I think the parallel multiply is done with MatMatMultNumeric_MPIAIJ_MPIDense(). This code has two problems 1) It uses MatMPIDenseScatter() to move to the other ranks their needed rows of the C matrix. That function has the call MatDenseGetArrayRead() normally would trigger a copy of C up to the CPU each time. But since C is not changing in your test run I guess it only triggers one copy. 2) If uses MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); to do the off diagonal part of the product but this triggers for each multiply a copy of the result matrix from the CPU to the GPU (hugely expensive) For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() that is smarter about the needed MPI communication so it only moves exactly what it needs to the other ranks and it does the off-diagonal part of the product on the GPU so it does not need to copy the result up to the CPU. Barry > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 > Using Petsc Release Version 3.16.3, unknown > > Max Max/Min Avg Total > Time (sec): 1.163e+02 1.000 1.163e+02 > Objects: 4.800e+01 1.000 4.800e+01 > Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 > Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 > MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 > MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 > MPI Reductions: 8.100e+01 1.000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flop > and VecAXPY() for complex vectors of length N --> 8N flop > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count %Total Avg %Total Count %Total > 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) > GPU %F: percent flops on GPU in this event > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > --- Event Stage 0: Main Stage > > BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 > BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 > MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 > MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 > MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 > MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 > MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 > MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 > MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 > MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 > VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 37 30 2867511840 0. > Viewer 2 0 0 0. > Vector 4 1 1792 0. > Index Set 2 2 1495248 0. > Star Forest Graph 3 0 0 0. > ======================================================================================================================== > Average time to get PetscTime(): 3.83e-08 > Average time for MPI_Barrier(): 7.874e-07 > Average time for zero size MPI_Send(): 3.4035e-06 > #PETSc Option Table entries: > -bench spmm > -enable_gpu > -log_view > -mat_type aijcusparse > -matload_block_size 1 > -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc > -n 20 > -vec_type cuda > -warmup 10 > ``` > > Thanks, > > Rohan Yadav > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Thu Feb 3 01:59:40 2022 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Thu, 3 Feb 2022 10:59:40 +0300 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> References: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> Message-ID: 1) It uses MatMPIDenseScatter() to move to the other ranks their needed > rows of the C matrix. That function has the call MatDenseGetArrayRead() > normally would trigger a copy of C up to the CPU each time. But since C is > not changing in your test run I guess it only triggers one copy. > > 2) If uses > MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); > to do the off diagonal part of the product but this triggers for each > multiply a copy of the result matrix from the CPU to the GPU (hugely > expensive) > > For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() > that is smarter about the needed MPI communication so it only moves exactly > what it needs to the other ranks and it does the off-diagonal part of the > product on the GPU so it does not need to copy the result up to the CPU. > > MPIAIJCUSPARSE uses MatProductSetFromOptions_MPIAIJBACKEND Rohan I would suggest to add PetscLogStage around your performance loop (do a warmup outside of it) and send the relevant portion of the log > Barry > > > > > > > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 > Using Petsc Release Version 3.16.3, unknown > > Max Max/Min Avg Total > Time (sec): 1.163e+02 1.000 1.163e+02 > Objects: 4.800e+01 1.000 4.800e+01 > Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 > Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 > MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 > MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 > MPI Reductions: 8.100e+01 1.000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flop > and VecAXPY() for complex vectors of length N --> 8N flop > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count %Total Avg %Total Count %Total > 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) > GPU %F: percent flops on GPU in this event > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > --- Event Stage 0: Main Stage > > BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 > BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 > MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 > MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 > MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 > MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 > MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 > MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 > MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 > MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 > VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 37 30 2867511840 0. > Viewer 2 0 0 0. > Vector 4 1 1792 0. > Index Set 2 2 1495248 0. > Star Forest Graph 3 0 0 0. > ======================================================================================================================== > Average time to get PetscTime(): 3.83e-08 > Average time for MPI_Barrier(): 7.874e-07 > Average time for zero size MPI_Send(): 3.4035e-06 > #PETSc Option Table entries: > -bench spmm > -enable_gpu > -log_view > -mat_type aijcusparse > -matload_block_size 1 > -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc > -n 20 > -vec_type cuda > -warmup 10 > ``` > > > Thanks, > > > Rohan Yadav > > > > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From rohany at alumni.cmu.edu Thu Feb 3 11:29:06 2022 From: rohany at alumni.cmu.edu (Rohan Yadav) Date: Thu, 3 Feb 2022 09:29:06 -0800 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: References: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> Message-ID: > Please send the code that builds the sparse B matrix and the setMatToConstant() routine. Setting to a constant: ``` void setMatToConstant(Mat mat, PetscScalar c) { PetscInt rStart, rEnd, m, n; MatGetSize(mat, &m, &n); MatGetOwnershipRange(mat, &rStart, &rEnd); for (int i = rStart; i < rEnd; i++) { for (int j = 0; j < n; j++) { MatSetValue(mat, i, j, c, INSERT_VALUES); } } MatAssemblyBegin(mat, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(mat, MAT_FINAL_ASSEMBLY); } ``` Loading sparse matrix from disk: ``` int loadMatrixFromFile(Mat* A, char* filename) { auto ierr = MatCreate(PETSC_COMM_WORLD, A); CHKERRQ(ierr); MatSetFromOptions(*A); PetscViewer viewer; PetscViewerCreate(PETSC_COMM_WORLD, &viewer); PetscViewerSetType(viewer, PETSCVIEWERBINARY); PetscViewerFileSetMode(viewer, FILE_MODE_READ); PetscViewerFileSetName(viewer, filename); MatLoad(*A, viewer); return 0; } ``` These are only called once and should not affect the computation in a loop though. > But first please verify that if you run with one MPI rank the "on GPU" and the overall flop rates for the MatMatMult() are almost the same and there is no copy from the GPU for each multiply? Yes, with 1 mpi rank / GPU there are no extra copies done. As soon as I move to 2 ranks I see this behavior. Here are updated logs with a new stage for 2 ranks. I've staged the logs into "MyComputation". ``` ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen572 with 2 processors, by yadav2 Thu Feb 3 09:27:30 2022 Using Petsc Release Version 3.16.3, unknown Max Max/Min Avg Total Time (sec): 2.091e+02 1.001 2.090e+02 Objects: 4.800e+01 1.000 4.800e+01 Flop: 4.344e+11 1.019 4.303e+11 8.606e+11 Flop/sec: 2.077e+09 1.018 2.059e+09 4.118e+09 MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 MPI Message Lengths: 6.316e+10 1.000 1.805e+09 1.263e+11 MPI Reductions: 8.100e+01 1.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 1.0555e+02 50.5% 2.8686e+11 33.3% 3.000e+01 42.9% 1.466e+09 34.8% 4.300e+01 53.1% 1: MyComputation: 1.0345e+02 49.5% 5.7373e+11 66.7% 4.000e+01 57.1% 2.058e+09 65.2% 2.000e+01 24.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) CpuToGpu Count: total number of CPU to GPU copies per processor CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) GpuToCpu Count: total number of GPU to CPU copies per processor GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) GPU %F: percent flops on GPU in this event ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F --------------------------------------------------------------------------------------------------------------------------------------------------------------- --- Event Stage 0: Main Stage BuildTwoSided 2 1.0 4.0085e-0136.3 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 7 0 5 0 0 0 0.00e+00 0 0.00e+00 0 BuildTwoSidedF 1 1.0 4.0080e-0113602.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 MatAssemblyBegin 12 1.0 4.0084e-017217.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 12 1.0 3.4970e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 2 0 0 0 7 3 0 0 0 14 0 0 0 0.00e+00 0 0.00e+00 0 MatZeroEntries 1 1.0 2.4093e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatLoad 1 1.0 1.3756e+01 1.0 0.00e+00 0.0 6.0e+00 4.6e+08 2.1e+01 7 0 9 2 26 13 0 20 6 49 0 0 0 0.00e+00 0 0.00e+00 0 MatMatMultSym 20 1.0 4.7919e+00 2.4 0.00e+00 0.0 4.0e+00 1.6e+07 1.2e+01 2 0 6 0 15 3 0 13 0 28 0 0 0 0.00e+00 0 0.00e+00 0 MatMatMultNum 10 1.0 4.9853e+01 1.1 1.45e+11 1.0 2.0e+01 2.1e+09 0.0e+00 23 33 29 33 0 46100 67 94 0 5754 182686 2 2.23e+03 10 2.08e+04 5 MatCUSPARSCopyTo 1 1.0 2.2646e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 1.55e+02 0 0.00e+00 0 MatDenseCopyTo 1 1.0 1.6636e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.08e+03 0 0.00e+00 0 MatDenseCopyFrom 11 1.0 3.0463e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 0 0 0.00e+00 11 2.29e+04 0 VecSet 3 1.0 5.0035e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFSetGraph 1 1.0 4.4294e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 SFSetUp 1 1.0 1.3982e-01 1.0 0.00e+00 0.0 4.0e+00 1.6e+07 1.0e+00 0 0 6 0 1 0 0 13 0 2 0 0 0 0.00e+00 0 0.00e+00 0 --- Event Stage 1: MyComputation MatAssemblyBegin 20 1.0 1.6894e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatAssemblyEnd 20 1.0 1.5575e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 MatMatMultSym 40 1.0 1.0096e+01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 3 0 0 0 25 7 0 0 0100 0 0 0 0.00e+00 0 0.00e+00 0 MatMatMultNum 20 1.0 9.9320e+01 1.1 2.90e+11 1.0 4.0e+01 2.1e+09 0.0e+00 46 67 57 65 0 93100100100 0 5777 182577 0 0.00e+00 20 4.16e+04 5 MatDenseCopyFrom 20 1.0 5.5380e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 5 0 0 0 0 0 0 0 0.00e+00 20 4.16e+04 0 --------------------------------------------------------------------------------------------------------------------------------------------------------------- Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 17 10 20381695840 0. Viewer 2 0 0 0. Vector 4 1 1792 0. Index Set 2 2 31848152 0. Star Forest Graph 3 0 0 0. --- Event Stage 1: MyComputation Matrix 20 20 40763391680 0. ======================================================================================================================== Average time to get PetscTime(): 3.96e-08 Average time for MPI_Barrier(): 8.184e-07 Average time for zero size MPI_Send(): 2.8165e-06 #PETSc Option Table entries: -bench spmm -enable_gpu -log_view -mat_type aijcusparse -matload_block_size 1 -matrix /p/gpfs1/yadav2/tensors/petsc/nlpkkt200.petsc -n 20 -vec_type cuda -warmup 10 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --download-c2html=0 --download-hwloc=0 --download-sowing=0 --prefix=./petsc-install/ --with-64-bit-indices=0 --with-blaslapack-lib="/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/liblapack.so /usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/libblas.so" --with-cc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc --with-clanguage=C --with-cxx-dialect=C++17 --with-cxx=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpig++ --with-cuda=1 --with-debugging=0 --with-fc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran --with-fftw=0 --with-hdf5-dir=/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4 --with-hdf5=1 --with-mumps=0 --with-precision=double --with-scalapack=0 --with-scalar-type=real --with-shared-libraries=1 --with-ssl=0 --with-suitesparse=0 --with-trilinos=0 --with-valgrind=0 --with-x=0 --with-zlib-include=/usr/include --with-zlib-lib=/usr/lib64/libz.so --with-zlib=1 CFLAGS="-g -DNoChange" COPTFLAGS="-O3" CXXFLAGS="-O3" CXXOPTFLAGS="-O3" FFLAGS=-g CUDAFLAGS=-std=c++17 FOPTFLAGS= PETSC_ARCH=arch-linux-c-opt ----------------------------------------- Libraries compiled on 2022-01-21 06:41:50 on lassen111 Machine characteristics: Linux-4.14.0-115.21.2.1chaos.ch6a.ppc64le-ppc64le-with-redhat-7.6-Maipo Using PETSc directory: /g/g15/yadav2/taco/petsc/petsc/petsc-install Using PETSc arch: ----------------------------------------- Using C compiler: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc -g -DNoChange -fPIC "-O3" Using Fortran compiler: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran -g -fPIC ----------------------------------------- Using include paths: -I/g/g15/yadav2/taco/petsc/petsc/petsc-install/include -I/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/include -I/usr/include -I/usr/tce/packages/cuda/cuda-11.1.0/include ----------------------------------------- Using C linker: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc Using Fortran linker: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran Using libraries: -Wl,-rpath,/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -L/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -lpetsc -Wl,-rpath,/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib -L/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib -Wl,-rpath,/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib -L/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib -Wl,-rpath,/usr/tce/packages/cuda/cuda-11.1.0/lib64 -L/usr/tce/packages/cuda/cuda-11.1.0/lib64 -Wl,-rpath,/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib -L/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -llapack -lblas -lhdf5_hl -lhdf5 -lm /usr/lib64/libz.so -lcuda -lcudart -lcufft -lcublas -lcusparse -lcusolver -lcurand -lstdc++ -ldl -lmpiprofilesupport -lmpi_ibm_usempi -lmpi_ibm_mpifh -lmpi_ibm -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl ----------------------------------------- ``` On Wed, Feb 2, 2022 at 11:59 PM Stefano Zampini wrote: > > > 1) It uses MatMPIDenseScatter() to move to the other ranks their needed >> rows of the C matrix. That function has the call MatDenseGetArrayRead() >> normally would trigger a copy of C up to the CPU each time. But since C is >> not changing in your test run I guess it only triggers one copy. >> >> 2) If uses >> MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); >> to do the off diagonal part of the product but this triggers for each >> multiply a copy of the result matrix from the CPU to the GPU (hugely >> expensive) >> >> For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() >> that is smarter about the needed MPI communication so it only moves exactly >> what it needs to the other ranks and it does the off-diagonal part of the >> product on the GPU so it does not need to copy the result up to the CPU. >> >> > MPIAIJCUSPARSE uses MatProductSetFromOptions_MPIAIJBACKEND > > Rohan > I would suggest to add PetscLogStage around your performance loop (do a > warmup outside of it) and send the relevant portion of the log > > >> Barry >> >> >> >> >> >> >> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >> >> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 >> Using Petsc Release Version 3.16.3, unknown >> >> Max Max/Min Avg Total >> Time (sec): 1.163e+02 1.000 1.163e+02 >> Objects: 4.800e+01 1.000 4.800e+01 >> Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 >> Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 >> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >> MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 >> MPI Reductions: 8.100e+01 1.000 >> >> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N --> 2N flop >> and VecAXPY() for complex vectors of length N --> 8N flop >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >> 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> AvgLen: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this phase >> %M - percent messages in this phase %L - percent message lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >> CpuToGpu Count: total number of CPU to GPU copies per processor >> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >> GpuToCpu Count: total number of GPU to CPU copies per processor >> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >> GPU %F: percent flops on GPU in this event >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> --- Event Stage 0: Main Stage >> >> BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 >> BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >> MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >> MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 >> MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 >> MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 >> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >> MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 >> MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 >> MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 >> VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Matrix 37 30 2867511840 0. >> Viewer 2 0 0 0. >> Vector 4 1 1792 0. >> Index Set 2 2 1495248 0. >> Star Forest Graph 3 0 0 0. >> ======================================================================================================================== >> Average time to get PetscTime(): 3.83e-08 >> Average time for MPI_Barrier(): 7.874e-07 >> Average time for zero size MPI_Send(): 3.4035e-06 >> #PETSc Option Table entries: >> -bench spmm >> -enable_gpu >> -log_view >> -mat_type aijcusparse >> -matload_block_size 1 >> -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc >> -n 20 >> -vec_type cuda >> -warmup 10 >> ``` >> >> >> Thanks, >> >> >> Rohan Yadav >> >> >> >> > > -- > Stefano > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Feb 3 11:42:40 2022 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 3 Feb 2022 12:42:40 -0500 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: References: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> Message-ID: Oly 5% on the GPU: MatMatMultNum 20 1.0 9.9320e+01 1.1 2.90e+11 1.0 4.0e+01 2.1e+09 0.0e+00 46 67 57 65 0 93100100100 0 5777 182577 0 0.00e+00 20 4.16e+04 *5* A bug in MPI-MatMatMultNum? Maybe a device method is not implemented? On Thu, Feb 3, 2022 at 12:29 PM Rohan Yadav wrote: > > Please send the code that builds the sparse B matrix and the setMatToConstant() > routine. > > Setting to a constant: > ``` > void setMatToConstant(Mat mat, PetscScalar c) { > > PetscInt rStart, rEnd, m, n; > MatGetSize(mat, &m, &n); > MatGetOwnershipRange(mat, &rStart, &rEnd); > for (int i = rStart; i < rEnd; i++) { > for (int j = 0; j < n; j++) { > MatSetValue(mat, i, j, c, INSERT_VALUES); > } > } > MatAssemblyBegin(mat, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(mat, MAT_FINAL_ASSEMBLY); > } > ``` > > Loading sparse matrix from disk: > > ``` > > int loadMatrixFromFile(Mat* A, char* filename) { > auto ierr = MatCreate(PETSC_COMM_WORLD, A); CHKERRQ(ierr); > MatSetFromOptions(*A); > PetscViewer viewer; > PetscViewerCreate(PETSC_COMM_WORLD, &viewer); > PetscViewerSetType(viewer, PETSCVIEWERBINARY); > PetscViewerFileSetMode(viewer, FILE_MODE_READ); > PetscViewerFileSetName(viewer, filename); > MatLoad(*A, viewer); > return 0; > } > > ``` > > These are only called once and should not affect the computation in a loop though. > > > But first please verify that if you run with one MPI rank the "on GPU" and the overall flop rates for the MatMatMult() are almost the same and there is no copy from the GPU for each multiply? > > > Yes, with 1 mpi rank / GPU there are no extra copies done. As soon as I > move to 2 ranks I see this behavior. > > Here are updated logs with a new stage for 2 ranks. I've staged the logs > into "MyComputation". > > ``` > ---------------------------------------------- PETSc Performance Summary: > ---------------------------------------------- > > /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen572 with 2 > processors, by yadav2 Thu Feb 3 09:27:30 2022 > Using Petsc Release Version 3.16.3, unknown > > Max Max/Min Avg Total > Time (sec): 2.091e+02 1.001 2.090e+02 > Objects: 4.800e+01 1.000 4.800e+01 > Flop: 4.344e+11 1.019 4.303e+11 8.606e+11 > Flop/sec: 2.077e+09 1.018 2.059e+09 4.118e+09 > MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 > MPI Message Lengths: 6.316e+10 1.000 1.805e+09 1.263e+11 > MPI Reductions: 8.100e+01 1.000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flop > and VecAXPY() for complex vectors of length N > --> 8N flop > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 1.0555e+02 50.5% 2.8686e+11 33.3% 3.000e+01 > 42.9% 1.466e+09 34.8% 4.300e+01 53.1% > 1: MyComputation: 1.0345e+02 49.5% 5.7373e+11 66.7% 4.000e+01 > 57.1% 2.058e+09 65.2% 2.000e+01 24.7% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU > time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per > processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per > processor) > GPU %F: percent flops on GPU in this event > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total GPU - CpuToGpu - - > GpuToCpu - GPU > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size > Count Size %F > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > --- Event Stage 0: Main Stage > > BuildTwoSided 2 1.0 4.0085e-0136.3 0.00e+00 0.0 2.0e+00 4.0e+00 > 2.0e+00 0 0 3 0 2 0 0 7 0 5 0 0 0 0.00e+00 0 > 0.00e+00 0 > BuildTwoSidedF 1 1.0 4.0080e-0113602.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 > 0.00e+00 0 0.00e+00 0 > MatAssemblyBegin 12 1.0 4.0084e-017217.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyEnd 12 1.0 3.4970e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+00 2 0 0 0 7 3 0 0 0 14 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatZeroEntries 1 1.0 2.4093e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatLoad 1 1.0 1.3756e+01 1.0 0.00e+00 0.0 6.0e+00 4.6e+08 > 2.1e+01 7 0 9 2 26 13 0 20 6 49 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatMatMultSym 20 1.0 4.7919e+00 2.4 0.00e+00 0.0 4.0e+00 1.6e+07 > 1.2e+01 2 0 6 0 15 3 0 13 0 28 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatMatMultNum 10 1.0 4.9853e+01 1.1 1.45e+11 1.0 2.0e+01 2.1e+09 > 0.0e+00 23 33 29 33 0 46100 67 94 0 5754 182686 2 2.23e+03 10 > 2.08e+04 5 > MatCUSPARSCopyTo 1 1.0 2.2646e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 1.55e+02 0 > 0.00e+00 0 > MatDenseCopyTo 1 1.0 1.6636e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.08e+03 0 > 0.00e+00 0 > MatDenseCopyFrom 11 1.0 3.0463e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 0 0 0.00e+00 11 > 2.29e+04 0 > VecSet 3 1.0 5.0035e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > SFSetGraph 1 1.0 4.4294e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > SFSetUp 1 1.0 1.3982e-01 1.0 0.00e+00 0.0 4.0e+00 1.6e+07 > 1.0e+00 0 0 6 0 1 0 0 13 0 2 0 0 0 0.00e+00 0 > 0.00e+00 0 > > --- Event Stage 1: MyComputation > > MatAssemblyBegin 20 1.0 1.6894e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyEnd 20 1.0 1.5575e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatMatMultSym 40 1.0 1.0096e+01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+01 3 0 0 0 25 7 0 0 0100 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatMatMultNum 20 1.0 9.9320e+01 1.1 2.90e+11 1.0 4.0e+01 2.1e+09 > 0.0e+00 46 67 57 65 0 93100100100 0 5777 182577 0 0.00e+00 20 > 4.16e+04 5 > MatDenseCopyFrom 20 1.0 5.5380e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 3 0 0 0 0 5 0 0 0 0 0 0 0 0.00e+00 20 > 4.16e+04 0 > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 17 10 20381695840 0. > Viewer 2 0 0 0. > Vector 4 1 1792 0. > Index Set 2 2 31848152 0. > Star Forest Graph 3 0 0 0. > > --- Event Stage 1: MyComputation > > Matrix 20 20 40763391680 0. > > ======================================================================================================================== > Average time to get PetscTime(): 3.96e-08 > Average time for MPI_Barrier(): 8.184e-07 > Average time for zero size MPI_Send(): 2.8165e-06 > #PETSc Option Table entries: > -bench spmm > -enable_gpu > -log_view > -mat_type aijcusparse > -matload_block_size 1 > -matrix /p/gpfs1/yadav2/tensors/petsc/nlpkkt200.petsc > -n 20 > -vec_type cuda > -warmup 10 > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: --download-c2html=0 --download-hwloc=0 > --download-sowing=0 --prefix=./petsc-install/ --with-64-bit-indices=0 > --with-blaslapack-lib="/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/liblapack.so > /usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/libblas.so" > --with-cc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc > --with-clanguage=C --with-cxx-dialect=C++17 > --with-cxx=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpig++ > --with-cuda=1 --with-debugging=0 > --with-fc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran > --with-fftw=0 > --with-hdf5-dir=/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4 > --with-hdf5=1 --with-mumps=0 --with-precision=double --with-scalapack=0 > --with-scalar-type=real --with-shared-libraries=1 --with-ssl=0 > --with-suitesparse=0 --with-trilinos=0 --with-valgrind=0 --with-x=0 > --with-zlib-include=/usr/include --with-zlib-lib=/usr/lib64/libz.so > --with-zlib=1 CFLAGS="-g -DNoChange" COPTFLAGS="-O3" CXXFLAGS="-O3" > CXXOPTFLAGS="-O3" FFLAGS=-g CUDAFLAGS=-std=c++17 FOPTFLAGS= > PETSC_ARCH=arch-linux-c-opt > ----------------------------------------- > Libraries compiled on 2022-01-21 06:41:50 on lassen111 > Machine characteristics: > Linux-4.14.0-115.21.2.1chaos.ch6a.ppc64le-ppc64le-with-redhat-7.6-Maipo > Using PETSc directory: /g/g15/yadav2/taco/petsc/petsc/petsc-install > Using PETSc arch: > ----------------------------------------- > > Using C compiler: > /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc > -g -DNoChange -fPIC "-O3" > Using Fortran compiler: > /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran > -g -fPIC > ----------------------------------------- > > Using include paths: > -I/g/g15/yadav2/taco/petsc/petsc/petsc-install/include > -I/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/include > -I/usr/include -I/usr/tce/packages/cuda/cuda-11.1.0/include > ----------------------------------------- > > Using C linker: > /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc > Using Fortran linker: > /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran > Using libraries: > -Wl,-rpath,/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib > -L/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -lpetsc > -Wl,-rpath,/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib > -L/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib > -Wl,-rpath,/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib > -L/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib > -Wl,-rpath,/usr/tce/packages/cuda/cuda-11.1.0/lib64 > -L/usr/tce/packages/cuda/cuda-11.1.0/lib64 > -Wl,-rpath,/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib > -L/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib > -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 > -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 > -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc > -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc > -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 > -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 > -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib > -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -llapack -lblas -lhdf5_hl > -lhdf5 -lm /usr/lib64/libz.so -lcuda -lcudart -lcufft -lcublas -lcusparse > -lcusolver -lcurand -lstdc++ -ldl -lmpiprofilesupport -lmpi_ibm_usempi > -lmpi_ibm_mpifh -lmpi_ibm -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath > -lpthread -lquadmath -lstdc++ -ldl > ----------------------------------------- > ``` > > On Wed, Feb 2, 2022 at 11:59 PM Stefano Zampini > wrote: > >> >> >> 1) It uses MatMPIDenseScatter() to move to the other ranks their needed >>> rows of the C matrix. That function has the call MatDenseGetArrayRead() >>> normally would trigger a copy of C up to the CPU each time. But since C is >>> not changing in your test run I guess it only triggers one copy. >>> >>> 2) If uses >>> MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); >>> to do the off diagonal part of the product but this triggers for each >>> multiply a copy of the result matrix from the CPU to the GPU (hugely >>> expensive) >>> >>> For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() >>> that is smarter about the needed MPI communication so it only moves exactly >>> what it needs to the other ranks and it does the off-diagonal part of the >>> product on the GPU so it does not need to copy the result up to the CPU. >>> >>> >> MPIAIJCUSPARSE uses MatProductSetFromOptions_MPIAIJBACKEND >> >> Rohan >> I would suggest to add PetscLogStage around your performance loop (do a >> warmup outside of it) and send the relevant portion of the log >> >> >>> Barry >>> >>> >>> >>> >>> >>> >>> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >>> >>> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 >>> Using Petsc Release Version 3.16.3, unknown >>> >>> Max Max/Min Avg Total >>> Time (sec): 1.163e+02 1.000 1.163e+02 >>> Objects: 4.800e+01 1.000 4.800e+01 >>> Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 >>> Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 >>> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >>> MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 >>> MPI Reductions: 8.100e+01 1.000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of length N --> 2N flop >>> and VecAXPY() for complex vectors of length N --> 8N flop >>> >>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>> 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flop: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all processors >>> Mess: number of messages sent >>> AvgLen: average message length (bytes) >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flop in this phase >>> %M - percent messages in this phase %L - percent message lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >>> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >>> CpuToGpu Count: total number of CPU to GPU copies per processor >>> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >>> GpuToCpu Count: total number of GPU to CPU copies per processor >>> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >>> GPU %F: percent flops on GPU in this event >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> --- Event Stage 0: Main Stage >>> >>> BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 >>> BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >>> MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 >>> MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 >>> MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 >>> VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory Descendants' Mem. >>> Reports information only for process 0. >>> >>> --- Event Stage 0: Main Stage >>> >>> Matrix 37 30 2867511840 0. >>> Viewer 2 0 0 0. >>> Vector 4 1 1792 0. >>> Index Set 2 2 1495248 0. >>> Star Forest Graph 3 0 0 0. >>> ======================================================================================================================== >>> Average time to get PetscTime(): 3.83e-08 >>> Average time for MPI_Barrier(): 7.874e-07 >>> Average time for zero size MPI_Send(): 3.4035e-06 >>> #PETSc Option Table entries: >>> -bench spmm >>> -enable_gpu >>> -log_view >>> -mat_type aijcusparse >>> -matload_block_size 1 >>> -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc >>> -n 20 >>> -vec_type cuda >>> -warmup 10 >>> ``` >>> >>> >>> Thanks, >>> >>> >>> Rohan Yadav >>> >>> >>> >>> >> >> -- >> Stefano >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From te307 at cam.ac.uk Thu Feb 3 11:22:16 2022 From: te307 at cam.ac.uk (Evstafyeva,Tamara) Date: Thu, 3 Feb 2022 17:22:16 +0000 Subject: [petsc-users] cannot open source file "petsc.h" Message-ID: To whom it may concern, I am using a code that utilizes some of the PETSC capabilities. After configuring and installing PETSC on a cluster, I have set my environment variables $PETSC_DIR and $PETSC_ARCH. The code using PETSC compiles using GNUMakefile and so using instructions on the website and in Makefile.user I added the following lines to the make file: include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules include ${PETSC_DIR}/lib/petsc/conf/test petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc When compiling the code I get that the head files of PETSC are not recognized: catastrophic error: cannot open source file "petsc.h" This seems like a trivial problem, however I cannot seem to figure out what exactly went wrong. I?d expect this cannot be installation problem, and most likely the linking? I would really appreciate some direction on this problem. Thank you, Tamara -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Feb 3 12:51:31 2022 From: jed at jedbrown.org (Jed Brown) Date: Thu, 03 Feb 2022 11:51:31 -0700 Subject: [petsc-users] cannot open source file "petsc.h" In-Reply-To: References: Message-ID: <871r0jokb0.fsf@jedbrown.org> Hmm, usually we don't use BOTH the makefile includes and pkgconfig (as in Makefile.user). You can use either. If you share the whole file and the command line that executes, I think it'll be easy enough to fix. "Evstafyeva,Tamara" writes: > To whom it may concern, > > I am using a code that utilizes some of the PETSC capabilities. After configuring and installing PETSC on a cluster, I have set my environment variables $PETSC_DIR and $PETSC_ARCH. The code using PETSC compiles using GNUMakefile and so using instructions on the website and in Makefile.user I added the following lines to the make file: > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > include ${PETSC_DIR}/lib/petsc/conf/test > > > petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc > > When compiling the code I get that the head files of PETSC are not recognized: > > catastrophic error: cannot open source file "petsc.h" > > This seems like a trivial problem, however I cannot seem to figure out what exactly went wrong. I?d expect this cannot be installation problem, and most likely the linking? I would really appreciate some direction on this problem. > > Thank you, > > Tamara From balay at mcs.anl.gov Thu Feb 3 12:52:47 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 3 Feb 2022 12:52:47 -0600 (CST) Subject: [petsc-users] cannot open source file "petsc.h" In-Reply-To: References: Message-ID: Can you send your current makefile? Also send compile log [including errors] when building with this makefile. Also send us compile log from: cd src/snes/tutorials make ex19 Satish On Thu, 3 Feb 2022, Evstafyeva,Tamara wrote: > To whom it may concern, > > I am using a code that utilizes some of the PETSC capabilities. After configuring and installing PETSC on a cluster, I have set my environment variables $PETSC_DIR and $PETSC_ARCH. The code using PETSC compiles using GNUMakefile and so using instructions on the website and in Makefile.user I added the following lines to the make file: > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > include ${PETSC_DIR}/lib/petsc/conf/test > > > petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc > > When compiling the code I get that the head files of PETSC are not recognized: > > catastrophic error: cannot open source file "petsc.h" > > This seems like a trivial problem, however I cannot seem to figure out what exactly went wrong. I?d expect this cannot be installation problem, and most likely the linking? I would really appreciate some direction on this problem. > > Thank you, > > Tamara > > > From te307 at cam.ac.uk Thu Feb 3 13:25:30 2022 From: te307 at cam.ac.uk (Evstafyeva,Tamara) Date: Thu, 3 Feb 2022 19:25:30 +0000 Subject: [petsc-users] cannot open source file "petsc.h" In-Reply-To: References: Message-ID: Dear Satish, Thank you for getting back! I am attaching mu current GNUmakefile and the logs for it and the ex19. Let me know if anything else is needed and thanks in advance, Tamara From: Satish Balay Date: Thursday, 3 February 2022, 18:53 To: Evstafyeva,Tamara Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] cannot open source file "petsc.h" Can you send your current makefile? Also send compile log [including errors] when building with this makefile. Also send us compile log from: cd src/snes/tutorials make ex19 Satish On Thu, 3 Feb 2022, Evstafyeva,Tamara wrote: > To whom it may concern, > > I am using a code that utilizes some of the PETSC capabilities. After configuring and installing PETSC on a cluster, I have set my environment variables $PETSC_DIR and $PETSC_ARCH. The code using PETSC compiles using GNUMakefile and so using instructions on the website and in Makefile.user I added the following lines to the make file: > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > include ${PETSC_DIR}/lib/petsc/conf/test > > > petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc > > When compiling the code I get that the head files of PETSC are not recognized: > > catastrophic error: cannot open source file "petsc.h" > > This seems like a trivial problem, however I cannot seem to figure out what exactly went wrong. I?d expect this cannot be installation problem, and most likely the linking? I would really appreciate some direction on this problem. > > Thank you, > > Tamara > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: GNUmakefile Type: application/octet-stream Size: 2064 bytes Desc: GNUmakefile URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log_ex19 Type: application/octet-stream Size: 6597 bytes Desc: log_ex19 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: log_main Type: application/octet-stream Size: 7574 bytes Desc: log_main URL: From balay at mcs.anl.gov Thu Feb 3 13:39:37 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 3 Feb 2022 13:39:37 -0600 (CST) Subject: [petsc-users] cannot open source file "petsc.h" In-Reply-To: References: Message-ID: <909ae29-e85-75e1-5fd-7c5c21a2ec63@mcs.anl.gov> You an try the following: - remove PETSC_INCLUDE=/cosma/home/dp092/dc-evst1/petsc/include PETSC_LIB=/cosma/home/dp092/dc-evst1/petsc/lib - remove include ${PETSC_DIR}/lib/petsc/conf/rules include ${PETSC_DIR}/lib/petsc/conf/test - remove petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc i.e only keep: include ${PETSC_DIR}/lib/petsc/conf/variables You have: LINK += $(PETSC_LIB) Similarly you need: cxxcppflags += $(PETSC_CXXCPPFLAGS) Satish On Thu, 3 Feb 2022, Evstafyeva,Tamara wrote: > Dear Satish, > > Thank you for getting back! > > I am attaching mu current GNUmakefile and the logs for it and the ex19. > > Let me know if anything else is needed and thanks in advance, > > Tamara > > From: Satish Balay > Date: Thursday, 3 February 2022, 18:53 > To: Evstafyeva,Tamara > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] cannot open source file "petsc.h" > Can you send your current makefile? > > Also send compile log [including errors] when building with this makefile. > > Also send us compile log from: > > cd src/snes/tutorials > make ex19 > > Satish > > On Thu, 3 Feb 2022, Evstafyeva,Tamara wrote: > > > To whom it may concern, > > > > I am using a code that utilizes some of the PETSC capabilities. After configuring and installing PETSC on a cluster, I have set my environment variables $PETSC_DIR and $PETSC_ARCH. The code using PETSC compiles using GNUMakefile and so using instructions on the website and in Makefile.user I added the following lines to the make file: > > > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > include ${PETSC_DIR}/lib/petsc/conf/test > > > > > > petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc > > > > When compiling the code I get that the head files of PETSC are not recognized: > > > > catastrophic error: cannot open source file "petsc.h" > > > > This seems like a trivial problem, however I cannot seem to figure out what exactly went wrong. I?d expect this cannot be installation problem, and most likely the linking? I would really appreciate some direction on this problem. > > > > Thank you, > > > > Tamara > > > > > > > From bsmith at petsc.dev Thu Feb 3 13:50:15 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 3 Feb 2022 14:50:15 -0500 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: References: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> Message-ID: <138725A0-A19A-4CE0-B279-C509FB459379@petsc.dev> Mark, Good eye. Something is definitely very different between this run and the previous (options, code change?). In the previously sent runs it was about 98% on GPU. Barry > On Feb 3, 2022, at 12:29 PM, Rohan Yadav wrote: > > > Please send the code that builds the sparse B matrix and the setMatToConstant() routine. > > Setting to a constant: > ``` > void setMatToConstant(Mat mat, PetscScalar c) { > PetscInt rStart, rEnd, m, n; > MatGetSize(mat, &m, &n); > MatGetOwnershipRange(mat, &rStart, &rEnd); > for (int i = rStart; i < rEnd; i++) { > for (int j = 0; j < n; j++) { > MatSetValue(mat, i, j, c, INSERT_VALUES); > } > } > MatAssemblyBegin(mat, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(mat, MAT_FINAL_ASSEMBLY); > } > ``` > > Loading sparse matrix from disk: > ``` > int loadMatrixFromFile(Mat* A, char* filename) { > auto ierr = MatCreate(PETSC_COMM_WORLD, A); CHKERRQ(ierr); > MatSetFromOptions(*A); > PetscViewer viewer; > PetscViewerCreate(PETSC_COMM_WORLD, &viewer); > PetscViewerSetType(viewer, PETSCVIEWERBINARY); > PetscViewerFileSetMode(viewer, FILE_MODE_READ); > PetscViewerFileSetName(viewer, filename); > MatLoad(*A, viewer); > return 0; > } > ``` > These are only called once and should not affect the computation in a loop though. > > But first please verify that if you run with one MPI rank the "on GPU" and the overall flop rates for the MatMatMult() are almost the same and there is no copy from the GPU for each multiply? > > Yes, with 1 mpi rank / GPU there are no extra copies done. As soon as I move to 2 ranks I see this behavior. > > Here are updated logs with a new stage for 2 ranks. I've staged the logs into "MyComputation". > > ``` > ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- > > /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen572 with 2 processors, by yadav2 Thu Feb 3 09:27:30 2022 > Using Petsc Release Version 3.16.3, unknown > > Max Max/Min Avg Total > Time (sec): 2.091e+02 1.001 2.090e+02 > Objects: 4.800e+01 1.000 4.800e+01 > Flop: 4.344e+11 1.019 4.303e+11 8.606e+11 > Flop/sec: 2.077e+09 1.018 2.059e+09 4.118e+09 > MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 > MPI Message Lengths: 6.316e+10 1.000 1.805e+09 1.263e+11 > MPI Reductions: 8.100e+01 1.000 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flop > and VecAXPY() for complex vectors of length N --> 8N flop > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count %Total Avg %Total Count %Total > 0: Main Stage: 1.0555e+02 50.5% 2.8686e+11 33.3% 3.000e+01 42.9% 1.466e+09 34.8% 4.300e+01 53.1% > 1: MyComputation: 1.0345e+02 49.5% 5.7373e+11 66.7% 4.000e+01 57.1% 2.058e+09 65.2% 2.000e+01 24.7% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) > GPU %F: percent flops on GPU in this event > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > --- Event Stage 0: Main Stage > > BuildTwoSided 2 1.0 4.0085e-0136.3 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 7 0 5 0 0 0 0.00e+00 0 0.00e+00 0 > BuildTwoSidedF 1 1.0 4.0080e-0113602.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 > MatAssemblyBegin 12 1.0 4.0084e-017217.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 > MatAssemblyEnd 12 1.0 3.4970e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 2 0 0 0 7 3 0 0 0 14 0 0 0 0.00e+00 0 0.00e+00 0 > MatZeroEntries 1 1.0 2.4093e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > MatLoad 1 1.0 1.3756e+01 1.0 0.00e+00 0.0 6.0e+00 4.6e+08 2.1e+01 7 0 9 2 26 13 0 20 6 49 0 0 0 0.00e+00 0 0.00e+00 0 > MatMatMultSym 20 1.0 4.7919e+00 2.4 0.00e+00 0.0 4.0e+00 1.6e+07 1.2e+01 2 0 6 0 15 3 0 13 0 28 0 0 0 0.00e+00 0 0.00e+00 0 > MatMatMultNum 10 1.0 4.9853e+01 1.1 1.45e+11 1.0 2.0e+01 2.1e+09 0.0e+00 23 33 29 33 0 46100 67 94 0 5754 182686 2 2.23e+03 10 2.08e+04 5 > MatCUSPARSCopyTo 1 1.0 2.2646e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 1.55e+02 0 0.00e+00 0 > MatDenseCopyTo 1 1.0 1.6636e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.08e+03 0 0.00e+00 0 > MatDenseCopyFrom 11 1.0 3.0463e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 0 0 0.00e+00 11 2.29e+04 0 > VecSet 3 1.0 5.0035e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SFSetGraph 1 1.0 4.4294e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > SFSetUp 1 1.0 1.3982e-01 1.0 0.00e+00 0.0 4.0e+00 1.6e+07 1.0e+00 0 0 6 0 1 0 0 13 0 2 0 0 0 0.00e+00 0 0.00e+00 0 > > --- Event Stage 1: MyComputation > > MatAssemblyBegin 20 1.0 1.6894e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > MatAssemblyEnd 20 1.0 1.5575e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 > MatMatMultSym 40 1.0 1.0096e+01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 3 0 0 0 25 7 0 0 0100 0 0 0 0.00e+00 0 0.00e+00 0 > MatMatMultNum 20 1.0 9.9320e+01 1.1 2.90e+11 1.0 4.0e+01 2.1e+09 0.0e+00 46 67 57 65 0 93100100100 0 5777 182577 0 0.00e+00 20 4.16e+04 5 > MatDenseCopyFrom 20 1.0 5.5380e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 5 0 0 0 0 0 0 0 0.00e+00 20 4.16e+04 0 > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 17 10 20381695840 0. > Viewer 2 0 0 0. > Vector 4 1 1792 0. > Index Set 2 2 31848152 0. > Star Forest Graph 3 0 0 0. > > --- Event Stage 1: MyComputation > > Matrix 20 20 40763391680 0. > ======================================================================================================================== > Average time to get PetscTime(): 3.96e-08 > Average time for MPI_Barrier(): 8.184e-07 > Average time for zero size MPI_Send(): 2.8165e-06 > #PETSc Option Table entries: > -bench spmm > -enable_gpu > -log_view > -mat_type aijcusparse > -matload_block_size 1 > -matrix /p/gpfs1/yadav2/tensors/petsc/nlpkkt200.petsc > -n 20 > -vec_type cuda > -warmup 10 > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: --download-c2html=0 --download-hwloc=0 --download-sowing=0 --prefix=./petsc-install/ --with-64-bit-indices=0 --with-blaslapack-lib="/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/liblapack.so /usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/libblas.so" --with-cc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc --with-clanguage=C --with-cxx-dialect=C++17 --with-cxx=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpig++ --with-cuda=1 --with-debugging=0 --with-fc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran --with-fftw=0 --with-hdf5-dir=/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4 --with-hdf5=1 --with-mumps=0 --with-precision=double --with-scalapack=0 --with-scalar-type=real --with-shared-libraries=1 --with-ssl=0 --with-suitesparse=0 --with-trilinos=0 --with-valgrind=0 --with-x=0 --with-zlib-include=/usr/include --with-zlib-lib=/usr/lib64/libz.so --with-zlib=1 CFLAGS="-g -DNoChange" COPTFLAGS="-O3" CXXFLAGS="-O3" CXXOPTFLAGS="-O3" FFLAGS=-g CUDAFLAGS=-std=c++17 FOPTFLAGS= PETSC_ARCH=arch-linux-c-opt > ----------------------------------------- > Libraries compiled on 2022-01-21 06:41:50 on lassen111 > Machine characteristics: Linux-4.14.0-115.21.2.1chaos.ch6a.ppc64le-ppc64le-with-redhat-7.6-Maipo > Using PETSc directory: /g/g15/yadav2/taco/petsc/petsc/petsc-install > Using PETSc arch: > ----------------------------------------- > > Using C compiler: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc -g -DNoChange -fPIC "-O3" > Using Fortran compiler: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran -g -fPIC > ----------------------------------------- > > Using include paths: -I/g/g15/yadav2/taco/petsc/petsc/petsc-install/include -I/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/include -I/usr/include -I/usr/tce/packages/cuda/cuda-11.1.0/include > ----------------------------------------- > > Using C linker: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc > Using Fortran linker: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran > Using libraries: -Wl,-rpath,/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -L/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -lpetsc -Wl,-rpath,/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib -L/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib -Wl,-rpath,/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib -L/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib -Wl,-rpath,/usr/tce/packages/cuda/cuda-11.1.0/lib64 -L/usr/tce/packages/cuda/cuda-11.1.0/lib64 -Wl,-rpath,/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib -L/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -llapack -lblas -lhdf5_hl -lhdf5 -lm /usr/lib64/libz.so -lcuda -lcudart -lcufft -lcublas -lcusparse -lcusolver -lcurand -lstdc++ -ldl -lmpiprofilesupport -lmpi_ibm_usempi -lmpi_ibm_mpifh -lmpi_ibm -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl > ----------------------------------------- > ``` > > On Wed, Feb 2, 2022 at 11:59 PM Stefano Zampini > wrote: > > > 1) It uses MatMPIDenseScatter() to move to the other ranks their needed rows of the C matrix. That function has the call MatDenseGetArrayRead() normally would trigger a copy of C up to the CPU each time. But since C is not changing in your test run I guess it only triggers one copy. > > 2) If uses MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); to do the off diagonal part of the product but this triggers for each multiply a copy of the result matrix from the CPU to the GPU (hugely expensive) > > For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() that is smarter about the needed MPI communication so it only moves exactly what it needs to the other ranks and it does the off-diagonal part of the product on the GPU so it does not need to copy the result up to the CPU. > > > MPIAIJCUSPARSE uses MatProductSetFromOptions_MPIAIJBACKEND > > Rohan > I would suggest to add PetscLogStage around your performance loop (do a warmup outside of it) and send the relevant portion of the log > > Barry > > > > > > >> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >> >> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 >> Using Petsc Release Version 3.16.3, unknown >> >> Max Max/Min Avg Total >> Time (sec): 1.163e+02 1.000 1.163e+02 >> Objects: 4.800e+01 1.000 4.800e+01 >> Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 >> Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 >> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >> MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 >> MPI Reductions: 8.100e+01 1.000 >> >> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N --> 2N flop >> and VecAXPY() for complex vectors of length N --> 8N flop >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >> 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> AvgLen: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this phase >> %M - percent messages in this phase %L - percent message lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >> CpuToGpu Count: total number of CPU to GPU copies per processor >> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >> GpuToCpu Count: total number of GPU to CPU copies per processor >> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >> GPU %F: percent flops on GPU in this event >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> --- Event Stage 0: Main Stage >> >> BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 >> BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >> MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >> MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 >> MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 >> MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 >> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >> MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 >> MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 >> MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 >> VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Matrix 37 30 2867511840 0. >> Viewer 2 0 0 0. >> Vector 4 1 1792 0. >> Index Set 2 2 1495248 0. >> Star Forest Graph 3 0 0 0. >> ======================================================================================================================== >> Average time to get PetscTime(): 3.83e-08 >> Average time for MPI_Barrier(): 7.874e-07 >> Average time for zero size MPI_Send(): 3.4035e-06 >> #PETSc Option Table entries: >> -bench spmm >> -enable_gpu >> -log_view >> -mat_type aijcusparse >> -matload_block_size 1 >> -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc >> -n 20 >> -vec_type cuda >> -warmup 10 >> ``` >> >> Thanks, >> >> Rohan Yadav >> > > > > -- > Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From te307 at cam.ac.uk Thu Feb 3 13:52:39 2022 From: te307 at cam.ac.uk (Evstafyeva,Tamara) Date: Thu, 3 Feb 2022 19:52:39 +0000 Subject: [petsc-users] cannot open source file "petsc.h" In-Reply-To: <909ae29-e85-75e1-5fd-7c5c21a2ec63@mcs.anl.gov> References: <909ae29-e85-75e1-5fd-7c5c21a2ec63@mcs.anl.gov> Message-ID: Hi Satish, Thanks for the reply, I am afraid it produces the same error still. From: Satish Balay Date: Thursday, 3 February 2022, 19:39 To: Evstafyeva,Tamara Cc: petsc-users Subject: Re: [petsc-users] cannot open source file "petsc.h" You an try the following: - remove PETSC_INCLUDE=/cosma/home/dp092/dc-evst1/petsc/include PETSC_LIB=/cosma/home/dp092/dc-evst1/petsc/lib - remove include ${PETSC_DIR}/lib/petsc/conf/rules include ${PETSC_DIR}/lib/petsc/conf/test - remove petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc i.e only keep: include ${PETSC_DIR}/lib/petsc/conf/variables You have: LINK += $(PETSC_LIB) Similarly you need: cxxcppflags += $(PETSC_CXXCPPFLAGS) Satish On Thu, 3 Feb 2022, Evstafyeva,Tamara wrote: > Dear Satish, > > Thank you for getting back! > > I am attaching mu current GNUmakefile and the logs for it and the ex19. > > Let me know if anything else is needed and thanks in advance, > > Tamara > > From: Satish Balay > Date: Thursday, 3 February 2022, 18:53 > To: Evstafyeva,Tamara > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] cannot open source file "petsc.h" > Can you send your current makefile? > > Also send compile log [including errors] when building with this makefile. > > Also send us compile log from: > > cd src/snes/tutorials > make ex19 > > Satish > > On Thu, 3 Feb 2022, Evstafyeva,Tamara wrote: > > > To whom it may concern, > > > > I am using a code that utilizes some of the PETSC capabilities. After configuring and installing PETSC on a cluster, I have set my environment variables $PETSC_DIR and $PETSC_ARCH. The code using PETSC compiles using GNUMakefile and so using instructions on the website and in Makefile.user I added the following lines to the make file: > > > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > include ${PETSC_DIR}/lib/petsc/conf/test > > > > > > petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc > > > > When compiling the code I get that the head files of PETSC are not recognized: > > > > catastrophic error: cannot open source file "petsc.h" > > > > This seems like a trivial problem, however I cannot seem to figure out what exactly went wrong. I?d expect this cannot be installation problem, and most likely the linking? I would really appreciate some direction on this problem. > > > > Thank you, > > > > Tamara > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rohany at alumni.cmu.edu Thu Feb 3 13:59:15 2022 From: rohany at alumni.cmu.edu (Rohan Yadav) Date: Thu, 3 Feb 2022 11:59:15 -0800 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: <138725A0-A19A-4CE0-B279-C509FB459379@petsc.dev> References: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> <138725A0-A19A-4CE0-B279-C509FB459379@petsc.dev> Message-ID: I'm sorry, I did a little switch here. The original log view I sent for 2 runs was on a different input matrix. Based on Barry's request I switched to a different matrix as the original one did not fit on 1 GPU. > In the previously sent runs it was about 98% on GPU. Re 98% on the GPU though, my first email had a similar ratio in the log though: ``` MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 ``` The follow up log might be slightly different as well because I pushed a new log stage as requested by Stefano. Rohan On Thu, Feb 3, 2022 at 11:50 AM Barry Smith wrote: > > Mark, > > Good eye. Something is definitely very different between this run and > the previous (options, code change?). In the previously sent runs it was > about 98% on GPU. > > Barry > > > On Feb 3, 2022, at 12:29 PM, Rohan Yadav wrote: > > > Please send the code that builds the sparse B matrix and the setMatToConstant() > routine. > > Setting to a constant: > ``` > void setMatToConstant(Mat mat, PetscScalar c) { > > PetscInt rStart, rEnd, m, n; > MatGetSize(mat, &m, &n); > MatGetOwnershipRange(mat, &rStart, &rEnd); > for (int i = rStart; i < rEnd; i++) { > for (int j = 0; j < n; j++) { > MatSetValue(mat, i, j, c, INSERT_VALUES); > } > } > MatAssemblyBegin(mat, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(mat, MAT_FINAL_ASSEMBLY); > } > ``` > > Loading sparse matrix from disk: > > ``` > > int loadMatrixFromFile(Mat* A, char* filename) { > auto ierr = MatCreate(PETSC_COMM_WORLD, A); CHKERRQ(ierr); > MatSetFromOptions(*A); > PetscViewer viewer; > PetscViewerCreate(PETSC_COMM_WORLD, &viewer); > PetscViewerSetType(viewer, PETSCVIEWERBINARY); > PetscViewerFileSetMode(viewer, FILE_MODE_READ); > PetscViewerFileSetName(viewer, filename); > MatLoad(*A, viewer); > return 0; > } > > ``` > > These are only called once and should not affect the computation in a loop though. > > > But first please verify that if you run with one MPI rank the "on GPU" and the overall flop rates for the MatMatMult() are almost the same and there is no copy from the GPU for each multiply? > > > Yes, with 1 mpi rank / GPU there are no extra copies done. As soon as I > move to 2 ranks I see this behavior. > > Here are updated logs with a new stage for 2 ranks. I've staged the logs > into "MyComputation". > > ``` > ---------------------------------------------- PETSc Performance Summary: > ---------------------------------------------- > > /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen572 with 2 > processors, by yadav2 Thu Feb 3 09:27:30 2022 > Using Petsc Release Version 3.16.3, unknown > > Max Max/Min Avg Total > Time (sec): 2.091e+02 1.001 2.090e+02 > Objects: 4.800e+01 1.000 4.800e+01 > Flop: 4.344e+11 1.019 4.303e+11 8.606e+11 > Flop/sec: 2.077e+09 1.018 2.059e+09 4.118e+09 > MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 > MPI Message Lengths: 6.316e+10 1.000 1.805e+09 1.263e+11 > MPI Reductions: 8.100e+01 1.000 > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N > --> 2N flop > and VecAXPY() for complex vectors of length N > --> 8N flop > > Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages > --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total Count > %Total Avg %Total Count %Total > 0: Main Stage: 1.0555e+02 50.5% 2.8686e+11 33.3% 3.000e+01 > 42.9% 1.466e+09 34.8% 4.300e+01 53.1% > 1: MyComputation: 1.0345e+02 49.5% 5.7373e+11 66.7% 4.000e+01 > 57.1% 2.058e+09 65.2% 2.000e+01 24.7% > > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flop: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > AvgLen: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > %T - percent time in this phase %F - percent flop in this > phase > %M - percent messages in this phase %L - percent message lengths > in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over > all processors) > GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU > time over all processors) > CpuToGpu Count: total number of CPU to GPU copies per processor > CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per > processor) > GpuToCpu Count: total number of GPU to CPU copies per processor > GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per > processor) > GPU %F: percent flops on GPU in this event > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total GPU - CpuToGpu - - > GpuToCpu - GPU > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size > Count Size %F > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > --- Event Stage 0: Main Stage > > BuildTwoSided 2 1.0 4.0085e-0136.3 0.00e+00 0.0 2.0e+00 4.0e+00 > 2.0e+00 0 0 3 0 2 0 0 7 0 5 0 0 0 0.00e+00 0 > 0.00e+00 0 > BuildTwoSidedF 1 1.0 4.0080e-0113602.0 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 > 0.00e+00 0 0.00e+00 0 > MatAssemblyBegin 12 1.0 4.0084e-017217.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyEnd 12 1.0 3.4970e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.0e+00 2 0 0 0 7 3 0 0 0 14 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatZeroEntries 1 1.0 2.4093e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatLoad 1 1.0 1.3756e+01 1.0 0.00e+00 0.0 6.0e+00 4.6e+08 > 2.1e+01 7 0 9 2 26 13 0 20 6 49 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatMatMultSym 20 1.0 4.7919e+00 2.4 0.00e+00 0.0 4.0e+00 1.6e+07 > 1.2e+01 2 0 6 0 15 3 0 13 0 28 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatMatMultNum 10 1.0 4.9853e+01 1.1 1.45e+11 1.0 2.0e+01 2.1e+09 > 0.0e+00 23 33 29 33 0 46100 67 94 0 5754 182686 2 2.23e+03 10 > 2.08e+04 5 > MatCUSPARSCopyTo 1 1.0 2.2646e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 1.55e+02 0 > 0.00e+00 0 > MatDenseCopyTo 1 1.0 1.6636e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.08e+03 0 > 0.00e+00 0 > MatDenseCopyFrom 11 1.0 3.0463e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 0 0 0.00e+00 11 > 2.29e+04 0 > VecSet 3 1.0 5.0035e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > SFSetGraph 1 1.0 4.4294e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > SFSetUp 1 1.0 1.3982e-01 1.0 0.00e+00 0.0 4.0e+00 1.6e+07 > 1.0e+00 0 0 6 0 1 0 0 13 0 2 0 0 0 0.00e+00 0 > 0.00e+00 0 > > --- Event Stage 1: MyComputation > > MatAssemblyBegin 20 1.0 1.6894e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatAssemblyEnd 20 1.0 1.5575e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatMatMultSym 40 1.0 1.0096e+01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+01 3 0 0 0 25 7 0 0 0100 0 0 0 0.00e+00 0 > 0.00e+00 0 > MatMatMultNum 20 1.0 9.9320e+01 1.1 2.90e+11 1.0 4.0e+01 2.1e+09 > 0.0e+00 46 67 57 65 0 93100100100 0 5777 182577 0 0.00e+00 20 > 4.16e+04 5 > MatDenseCopyFrom 20 1.0 5.5380e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 3 0 0 0 0 5 0 0 0 0 0 0 0 0.00e+00 20 > 4.16e+04 0 > > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 17 10 20381695840 0. > Viewer 2 0 0 0. > Vector 4 1 1792 0. > Index Set 2 2 31848152 0. > Star Forest Graph 3 0 0 0. > > --- Event Stage 1: MyComputation > > Matrix 20 20 40763391680 0. > > ======================================================================================================================== > Average time to get PetscTime(): 3.96e-08 > Average time for MPI_Barrier(): 8.184e-07 > Average time for zero size MPI_Send(): 2.8165e-06 > #PETSc Option Table entries: > -bench spmm > -enable_gpu > -log_view > -mat_type aijcusparse > -matload_block_size 1 > -matrix /p/gpfs1/yadav2/tensors/petsc/nlpkkt200.petsc > -n 20 > -vec_type cuda > -warmup 10 > #End of PETSc Option Table entries > Compiled without FORTRAN kernels > Compiled with full precision matrices (default) > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 > sizeof(PetscScalar) 8 sizeof(PetscInt) 4 > Configure options: --download-c2html=0 --download-hwloc=0 > --download-sowing=0 --prefix=./petsc-install/ --with-64-bit-indices=0 > --with-blaslapack-lib="/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/liblapack.so > /usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/libblas.so" > --with-cc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc > --with-clanguage=C --with-cxx-dialect=C++17 > --with-cxx=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpig++ > --with-cuda=1 --with-debugging=0 > --with-fc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran > --with-fftw=0 > --with-hdf5-dir=/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4 > --with-hdf5=1 --with-mumps=0 --with-precision=double --with-scalapack=0 > --with-scalar-type=real --with-shared-libraries=1 --with-ssl=0 > --with-suitesparse=0 --with-trilinos=0 --with-valgrind=0 --with-x=0 > --with-zlib-include=/usr/include --with-zlib-lib=/usr/lib64/libz.so > --with-zlib=1 CFLAGS="-g -DNoChange" COPTFLAGS="-O3" CXXFLAGS="-O3" > CXXOPTFLAGS="-O3" FFLAGS=-g CUDAFLAGS=-std=c++17 FOPTFLAGS= > PETSC_ARCH=arch-linux-c-opt > ----------------------------------------- > Libraries compiled on 2022-01-21 06:41:50 on lassen111 > Machine characteristics: > Linux-4.14.0-115.21.2.1chaos.ch6a.ppc64le-ppc64le-with-redhat-7.6-Maipo > Using PETSc directory: /g/g15/yadav2/taco/petsc/petsc/petsc-install > Using PETSc arch: > ----------------------------------------- > > Using C compiler: > /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc > -g -DNoChange -fPIC "-O3" > Using Fortran compiler: > /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran > -g -fPIC > ----------------------------------------- > > Using include paths: > -I/g/g15/yadav2/taco/petsc/petsc/petsc-install/include > -I/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/include > -I/usr/include -I/usr/tce/packages/cuda/cuda-11.1.0/include > ----------------------------------------- > > Using C linker: > /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc > Using Fortran linker: > /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran > Using libraries: > -Wl,-rpath,/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib > -L/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -lpetsc > -Wl,-rpath,/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib > -L/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib > -Wl,-rpath,/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib > -L/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib > -Wl,-rpath,/usr/tce/packages/cuda/cuda-11.1.0/lib64 > -L/usr/tce/packages/cuda/cuda-11.1.0/lib64 > -Wl,-rpath,/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib > -L/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib > -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 > -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 > -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc > -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc > -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 > -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 > -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib > -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -llapack -lblas -lhdf5_hl > -lhdf5 -lm /usr/lib64/libz.so -lcuda -lcudart -lcufft -lcublas -lcusparse > -lcusolver -lcurand -lstdc++ -ldl -lmpiprofilesupport -lmpi_ibm_usempi > -lmpi_ibm_mpifh -lmpi_ibm -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath > -lpthread -lquadmath -lstdc++ -ldl > ----------------------------------------- > ``` > > On Wed, Feb 2, 2022 at 11:59 PM Stefano Zampini > wrote: > >> >> >> 1) It uses MatMPIDenseScatter() to move to the other ranks their needed >>> rows of the C matrix. That function has the call MatDenseGetArrayRead() >>> normally would trigger a copy of C up to the CPU each time. But since C is >>> not changing in your test run I guess it only triggers one copy. >>> >>> 2) If uses >>> MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); >>> to do the off diagonal part of the product but this triggers for each >>> multiply a copy of the result matrix from the CPU to the GPU (hugely >>> expensive) >>> >>> For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() >>> that is smarter about the needed MPI communication so it only moves exactly >>> what it needs to the other ranks and it does the off-diagonal part of the >>> product on the GPU so it does not need to copy the result up to the CPU. >>> >>> >> MPIAIJCUSPARSE uses MatProductSetFromOptions_MPIAIJBACKEND >> >> Rohan >> I would suggest to add PetscLogStage around your performance loop (do a >> warmup outside of it) and send the relevant portion of the log >> >> >>> Barry >>> >>> >>> >>> >>> >>> >>> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >>> >>> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 >>> Using Petsc Release Version 3.16.3, unknown >>> >>> Max Max/Min Avg Total >>> Time (sec): 1.163e+02 1.000 1.163e+02 >>> Objects: 4.800e+01 1.000 4.800e+01 >>> Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 >>> Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 >>> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >>> MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 >>> MPI Reductions: 8.100e+01 1.000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of length N --> 2N flop >>> and VecAXPY() for complex vectors of length N --> 8N flop >>> >>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>> 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flop: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all processors >>> Mess: number of messages sent >>> AvgLen: average message length (bytes) >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flop in this phase >>> %M - percent messages in this phase %L - percent message lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >>> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >>> CpuToGpu Count: total number of CPU to GPU copies per processor >>> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >>> GpuToCpu Count: total number of GPU to CPU copies per processor >>> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >>> GPU %F: percent flops on GPU in this event >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> --- Event Stage 0: Main Stage >>> >>> BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 >>> BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >>> MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 >>> MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 >>> MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 >>> VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory Descendants' Mem. >>> Reports information only for process 0. >>> >>> --- Event Stage 0: Main Stage >>> >>> Matrix 37 30 2867511840 0. >>> Viewer 2 0 0 0. >>> Vector 4 1 1792 0. >>> Index Set 2 2 1495248 0. >>> Star Forest Graph 3 0 0 0. >>> ======================================================================================================================== >>> Average time to get PetscTime(): 3.83e-08 >>> Average time for MPI_Barrier(): 7.874e-07 >>> Average time for zero size MPI_Send(): 3.4035e-06 >>> #PETSc Option Table entries: >>> -bench spmm >>> -enable_gpu >>> -log_view >>> -mat_type aijcusparse >>> -matload_block_size 1 >>> -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc >>> -n 20 >>> -vec_type cuda >>> -warmup 10 >>> ``` >>> >>> >>> Thanks, >>> >>> >>> Rohan Yadav >>> >>> >>> >>> >> >> -- >> Stefano >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Feb 3 14:01:17 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 3 Feb 2022 14:01:17 -0600 (CST) Subject: [petsc-users] cannot open source file "petsc.h" In-Reply-To: References: <909ae29-e85-75e1-5fd-7c5c21a2ec63@mcs.anl.gov> Message-ID: Perhaps cxxcppflags is not the correct variable. you need to look at the compile targets in "$(CHOMBO_HOME)/mk/Make.test" And perhaps you need to move these 2 lines after this include line the makefile Satish On Thu, 3 Feb 2022, Evstafyeva,Tamara wrote: > Hi Satish, > > Thanks for the reply, I am afraid it produces the same error still. > > From: Satish Balay > Date: Thursday, 3 February 2022, 19:39 > To: Evstafyeva,Tamara > Cc: petsc-users > Subject: Re: [petsc-users] cannot open source file "petsc.h" > You an try the following: > > - remove > PETSC_INCLUDE=/cosma/home/dp092/dc-evst1/petsc/include > PETSC_LIB=/cosma/home/dp092/dc-evst1/petsc/lib > > - remove > include ${PETSC_DIR}/lib/petsc/conf/rules > include ${PETSC_DIR}/lib/petsc/conf/test > > - remove > petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc > > i.e only keep: > include ${PETSC_DIR}/lib/petsc/conf/variables > > You have: > > LINK += $(PETSC_LIB) > > Similarly you need: > cxxcppflags += $(PETSC_CXXCPPFLAGS) > > Satish > > On Thu, 3 Feb 2022, Evstafyeva,Tamara wrote: > > > Dear Satish, > > > > Thank you for getting back! > > > > I am attaching mu current GNUmakefile and the logs for it and the ex19. > > > > Let me know if anything else is needed and thanks in advance, > > > > Tamara > > > > From: Satish Balay > > Date: Thursday, 3 February 2022, 18:53 > > To: Evstafyeva,Tamara > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] cannot open source file "petsc.h" > > Can you send your current makefile? > > > > Also send compile log [including errors] when building with this makefile. > > > > Also send us compile log from: > > > > cd src/snes/tutorials > > make ex19 > > > > Satish > > > > On Thu, 3 Feb 2022, Evstafyeva,Tamara wrote: > > > > > To whom it may concern, > > > > > > I am using a code that utilizes some of the PETSC capabilities. After configuring and installing PETSC on a cluster, I have set my environment variables $PETSC_DIR and $PETSC_ARCH. The code using PETSC compiles using GNUMakefile and so using instructions on the website and in Makefile.user I added the following lines to the make file: > > > > > > > > > include ${PETSC_DIR}/lib/petsc/conf/variables > > > > > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > > > include ${PETSC_DIR}/lib/petsc/conf/test > > > > > > > > > petsc.pc := $(PETSC_DIR)/$(PETSC_ARCH)/lib/pkgconfig/petsc.pc > > > > > > When compiling the code I get that the head files of PETSC are not recognized: > > > > > > catastrophic error: cannot open source file "petsc.h" > > > > > > This seems like a trivial problem, however I cannot seem to figure out what exactly went wrong. I?d expect this cannot be installation problem, and most likely the linking? I would really appreciate some direction on this problem. > > > > > > Thank you, > > > > > > Tamara > > > > > > > > > > > > From bsmith at petsc.dev Thu Feb 3 15:25:20 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 3 Feb 2022 16:25:20 -0500 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: References: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> <138725A0-A19A-4CE0-B279-C509FB459379@petsc.dev> Message-ID: <76C1F609-FB34-42A3-9654-90FC464CED6D@petsc.dev> I suspect the new matrix has a very different parallel nonzero structure that results in MOST of the calculations taking place on the CPU (since the "off-diagonal" part of the matrix dominates the non-zero pattern). PETSc is not designed for this type of nonzero structure and will give a bad performance (CPU or GPU); it is not a "PDE-ish" type of nonzero structure. > On Feb 3, 2022, at 2:59 PM, Rohan Yadav wrote: > > I'm sorry, I did a little switch here. The original log view I sent for 2 runs was on a different input matrix. Based on Barry's request I switched to a different matrix as the original one did not fit on 1 GPU. > > > In the previously sent runs it was about 98% on GPU. > > Re 98% on the GPU though, my first email had a similar ratio in the log though: > ``` > MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 > ``` > The follow up log might be slightly different as well because I pushed a new log stage as requested by Stefano. > > Rohan > > On Thu, Feb 3, 2022 at 11:50 AM Barry Smith > wrote: > > Mark, > > Good eye. Something is definitely very different between this run and the previous (options, code change?). In the previously sent runs it was about 98% on GPU. > > Barry > > >> On Feb 3, 2022, at 12:29 PM, Rohan Yadav > wrote: >> >> > Please send the code that builds the sparse B matrix and the setMatToConstant() routine. >> >> Setting to a constant: >> ``` >> void setMatToConstant(Mat mat, PetscScalar c) { >> PetscInt rStart, rEnd, m, n; >> MatGetSize(mat, &m, &n); >> MatGetOwnershipRange(mat, &rStart, &rEnd); >> for (int i = rStart; i < rEnd; i++) { >> for (int j = 0; j < n; j++) { >> MatSetValue(mat, i, j, c, INSERT_VALUES); >> } >> } >> MatAssemblyBegin(mat, MAT_FINAL_ASSEMBLY); >> MatAssemblyEnd(mat, MAT_FINAL_ASSEMBLY); >> } >> ``` >> >> Loading sparse matrix from disk: >> ``` >> int loadMatrixFromFile(Mat* A, char* filename) { >> auto ierr = MatCreate(PETSC_COMM_WORLD, A); CHKERRQ(ierr); >> MatSetFromOptions(*A); >> PetscViewer viewer; >> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >> PetscViewerSetType(viewer, PETSCVIEWERBINARY); >> PetscViewerFileSetMode(viewer, FILE_MODE_READ); >> PetscViewerFileSetName(viewer, filename); >> MatLoad(*A, viewer); >> return 0; >> } >> ``` >> These are only called once and should not affect the computation in a loop though. >> > But first please verify that if you run with one MPI rank the "on GPU" and the overall flop rates for the MatMatMult() are almost the same and there is no copy from the GPU for each multiply? >> >> Yes, with 1 mpi rank / GPU there are no extra copies done. As soon as I move to 2 ranks I see this behavior. >> >> Here are updated logs with a new stage for 2 ranks. I've staged the logs into "MyComputation". >> >> ``` >> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >> >> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen572 with 2 processors, by yadav2 Thu Feb 3 09:27:30 2022 >> Using Petsc Release Version 3.16.3, unknown >> >> Max Max/Min Avg Total >> Time (sec): 2.091e+02 1.001 2.090e+02 >> Objects: 4.800e+01 1.000 4.800e+01 >> Flop: 4.344e+11 1.019 4.303e+11 8.606e+11 >> Flop/sec: 2.077e+09 1.018 2.059e+09 4.118e+09 >> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >> MPI Message Lengths: 6.316e+10 1.000 1.805e+09 1.263e+11 >> MPI Reductions: 8.100e+01 1.000 >> >> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N --> 2N flop >> and VecAXPY() for complex vectors of length N --> 8N flop >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >> 0: Main Stage: 1.0555e+02 50.5% 2.8686e+11 33.3% 3.000e+01 42.9% 1.466e+09 34.8% 4.300e+01 53.1% >> 1: MyComputation: 1.0345e+02 49.5% 5.7373e+11 66.7% 4.000e+01 57.1% 2.058e+09 65.2% 2.000e+01 24.7% >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> AvgLen: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this phase >> %M - percent messages in this phase %L - percent message lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >> CpuToGpu Count: total number of CPU to GPU copies per processor >> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >> GpuToCpu Count: total number of GPU to CPU copies per processor >> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >> GPU %F: percent flops on GPU in this event >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> --- Event Stage 0: Main Stage >> >> BuildTwoSided 2 1.0 4.0085e-0136.3 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 7 0 5 0 0 0 0.00e+00 0 0.00e+00 0 >> BuildTwoSidedF 1 1.0 4.0080e-0113602.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >> MatAssemblyBegin 12 1.0 4.0084e-017217.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >> MatAssemblyEnd 12 1.0 3.4970e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 2 0 0 0 7 3 0 0 0 14 0 0 0 0.00e+00 0 0.00e+00 0 >> MatZeroEntries 1 1.0 2.4093e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> MatLoad 1 1.0 1.3756e+01 1.0 0.00e+00 0.0 6.0e+00 4.6e+08 2.1e+01 7 0 9 2 26 13 0 20 6 49 0 0 0 0.00e+00 0 0.00e+00 0 >> MatMatMultSym 20 1.0 4.7919e+00 2.4 0.00e+00 0.0 4.0e+00 1.6e+07 1.2e+01 2 0 6 0 15 3 0 13 0 28 0 0 0 0.00e+00 0 0.00e+00 0 >> MatMatMultNum 10 1.0 4.9853e+01 1.1 1.45e+11 1.0 2.0e+01 2.1e+09 0.0e+00 23 33 29 33 0 46100 67 94 0 5754 182686 2 2.23e+03 10 2.08e+04 5 >> MatCUSPARSCopyTo 1 1.0 2.2646e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 1.55e+02 0 0.00e+00 0 >> MatDenseCopyTo 1 1.0 1.6636e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.08e+03 0 0.00e+00 0 >> MatDenseCopyFrom 11 1.0 3.0463e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 0 0 0.00e+00 11 2.29e+04 0 >> VecSet 3 1.0 5.0035e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> SFSetGraph 1 1.0 4.4294e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> SFSetUp 1 1.0 1.3982e-01 1.0 0.00e+00 0.0 4.0e+00 1.6e+07 1.0e+00 0 0 6 0 1 0 0 13 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >> >> --- Event Stage 1: MyComputation >> >> MatAssemblyBegin 20 1.0 1.6894e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> MatAssemblyEnd 20 1.0 1.5575e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >> MatMatMultSym 40 1.0 1.0096e+01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 3 0 0 0 25 7 0 0 0100 0 0 0 0.00e+00 0 0.00e+00 0 >> MatMatMultNum 20 1.0 9.9320e+01 1.1 2.90e+11 1.0 4.0e+01 2.1e+09 0.0e+00 46 67 57 65 0 93100100100 0 5777 182577 0 0.00e+00 20 4.16e+04 5 >> MatDenseCopyFrom 20 1.0 5.5380e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 5 0 0 0 0 0 0 0 0.00e+00 20 4.16e+04 0 >> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Matrix 17 10 20381695840 0. >> Viewer 2 0 0 0. >> Vector 4 1 1792 0. >> Index Set 2 2 31848152 0. >> Star Forest Graph 3 0 0 0. >> >> --- Event Stage 1: MyComputation >> >> Matrix 20 20 40763391680 0. >> ======================================================================================================================== >> Average time to get PetscTime(): 3.96e-08 >> Average time for MPI_Barrier(): 8.184e-07 >> Average time for zero size MPI_Send(): 2.8165e-06 >> #PETSc Option Table entries: >> -bench spmm >> -enable_gpu >> -log_view >> -mat_type aijcusparse >> -matload_block_size 1 >> -matrix /p/gpfs1/yadav2/tensors/petsc/nlpkkt200.petsc >> -n 20 >> -vec_type cuda >> -warmup 10 >> #End of PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >> Configure options: --download-c2html=0 --download-hwloc=0 --download-sowing=0 --prefix=./petsc-install/ --with-64-bit-indices=0 --with-blaslapack-lib="/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/liblapack.so /usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/libblas.so" --with-cc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc --with-clanguage=C --with-cxx-dialect=C++17 --with-cxx=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpig++ --with-cuda=1 --with-debugging=0 --with-fc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran --with-fftw=0 --with-hdf5-dir=/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4 --with-hdf5=1 --with-mumps=0 --with-precision=double --with-scalapack=0 --with-scalar-type=real --with-shared-libraries=1 --with-ssl=0 --with-suitesparse=0 --with-trilinos=0 --with-valgrind=0 --with-x=0 --with-zlib-include=/usr/include --with-zlib-lib=/usr/lib64/libz.so --with-zlib=1 CFLAGS="-g -DNoChange" COPTFLAGS="-O3" CXXFLAGS="-O3" CXXOPTFLAGS="-O3" FFLAGS=-g CUDAFLAGS=-std=c++17 FOPTFLAGS= PETSC_ARCH=arch-linux-c-opt >> ----------------------------------------- >> Libraries compiled on 2022-01-21 06:41:50 on lassen111 >> Machine characteristics: Linux-4.14.0-115.21.2.1chaos.ch6a.ppc64le-ppc64le-with-redhat-7.6-Maipo >> Using PETSc directory: /g/g15/yadav2/taco/petsc/petsc/petsc-install >> Using PETSc arch: >> ----------------------------------------- >> >> Using C compiler: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc -g -DNoChange -fPIC "-O3" >> Using Fortran compiler: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran -g -fPIC >> ----------------------------------------- >> >> Using include paths: -I/g/g15/yadav2/taco/petsc/petsc/petsc-install/include -I/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/include -I/usr/include -I/usr/tce/packages/cuda/cuda-11.1.0/include >> ----------------------------------------- >> >> Using C linker: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc >> Using Fortran linker: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran >> Using libraries: -Wl,-rpath,/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -L/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -lpetsc -Wl,-rpath,/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib -L/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib -Wl,-rpath,/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib -L/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib -Wl,-rpath,/usr/tce/packages/cuda/cuda-11.1.0/lib64 -L/usr/tce/packages/cuda/cuda-11.1.0/lib64 -Wl,-rpath,/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib -L/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -llapack -lblas -lhdf5_hl -lhdf5 -lm /usr/lib64/libz.so -lcuda -lcudart -lcufft -lcublas -lcusparse -lcusolver -lcurand -lstdc++ -ldl -lmpiprofilesupport -lmpi_ibm_usempi -lmpi_ibm_mpifh -lmpi_ibm -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl >> ----------------------------------------- >> ``` >> >> On Wed, Feb 2, 2022 at 11:59 PM Stefano Zampini > wrote: >> >> >> 1) It uses MatMPIDenseScatter() to move to the other ranks their needed rows of the C matrix. That function has the call MatDenseGetArrayRead() normally would trigger a copy of C up to the CPU each time. But since C is not changing in your test run I guess it only triggers one copy. >> >> 2) If uses MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); to do the off diagonal part of the product but this triggers for each multiply a copy of the result matrix from the CPU to the GPU (hugely expensive) >> >> For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() that is smarter about the needed MPI communication so it only moves exactly what it needs to the other ranks and it does the off-diagonal part of the product on the GPU so it does not need to copy the result up to the CPU. >> >> >> MPIAIJCUSPARSE uses MatProductSetFromOptions_MPIAIJBACKEND >> >> Rohan >> I would suggest to add PetscLogStage around your performance loop (do a warmup outside of it) and send the relevant portion of the log >> >> Barry >> >> >> >> >> >> >>> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >>> >>> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 >>> Using Petsc Release Version 3.16.3, unknown >>> >>> Max Max/Min Avg Total >>> Time (sec): 1.163e+02 1.000 1.163e+02 >>> Objects: 4.800e+01 1.000 4.800e+01 >>> Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 >>> Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 >>> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >>> MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 >>> MPI Reductions: 8.100e+01 1.000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of length N --> 2N flop >>> and VecAXPY() for complex vectors of length N --> 8N flop >>> >>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>> 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flop: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all processors >>> Mess: number of messages sent >>> AvgLen: average message length (bytes) >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flop in this phase >>> %M - percent messages in this phase %L - percent message lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >>> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >>> CpuToGpu Count: total number of CPU to GPU copies per processor >>> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >>> GpuToCpu Count: total number of GPU to CPU copies per processor >>> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >>> GPU %F: percent flops on GPU in this event >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> --- Event Stage 0: Main Stage >>> >>> BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 >>> BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >>> MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 >>> MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 >>> MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 >>> VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory Descendants' Mem. >>> Reports information only for process 0. >>> >>> --- Event Stage 0: Main Stage >>> >>> Matrix 37 30 2867511840 0. >>> Viewer 2 0 0 0. >>> Vector 4 1 1792 0. >>> Index Set 2 2 1495248 0. >>> Star Forest Graph 3 0 0 0. >>> ======================================================================================================================== >>> Average time to get PetscTime(): 3.83e-08 >>> Average time for MPI_Barrier(): 7.874e-07 >>> Average time for zero size MPI_Send(): 3.4035e-06 >>> #PETSc Option Table entries: >>> -bench spmm >>> -enable_gpu >>> -log_view >>> -mat_type aijcusparse >>> -matload_block_size 1 >>> -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc >>> -n 20 >>> -vec_type cuda >>> -warmup 10 >>> ``` >>> >>> Thanks, >>> >>> Rohan Yadav >>> >> >> >> >> -- >> Stefano > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rohany at alumni.cmu.edu Thu Feb 3 15:28:52 2022 From: rohany at alumni.cmu.edu (Rohan Yadav) Date: Thu, 3 Feb 2022 13:28:52 -0800 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: <76C1F609-FB34-42A3-9654-90FC464CED6D@petsc.dev> References: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> <138725A0-A19A-4CE0-B279-C509FB459379@petsc.dev> <76C1F609-FB34-42A3-9654-90FC464CED6D@petsc.dev> Message-ID: To be concrete, the first matrix was https://sparse.tamu.edu/LAW/arabic-2005 and the second was https://sparse.tamu.edu/Schenk/nlpkkt200 (which looks like it does come from the PDE domain?). Regardless of the non-zero structure, there is still a significant hit when moving from 1 gpu to multiple GPUs that causes a large number of device to host copies to be performed. If this is a result of the PETSc implementation thats fine -- but if there's something I can do to work around that it would be great. Rohan On Thu, Feb 3, 2022 at 1:25 PM Barry Smith wrote: > > I suspect the new matrix has a very different parallel nonzero structure > that results in MOST of the calculations taking place on the CPU (since the > "off-diagonal" part of the matrix dominates the non-zero pattern). PETSc is > not designed for this type of nonzero structure and will give a bad > performance (CPU or GPU); it is not a "PDE-ish" type of nonzero structure. > > > > On Feb 3, 2022, at 2:59 PM, Rohan Yadav wrote: > > I'm sorry, I did a little switch here. The original log view I sent for 2 > runs was on a different input matrix. Based on Barry's request I switched > to a different matrix as the original one did not fit on 1 GPU. > > > In the previously sent runs it was about 98% on GPU. > > Re 98% on the GPU though, my first email had a similar ratio in the log > though: > ``` > > MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 > > ``` > > The follow up log might be slightly different as well because I pushed a new log stage as requested by Stefano. > > > Rohan > > > On Thu, Feb 3, 2022 at 11:50 AM Barry Smith wrote: > >> >> Mark, >> >> Good eye. Something is definitely very different between this run and >> the previous (options, code change?). In the previously sent runs it was >> about 98% on GPU. >> >> Barry >> >> >> On Feb 3, 2022, at 12:29 PM, Rohan Yadav wrote: >> >> > Please send the code that builds the sparse B matrix and the setMatToConstant() >> routine. >> >> Setting to a constant: >> ``` >> void setMatToConstant(Mat mat, PetscScalar c) { >> >> PetscInt rStart, rEnd, m, n; >> MatGetSize(mat, &m, &n); >> MatGetOwnershipRange(mat, &rStart, &rEnd); >> for (int i = rStart; i < rEnd; i++) { >> for (int j = 0; j < n; j++) { >> MatSetValue(mat, i, j, c, INSERT_VALUES); >> } >> } >> MatAssemblyBegin(mat, MAT_FINAL_ASSEMBLY); >> MatAssemblyEnd(mat, MAT_FINAL_ASSEMBLY); >> } >> ``` >> >> Loading sparse matrix from disk: >> >> ``` >> >> int loadMatrixFromFile(Mat* A, char* filename) { >> auto ierr = MatCreate(PETSC_COMM_WORLD, A); CHKERRQ(ierr); >> MatSetFromOptions(*A); >> PetscViewer viewer; >> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >> PetscViewerSetType(viewer, PETSCVIEWERBINARY); >> PetscViewerFileSetMode(viewer, FILE_MODE_READ); >> PetscViewerFileSetName(viewer, filename); >> MatLoad(*A, viewer); >> return 0; >> } >> >> ``` >> >> These are only called once and should not affect the computation in a loop though. >> >> > But first please verify that if you run with one MPI rank the "on GPU" and the overall flop rates for the MatMatMult() are almost the same and there is no copy from the GPU for each multiply? >> >> >> Yes, with 1 mpi rank / GPU there are no extra copies done. As soon as I >> move to 2 ranks I see this behavior. >> >> Here are updated logs with a new stage for 2 ranks. I've staged the logs >> into "MyComputation". >> >> ``` >> ---------------------------------------------- PETSc Performance Summary: >> ---------------------------------------------- >> >> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen572 with 2 >> processors, by yadav2 Thu Feb 3 09:27:30 2022 >> Using Petsc Release Version 3.16.3, unknown >> >> Max Max/Min Avg Total >> Time (sec): 2.091e+02 1.001 2.090e+02 >> Objects: 4.800e+01 1.000 4.800e+01 >> Flop: 4.344e+11 1.019 4.303e+11 8.606e+11 >> Flop/sec: 2.077e+09 1.018 2.059e+09 4.118e+09 >> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >> MPI Message Lengths: 6.316e+10 1.000 1.805e+09 1.263e+11 >> MPI Reductions: 8.100e+01 1.000 >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N >> --> 2N flop >> and VecAXPY() for complex vectors of length N >> --> 8N flop >> >> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages >> --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total Count >> %Total Avg %Total Count %Total >> 0: Main Stage: 1.0555e+02 50.5% 2.8686e+11 33.3% 3.000e+01 >> 42.9% 1.466e+09 34.8% 4.300e+01 53.1% >> 1: MyComputation: 1.0345e+02 49.5% 5.7373e+11 66.7% 4.000e+01 >> 57.1% 2.058e+09 65.2% 2.000e+01 24.7% >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flop: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> AvgLen: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() >> and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flop in this >> phase >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time >> over all processors) >> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU >> time over all processors) >> CpuToGpu Count: total number of CPU to GPU copies per processor >> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per >> processor) >> GpuToCpu Count: total number of GPU to CPU copies per processor >> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per >> processor) >> GPU %F: percent flops on GPU in this event >> >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop >> --- Global --- --- Stage ---- Total GPU - CpuToGpu - - >> GpuToCpu - GPU >> Max Ratio Max Ratio Max Ratio Mess AvgLen >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size >> Count Size %F >> >> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> --- Event Stage 0: Main Stage >> >> BuildTwoSided 2 1.0 4.0085e-0136.3 0.00e+00 0.0 2.0e+00 4.0e+00 >> 2.0e+00 0 0 3 0 2 0 0 7 0 5 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> BuildTwoSidedF 1 1.0 4.0080e-0113602.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 >> 0.00e+00 0 0.00e+00 0 >> MatAssemblyBegin 12 1.0 4.0084e-017217.1 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 >> 0.00e+00 0 0.00e+00 0 >> MatAssemblyEnd 12 1.0 3.4970e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 6.0e+00 2 0 0 0 7 3 0 0 0 14 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> MatZeroEntries 1 1.0 2.4093e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> MatLoad 1 1.0 1.3756e+01 1.0 0.00e+00 0.0 6.0e+00 4.6e+08 >> 2.1e+01 7 0 9 2 26 13 0 20 6 49 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> MatMatMultSym 20 1.0 4.7919e+00 2.4 0.00e+00 0.0 4.0e+00 1.6e+07 >> 1.2e+01 2 0 6 0 15 3 0 13 0 28 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> MatMatMultNum 10 1.0 4.9853e+01 1.1 1.45e+11 1.0 2.0e+01 2.1e+09 >> 0.0e+00 23 33 29 33 0 46100 67 94 0 5754 182686 2 2.23e+03 10 >> 2.08e+04 5 >> MatCUSPARSCopyTo 1 1.0 2.2646e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 1.55e+02 0 >> 0.00e+00 0 >> MatDenseCopyTo 1 1.0 1.6636e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.08e+03 0 >> 0.00e+00 0 >> MatDenseCopyFrom 11 1.0 3.0463e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 0 0 0.00e+00 11 >> 2.29e+04 0 >> VecSet 3 1.0 5.0035e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> SFSetGraph 1 1.0 4.4294e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> SFSetUp 1 1.0 1.3982e-01 1.0 0.00e+00 0.0 4.0e+00 1.6e+07 >> 1.0e+00 0 0 6 0 1 0 0 13 0 2 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> >> --- Event Stage 1: MyComputation >> >> MatAssemblyBegin 20 1.0 1.6894e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> MatAssemblyEnd 20 1.0 1.5575e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> MatMatMultSym 40 1.0 1.0096e+01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 2.0e+01 3 0 0 0 25 7 0 0 0100 0 0 0 0.00e+00 0 >> 0.00e+00 0 >> MatMatMultNum 20 1.0 9.9320e+01 1.1 2.90e+11 1.0 4.0e+01 2.1e+09 >> 0.0e+00 46 67 57 65 0 93100100100 0 5777 182577 0 0.00e+00 20 >> 4.16e+04 5 >> MatDenseCopyFrom 20 1.0 5.5380e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 3 0 0 0 0 5 0 0 0 0 0 0 0 0.00e+00 20 >> 4.16e+04 0 >> >> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' >> Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Matrix 17 10 20381695840 0. >> Viewer 2 0 0 0. >> Vector 4 1 1792 0. >> Index Set 2 2 31848152 0. >> Star Forest Graph 3 0 0 0. >> >> --- Event Stage 1: MyComputation >> >> Matrix 20 20 40763391680 0. >> >> ======================================================================================================================== >> Average time to get PetscTime(): 3.96e-08 >> Average time for MPI_Barrier(): 8.184e-07 >> Average time for zero size MPI_Send(): 2.8165e-06 >> #PETSc Option Table entries: >> -bench spmm >> -enable_gpu >> -log_view >> -mat_type aijcusparse >> -matload_block_size 1 >> -matrix /p/gpfs1/yadav2/tensors/petsc/nlpkkt200.petsc >> -n 20 >> -vec_type cuda >> -warmup 10 >> #End of PETSc Option Table entries >> Compiled without FORTRAN kernels >> Compiled with full precision matrices (default) >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >> sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >> Configure options: --download-c2html=0 --download-hwloc=0 >> --download-sowing=0 --prefix=./petsc-install/ --with-64-bit-indices=0 >> --with-blaslapack-lib="/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/liblapack.so >> /usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/libblas.so" >> --with-cc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc >> --with-clanguage=C --with-cxx-dialect=C++17 >> --with-cxx=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpig++ >> --with-cuda=1 --with-debugging=0 >> --with-fc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran >> --with-fftw=0 >> --with-hdf5-dir=/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4 >> --with-hdf5=1 --with-mumps=0 --with-precision=double --with-scalapack=0 >> --with-scalar-type=real --with-shared-libraries=1 --with-ssl=0 >> --with-suitesparse=0 --with-trilinos=0 --with-valgrind=0 --with-x=0 >> --with-zlib-include=/usr/include --with-zlib-lib=/usr/lib64/libz.so >> --with-zlib=1 CFLAGS="-g -DNoChange" COPTFLAGS="-O3" CXXFLAGS="-O3" >> CXXOPTFLAGS="-O3" FFLAGS=-g CUDAFLAGS=-std=c++17 FOPTFLAGS= >> PETSC_ARCH=arch-linux-c-opt >> ----------------------------------------- >> Libraries compiled on 2022-01-21 06:41:50 on lassen111 >> Machine characteristics: >> Linux-4.14.0-115.21.2.1chaos.ch6a.ppc64le-ppc64le-with-redhat-7.6-Maipo >> Using PETSc directory: /g/g15/yadav2/taco/petsc/petsc/petsc-install >> Using PETSc arch: >> ----------------------------------------- >> >> Using C compiler: >> /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc >> -g -DNoChange -fPIC "-O3" >> Using Fortran compiler: >> /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran >> -g -fPIC >> ----------------------------------------- >> >> Using include paths: >> -I/g/g15/yadav2/taco/petsc/petsc/petsc-install/include >> -I/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/include >> -I/usr/include -I/usr/tce/packages/cuda/cuda-11.1.0/include >> ----------------------------------------- >> >> Using C linker: >> /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc >> Using Fortran linker: >> /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran >> Using libraries: >> -Wl,-rpath,/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib >> -L/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -lpetsc >> -Wl,-rpath,/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib >> -L/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib >> -Wl,-rpath,/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib >> -L/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib >> -Wl,-rpath,/usr/tce/packages/cuda/cuda-11.1.0/lib64 >> -L/usr/tce/packages/cuda/cuda-11.1.0/lib64 >> -Wl,-rpath,/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib >> -L/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib >> -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 >> -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 >> -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc >> -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc >> -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 >> -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 >> -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib >> -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -llapack -lblas -lhdf5_hl >> -lhdf5 -lm /usr/lib64/libz.so -lcuda -lcudart -lcufft -lcublas -lcusparse >> -lcusolver -lcurand -lstdc++ -ldl -lmpiprofilesupport -lmpi_ibm_usempi >> -lmpi_ibm_mpifh -lmpi_ibm -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath >> -lpthread -lquadmath -lstdc++ -ldl >> ----------------------------------------- >> ``` >> >> On Wed, Feb 2, 2022 at 11:59 PM Stefano Zampini < >> stefano.zampini at gmail.com> wrote: >> >>> >>> >>> 1) It uses MatMPIDenseScatter() to move to the other ranks their needed >>>> rows of the C matrix. That function has the call MatDenseGetArrayRead() >>>> normally would trigger a copy of C up to the CPU each time. But since C is >>>> not changing in your test run I guess it only triggers one copy. >>>> >>>> 2) If uses >>>> MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); >>>> to do the off diagonal part of the product but this triggers for each >>>> multiply a copy of the result matrix from the CPU to the GPU (hugely >>>> expensive) >>>> >>>> For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() >>>> that is smarter about the needed MPI communication so it only moves exactly >>>> what it needs to the other ranks and it does the off-diagonal part of the >>>> product on the GPU so it does not need to copy the result up to the CPU. >>>> >>>> >>> MPIAIJCUSPARSE uses MatProductSetFromOptions_MPIAIJBACKEND >>> >>> Rohan >>> I would suggest to add PetscLogStage around your performance loop (do a >>> warmup outside of it) and send the relevant portion of the log >>> >>> >>>> Barry >>>> >>>> >>>> >>>> >>>> >>>> >>>> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >>>> >>>> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 >>>> Using Petsc Release Version 3.16.3, unknown >>>> >>>> Max Max/Min Avg Total >>>> Time (sec): 1.163e+02 1.000 1.163e+02 >>>> Objects: 4.800e+01 1.000 4.800e+01 >>>> Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 >>>> Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 >>>> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >>>> MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 >>>> MPI Reductions: 8.100e+01 1.000 >>>> >>>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >>>> e.g., VecAXPY() for real vectors of length N --> 2N flop >>>> and VecAXPY() for complex vectors of length N --> 8N flop >>>> >>>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>>> 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% >>>> >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> See the 'Profiling' chapter of the users' manual for details on interpreting output. >>>> Phase summary info: >>>> Count: number of times phase was executed >>>> Time and Flop: Max - maximum over all processors >>>> Ratio - ratio of maximum to minimum over all processors >>>> Mess: number of messages sent >>>> AvgLen: average message length (bytes) >>>> Reduct: number of global reductions >>>> Global: entire computation >>>> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >>>> %T - percent time in this phase %F - percent flop in this phase >>>> %M - percent messages in this phase %L - percent message lengths in this phase >>>> %R - percent reductions in this phase >>>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >>>> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >>>> CpuToGpu Count: total number of CPU to GPU copies per processor >>>> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >>>> GpuToCpu Count: total number of GPU to CPU copies per processor >>>> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >>>> GPU %F: percent flops on GPU in this event >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> --- Event Stage 0: Main Stage >>>> >>>> BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 >>>> BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >>>> MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 >>>> MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 >>>> MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 >>>> VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>>> SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>>> SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> Memory usage is given in bytes: >>>> >>>> Object Type Creations Destructions Memory Descendants' Mem. >>>> Reports information only for process 0. >>>> >>>> --- Event Stage 0: Main Stage >>>> >>>> Matrix 37 30 2867511840 0. >>>> Viewer 2 0 0 0. >>>> Vector 4 1 1792 0. >>>> Index Set 2 2 1495248 0. >>>> Star Forest Graph 3 0 0 0. >>>> ======================================================================================================================== >>>> Average time to get PetscTime(): 3.83e-08 >>>> Average time for MPI_Barrier(): 7.874e-07 >>>> Average time for zero size MPI_Send(): 3.4035e-06 >>>> #PETSc Option Table entries: >>>> -bench spmm >>>> -enable_gpu >>>> -log_view >>>> -mat_type aijcusparse >>>> -matload_block_size 1 >>>> -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc >>>> -n 20 >>>> -vec_type cuda >>>> -warmup 10 >>>> ``` >>>> >>>> >>>> Thanks, >>>> >>>> >>>> Rohan Yadav >>>> >>>> >>>> >>>> >>> >>> -- >>> Stefano >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Feb 3 16:01:48 2022 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 3 Feb 2022 17:01:48 -0500 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: References: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> <138725A0-A19A-4CE0-B279-C509FB459379@petsc.dev> <76C1F609-FB34-42A3-9654-90FC464CED6D@petsc.dev> Message-ID: > On Feb 3, 2022, at 4:28 PM, Rohan Yadav wrote: > > To be concrete, the first matrix was https://sparse.tamu.edu/LAW/arabic-2005 and the second was https://sparse.tamu.edu/Schenk/nlpkkt200 (which looks like it does come from the PDE domain?). You are correct; but the matrix bandwidth is so huge (see how far that off diagonal that second band of nonzeros is is) for two ranks this means the each MPI rank ends up needing the entire right hand side matrix to do the computation. Plus the 0 in the lower diagonal block means that on the second rank there is essentially no work to be done on the GPU at all. > > Regardless of the non-zero structure, there is still a significant hit when moving from 1 gpu to multiple GPUs that causes a large number of device to host copies to be performed. If this is a result of the PETSc implementation thats fine -- but if there's something I can do to work around that it would be great. I don't understand enough about the code that Stefeno pointed to know how easy the performance problem would be to fix in PETSc for sparse times dense matrix product It would still be a problem for the nlpkkt200 as partitioned on two ranks no matter what, but in theory the problem can be fixed by improving the PETSc code. I recommend finding a different sparse matrix test case which has an appropriate nonzero data structuring and partitioning that one can expect good performance on. > > Rohan > > On Thu, Feb 3, 2022 at 1:25 PM Barry Smith > wrote: > > I suspect the new matrix has a very different parallel nonzero structure that results in MOST of the calculations taking place on the CPU (since the "off-diagonal" part of the matrix dominates the non-zero pattern). PETSc is not designed for this type of nonzero structure and will give a bad performance (CPU or GPU); it is not a "PDE-ish" type of nonzero structure. > > > >> On Feb 3, 2022, at 2:59 PM, Rohan Yadav > wrote: >> >> I'm sorry, I did a little switch here. The original log view I sent for 2 runs was on a different input matrix. Based on Barry's request I switched to a different matrix as the original one did not fit on 1 GPU. >> >> > In the previously sent runs it was about 98% on GPU. >> >> Re 98% on the GPU though, my first email had a similar ratio in the log though: >> ``` >> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >> ``` >> The follow up log might be slightly different as well because I pushed a new log stage as requested by Stefano. >> >> Rohan >> >> On Thu, Feb 3, 2022 at 11:50 AM Barry Smith > wrote: >> >> Mark, >> >> Good eye. Something is definitely very different between this run and the previous (options, code change?). In the previously sent runs it was about 98% on GPU. >> >> Barry >> >> >>> On Feb 3, 2022, at 12:29 PM, Rohan Yadav > wrote: >>> >>> > Please send the code that builds the sparse B matrix and the setMatToConstant() routine. >>> >>> Setting to a constant: >>> ``` >>> void setMatToConstant(Mat mat, PetscScalar c) { >>> PetscInt rStart, rEnd, m, n; >>> MatGetSize(mat, &m, &n); >>> MatGetOwnershipRange(mat, &rStart, &rEnd); >>> for (int i = rStart; i < rEnd; i++) { >>> for (int j = 0; j < n; j++) { >>> MatSetValue(mat, i, j, c, INSERT_VALUES); >>> } >>> } >>> MatAssemblyBegin(mat, MAT_FINAL_ASSEMBLY); >>> MatAssemblyEnd(mat, MAT_FINAL_ASSEMBLY); >>> } >>> ``` >>> >>> Loading sparse matrix from disk: >>> ``` >>> int loadMatrixFromFile(Mat* A, char* filename) { >>> auto ierr = MatCreate(PETSC_COMM_WORLD, A); CHKERRQ(ierr); >>> MatSetFromOptions(*A); >>> PetscViewer viewer; >>> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >>> PetscViewerSetType(viewer, PETSCVIEWERBINARY); >>> PetscViewerFileSetMode(viewer, FILE_MODE_READ); >>> PetscViewerFileSetName(viewer, filename); >>> MatLoad(*A, viewer); >>> return 0; >>> } >>> ``` >>> These are only called once and should not affect the computation in a loop though. >>> > But first please verify that if you run with one MPI rank the "on GPU" and the overall flop rates for the MatMatMult() are almost the same and there is no copy from the GPU for each multiply? >>> >>> Yes, with 1 mpi rank / GPU there are no extra copies done. As soon as I move to 2 ranks I see this behavior. >>> >>> Here are updated logs with a new stage for 2 ranks. I've staged the logs into "MyComputation". >>> >>> ``` >>> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >>> >>> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen572 with 2 processors, by yadav2 Thu Feb 3 09:27:30 2022 >>> Using Petsc Release Version 3.16.3, unknown >>> >>> Max Max/Min Avg Total >>> Time (sec): 2.091e+02 1.001 2.090e+02 >>> Objects: 4.800e+01 1.000 4.800e+01 >>> Flop: 4.344e+11 1.019 4.303e+11 8.606e+11 >>> Flop/sec: 2.077e+09 1.018 2.059e+09 4.118e+09 >>> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >>> MPI Message Lengths: 6.316e+10 1.000 1.805e+09 1.263e+11 >>> MPI Reductions: 8.100e+01 1.000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of length N --> 2N flop >>> and VecAXPY() for complex vectors of length N --> 8N flop >>> >>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>> 0: Main Stage: 1.0555e+02 50.5% 2.8686e+11 33.3% 3.000e+01 42.9% 1.466e+09 34.8% 4.300e+01 53.1% >>> 1: MyComputation: 1.0345e+02 49.5% 5.7373e+11 66.7% 4.000e+01 57.1% 2.058e+09 65.2% 2.000e+01 24.7% >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flop: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all processors >>> Mess: number of messages sent >>> AvgLen: average message length (bytes) >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flop in this phase >>> %M - percent messages in this phase %L - percent message lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >>> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >>> CpuToGpu Count: total number of CPU to GPU copies per processor >>> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >>> GpuToCpu Count: total number of GPU to CPU copies per processor >>> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >>> GPU %F: percent flops on GPU in this event >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> --- Event Stage 0: Main Stage >>> >>> BuildTwoSided 2 1.0 4.0085e-0136.3 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 7 0 5 0 0 0 0.00e+00 0 0.00e+00 0 >>> BuildTwoSidedF 1 1.0 4.0080e-0113602.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatAssemblyBegin 12 1.0 4.0084e-017217.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatAssemblyEnd 12 1.0 3.4970e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 2 0 0 0 7 3 0 0 0 14 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatZeroEntries 1 1.0 2.4093e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatLoad 1 1.0 1.3756e+01 1.0 0.00e+00 0.0 6.0e+00 4.6e+08 2.1e+01 7 0 9 2 26 13 0 20 6 49 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultSym 20 1.0 4.7919e+00 2.4 0.00e+00 0.0 4.0e+00 1.6e+07 1.2e+01 2 0 6 0 15 3 0 13 0 28 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultNum 10 1.0 4.9853e+01 1.1 1.45e+11 1.0 2.0e+01 2.1e+09 0.0e+00 23 33 29 33 0 46100 67 94 0 5754 182686 2 2.23e+03 10 2.08e+04 5 >>> MatCUSPARSCopyTo 1 1.0 2.2646e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 1.55e+02 0 0.00e+00 0 >>> MatDenseCopyTo 1 1.0 1.6636e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.08e+03 0 0.00e+00 0 >>> MatDenseCopyFrom 11 1.0 3.0463e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 0 0 0.00e+00 11 2.29e+04 0 >>> VecSet 3 1.0 5.0035e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SFSetGraph 1 1.0 4.4294e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> SFSetUp 1 1.0 1.3982e-01 1.0 0.00e+00 0.0 4.0e+00 1.6e+07 1.0e+00 0 0 6 0 1 0 0 13 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>> >>> --- Event Stage 1: MyComputation >>> >>> MatAssemblyBegin 20 1.0 1.6894e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatAssemblyEnd 20 1.0 1.5575e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultSym 40 1.0 1.0096e+01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01 3 0 0 0 25 7 0 0 0100 0 0 0 0.00e+00 0 0.00e+00 0 >>> MatMatMultNum 20 1.0 9.9320e+01 1.1 2.90e+11 1.0 4.0e+01 2.1e+09 0.0e+00 46 67 57 65 0 93100100100 0 5777 182577 0 0.00e+00 20 4.16e+04 5 >>> MatDenseCopyFrom 20 1.0 5.5380e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 5 0 0 0 0 0 0 0 0.00e+00 20 4.16e+04 0 >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory Descendants' Mem. >>> Reports information only for process 0. >>> >>> --- Event Stage 0: Main Stage >>> >>> Matrix 17 10 20381695840 0. >>> Viewer 2 0 0 0. >>> Vector 4 1 1792 0. >>> Index Set 2 2 31848152 0. >>> Star Forest Graph 3 0 0 0. >>> >>> --- Event Stage 1: MyComputation >>> >>> Matrix 20 20 40763391680 0. >>> ======================================================================================================================== >>> Average time to get PetscTime(): 3.96e-08 >>> Average time for MPI_Barrier(): 8.184e-07 >>> Average time for zero size MPI_Send(): 2.8165e-06 >>> #PETSc Option Table entries: >>> -bench spmm >>> -enable_gpu >>> -log_view >>> -mat_type aijcusparse >>> -matload_block_size 1 >>> -matrix /p/gpfs1/yadav2/tensors/petsc/nlpkkt200.petsc >>> -n 20 >>> -vec_type cuda >>> -warmup 10 >>> #End of PETSc Option Table entries >>> Compiled without FORTRAN kernels >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >>> Configure options: --download-c2html=0 --download-hwloc=0 --download-sowing=0 --prefix=./petsc-install/ --with-64-bit-indices=0 --with-blaslapack-lib="/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/liblapack.so /usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/libblas.so" --with-cc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc --with-clanguage=C --with-cxx-dialect=C++17 --with-cxx=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpig++ --with-cuda=1 --with-debugging=0 --with-fc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran --with-fftw=0 --with-hdf5-dir=/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4 --with-hdf5=1 --with-mumps=0 --with-precision=double --with-scalapack=0 --with-scalar-type=real --with-shared-libraries=1 --with-ssl=0 --with-suitesparse=0 --with-trilinos=0 --with-valgrind=0 --with-x=0 --with-zlib-include=/usr/include --with-zlib-lib=/usr/lib64/libz.so --with-zlib=1 CFLAGS="-g -DNoChange" COPTFLAGS="-O3" CXXFLAGS="-O3" CXXOPTFLAGS="-O3" FFLAGS=-g CUDAFLAGS=-std=c++17 FOPTFLAGS= PETSC_ARCH=arch-linux-c-opt >>> ----------------------------------------- >>> Libraries compiled on 2022-01-21 06:41:50 on lassen111 >>> Machine characteristics: Linux-4.14.0-115.21.2.1chaos.ch6a.ppc64le-ppc64le-with-redhat-7.6-Maipo >>> Using PETSc directory: /g/g15/yadav2/taco/petsc/petsc/petsc-install >>> Using PETSc arch: >>> ----------------------------------------- >>> >>> Using C compiler: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc -g -DNoChange -fPIC "-O3" >>> Using Fortran compiler: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran -g -fPIC >>> ----------------------------------------- >>> >>> Using include paths: -I/g/g15/yadav2/taco/petsc/petsc/petsc-install/include -I/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/include -I/usr/include -I/usr/tce/packages/cuda/cuda-11.1.0/include >>> ----------------------------------------- >>> >>> Using C linker: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc >>> Using Fortran linker: /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran >>> Using libraries: -Wl,-rpath,/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -L/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -lpetsc -Wl,-rpath,/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib -L/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib -Wl,-rpath,/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib -L/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib -Wl,-rpath,/usr/tce/packages/cuda/cuda-11.1.0/lib64 -L/usr/tce/packages/cuda/cuda-11.1.0/lib64 -Wl,-rpath,/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib -L/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -llapack -lblas -lhdf5_hl -lhdf5 -lm /usr/lib64/libz.so -lcuda -lcudart -lcufft -lcublas -lcusparse -lcusolver -lcurand -lstdc++ -ldl -lmpiprofilesupport -lmpi_ibm_usempi -lmpi_ibm_mpifh -lmpi_ibm -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl >>> ----------------------------------------- >>> ``` >>> >>> On Wed, Feb 2, 2022 at 11:59 PM Stefano Zampini > wrote: >>> >>> >>> 1) It uses MatMPIDenseScatter() to move to the other ranks their needed rows of the C matrix. That function has the call MatDenseGetArrayRead() normally would trigger a copy of C up to the CPU each time. But since C is not changing in your test run I guess it only triggers one copy. >>> >>> 2) If uses MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); to do the off diagonal part of the product but this triggers for each multiply a copy of the result matrix from the CPU to the GPU (hugely expensive) >>> >>> For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() that is smarter about the needed MPI communication so it only moves exactly what it needs to the other ranks and it does the off-diagonal part of the product on the GPU so it does not need to copy the result up to the CPU. >>> >>> >>> MPIAIJCUSPARSE uses MatProductSetFromOptions_MPIAIJBACKEND >>> >>> Rohan >>> I would suggest to add PetscLogStage around your performance loop (do a warmup outside of it) and send the relevant portion of the log >>> >>> Barry >>> >>> >>> >>> >>> >>> >>>> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >>>> >>>> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 >>>> Using Petsc Release Version 3.16.3, unknown >>>> >>>> Max Max/Min Avg Total >>>> Time (sec): 1.163e+02 1.000 1.163e+02 >>>> Objects: 4.800e+01 1.000 4.800e+01 >>>> Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 >>>> Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 >>>> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >>>> MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 >>>> MPI Reductions: 8.100e+01 1.000 >>>> >>>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >>>> e.g., VecAXPY() for real vectors of length N --> 2N flop >>>> and VecAXPY() for complex vectors of length N --> 8N flop >>>> >>>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>>> 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% >>>> >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> See the 'Profiling' chapter of the users' manual for details on interpreting output. >>>> Phase summary info: >>>> Count: number of times phase was executed >>>> Time and Flop: Max - maximum over all processors >>>> Ratio - ratio of maximum to minimum over all processors >>>> Mess: number of messages sent >>>> AvgLen: average message length (bytes) >>>> Reduct: number of global reductions >>>> Global: entire computation >>>> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >>>> %T - percent time in this phase %F - percent flop in this phase >>>> %M - percent messages in this phase %L - percent message lengths in this phase >>>> %R - percent reductions in this phase >>>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >>>> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >>>> CpuToGpu Count: total number of CPU to GPU copies per processor >>>> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >>>> GpuToCpu Count: total number of GPU to CPU copies per processor >>>> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >>>> GPU %F: percent flops on GPU in this event >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> --- Event Stage 0: Main Stage >>>> >>>> BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 >>>> BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 >>>> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >>>> MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 >>>> MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 >>>> MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 >>>> VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>>> SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>>> SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>> >>>> Memory usage is given in bytes: >>>> >>>> Object Type Creations Destructions Memory Descendants' Mem. >>>> Reports information only for process 0. >>>> >>>> --- Event Stage 0: Main Stage >>>> >>>> Matrix 37 30 2867511840 0. >>>> Viewer 2 0 0 0. >>>> Vector 4 1 1792 0. >>>> Index Set 2 2 1495248 0. >>>> Star Forest Graph 3 0 0 0. >>>> ======================================================================================================================== >>>> Average time to get PetscTime(): 3.83e-08 >>>> Average time for MPI_Barrier(): 7.874e-07 >>>> Average time for zero size MPI_Send(): 3.4035e-06 >>>> #PETSc Option Table entries: >>>> -bench spmm >>>> -enable_gpu >>>> -log_view >>>> -mat_type aijcusparse >>>> -matload_block_size 1 >>>> -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc >>>> -n 20 >>>> -vec_type cuda >>>> -warmup 10 >>>> ``` >>>> >>>> Thanks, >>>> >>>> Rohan Yadav >>>> >>> >>> >>> >>> -- >>> Stefano >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Feb 3 17:12:03 2022 From: jed at jedbrown.org (Jed Brown) Date: Thu, 03 Feb 2022 16:12:03 -0700 Subject: [petsc-users] cannot open source file "petsc.h" In-Reply-To: References: <871r0jokb0.fsf@jedbrown.org> Message-ID: <87wnibmtoc.fsf@jedbrown.org> "Evstafyeva,Tamara" writes: > Thanks for your prompt reply. I am attaching the makefile; the line for execution ?make all -j 4? > > I guess using both was my attempt at trying multiple things until they work ? using either one or the other produced the same error for me. petsc.pc isn't being used here. What's probably happening is that Chombo's mk/Make.test has rules (not just variables) and those rules are replacing PETSc rules. Mark Adams (Cc'd) works with Chombo and will know for sure. I'm mildly afraid of Chombo and would use Makefile.user to extract exactly the information you want from PETSc. Alternatively, include only ${PETSC_DIR}/lib/petsc/conf/variables from PETSc and append the variables as needed to Chombo. The downside of this is that some variables are not namespaced so there could be conflicts depending on Chombo's naming conventions. From rohany at alumni.cmu.edu Thu Feb 3 17:20:52 2022 From: rohany at alumni.cmu.edu (Rohan Yadav) Date: Thu, 3 Feb 2022 15:20:52 -0800 Subject: [petsc-users] PETSc GPU MatMatMult performance question In-Reply-To: References: <2E26019F-4676-432F-8B17-EFF2FEEB3000@petsc.dev> <138725A0-A19A-4CE0-B279-C509FB459379@petsc.dev> <76C1F609-FB34-42A3-9654-90FC464CED6D@petsc.dev> Message-ID: Alright, thanks for the help everyone. Rohan On Thu, Feb 3, 2022 at 2:01 PM Barry Smith wrote: > > > On Feb 3, 2022, at 4:28 PM, Rohan Yadav wrote: > > To be concrete, the first matrix was > https://sparse.tamu.edu/LAW/arabic-2005 and the second was > https://sparse.tamu.edu/Schenk/nlpkkt200 (which looks like it does come > from the PDE domain?). > > > You are correct; but the matrix bandwidth is so huge (see how far that > off diagonal that second band of nonzeros is is) for two ranks this means > the each MPI rank ends up needing the entire right hand side matrix to do > the computation. Plus the 0 in the lower diagonal block means that on the > second rank there is essentially no work to be done on the GPU at all. > > > > Regardless of the non-zero structure, there is still a significant hit > when moving from 1 gpu to multiple GPUs that causes a large number of > device to host copies to be performed. If this is a result of the PETSc > implementation thats fine -- but if there's something I can do to work > around that it would be great. > > > I don't understand enough about the code that Stefeno pointed to know > how easy the performance problem would be to fix in PETSc for sparse times > dense matrix product It would still be a problem for the nlpkkt200 > as partitioned on two ranks > no matter what, but in theory the problem can be fixed by improving the > PETSc code. > > I recommend finding a different sparse matrix test case which has an > appropriate nonzero data structuring and partitioning that one can expect > good performance on. > > > > > Rohan > > On Thu, Feb 3, 2022 at 1:25 PM Barry Smith wrote: > >> >> I suspect the new matrix has a very different parallel nonzero >> structure that results in MOST of the calculations taking place on the CPU >> (since the "off-diagonal" part of the matrix dominates the non-zero >> pattern). PETSc is not designed for this type of nonzero structure and will >> give a bad performance (CPU or GPU); it is not a "PDE-ish" type of nonzero >> structure. >> >> >> >> On Feb 3, 2022, at 2:59 PM, Rohan Yadav wrote: >> >> I'm sorry, I did a little switch here. The original log view I sent for 2 >> runs was on a different input matrix. Based on Barry's request I switched >> to a different matrix as the original one did not fit on 1 GPU. >> >> > In the previously sent runs it was about 98% on GPU. >> >> Re 98% on the GPU though, my first email had a similar ratio in the log >> though: >> ``` >> >> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >> >> ``` >> >> The follow up log might be slightly different as well because I pushed a new log stage as requested by Stefano. >> >> >> Rohan >> >> >> On Thu, Feb 3, 2022 at 11:50 AM Barry Smith wrote: >> >>> >>> Mark, >>> >>> Good eye. Something is definitely very different between this run >>> and the previous (options, code change?). In the previously sent runs it >>> was about 98% on GPU. >>> >>> Barry >>> >>> >>> On Feb 3, 2022, at 12:29 PM, Rohan Yadav wrote: >>> >>> > Please send the code that builds the sparse B matrix and the setMatToConstant() >>> routine. >>> >>> Setting to a constant: >>> ``` >>> void setMatToConstant(Mat mat, PetscScalar c) { >>> >>> PetscInt rStart, rEnd, m, n; >>> MatGetSize(mat, &m, &n); >>> MatGetOwnershipRange(mat, &rStart, &rEnd); >>> for (int i = rStart; i < rEnd; i++) { >>> for (int j = 0; j < n; j++) { >>> MatSetValue(mat, i, j, c, INSERT_VALUES); >>> } >>> } >>> MatAssemblyBegin(mat, MAT_FINAL_ASSEMBLY); >>> MatAssemblyEnd(mat, MAT_FINAL_ASSEMBLY); >>> } >>> ``` >>> >>> Loading sparse matrix from disk: >>> >>> ``` >>> >>> int loadMatrixFromFile(Mat* A, char* filename) { >>> auto ierr = MatCreate(PETSC_COMM_WORLD, A); CHKERRQ(ierr); >>> MatSetFromOptions(*A); >>> PetscViewer viewer; >>> PetscViewerCreate(PETSC_COMM_WORLD, &viewer); >>> PetscViewerSetType(viewer, PETSCVIEWERBINARY); >>> PetscViewerFileSetMode(viewer, FILE_MODE_READ); >>> PetscViewerFileSetName(viewer, filename); >>> MatLoad(*A, viewer); >>> return 0; >>> } >>> >>> ``` >>> >>> These are only called once and should not affect the computation in a loop though. >>> >>> > But first please verify that if you run with one MPI rank the "on GPU" and the overall flop rates for the MatMatMult() are almost the same and there is no copy from the GPU for each multiply? >>> >>> >>> Yes, with 1 mpi rank / GPU there are no extra copies done. As soon as I >>> move to 2 ranks I see this behavior. >>> >>> Here are updated logs with a new stage for 2 ranks. I've staged the logs >>> into "MyComputation". >>> >>> ``` >>> ---------------------------------------------- PETSc Performance >>> Summary: ---------------------------------------------- >>> >>> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen572 with 2 >>> processors, by yadav2 Thu Feb 3 09:27:30 2022 >>> Using Petsc Release Version 3.16.3, unknown >>> >>> Max Max/Min Avg Total >>> Time (sec): 2.091e+02 1.001 2.090e+02 >>> Objects: 4.800e+01 1.000 4.800e+01 >>> Flop: 4.344e+11 1.019 4.303e+11 8.606e+11 >>> Flop/sec: 2.077e+09 1.018 2.059e+09 4.118e+09 >>> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >>> MPI Message Lengths: 6.316e+10 1.000 1.805e+09 1.263e+11 >>> MPI Reductions: 8.100e+01 1.000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type >>> (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of length N >>> --> 2N flop >>> and VecAXPY() for complex vectors of length >>> N --> 8N flop >>> >>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages >>> --- -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total Count >>> %Total Avg %Total Count %Total >>> 0: Main Stage: 1.0555e+02 50.5% 2.8686e+11 33.3% 3.000e+01 >>> 42.9% 1.466e+09 34.8% 4.300e+01 53.1% >>> 1: MyComputation: 1.0345e+02 49.5% 5.7373e+11 66.7% 4.000e+01 >>> 57.1% 2.058e+09 65.2% 2.000e+01 24.7% >>> >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on >>> interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flop: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all processors >>> Mess: number of messages sent >>> AvgLen: average message length (bytes) >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with PetscLogStagePush() >>> and PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flop in this >>> phase >>> %M - percent messages in this phase %L - percent message >>> lengths in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time >>> over all processors) >>> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max >>> GPU time over all processors) >>> CpuToGpu Count: total number of CPU to GPU copies per processor >>> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per >>> processor) >>> GpuToCpu Count: total number of GPU to CPU copies per processor >>> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per >>> processor) >>> GPU %F: percent flops on GPU in this event >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop >>> --- Global --- --- Stage ---- Total GPU - CpuToGpu - - >>> GpuToCpu - GPU >>> Max Ratio Max Ratio Max Ratio Mess AvgLen >>> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size >>> Count Size %F >>> >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> --- Event Stage 0: Main Stage >>> >>> BuildTwoSided 2 1.0 4.0085e-0136.3 0.00e+00 0.0 2.0e+00 4.0e+00 >>> 2.0e+00 0 0 3 0 2 0 0 7 0 5 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> BuildTwoSidedF 1 1.0 4.0080e-0113602.0 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 >>> 0.00e+00 0 0.00e+00 0 >>> MatAssemblyBegin 12 1.0 4.0084e-017217.1 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 >>> 0.00e+00 0 0.00e+00 0 >>> MatAssemblyEnd 12 1.0 3.4970e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 6.0e+00 2 0 0 0 7 3 0 0 0 14 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> MatZeroEntries 1 1.0 2.4093e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> MatLoad 1 1.0 1.3756e+01 1.0 0.00e+00 0.0 6.0e+00 4.6e+08 >>> 2.1e+01 7 0 9 2 26 13 0 20 6 49 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> MatMatMultSym 20 1.0 4.7919e+00 2.4 0.00e+00 0.0 4.0e+00 1.6e+07 >>> 1.2e+01 2 0 6 0 15 3 0 13 0 28 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> MatMatMultNum 10 1.0 4.9853e+01 1.1 1.45e+11 1.0 2.0e+01 2.1e+09 >>> 0.0e+00 23 33 29 33 0 46100 67 94 0 5754 182686 2 2.23e+03 10 >>> 2.08e+04 5 >>> MatCUSPARSCopyTo 1 1.0 2.2646e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 1.55e+02 0 >>> 0.00e+00 0 >>> MatDenseCopyTo 1 1.0 1.6636e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.08e+03 0 >>> 0.00e+00 0 >>> MatDenseCopyFrom 11 1.0 3.0463e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 1 0 0 0 0 3 0 0 0 0 0 0 0 0.00e+00 11 >>> 2.29e+04 0 >>> VecSet 3 1.0 5.0035e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> SFSetGraph 1 1.0 4.4294e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> SFSetUp 1 1.0 1.3982e-01 1.0 0.00e+00 0.0 4.0e+00 1.6e+07 >>> 1.0e+00 0 0 6 0 1 0 0 13 0 2 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> >>> --- Event Stage 1: MyComputation >>> >>> MatAssemblyBegin 20 1.0 1.6894e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> MatAssemblyEnd 20 1.0 1.5575e-05 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> MatMatMultSym 40 1.0 1.0096e+01 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 2.0e+01 3 0 0 0 25 7 0 0 0100 0 0 0 0.00e+00 0 >>> 0.00e+00 0 >>> MatMatMultNum 20 1.0 9.9320e+01 1.1 2.90e+11 1.0 4.0e+01 2.1e+09 >>> 0.0e+00 46 67 57 65 0 93100100100 0 5777 182577 0 0.00e+00 20 >>> 4.16e+04 5 >>> MatDenseCopyFrom 20 1.0 5.5380e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 3 0 0 0 0 5 0 0 0 0 0 0 0 0.00e+00 20 >>> 4.16e+04 0 >>> >>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory Descendants' >>> Mem. >>> Reports information only for process 0. >>> >>> --- Event Stage 0: Main Stage >>> >>> Matrix 17 10 20381695840 0. >>> Viewer 2 0 0 0. >>> Vector 4 1 1792 0. >>> Index Set 2 2 31848152 0. >>> Star Forest Graph 3 0 0 0. >>> >>> --- Event Stage 1: MyComputation >>> >>> Matrix 20 20 40763391680 0. >>> >>> ======================================================================================================================== >>> Average time to get PetscTime(): 3.96e-08 >>> Average time for MPI_Barrier(): 8.184e-07 >>> Average time for zero size MPI_Send(): 2.8165e-06 >>> #PETSc Option Table entries: >>> -bench spmm >>> -enable_gpu >>> -log_view >>> -mat_type aijcusparse >>> -matload_block_size 1 >>> -matrix /p/gpfs1/yadav2/tensors/petsc/nlpkkt200.petsc >>> -n 20 >>> -vec_type cuda >>> -warmup 10 >>> #End of PETSc Option Table entries >>> Compiled without FORTRAN kernels >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >>> Configure options: --download-c2html=0 --download-hwloc=0 >>> --download-sowing=0 --prefix=./petsc-install/ --with-64-bit-indices=0 >>> --with-blaslapack-lib="/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/liblapack.so >>> /usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib/libblas.so" >>> --with-cc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc >>> --with-clanguage=C --with-cxx-dialect=C++17 >>> --with-cxx=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpig++ >>> --with-cuda=1 --with-debugging=0 >>> --with-fc=/usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran >>> --with-fftw=0 >>> --with-hdf5-dir=/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4 >>> --with-hdf5=1 --with-mumps=0 --with-precision=double --with-scalapack=0 >>> --with-scalar-type=real --with-shared-libraries=1 --with-ssl=0 >>> --with-suitesparse=0 --with-trilinos=0 --with-valgrind=0 --with-x=0 >>> --with-zlib-include=/usr/include --with-zlib-lib=/usr/lib64/libz.so >>> --with-zlib=1 CFLAGS="-g -DNoChange" COPTFLAGS="-O3" CXXFLAGS="-O3" >>> CXXOPTFLAGS="-O3" FFLAGS=-g CUDAFLAGS=-std=c++17 FOPTFLAGS= >>> PETSC_ARCH=arch-linux-c-opt >>> ----------------------------------------- >>> Libraries compiled on 2022-01-21 06:41:50 on lassen111 >>> Machine characteristics: >>> Linux-4.14.0-115.21.2.1chaos.ch6a.ppc64le-ppc64le-with-redhat-7.6-Maipo >>> Using PETSc directory: /g/g15/yadav2/taco/petsc/petsc/petsc-install >>> Using PETSc arch: >>> ----------------------------------------- >>> >>> Using C compiler: >>> /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc >>> -g -DNoChange -fPIC "-O3" >>> Using Fortran compiler: >>> /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran >>> -g -fPIC >>> ----------------------------------------- >>> >>> Using include paths: >>> -I/g/g15/yadav2/taco/petsc/petsc/petsc-install/include >>> -I/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/include >>> -I/usr/include -I/usr/tce/packages/cuda/cuda-11.1.0/include >>> ----------------------------------------- >>> >>> Using C linker: >>> /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigcc >>> Using Fortran linker: >>> /usr/tce/packages/spectrum-mpi/spectrum-mpi-rolling-release-gcc-8.3.1/bin/mpigfortran >>> Using libraries: >>> -Wl,-rpath,/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib >>> -L/g/g15/yadav2/taco/petsc/petsc/petsc-install/lib -lpetsc >>> -Wl,-rpath,/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib >>> -L/usr/tcetmp/packages/lapack/lapack-3.9.0-gcc-7.3.1/lib >>> -Wl,-rpath,/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib >>> -L/usr/tcetmp/packages/petsc/build/3.13.0/spack/opt/spack/linux-rhel7-power9le/xl_r-16.1/hdf5-1.10.6-e7e7urb5k7va3ib7j4uro56grvzmcmd4/lib >>> -Wl,-rpath,/usr/tce/packages/cuda/cuda-11.1.0/lib64 >>> -L/usr/tce/packages/cuda/cuda-11.1.0/lib64 >>> -Wl,-rpath,/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib >>> -L/usr/tce/packages/spectrum-mpi/ibm/spectrum-mpi-rolling-release/lib >>> -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 >>> -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8 >>> -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc >>> -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc >>> -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 >>> -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib64 >>> -Wl,-rpath,/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib >>> -L/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib -llapack -lblas -lhdf5_hl >>> -lhdf5 -lm /usr/lib64/libz.so -lcuda -lcudart -lcufft -lcublas -lcusparse >>> -lcusolver -lcurand -lstdc++ -ldl -lmpiprofilesupport -lmpi_ibm_usempi >>> -lmpi_ibm_mpifh -lmpi_ibm -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath >>> -lpthread -lquadmath -lstdc++ -ldl >>> ----------------------------------------- >>> ``` >>> >>> On Wed, Feb 2, 2022 at 11:59 PM Stefano Zampini < >>> stefano.zampini at gmail.com> wrote: >>> >>>> >>>> >>>> 1) It uses MatMPIDenseScatter() to move to the other ranks their >>>>> needed rows of the C matrix. That function has the call >>>>> MatDenseGetArrayRead() normally would trigger a copy of C up to the CPU >>>>> each time. But since C is not changing in your test run I guess it only >>>>> triggers one copy. >>>>> >>>>> 2) If uses >>>>> MatMatMultNumericAdd_SeqAIJ_SeqDense(aij->B,workB,cdense->A,PETSC_TRUE);CHKERRQ(ierr); >>>>> to do the off diagonal part of the product but this triggers for each >>>>> multiply a copy of the result matrix from the CPU to the GPU (hugely >>>>> expensive) >>>>> >>>>> For performance there needs to be a new routine MatMatMultNumeric_MPIAIJCUSPRSE_MPICUDADense() >>>>> that is smarter about the needed MPI communication so it only moves exactly >>>>> what it needs to the other ranks and it does the off-diagonal part of the >>>>> product on the GPU so it does not need to copy the result up to the CPU. >>>>> >>>>> >>>> MPIAIJCUSPARSE uses MatProductSetFromOptions_MPIAIJBACKEND >>>> >>>> Rohan >>>> I would suggest to add PetscLogStage around your performance loop (do a >>>> warmup outside of it) and send the relevant portion of the log >>>> >>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- >>>>> >>>>> /g/g15/yadav2/taco/petsc/bin/benchmark on a named lassen457 with 2 processors, by yadav2 Wed Feb 2 17:23:19 2022 >>>>> Using Petsc Release Version 3.16.3, unknown >>>>> >>>>> Max Max/Min Avg Total >>>>> Time (sec): 1.163e+02 1.000 1.163e+02 >>>>> Objects: 4.800e+01 1.000 4.800e+01 >>>>> Flop: 6.338e+11 1.065 6.144e+11 1.229e+12 >>>>> Flop/sec: 5.451e+09 1.065 5.284e+09 1.057e+10 >>>>> MPI Messages: 3.500e+01 1.000 3.500e+01 7.000e+01 >>>>> MPI Message Lengths: 2.544e+09 1.000 7.267e+07 5.087e+09 >>>>> MPI Reductions: 8.100e+01 1.000 >>>>> >>>>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >>>>> e.g., VecAXPY() for real vectors of length N --> 2N flop >>>>> and VecAXPY() for complex vectors of length N --> 8N flop >>>>> >>>>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- >>>>> Avg %Total Avg %Total Count %Total Avg %Total Count %Total >>>>> 0: Main Stage: 1.1628e+02 100.0% 1.2288e+12 100.0% 7.000e+01 100.0% 7.267e+07 100.0% 6.300e+01 77.8% >>>>> >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> See the 'Profiling' chapter of the users' manual for details on interpreting output. >>>>> Phase summary info: >>>>> Count: number of times phase was executed >>>>> Time and Flop: Max - maximum over all processors >>>>> Ratio - ratio of maximum to minimum over all processors >>>>> Mess: number of messages sent >>>>> AvgLen: average message length (bytes) >>>>> Reduct: number of global reductions >>>>> Global: entire computation >>>>> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >>>>> %T - percent time in this phase %F - percent flop in this phase >>>>> %M - percent messages in this phase %L - percent message lengths in this phase >>>>> %R - percent reductions in this phase >>>>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) >>>>> GPU Mflop/s: 10e-6 * (sum of flop on GPU over all processors)/(max GPU time over all processors) >>>>> CpuToGpu Count: total number of CPU to GPU copies per processor >>>>> CpuToGpu Size (Mbytes): 10e-6 * (total size of CPU to GPU copies per processor) >>>>> GpuToCpu Count: total number of GPU to CPU copies per processor >>>>> GpuToCpu Size (Mbytes): 10e-6 * (total size of GPU to CPU copies per processor) >>>>> GPU %F: percent flops on GPU in this event >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total GPU - CpuToGpu - - GpuToCpu - GPU >>>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s Mflop/s Count Size Count Size %F >>>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>> >>>>> --- Event Stage 0: Main Stage >>>>> >>>>> BuildTwoSided 2 1.0 4.4400e-01567.5 0.00e+00 0.0 2.0e+00 4.0e+00 2.0e+00 0 0 3 0 2 0 0 3 0 3 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> BuildTwoSidedF 1 1.0 4.4395e-0115659.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> MatAssemblyBegin 32 1.0 4.4400e-017378.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 1 0 0 0 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> MatAssemblyEnd 32 1.0 1.8511e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 1 0 0 0 7 1 0 0 0 10 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> MatZeroEntries 1 1.0 3.3306e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> MatLoad 1 1.0 1.7220e+01 1.0 0.00e+00 0.0 6.0e+00 -8.8e+07 2.1e+01 15 0 9-10 26 15 0 9-10 33 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> MatMatMultSym 60 1.0 9.2215e-01 2.6 0.00e+00 0.0 4.0e+00 7.3e+05 3.2e+01 1 0 6 0 40 1 0 6 0 51 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> MatMatMultNum 30 1.0 4.2967e+01 1.0 6.34e+11 1.1 6.0e+01 9.4e+07 0.0e+00 37100 86110 0 37100 86110 0 28598 920026 2 6.71e+03 30 8.73e+04 98 >>>>> MatCUSPARSCopyTo 1 1.0 4.4761e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 3.80e+03 0 0.00e+00 0 >>>>> MatDenseCopyTo 1 1.0 2.2742e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 1 2.91e+03 0 0.00e+00 0 >>>>> MatDenseCopyFrom 31 1.0 1.2006e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0 0 0 0.00e+00 31 9.02e+04 0 >>>>> VecSet 3 1.0 4.1917e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> SFSetGraph 1 1.0 1.9180e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> SFSetUp 1 1.0 1.3672e-02 1.1 0.00e+00 0.0 4.0e+00 7.3e+05 1.0e+00 0 0 6 0 1 0 0 6 0 2 0 0 0 0.00e+00 0 0.00e+00 0 >>>>> --------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>> >>>>> Memory usage is given in bytes: >>>>> >>>>> Object Type Creations Destructions Memory Descendants' Mem. >>>>> Reports information only for process 0. >>>>> >>>>> --- Event Stage 0: Main Stage >>>>> >>>>> Matrix 37 30 2867511840 0. >>>>> Viewer 2 0 0 0. >>>>> Vector 4 1 1792 0. >>>>> Index Set 2 2 1495248 0. >>>>> Star Forest Graph 3 0 0 0. >>>>> ======================================================================================================================== >>>>> Average time to get PetscTime(): 3.83e-08 >>>>> Average time for MPI_Barrier(): 7.874e-07 >>>>> Average time for zero size MPI_Send(): 3.4035e-06 >>>>> #PETSc Option Table entries: >>>>> -bench spmm >>>>> -enable_gpu >>>>> -log_view >>>>> -mat_type aijcusparse >>>>> -matload_block_size 1 >>>>> -matrix /p/gpfs1/yadav2/tensors/petsc/arabic-2005.petsc >>>>> -n 20 >>>>> -vec_type cuda >>>>> -warmup 10 >>>>> ``` >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> >>>>> Rohan Yadav >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Stefano >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbuerkle at web.de Thu Feb 3 18:43:22 2022 From: mbuerkle at web.de (Marius Buerkle) Date: Fri, 4 Feb 2022 01:43:22 +0100 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: <87a6f8og4i.fsf@jedbrown.org> References: <87tudi62bz.fsf@jedbrown.org> <87r18m61pr.fsf@jedbrown.org> <87a6f8og4i.fsf@jedbrown.org> Message-ID: Ok. I did not know that. I was under the impression that MatPreallocator does actually not allocate the nonzeros and just stores the nonzero structure. But if this is not the case then of course I just duplicate the matrix. Thanks for the feedback. > Gesendet: Donnerstag, den 03.02.2022 um 03:09 Uhr > Von: "Jed Brown" > An: "Marius Buerkle" , "Patrick Sanan" > Cc: "PETSc users list" , petsc-dev > Betreff: Re: Aw: Re: [petsc-dev] [petsc-users] MatPreallocatorPreallocate segfault with PETSC 3.16 > > Marius Buerkle writes: > > > Thanks for they reply. Yes the example works, this is how I was doing it before. But the matrix is rather big and i need a matrix with the same structure at various points in my code. So it was convenient to create the matrix with preallocate, destroy it after using it to free the memory and creating it again later with the same preallocate. > > Anyway it works with MatDuplicate for now. > > I think it should take *less* memory to destroy the preallocator and duplicate the actual matrix than to destroy the matrix and persist the preallocator. If that is not the case (or close enough), we can make it so. From jed at jedbrown.org Thu Feb 3 19:02:18 2022 From: jed at jedbrown.org (Jed Brown) Date: Thu, 03 Feb 2022 18:02:18 -0700 Subject: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16 In-Reply-To: References: <87tudi62bz.fsf@jedbrown.org> <87r18m61pr.fsf@jedbrown.org> <87a6f8og4i.fsf@jedbrown.org> Message-ID: <87o83nmokl.fsf@jedbrown.org> MatPreallocator stores "the nonzero structure" in a hash table so it can be easily updated. A normal Mat stores it in a compressed (CSR) format that is expensive to update. Marius Buerkle writes: > Ok. I did not know that. I was under the impression that MatPreallocator does actually not allocate the nonzeros and just stores the nonzero structure. But if this is not the case then of course I just duplicate the matrix. > > Thanks for the feedback. > >> Gesendet: Donnerstag, den 03.02.2022 um 03:09 Uhr >> Von: "Jed Brown" >> An: "Marius Buerkle" , "Patrick Sanan" >> Cc: "PETSc users list" , petsc-dev >> Betreff: Re: Aw: Re: [petsc-dev] [petsc-users] MatPreallocatorPreallocate segfault with PETSC 3.16 >> >> Marius Buerkle writes: >> >> > Thanks for they reply. Yes the example works, this is how I was doing it before. But the matrix is rather big and i need a matrix with the same structure at various points in my code. So it was convenient to create the matrix with preallocate, destroy it after using it to free the memory and creating it again later with the same preallocate. >> > Anyway it works with MatDuplicate for now. >> >> I think it should take *less* memory to destroy the preallocator and duplicate the actual matrix than to destroy the matrix and persist the preallocator. If that is not the case (or close enough), we can make it so. From mfadams at lbl.gov Thu Feb 3 21:35:23 2022 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 3 Feb 2022 22:35:23 -0500 Subject: [petsc-users] cannot open source file "petsc.h" In-Reply-To: <87wnibmtoc.fsf@jedbrown.org> References: <871r0jokb0.fsf@jedbrown.org> <87wnibmtoc.fsf@jedbrown.org> Message-ID: On Thu, Feb 3, 2022 at 6:12 PM Jed Brown wrote: > "Evstafyeva,Tamara" writes: > > > Thanks for your prompt reply. I am attaching the makefile; the line for > execution ?make all -j 4? > > > > I guess using both was my attempt at trying multiple things until they > work ? using either one or the other produced the same error for me. > > petsc.pc isn't being used here. What's probably happening is that Chombo's > mk/Make.test has rules (not just variables) and those rules are replacing > PETSc rules. Mark Adams (Cc'd) works with Chombo and will know for sure. It seems like you are trying to build a makefile system for Chombo, with PETSc, from scratch. I think you want to go through Chombo. They have worked out makefiles with PETSc and we don't want to try to recreate that. It appears that you did not find any instructions on building Chombo with PETSc, including example makefiles. They exist, but I don't know anything about the Chombo distribution and support. I would contact Chombo and see if they can help you. I wish I could be of more help, Good luck, Mark > I'm mildly afraid of Chombo and would use Makefile.user to extract exactly > the information you want from PETSc. > > Alternatively, include only ${PETSC_DIR}/lib/petsc/conf/variables from > PETSc and append the variables as needed to Chombo. The downside of this is > that some variables are not namespaced so there could be conflicts > depending on Chombo's naming conventions. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yuanxi at advancesoft.jp Fri Feb 4 01:58:58 2022 From: yuanxi at advancesoft.jp (=?UTF-8?B?6KKB54WV?=) Date: Fri, 4 Feb 2022 16:58:58 +0900 Subject: [petsc-users] Is it possible to enforce some mesh entities owned by the same CPU? In-Reply-To: References: Message-ID: Well, it seems like there is a long way to go. Thanks for your answer.. Yuan 2022?2?3?(?) 9:50 Matthew Knepley : > On Wed, Feb 2, 2022 at 7:32 PM ?? wrote: > >> Hello everyone >> >> I need to enforce some specific nodes, for example, two nodes i,j in my >> finite element mesh, to be owned by the same CPU when doing DMPlex >> partition. Are there any means to implement it? >> > > It might be possible using edge weights in the partitioner. However, we > have no automatic support for that. To do it manually, you would probably > have to > make the CSR matrix for the mesh, wrap that in a Mat, add values for > weights, call MatPartition, and then feed that partition to Plex. It is > doable, but it would > be some amount of work. > > Thanks, > > Matt > > >> Thanks in advance. >> >> Yuan >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From te307 at cam.ac.uk Fri Feb 4 05:11:37 2022 From: te307 at cam.ac.uk (Evstafyeva,Tamara) Date: Fri, 4 Feb 2022 11:11:37 +0000 Subject: [petsc-users] cannot open source file "petsc.h" In-Reply-To: References: <871r0jokb0.fsf@jedbrown.org> <87wnibmtoc.fsf@jedbrown.org> Message-ID: Hi all again, Thanks again for your help and very quick replies. I managed to fix it at the end ? unfortunately I cannot exactly pin-point what went wrong as I was trying different things until they worked ? not very scientific of me xD. But the steps were as follows, I am not sure if this will be useful at all, but here it is anyway, 1. Based on our emails, I decided to specify the same cxx, cc, fc flags for PETSC when configuring it as in the Chombo?s Make.defs.local file (the file we use to compile the Chombo library). For example, in my case I had to use COPTFLAGS="-O3 -xCOMMON-AVX512" CXXOPTFLAGS="-O3 -xCOMMON-AVX512" as per the flags I use to compile Chombo. And of course, one has to use -lpetsc in syslibflags. 1. Then I created a module-file to set my environment variables, like this #%Module set PETSC_DIR "$::env(HOME)/petsc" set PETSC_ARCH "arch-linux-c-debug" prepend-path LD_LIBRARY_PATH "$PETSC_DIR/$PETSC_ARCH/lib" prepend-path LIBRARY_PATH "$PETSC_DIR/$PETSC_ARCH/lib" prepend-path CPATH "$PETSC_DIR/include:$PETSC_DIR/$PETSC_ARCH/include" prepend-path PKG_CONFIG_PATH "$PETSC_DIR/$PETSC_ARCH/lib/pkgconfig" prepend-path CMAKE_PREFIX_PATH "$PETSC_DIR/$PETSC_ARCH/" setenv PETSC_DIR $PETSC_DIR setenv PETSC_ARCH $PETSC_ARCH 1. Loading module, and compiling everything magically got rid off the error. Best, Tamara From: Mark Adams Date: Friday, 4 February 2022, 03:35 To: Jed Brown Cc: Evstafyeva,Tamara , petsc-users Subject: Re: [petsc-users] cannot open source file "petsc.h" On Thu, Feb 3, 2022 at 6:12 PM Jed Brown > wrote: "Evstafyeva,Tamara" > writes: > Thanks for your prompt reply. I am attaching the makefile; the line for execution ?make all -j 4? > > I guess using both was my attempt at trying multiple things until they work ? using either one or the other produced the same error for me. petsc.pc isn't being used here. What's probably happening is that Chombo's mk/Make.test has rules (not just variables) and those rules are replacing PETSc rules. Mark Adams (Cc'd) works with Chombo and will know for sure. It seems like you are trying to build a makefile system for Chombo, with PETSc, from scratch. I think you want to go through Chombo. They have worked out makefiles with PETSc and we don't want to try to recreate that. It appears that you did not find any instructions on building Chombo with PETSc, including example makefiles. They exist, but I don't know anything about the Chombo distribution and support. I would contact Chombo and see if they can help you. I wish I could be of more help, Good luck, Mark I'm mildly afraid of Chombo and would use Makefile.user to extract exactly the information you want from PETSc. Alternatively, include only ${PETSC_DIR}/lib/petsc/conf/variables from PETSc and append the variables as needed to Chombo. The downside of this is that some variables are not namespaced so there could be conflicts depending on Chombo's naming conventions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Fri Feb 4 10:35:10 2022 From: popov at uni-mainz.de (Anton Popov) Date: Fri, 4 Feb 2022 17:35:10 +0100 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem Message-ID: Hi Satish, I just discovered that PETSc 3.16.4 fails to link against the latest AMD BLIS and LibFLAME libraries on a Linux box. ------------------------------------------------------------------------------- You set a value for --with-blas-lib= and --with-lapack-lib=, but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used ******************************************************************************* My previous experience with 3.9.4 on the same system was fully successful. Looking in the configure logs (attached) reveals small difference in the linking compared to 3.9.4 Could you please make a guess what went wrong? Best regards, Anton -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 1593273 bytes Desc: not available URL: From knepley at gmail.com Fri Feb 4 10:39:36 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 4 Feb 2022 11:39:36 -0500 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: References: Message-ID: On Fri, Feb 4, 2022 at 11:35 AM Anton Popov wrote: > Hi Satish, > > I just discovered that PETSc 3.16.4 fails to link against the latest AMD > BLIS and LibFLAME libraries on a Linux box. > > > ------------------------------------------------------------------------------- > You set a value for --with-blas-lib= and --with-lapack-lib=, > but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and > ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used > > ******************************************************************************* > > My previous experience with 3.9.4 on the same system was fully > successful. Looking in the configure logs (attached) reveals small > difference in the linking compared to 3.9.4 > > Could you please make a guess what went wrong? > Down in the log I see: /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): Did the gfortran library move or get upgraded? Thanks, Matt > Best regards, > > Anton > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Fri Feb 4 11:00:09 2022 From: popov at uni-mainz.de (Anton Popov) Date: Fri, 4 Feb 2022 18:00:09 +0100 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: References: Message-ID: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> On 04.02.22 17:39, Matthew Knepley wrote: > On Fri, Feb 4, 2022 at 11:35 AM Anton Popov wrote: > > Hi Satish, > > I just discovered that PETSc 3.16.4 fails to link against the > latest AMD > BLIS and LibFLAME libraries on a Linux box. > > ------------------------------------------------------------------------------- > You set a value for --with-blas-lib= and > --with-lapack-lib=, > but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and > ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used > ******************************************************************************* > > My previous experience with 3.9.4 on the same system was fully > successful. Looking in the configure logs (attached) reveals small > difference in the linking compared to 3.9.4 > > Could you please make a guess what went wrong? > > > Down in the log I see: > > /usr/bin/ld: warning: libgfortran.so.5, needed by > /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using > -rpath or -rpath-link): Thanks Matt, I'll try. > > Did the gfortran library move or get upgraded? Not at all. I have configured 3.9.4 just now to make a test, and it perfectly finds all the libraries. So there must be something that 3.16.4 does differently. Best, Anton > > ? Thanks, > > ? ? ?Matt > > Best regards, > > Anton > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sasyed at fnal.gov Fri Feb 4 11:09:01 2022 From: sasyed at fnal.gov (Sajid Ali Syed) Date: Fri, 4 Feb 2022 17:09:01 +0000 Subject: [petsc-users] Sparse solvers for distributed GPU matrices/vectors arising from 3D poisson eq Message-ID: Hi PETSc-developers, Could the linear solver table (at https://petsc.org/main/overview/linear_solve_table/) be updated with information regarding direct solvers that work on mpiaijkokkos/kokkos (or mpiaijcusparse/cuda) matrix/vector types? The use case for this solver would be to repeatedly invert the same matrix so any solver that is able to perform the SpTRSV phase entirely using GPU matrices/vectors would be helpful (even if the initial factorization is performed using CPU matrices/vectors with GPU offload), this functionality of course being the corresponding distributed memory counterpart to the current device-solve capabilities of the seqaijkokkos matrix type (provided by the kokkos-kernel SpTRSV routines). The system arises from a (7-pt) finite difference discretization of the 3D Poisson equation with a mesh of 256x256x1024 (likely necessitate using multiple GPUs) with dirichlet boundary conditions. The recent article on PETScSF (arXiv:2102.13018) describes an asynchronous CG solver that works well on communication bound multi-GPU systems. Is this solver available now and can it be combined with GAMG/hypre preconditioning ? Summary of Sparse Linear Solvers Available In PETSc ? PETSc v3.16.2-540-g1213a6437a documentation Last updated on 2022-01-01T03:38:46-0600 (v3.16.2-540-g1213a6437a). petsc.org Thank You, Sajid Ali (he/him) | Research Associate Scientific Computing Division Fermi National Accelerator Laboratory s-sajid-ali.github.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 4 11:13:12 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 4 Feb 2022 12:13:12 -0500 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> Message-ID: Can you send the 3.9 log? Thanks, Matt On Fri, Feb 4, 2022 at 12:00 PM Anton Popov wrote: > > On 04.02.22 17:39, Matthew Knepley wrote: > > On Fri, Feb 4, 2022 at 11:35 AM Anton Popov wrote: > >> Hi Satish, >> >> I just discovered that PETSc 3.16.4 fails to link against the latest AMD >> BLIS and LibFLAME libraries on a Linux box. >> >> >> ------------------------------------------------------------------------------- >> You set a value for --with-blas-lib= and --with-lapack-lib=, >> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and >> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used >> >> ******************************************************************************* >> >> My previous experience with 3.9.4 on the same system was fully >> successful. Looking in the configure logs (attached) reveals small >> difference in the linking compared to 3.9.4 >> >> Could you please make a guess what went wrong? >> > > Down in the log I see: > > /usr/bin/ld: warning: libgfortran.so.5, needed by > /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using > -rpath or -rpath-link): > > Thanks Matt, I'll try. > > > Did the gfortran library move or get upgraded? > > Not at all. I have configured 3.9.4 just now to make a test, and it > perfectly finds all the libraries. So there must be something that 3.16.4 > does differently. > > Best, > > Anton > > > Thanks, > > Matt > > >> Best regards, >> >> Anton >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Feb 4 11:18:33 2022 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 4 Feb 2022 12:18:33 -0500 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> Message-ID: Please do ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so You may need to list the gfortran library directory of libgfortran.so.5 it needs to use in LDFLAGS passed to PETSc configure Barry Note: Even though you explicitly listed a static library of libflame to use our configure is goofy and loses that information and wants to link with the shared version > On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: > > > > On 04.02.22 17:39, Matthew Knepley wrote: >> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov > wrote: >> Hi Satish, >> >> I just discovered that PETSc 3.16.4 fails to link against the latest AMD >> BLIS and LibFLAME libraries on a Linux box. >> >> ------------------------------------------------------------------------------- >> You set a value for --with-blas-lib= and --with-lapack-lib=, >> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and >> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used >> ******************************************************************************* >> >> My previous experience with 3.9.4 on the same system was fully >> successful. Looking in the configure logs (attached) reveals small >> difference in the linking compared to 3.9.4 >> >> Could you please make a guess what went wrong? >> >> Down in the log I see: >> >> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): > Thanks Matt, I'll try. > >> >> Did the gfortran library move or get upgraded? > Not at all. I have configured 3.9.4 just now to make a test, and it perfectly finds all the libraries. So there must be something that 3.16.4 does differently. > > Best, > > Anton > >> >> Thanks, >> >> Matt >> >> Best regards, >> >> Anton >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Feb 4 11:27:21 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 4 Feb 2022 11:27:21 -0600 (CST) Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> Message-ID: Probably best if you can use the same version of gfortran to build both petsc and libflame/blis > /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): This is probably an ignore-able warning - but configure defaults to -Werror mode here. Wrt forcing link with static libraries - you can try: LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" [instead of --with-blas-lib= --with-lapack-lib= options]. Satish On Fri, 4 Feb 2022, Barry Smith wrote: > > Please do > > ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so > > You may need to list the gfortran library directory of libgfortran.so.5 it needs to use in LDFLAGS passed to PETSc configure > > Barry > > Note: Even though you explicitly listed a static library of libflame to use our configure is goofy and loses that information and wants to link with the shared version > > > > On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: > > > > > > > > On 04.02.22 17:39, Matthew Knepley wrote: > >> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov > wrote: > >> Hi Satish, > >> > >> I just discovered that PETSc 3.16.4 fails to link against the latest AMD > >> BLIS and LibFLAME libraries on a Linux box. > >> > >> ------------------------------------------------------------------------------- > >> You set a value for --with-blas-lib= and --with-lapack-lib=, > >> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and > >> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used > >> ******************************************************************************* > >> > >> My previous experience with 3.9.4 on the same system was fully > >> successful. Looking in the configure logs (attached) reveals small > >> difference in the linking compared to 3.9.4 > >> > >> Could you please make a guess what went wrong? > >> > >> Down in the log I see: > >> > >> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): > > Thanks Matt, I'll try. > > > >> > >> Did the gfortran library move or get upgraded? > > Not at all. I have configured 3.9.4 just now to make a test, and it perfectly finds all the libraries. So there must be something that 3.16.4 does differently. > > > > Best, > > > > Anton > > > >> > >> Thanks, > >> > >> Matt > >> > >> Best regards, > >> > >> Anton > >> > >> > >> -- > >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > > From bsmith at petsc.dev Fri Feb 4 11:33:19 2022 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 4 Feb 2022 12:33:19 -0500 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> Message-ID: > On Feb 4, 2022, at 12:27 PM, Satish Balay wrote: > > Probably best if you can use the same version of gfortran to build both petsc and libflame/blis > >> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): > > This is probably an ignore-able warning - but configure defaults to -Werror mode here. Hmm, if it needs libgfortran.so.5 then it needs it and it cannot be ignored since a link cannot succeed. Flame presumably contains a lot of old Fortran code from Lapack so would normally need the fortran libraries. > > Wrt forcing link with static libraries - you can try: > > LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" > > [instead of --with-blas-lib= --with-lapack-lib= options]. > > Satish > > > On Fri, 4 Feb 2022, Barry Smith wrote: > >> >> Please do >> >> ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so >> >> You may need to list the gfortran library directory of libgfortran.so.5 it needs to use in LDFLAGS passed to PETSc configure >> >> Barry >> >> Note: Even though you explicitly listed a static library of libflame to use our configure is goofy and loses that information and wants to link with the shared version >> >> >>> On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: >>> >>> >>> >>> On 04.02.22 17:39, Matthew Knepley wrote: >>>> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov > wrote: >>>> Hi Satish, >>>> >>>> I just discovered that PETSc 3.16.4 fails to link against the latest AMD >>>> BLIS and LibFLAME libraries on a Linux box. >>>> >>>> ------------------------------------------------------------------------------- >>>> You set a value for --with-blas-lib= and --with-lapack-lib=, >>>> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and >>>> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used >>>> ******************************************************************************* >>>> >>>> My previous experience with 3.9.4 on the same system was fully >>>> successful. Looking in the configure logs (attached) reveals small >>>> difference in the linking compared to 3.9.4 >>>> >>>> Could you please make a guess what went wrong? >>>> >>>> Down in the log I see: >>>> >>>> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): >>> Thanks Matt, I'll try. >>> >>>> >>>> Did the gfortran library move or get upgraded? >>> Not at all. I have configured 3.9.4 just now to make a test, and it perfectly finds all the libraries. So there must be something that 3.16.4 does differently. >>> >>> Best, >>> >>> Anton >>> >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Best regards, >>>> >>>> Anton >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >> >> > From balay at mcs.anl.gov Fri Feb 4 11:38:41 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 4 Feb 2022 11:38:41 -0600 (CST) Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> Message-ID: <28e250ff-3234-445-17d6-c78c6a89c613@mcs.anl.gov> On Fri, 4 Feb 2022, Barry Smith wrote: > > > > On Feb 4, 2022, at 12:27 PM, Satish Balay wrote: > > > > Probably best if you can use the same version of gfortran to build both petsc and libflame/blis > > > >> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): > > > > This is probably an ignore-able warning - but configure defaults to -Werror mode here. > > Hmm, if it needs libgfortran.so.5 then it needs it and it cannot be ignored since a link cannot succeed. Flame presumably contains a lot of old Fortran code from Lapack so would normally need the fortran libraries. Its a warning not an error. And we already have a list of excludes (of such warnings) to ignore in configure Satish > > > > > Wrt forcing link with static libraries - you can try: > > > > LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" > > > > [instead of --with-blas-lib= --with-lapack-lib= options]. > > > > Satish > > > > > > On Fri, 4 Feb 2022, Barry Smith wrote: > > > >> > >> Please do > >> > >> ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so > >> > >> You may need to list the gfortran library directory of libgfortran.so.5 it needs to use in LDFLAGS passed to PETSc configure > >> > >> Barry > >> > >> Note: Even though you explicitly listed a static library of libflame to use our configure is goofy and loses that information and wants to link with the shared version > >> > >> > >>> On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: > >>> > >>> > >>> > >>> On 04.02.22 17:39, Matthew Knepley wrote: > >>>> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov > wrote: > >>>> Hi Satish, > >>>> > >>>> I just discovered that PETSc 3.16.4 fails to link against the latest AMD > >>>> BLIS and LibFLAME libraries on a Linux box. > >>>> > >>>> ------------------------------------------------------------------------------- > >>>> You set a value for --with-blas-lib= and --with-lapack-lib=, > >>>> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and > >>>> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used > >>>> ******************************************************************************* > >>>> > >>>> My previous experience with 3.9.4 on the same system was fully > >>>> successful. Looking in the configure logs (attached) reveals small > >>>> difference in the linking compared to 3.9.4 > >>>> > >>>> Could you please make a guess what went wrong? > >>>> > >>>> Down in the log I see: > >>>> > >>>> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): > >>> Thanks Matt, I'll try. > >>> > >>>> > >>>> Did the gfortran library move or get upgraded? > >>> Not at all. I have configured 3.9.4 just now to make a test, and it perfectly finds all the libraries. So there must be something that 3.16.4 does differently. > >>> > >>> Best, > >>> > >>> Anton > >>> > >>>> > >>>> Thanks, > >>>> > >>>> Matt > >>>> > >>>> Best regards, > >>>> > >>>> Anton > >>>> > >>>> > >>>> -- > >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >>>> -- Norbert Wiener > >>>> > >>>> https://www.cse.buffalo.edu/~knepley/ > >> > >> > > > From bsmith at petsc.dev Fri Feb 4 11:40:33 2022 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 4 Feb 2022 12:40:33 -0500 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: <28e250ff-3234-445-17d6-c78c6a89c613@mcs.anl.gov> References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> <28e250ff-3234-445-17d6-c78c6a89c613@mcs.anl.gov> Message-ID: <97378FA9-CFBB-471F-A3EB-4D998EEBF7FB@petsc.dev> > On Feb 4, 2022, at 12:38 PM, Satish Balay wrote: > > On Fri, 4 Feb 2022, Barry Smith wrote: > >> >> >>> On Feb 4, 2022, at 12:27 PM, Satish Balay wrote: >>> >>> Probably best if you can use the same version of gfortran to build both petsc and libflame/blis >>> >>>> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): >>> >>> This is probably an ignore-able warning - but configure defaults to -Werror mode here. >> >> Hmm, if it needs libgfortran.so.5 then it needs it and it cannot be ignored since a link cannot succeed. Flame presumably contains a lot of old Fortran code from Lapack so would normally need the fortran libraries. > > Its a warning not an error. > > And we already have a list of excludes (of such warnings) to ignore in configure I understand it is a warning. But I am questioning how one could actually use this libflame library if one of its dependencies cannot be found. > > Satish > >> >>> >>> Wrt forcing link with static libraries - you can try: >>> >>> LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" >>> >>> [instead of --with-blas-lib= --with-lapack-lib= options]. >>> >>> Satish >>> >>> >>> On Fri, 4 Feb 2022, Barry Smith wrote: >>> >>>> >>>> Please do >>>> >>>> ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so >>>> >>>> You may need to list the gfortran library directory of libgfortran.so.5 it needs to use in LDFLAGS passed to PETSc configure >>>> >>>> Barry >>>> >>>> Note: Even though you explicitly listed a static library of libflame to use our configure is goofy and loses that information and wants to link with the shared version >>>> >>>> >>>>> On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: >>>>> >>>>> >>>>> >>>>> On 04.02.22 17:39, Matthew Knepley wrote: >>>>>> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov > wrote: >>>>>> Hi Satish, >>>>>> >>>>>> I just discovered that PETSc 3.16.4 fails to link against the latest AMD >>>>>> BLIS and LibFLAME libraries on a Linux box. >>>>>> >>>>>> ------------------------------------------------------------------------------- >>>>>> You set a value for --with-blas-lib= and --with-lapack-lib=, >>>>>> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and >>>>>> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used >>>>>> ******************************************************************************* >>>>>> >>>>>> My previous experience with 3.9.4 on the same system was fully >>>>>> successful. Looking in the configure logs (attached) reveals small >>>>>> difference in the linking compared to 3.9.4 >>>>>> >>>>>> Could you please make a guess what went wrong? >>>>>> >>>>>> Down in the log I see: >>>>>> >>>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): >>>>> Thanks Matt, I'll try. >>>>> >>>>>> >>>>>> Did the gfortran library move or get upgraded? >>>>> Not at all. I have configured 3.9.4 just now to make a test, and it perfectly finds all the libraries. So there must be something that 3.16.4 does differently. >>>>> >>>>> Best, >>>>> >>>>> Anton >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Anton >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> > From balay at mcs.anl.gov Fri Feb 4 11:44:47 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 4 Feb 2022 11:44:47 -0600 (CST) Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: <97378FA9-CFBB-471F-A3EB-4D998EEBF7FB@petsc.dev> References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> <28e250ff-3234-445-17d6-c78c6a89c613@mcs.anl.gov> <97378FA9-CFBB-471F-A3EB-4D998EEBF7FB@petsc.dev> Message-ID: <62699aa-8b67-cb37-9b50-f78efd9ad2aa@mcs.anl.gov> On Fri, 4 Feb 2022, Barry Smith wrote: > > > > On Feb 4, 2022, at 12:38 PM, Satish Balay wrote: > > > > On Fri, 4 Feb 2022, Barry Smith wrote: > > > >> > >> > >>> On Feb 4, 2022, at 12:27 PM, Satish Balay wrote: > >>> > >>> Probably best if you can use the same version of gfortran to build both petsc and libflame/blis > >>> > >>>> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): > >>> > >>> This is probably an ignore-able warning - but configure defaults to -Werror mode here. > >> > >> Hmm, if it needs libgfortran.so.5 then it needs it and it cannot be ignored since a link cannot succeed. Flame presumably contains a lot of old Fortran code from Lapack so would normally need the fortran libraries. > > > > Its a warning not an error. > > > > And we already have a list of excludes (of such warnings) to ignore in configure > > I understand it is a warning. But I am questioning how one could actually use this libflame library if one of its dependencies cannot be found. libgfortran.so.4 [or .6?] might be found and used. Yeah - ideally that should be an error - but the compiler flags it as warning not an error. [and I don't think configure should convert these warnings to errors - even thought things might break for the user - at a later stage - due to this discrepancy..] Satish > > > > > > Satish > > > >> > >>> > >>> Wrt forcing link with static libraries - you can try: > >>> > >>> LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" > >>> > >>> [instead of --with-blas-lib= --with-lapack-lib= options]. > >>> > >>> Satish > >>> > >>> > >>> On Fri, 4 Feb 2022, Barry Smith wrote: > >>> > >>>> > >>>> Please do > >>>> > >>>> ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so > >>>> > >>>> You may need to list the gfortran library directory of libgfortran.so.5 it needs to use in LDFLAGS passed to PETSc configure > >>>> > >>>> Barry > >>>> > >>>> Note: Even though you explicitly listed a static library of libflame to use our configure is goofy and loses that information and wants to link with the shared version > >>>> > >>>> > >>>>> On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: > >>>>> > >>>>> > >>>>> > >>>>> On 04.02.22 17:39, Matthew Knepley wrote: > >>>>>> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov > wrote: > >>>>>> Hi Satish, > >>>>>> > >>>>>> I just discovered that PETSc 3.16.4 fails to link against the latest AMD > >>>>>> BLIS and LibFLAME libraries on a Linux box. > >>>>>> > >>>>>> ------------------------------------------------------------------------------- > >>>>>> You set a value for --with-blas-lib= and --with-lapack-lib=, > >>>>>> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and > >>>>>> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used > >>>>>> ******************************************************************************* > >>>>>> > >>>>>> My previous experience with 3.9.4 on the same system was fully > >>>>>> successful. Looking in the configure logs (attached) reveals small > >>>>>> difference in the linking compared to 3.9.4 > >>>>>> > >>>>>> Could you please make a guess what went wrong? > >>>>>> > >>>>>> Down in the log I see: > >>>>>> > >>>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): > >>>>> Thanks Matt, I'll try. > >>>>> > >>>>>> > >>>>>> Did the gfortran library move or get upgraded? > >>>>> Not at all. I have configured 3.9.4 just now to make a test, and it perfectly finds all the libraries. So there must be something that 3.16.4 does differently. > >>>>> > >>>>> Best, > >>>>> > >>>>> Anton > >>>>> > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Matt > >>>>>> > >>>>>> Best regards, > >>>>>> > >>>>>> Anton > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >>>>>> -- Norbert Wiener > >>>>>> > >>>>>> https://www.cse.buffalo.edu/~knepley/ > >>>> > >>>> > >>> > >> > > > From popov at uni-mainz.de Fri Feb 4 11:48:00 2022 From: popov at uni-mainz.de (Anton Popov) Date: Fri, 4 Feb 2022 18:48:00 +0100 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> Message-ID: <1c564368-3a38-4403-112e-79040008764b@uni-mainz.de> Thanks Matt, Barry and Satish, for? your suggestions. The problem was indeed in mismatch between the gfortran library version (system has libgfortran.so.4 libflame needs libgfortran.so.5) 3.9.4 did not detect this during configure, but only gave error later during test. 3.16.4 detected this immediately. After installing libgfortran.so.5, both PETSc versions install just fine, however I get the warning mentioned by Satish. Maybe it is indeed worth upgrading everything to the compatible versions. Best, Anton On 04.02.22 18:33, Barry Smith wrote: > >> On Feb 4, 2022, at 12:27 PM, Satish Balay wrote: >> >> Probably best if you can use the same version of gfortran to build both petsc and libflame/blis >> >>> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): >> This is probably an ignore-able warning - but configure defaults to -Werror mode here. > Hmm, if it needs libgfortran.so.5 then it needs it and it cannot be ignored since a link cannot succeed. Flame presumably contains a lot of old Fortran code from Lapack so would normally need the fortran libraries. > >> Wrt forcing link with static libraries - you can try: >> >> LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" >> >> [instead of --with-blas-lib= --with-lapack-lib= options]. >> >> Satish >> >> >> On Fri, 4 Feb 2022, Barry Smith wrote: >> >>> Please do >>> >>> ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so >>> >>> You may need to list the gfortran library directory of libgfortran.so.5 it needs to use in LDFLAGS passed to PETSc configure >>> >>> Barry >>> >>> Note: Even though you explicitly listed a static library of libflame to use our configure is goofy and loses that information and wants to link with the shared version >>> >>> >>>> On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: >>>> >>>> >>>> >>>> On 04.02.22 17:39, Matthew Knepley wrote: >>>>> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov > wrote: >>>>> Hi Satish, >>>>> >>>>> I just discovered that PETSc 3.16.4 fails to link against the latest AMD >>>>> BLIS and LibFLAME libraries on a Linux box. >>>>> >>>>> ------------------------------------------------------------------------------- >>>>> You set a value for --with-blas-lib= and --with-lapack-lib=, >>>>> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and >>>>> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used >>>>> ******************************************************************************* >>>>> >>>>> My previous experience with 3.9.4 on the same system was fully >>>>> successful. Looking in the configure logs (attached) reveals small >>>>> difference in the linking compared to 3.9.4 >>>>> >>>>> Could you please make a guess what went wrong? >>>>> >>>>> Down in the log I see: >>>>> >>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using -rpath or -rpath-link): >>>> Thanks Matt, I'll try. >>>> >>>>> Did the gfortran library move or get upgraded? >>>> Not at all. I have configured 3.9.4 just now to make a test, and it perfectly finds all the libraries. So there must be something that 3.16.4 does differently. >>>> >>>> Best, >>>> >>>> Anton >>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> Best regards, >>>>> >>>>> Anton >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>> From balay at mcs.anl.gov Fri Feb 4 11:54:51 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 4 Feb 2022 11:54:51 -0600 (CST) Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: <1c564368-3a38-4403-112e-79040008764b@uni-mainz.de> References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> <1c564368-3a38-4403-112e-79040008764b@uni-mainz.de> Message-ID: On Fri, 4 Feb 2022, Anton Popov wrote: > Thanks Matt, Barry and Satish, for? your suggestions. > > The problem was indeed in mismatch between the gfortran library version > (system has libgfortran.so.4 libflame needs libgfortran.so.5) > > 3.9.4 did not detect this during configure, but only gave error later during > test. > > 3.16.4 detected this immediately. > > After installing libgfortran.so.5, both PETSc versions install just fine, > however I get the warning mentioned by Satish. Hm - you could install libgfortran.so.5 without the corresponding compiler? >> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so Ah - this is a binary download - not something compiled locally? I wonder if the compiler gave both an error [return code] and a warning previously [with the missing libgfortran.so.5] - as Barry was suggesting. As you say - you still get that warning [which is expected - as you are using a different version of gfortran than what libflame.so was built with] Satish > > Maybe it is indeed worth upgrading everything to the compatible versions. > > Best, > > Anton > > > On 04.02.22 18:33, Barry Smith wrote: > > > >> On Feb 4, 2022, at 12:27 PM, Satish Balay wrote: > >> > >> Probably best if you can use the same version of gfortran to build both > >> petsc and libflame/blis > >> > >>> /usr/bin/ld: warning: libgfortran.so.5, needed by > >>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using > >>> -rpath or -rpath-link): > >> This is probably an ignore-able warning - but configure defaults to -Werror > >> mode here. > > Hmm, if it needs libgfortran.so.5 then it needs it and it cannot be > > ignored since a link cannot succeed. Flame presumably contains a lot of > > old Fortran code from Lapack so would normally need the fortran > > libraries. > > > >> Wrt forcing link with static libraries - you can try: > >> > >> LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a > >> /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" > >> > >> [instead of --with-blas-lib= --with-lapack-lib= options]. > >> > >> Satish > >> > >> > >> On Fri, 4 Feb 2022, Barry Smith wrote: > >> > >>> Please do > >>> > >>> ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so > >>> > >>> You may need to list the gfortran library directory of libgfortran.so.5 it > >>> needs to use in LDFLAGS passed to PETSc configure > >>> > >>> Barry > >>> > >>> Note: Even though you explicitly listed a static library of libflame to > >>> use our configure is goofy and loses that information and wants to link > >>> with the shared version > >>> > >>> > >>>> On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: > >>>> > >>>> > >>>> > >>>> On 04.02.22 17:39, Matthew Knepley wrote: > >>>>> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov >>>>> > wrote: > >>>>> Hi Satish, > >>>>> > >>>>> I just discovered that PETSc 3.16.4 fails to link against the latest AMD > >>>>> BLIS and LibFLAME libraries on a Linux box. > >>>>> > >>>>> ------------------------------------------------------------------------------- > >>>>> You set a value for --with-blas-lib= and --with-lapack-lib=, > >>>>> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and > >>>>> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used > >>>>> ******************************************************************************* > >>>>> > >>>>> My previous experience with 3.9.4 on the same system was fully > >>>>> successful. Looking in the configure logs (attached) reveals small > >>>>> difference in the linking compared to 3.9.4 > >>>>> > >>>>> Could you please make a guess what went wrong? > >>>>> > >>>>> Down in the log I see: > >>>>> > >>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by > >>>>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using > >>>>> -rpath or -rpath-link): > >>>> Thanks Matt, I'll try. > >>>> > >>>>> Did the gfortran library move or get upgraded? > >>>> Not at all. I have configured 3.9.4 just now to make a test, and it > >>>> perfectly finds all the libraries. So there must be something that 3.16.4 > >>>> does differently. > >>>> > >>>> Best, > >>>> > >>>> Anton > >>>> > >>>>> Thanks, > >>>>> > >>>>> Matt > >>>>> > >>>>> Best regards, > >>>>> > >>>>> Anton > >>>>> > >>>>> > >>>>> -- > >>>>> What most experimenters take for granted before they begin their > >>>>> experiments is infinitely more interesting than any results to which > >>>>> their experiments lead. > >>>>> -- Norbert Wiener > >>>>> > >>>>> https://www.cse.buffalo.edu/~knepley/ > >>>>> > >>> > > From popov at uni-mainz.de Fri Feb 4 14:17:35 2022 From: popov at uni-mainz.de (Anton Popov) Date: Fri, 4 Feb 2022 21:17:35 +0100 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> <1c564368-3a38-4403-112e-79040008764b@uni-mainz.de> Message-ID: Am 04.02.2022 um 18:54 schrieb Satish Balay: > On Fri, 4 Feb 2022, Anton Popov wrote: > >> Thanks Matt, Barry and Satish, for? your suggestions. >> >> The problem was indeed in mismatch between the gfortran library version >> (system has libgfortran.so.4 libflame needs libgfortran.so.5) >> >> 3.9.4 did not detect this during configure, but only gave error later during >> test. >> >> 3.16.4 detected this immediately. >> >> After installing libgfortran.so.5, both PETSc versions install just fine, >> however I get the warning mentioned by Satish. > Hm - you could install libgfortran.so.5 without the corresponding compiler? Yes, apparently it is possible on Ubuntu 18.04. > >>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so > Ah - this is a binary download - not something compiled locally? Yes it is the latest binary from AMD website. > > I wonder if the compiler gave both an error [return code] and a warning previously [with the missing libgfortran.so.5] - as Barry was suggesting. It was an error at runtime, configure was just fine. > > > As you say - you still get that warning [which is expected - as you are using a different version of gfortran than what libflame.so was built with] Now since libgfortran.so.5 coexists with libgfortran.so.4 it links against the proper one, but gives a warning that both versions are available. I think I will just compile BLIS and LibFLAME from sources on my system to avoid these problems altogether. Best, Anton > > Satish > >> Maybe it is indeed worth upgrading everything to the compatible versions. >> >> Best, >> >> Anton >> >> >> On 04.02.22 18:33, Barry Smith wrote: >>>> On Feb 4, 2022, at 12:27 PM, Satish Balay wrote: >>>> >>>> Probably best if you can use the same version of gfortran to build both >>>> petsc and libflame/blis >>>> >>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by >>>>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using >>>>> -rpath or -rpath-link): >>>> This is probably an ignore-able warning - but configure defaults to -Werror >>>> mode here. >>> Hmm, if it needs libgfortran.so.5 then it needs it and it cannot be >>> ignored since a link cannot succeed. Flame presumably contains a lot of >>> old Fortran code from Lapack so would normally need the fortran >>> libraries. >>> >>>> Wrt forcing link with static libraries - you can try: >>>> >>>> LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a >>>> /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" >>>> >>>> [instead of --with-blas-lib= --with-lapack-lib= options]. >>>> >>>> Satish >>>> >>>> >>>> On Fri, 4 Feb 2022, Barry Smith wrote: >>>> >>>>> Please do >>>>> >>>>> ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so >>>>> >>>>> You may need to list the gfortran library directory of libgfortran.so.5 it >>>>> needs to use in LDFLAGS passed to PETSc configure >>>>> >>>>> Barry >>>>> >>>>> Note: Even though you explicitly listed a static library of libflame to >>>>> use our configure is goofy and loses that information and wants to link >>>>> with the shared version >>>>> >>>>> >>>>>> On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 04.02.22 17:39, Matthew Knepley wrote: >>>>>>> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov >>>>>> > wrote: >>>>>>> Hi Satish, >>>>>>> >>>>>>> I just discovered that PETSc 3.16.4 fails to link against the latest AMD >>>>>>> BLIS and LibFLAME libraries on a Linux box. >>>>>>> >>>>>>> ------------------------------------------------------------------------------- >>>>>>> You set a value for --with-blas-lib= and --with-lapack-lib=, >>>>>>> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and >>>>>>> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used >>>>>>> ******************************************************************************* >>>>>>> >>>>>>> My previous experience with 3.9.4 on the same system was fully >>>>>>> successful. Looking in the configure logs (attached) reveals small >>>>>>> difference in the linking compared to 3.9.4 >>>>>>> >>>>>>> Could you please make a guess what went wrong? >>>>>>> >>>>>>> Down in the log I see: >>>>>>> >>>>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by >>>>>>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using >>>>>>> -rpath or -rpath-link): >>>>>> Thanks Matt, I'll try. >>>>>> >>>>>>> Did the gfortran library move or get upgraded? >>>>>> Not at all. I have configured 3.9.4 just now to make a test, and it >>>>>> perfectly finds all the libraries. So there must be something that 3.16.4 >>>>>> does differently. >>>>>> >>>>>> Best, >>>>>> >>>>>> Anton >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Anton >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which >>>>>>> their experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>> >> From balay at mcs.anl.gov Fri Feb 4 15:18:56 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 4 Feb 2022 15:18:56 -0600 (CST) Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> <1c564368-3a38-4403-112e-79040008764b@uni-mainz.de> Message-ID: On Fri, 4 Feb 2022, Anton Popov wrote: > > Am 04.02.2022 um 18:54 schrieb Satish Balay: > > On Fri, 4 Feb 2022, Anton Popov wrote: > > > >> Thanks Matt, Barry and Satish, for? your suggestions. > >> > >> The problem was indeed in mismatch between the gfortran library version > >> (system has libgfortran.so.4 libflame needs libgfortran.so.5) > >> > >> 3.9.4 did not detect this during configure, but only gave error later > >> during > >> test. > >> > >> 3.16.4 detected this immediately. > >> > >> After installing libgfortran.so.5, both PETSc versions install just fine, > >> however I get the warning mentioned by Satish. > > Hm - you could install libgfortran.so.5 without the corresponding compiler? > Yes, apparently it is possible on Ubuntu 18.04. Presumably it also comes with the corresponding gcc/gfortran [gcc-8, gfortran-8?]. Perhaps these compilers would avoid this issue. > > > >>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so > > Ah - this is a binary download - not something compiled locally? > Yes it is the latest binary from AMD website. > > > > I wonder if the compiler gave both an error [return code] and a warning > > previously [with the missing libgfortran.so.5] - as Barry was suggesting. > It was an error at runtime, configure was just fine. > > > > > > As you say - you still get that warning [which is expected - as you are > > using a different version of gfortran than what libflame.so was built with] > > Now since libgfortran.so.5 coexists with libgfortran.so.4 it links against the > proper one, but gives a warning that both versions are available. > > I think I will just compile BLIS and LibFLAME from sources on my system to > avoid these problems altogether. Yeah - generally building everything with the same version of the compiler avoids these issues. Satish > > Best, > > Anton > > > > > > Satish > > > >> Maybe it is indeed worth upgrading everything to the compatible versions. > >> > >> Best, > >> > >> Anton > >> > >> > >> On 04.02.22 18:33, Barry Smith wrote: > >>>> On Feb 4, 2022, at 12:27 PM, Satish Balay wrote: > >>>> > >>>> Probably best if you can use the same version of gfortran to build both > >>>> petsc and libflame/blis > >>>> > >>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by > >>>>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using > >>>>> -rpath or -rpath-link): > >>>> This is probably an ignore-able warning - but configure defaults to > >>>> -Werror > >>>> mode here. > >>> Hmm, if it needs libgfortran.so.5 then it needs it and it cannot be > >>> ignored since a link cannot succeed. Flame presumably contains a lot > >>> of > >>> old Fortran code from Lapack so would normally need the fortran > >>> libraries. > >>> > >>>> Wrt forcing link with static libraries - you can try: > >>>> > >>>> LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a > >>>> /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" > >>>> > >>>> [instead of --with-blas-lib= --with-lapack-lib= options]. > >>>> > >>>> Satish > >>>> > >>>> > >>>> On Fri, 4 Feb 2022, Barry Smith wrote: > >>>> > >>>>> Please do > >>>>> > >>>>> ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so > >>>>> > >>>>> You may need to list the gfortran library directory of libgfortran.so.5 > >>>>> it > >>>>> needs to use in LDFLAGS passed to PETSc configure > >>>>> > >>>>> Barry > >>>>> > >>>>> Note: Even though you explicitly listed a static library of libflame to > >>>>> use our configure is goofy and loses that information and wants to link > >>>>> with the shared version > >>>>> > >>>>> > >>>>>> On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 04.02.22 17:39, Matthew Knepley wrote: > >>>>>>> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov >>>>>>> > wrote: > >>>>>>> Hi Satish, > >>>>>>> > >>>>>>> I just discovered that PETSc 3.16.4 fails to link against the latest > >>>>>>> AMD > >>>>>>> BLIS and LibFLAME libraries on a Linux box. > >>>>>>> > >>>>>>> ------------------------------------------------------------------------------- > >>>>>>> You set a value for --with-blas-lib= and --with-lapack-lib=, > >>>>>>> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and > >>>>>>> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used > >>>>>>> ******************************************************************************* > >>>>>>> > >>>>>>> My previous experience with 3.9.4 on the same system was fully > >>>>>>> successful. Looking in the configure logs (attached) reveals small > >>>>>>> difference in the linking compared to 3.9.4 > >>>>>>> > >>>>>>> Could you please make a guess what went wrong? > >>>>>>> > >>>>>>> Down in the log I see: > >>>>>>> > >>>>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by > >>>>>>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using > >>>>>>> -rpath or -rpath-link): > >>>>>> Thanks Matt, I'll try. > >>>>>> > >>>>>>> Did the gfortran library move or get upgraded? > >>>>>> Not at all. I have configured 3.9.4 just now to make a test, and it > >>>>>> perfectly finds all the libraries. So there must be something that > >>>>>> 3.16.4 > >>>>>> does differently. > >>>>>> > >>>>>> Best, > >>>>>> > >>>>>> Anton > >>>>>> > >>>>>>> Thanks, > >>>>>>> > >>>>>>> Matt > >>>>>>> > >>>>>>> Best regards, > >>>>>>> > >>>>>>> Anton > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> What most experimenters take for granted before they begin their > >>>>>>> experiments is infinitely more interesting than any results to which > >>>>>>> their experiments lead. > >>>>>>> -- Norbert Wiener > >>>>>>> > >>>>>>> https://www.cse.buffalo.edu/~knepley/ > >>>>>>> > >> > From junchao.zhang at gmail.com Fri Feb 4 16:21:41 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 4 Feb 2022 16:21:41 -0600 Subject: [petsc-users] Sparse solvers for distributed GPU matrices/vectors arising from 3D poisson eq In-Reply-To: References: Message-ID: On Fri, Feb 4, 2022 at 11:09 AM Sajid Ali Syed wrote: > Hi PETSc-developers, > > Could the linear solver table (at > https://petsc.org/main/overview/linear_solve_table/) be updated with > information regarding direct solvers that work on mpiaijkokkos/kokkos (or > mpiaijcusparse/cuda) matrix/vector types? > > The use case for this solver would be to repeatedly invert the same matrix > so any solver that is able to perform the SpTRSV phase entirely using GPU > matrices/vectors would be helpful (even if the initial factorization is > performed using CPU matrices/vectors with GPU offload), this functionality > of course being the corresponding distributed memory counterpart to the > current device-solve capabilities of the seqaijkokkos matrix type (provided > by the kokkos-kernel SpTRSV routines). The system arises from a (7-pt) > finite difference discretization of the 3D Poisson equation with a mesh of > 256x256x1024 (likely necessitate using multiple GPUs) with dirichlet > boundary conditions. > We do not have parallel SpTRSV on GPU. I think you need superlu_dist for that. > The recent article on PETScSF (arXiv:2102.13018) describes an asynchronous > CG solver that works well on communication bound multi-GPU systems. Is this > solver available now and can it be combined with GAMG/hypre preconditioning > ? > > The asynchronous CG solver is experimental. It requires a lot of things not in petsc/main. It is currently not in a state for general use. > Summary of Sparse Linear Solvers Available In PETSc ? PETSc > v3.16.2-540-g1213a6437a documentation > > Last updated on 2022-01-01T03:38:46-0600 (v3.16.2-540-g1213a6437a). > petsc.org > > Thank You, > Sajid Ali (he/him) | Research Associate > Scientific Computing Division > Fermi National Accelerator Laboratory > s-sajid-ali.github.io > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From samuelestes91 at gmail.com Fri Feb 4 22:47:36 2022 From: samuelestes91 at gmail.com (Samuel Estes) Date: Fri, 4 Feb 2022 22:47:36 -0600 Subject: [petsc-users] Matrix preallocation Message-ID: Hi, I have a very basic question about matrix preallocation. I am trying to use the MatCreate(), MatSetFromOptions(), MatXXXXSetPreallocation() paradigm. I thought that I should use the MatXAIJSetPreallocation() routine since the code may be run with a SeqAIJ or MPIAIJ matrix but I do not understand all of the inputs required for the MatXAIJSetPreallocation routine. In particular, the dnnzu and onnzu variables don't quite make sense to me. Can these be NULL? I was basically just hoping for a routine that would preallocate for either a sequential or parallel matrix depending on what was given at runtime. This routine seems to be what I want but I don't understand it very well and the documentation hasn't helped me to figure it out. A related followup question: Is it good practice to use this function or should I just use the other routines like MatSeqAIJSetPreallocation() and MatMPIAIJSetPreallocation()? And finally my last question: if I were to use the MatSeqAIJSetPreallocation()/MatMPIAIJSetPreallocation() routines for preallocating memory, is it common to just call MatGetType() then call the appropriate routine depending on whether or not the matrix is parallel or not? I ask because when I have tested these routines out, it seems that MatSeqAIJSetPreallocation() works even for parallel matrices which is a bit confusing. I'm assuming that it just sets the diagonal part of the matrix? I hope that my questions were clear. Let me know if they need clarification and thanks in advance for the help! Sam -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Feb 5 05:20:07 2022 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 5 Feb 2022 06:20:07 -0500 Subject: [petsc-users] Matrix preallocation In-Reply-To: References: Message-ID: On Fri, Feb 4, 2022 at 11:47 PM Samuel Estes wrote: > Hi, > > I have a very basic question about matrix preallocation. I am trying to > use the MatCreate(), MatSetFromOptions(), MatXXXXSetPreallocation() > paradigm. I thought that I should use the MatXAIJSetPreallocation() routine > since the code may be run with a SeqAIJ or MPIAIJ matrix but I do not > understand all of the inputs required for the > MatXAIJSetPreallocation routine. In particular, the dnnzu and > onnzu variables don't quite make sense to me. Can these be NULL? > Yes. The integer is ignored if you provide the array. Just make sure that the integer is an upper bound on the number of non-zeros in any row. If not, MatSetValues can be very slow. Otherwise, the performance hit is not bad. > I was basically just hoping for a routine that would preallocate for > either a sequential or parallel matrix depending on what was given at > runtime. > That is what the "X" is for. It is basically syntactic sugar to call the right version of preallocate (seq, mpi, blocked versions, hypre, etc.) > This routine seems to be what I want but I don't understand it very well > and the documentation hasn't helped me to figure it out. > Sorry, > > A related followup question: Is it good practice to use this function or > should I just use the other routines like MatSeqAIJSetPreallocation() and > MatMPIAIJSetPreallocation()? > Yes this is the way to go. TL;DR Before we had a sparse matrix type explosion with GPUs, you could call both the MPI and Seq versions and be fine. You could call the blocked versions I suppose to be safe. But now there are several GPU/device enabled matrix types and "X" is convenient. > > And finally my last question: if I were to use the > MatSeqAIJSetPreallocation()/MatMPIAIJSetPreallocation() routines for > preallocating memory, is it common to just call MatGetType() then call the > appropriate routine depending on whether or not the matrix is parallel or > not? > You could do that but no need. As I said, the old way was to call both. > I ask because when I have tested these routines out, it seems > that MatSeqAIJSetPreallocation() works even for parallel matrices which is > a bit confusing. I'm assuming that it just sets the diagonal part of the > matrix? > There is probably a default. It should work w/o any of these calls, but performance would suffer if the default (10 maybe) is too small for you. Mark > I hope that my questions were clear. Let me know if they need > clarification and thanks in advance for the help! > > Sam > -------------- next part -------------- An HTML attachment was scrubbed... URL: From popov at uni-mainz.de Sat Feb 5 05:57:34 2022 From: popov at uni-mainz.de (Anton Popov) Date: Sat, 5 Feb 2022 12:57:34 +0100 Subject: [petsc-users] 3.16.4 amd blis libflame linking problem In-Reply-To: References: <2284ab7a-fbcc-786b-4a5e-be4b1fba10f1@uni-mainz.de> <1c564368-3a38-4403-112e-79040008764b@uni-mainz.de> Message-ID: <4490063f-a1c5-2f91-a5ef-c45b927be221@uni-mainz.de> On 04.02.22 22:18, Satish Balay wrote: > On Fri, 4 Feb 2022, Anton Popov wrote: > >> Am 04.02.2022 um 18:54 schrieb Satish Balay: >>> On Fri, 4 Feb 2022, Anton Popov wrote: >>> >>>> Thanks Matt, Barry and Satish, for? your suggestions. >>>> >>>> The problem was indeed in mismatch between the gfortran library version >>>> (system has libgfortran.so.4 libflame needs libgfortran.so.5) >>>> >>>> 3.9.4 did not detect this during configure, but only gave error later >>>> during >>>> test. >>>> >>>> 3.16.4 detected this immediately. >>>> >>>> After installing libgfortran.so.5, both PETSc versions install just fine, >>>> however I get the warning mentioned by Satish. >>> Hm - you could install libgfortran.so.5 without the corresponding compiler? >> Yes, apparently it is possible on Ubuntu 18.04. > Presumably it also comes with the corresponding gcc/gfortran [gcc-8, gfortran-8?]. Perhaps these compilers would avoid this issue. > >>>>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so >>> Ah - this is a binary download - not something compiled locally? >> Yes it is the latest binary from AMD website. >>> I wonder if the compiler gave both an error [return code] and a warning >>> previously [with the missing libgfortran.so.5] - as Barry was suggesting. >> It was an error at runtime, configure was just fine. >>> >>> As you say - you still get that warning [which is expected - as you are >>> using a different version of gfortran than what libflame.so was built with] >> Now since libgfortran.so.5 coexists with libgfortran.so.4 it links against the >> proper one, but gives a warning that both versions are available. >> >> I think I will just compile BLIS and LibFLAME from sources on my system to >> avoid these problems altogether. > Yeah - generally building everything with the same version of the compiler avoids these issues. Actually this is the fastest way to go and it works: --download-f2cblaslapack --download-blis Are there plans to also support? --download-libflame? I know that most of performance boost comes from BLIS in this case. But maybe LibFLAME would also run a bit faster than a reference LAPACK. Anton > > Satish > >> Best, >> >> Anton >> >> >>> Satish >>> >>>> Maybe it is indeed worth upgrading everything to the compatible versions. >>>> >>>> Best, >>>> >>>> Anton >>>> >>>> >>>> On 04.02.22 18:33, Barry Smith wrote: >>>>>> On Feb 4, 2022, at 12:27 PM, Satish Balay wrote: >>>>>> >>>>>> Probably best if you can use the same version of gfortran to build both >>>>>> petsc and libflame/blis >>>>>> >>>>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by >>>>>>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using >>>>>>> -rpath or -rpath-link): >>>>>> This is probably an ignore-able warning - but configure defaults to >>>>>> -Werror >>>>>> mode here. >>>>> Hmm, if it needs libgfortran.so.5 then it needs it and it cannot be >>>>> ignored since a link cannot succeed. Flame presumably contains a lot >>>>> of >>>>> old Fortran code from Lapack so would normally need the fortran >>>>> libraries. >>>>> >>>>>> Wrt forcing link with static libraries - you can try: >>>>>> >>>>>> LIBS="/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a >>>>>> /opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a" >>>>>> >>>>>> [instead of --with-blas-lib= --with-lapack-lib= options]. >>>>>> >>>>>> Satish >>>>>> >>>>>> >>>>>> On Fri, 4 Feb 2022, Barry Smith wrote: >>>>>> >>>>>>> Please do >>>>>>> >>>>>>> ldd -O /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so >>>>>>> >>>>>>> You may need to list the gfortran library directory of libgfortran.so.5 >>>>>>> it >>>>>>> needs to use in LDFLAGS passed to PETSc configure >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> Note: Even though you explicitly listed a static library of libflame to >>>>>>> use our configure is goofy and loses that information and wants to link >>>>>>> with the shared version >>>>>>> >>>>>>> >>>>>>>> On Feb 4, 2022, at 12:00 PM, Anton Popov wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 04.02.22 17:39, Matthew Knepley wrote: >>>>>>>>> On Fri, Feb 4, 2022 at 11:35 AM Anton Popov >>>>>>>> > wrote: >>>>>>>>> Hi Satish, >>>>>>>>> >>>>>>>>> I just discovered that PETSc 3.16.4 fails to link against the latest >>>>>>>>> AMD >>>>>>>>> BLIS and LibFLAME libraries on a Linux box. >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------------- >>>>>>>>> You set a value for --with-blas-lib= and --with-lapack-lib=, >>>>>>>>> but ['/opt/amd/amd-blis-3.1.0/lib/lp64/libblis.a'] and >>>>>>>>> ['/opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.a'] cannot be used >>>>>>>>> ******************************************************************************* >>>>>>>>> >>>>>>>>> My previous experience with 3.9.4 on the same system was fully >>>>>>>>> successful. Looking in the configure logs (attached) reveals small >>>>>>>>> difference in the linking compared to 3.9.4 >>>>>>>>> >>>>>>>>> Could you please make a guess what went wrong? >>>>>>>>> >>>>>>>>> Down in the log I see: >>>>>>>>> >>>>>>>>> /usr/bin/ld: warning: libgfortran.so.5, needed by >>>>>>>>> /opt/amd/amd-libflame-3.1.0/lib/lp64/libflame.so, not found (try using >>>>>>>>> -rpath or -rpath-link): >>>>>>>> Thanks Matt, I'll try. >>>>>>>> >>>>>>>>> Did the gfortran library move or get upgraded? >>>>>>>> Not at all. I have configured 3.9.4 just now to make a test, and it >>>>>>>> perfectly finds all the libraries. So there must be something that >>>>>>>> 3.16.4 >>>>>>>> does differently. >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Anton >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> >>>>>>>>> Anton >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>>> their experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>> From mfadams at lbl.gov Sat Feb 5 08:01:35 2022 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 5 Feb 2022 09:01:35 -0500 Subject: [petsc-users] Matrix preallocation In-Reply-To: References: Message-ID: Woops, I misspoke. MatXAIJSetPreallocation does not take the integer estimate. You have to give it the nnz arrays. You can use NULL for the upper triangular part. The two args. Mark On Sat, Feb 5, 2022 at 6:20 AM Mark Adams wrote: > > > On Fri, Feb 4, 2022 at 11:47 PM Samuel Estes > wrote: > >> Hi, >> >> I have a very basic question about matrix preallocation. I am trying to >> use the MatCreate(), MatSetFromOptions(), MatXXXXSetPreallocation() >> paradigm. I thought that I should use the MatXAIJSetPreallocation() routine >> since the code may be run with a SeqAIJ or MPIAIJ matrix but I do not >> understand all of the inputs required for the >> MatXAIJSetPreallocation routine. In particular, the dnnzu and >> onnzu variables don't quite make sense to me. Can these be NULL? >> > > Yes. The integer is ignored if you provide the array. Just make sure that > the integer is an upper bound on the number of non-zeros in any row. If > not, MatSetValues can be very slow. Otherwise, the performance hit is not > bad. > > >> I was basically just hoping for a routine that would preallocate for >> either a sequential or parallel matrix depending on what was given at >> runtime. >> > > That is what the "X" is for. It is basically syntactic sugar to call the > right version of preallocate (seq, mpi, blocked versions, hypre, etc.) > > >> This routine seems to be what I want but I don't understand it very well >> and the documentation hasn't helped me to figure it out. >> > > Sorry, > > >> >> A related followup question: Is it good practice to use this function or >> should I just use the other routines like MatSeqAIJSetPreallocation() and >> MatMPIAIJSetPreallocation()? >> > > Yes this is the way to go. > TL;DR > Before we had a sparse matrix type explosion with GPUs, you could call > both the MPI and Seq versions and be fine. You could call the blocked > versions I suppose to be safe. > But now there are several GPU/device enabled matrix types and "X" is > convenient. > > >> >> And finally my last question: if I were to use the >> MatSeqAIJSetPreallocation()/MatMPIAIJSetPreallocation() routines for >> preallocating memory, is it common to just call MatGetType() then call the >> appropriate routine depending on whether or not the matrix is parallel or >> not? >> > > You could do that but no need. As I said, the old way was to call both. > > >> I ask because when I have tested these routines out, it seems >> that MatSeqAIJSetPreallocation() works even for parallel matrices which is >> a bit confusing. I'm assuming that it just sets the diagonal part of the >> matrix? >> > > There is probably a default. It should work w/o any of these calls, but > performance would suffer if the default (10 maybe) is too small for you. > > Mark > > >> I hope that my questions were clear. Let me know if they need >> clarification and thanks in advance for the help! >> >> Sam >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Feb 5 09:35:46 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 5 Feb 2022 10:35:46 -0500 Subject: [petsc-users] Matrix preallocation In-Reply-To: References: Message-ID: On Fri, Feb 4, 2022 at 11:47 PM Samuel Estes wrote: > Hi, > > I have a very basic question about matrix preallocation. I am trying to > use the MatCreate(), MatSetFromOptions(), MatXXXXSetPreallocation() > paradigm. I thought that I should use the MatXAIJSetPreallocation() routine > since the code may be run with a SeqAIJ or MPIAIJ matrix but I do not > understand all of the inputs required for the > MatXAIJSetPreallocation routine. In particular, the dnnzu and > onnzu variables don't quite make sense to me. Can these be NULL? I was > basically just hoping for a routine that would preallocate for either a > sequential or parallel matrix depending on what was given at runtime. This > routine seems to be what I want but I don't understand it very well and the > documentation hasn't helped me to figure it out. > The example for this is here https://petsc.org/main/docs/manualpages/Mat/MatMPIAIJSetPreallocation.html#MatMPIAIJSetPreallocation Maybe we should copy it to the XAIJ page as well. Does this help explain the arguments? > A related followup question: Is it good practice to use this function or > should I just use the other routines like MatSeqAIJSetPreallocation() and > MatMPIAIJSetPreallocation()? > > And finally my last question: if I were to use the > MatSeqAIJSetPreallocation()/MatMPIAIJSetPreallocation() routines for > preallocating memory, is it common to just call MatGetType() then call the > appropriate routine depending on whether or not the matrix is parallel or > not? I ask because when I have tested these routines out, it seems > that MatSeqAIJSetPreallocation() works even for parallel matrices which is > a bit confusing. I'm assuming that it just sets the diagonal part of the > matrix? > No, it definitely will not preallocate in parallel, so something else is happening. Thanks, Matt > I hope that my questions were clear. Let me know if they need > clarification and thanks in advance for the help! > > Sam > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sat Feb 5 10:12:41 2022 From: mfadams at lbl.gov (Mark Adams) Date: Sat, 5 Feb 2022 11:12:41 -0500 Subject: [petsc-users] Matrix preallocation In-Reply-To: References: Message-ID: On Sat, Feb 5, 2022 at 10:36 AM Matthew Knepley wrote: > On Fri, Feb 4, 2022 at 11:47 PM Samuel Estes > wrote: > >> Hi, >> >> I have a very basic question about matrix preallocation. I am trying to >> use the MatCreate(), MatSetFromOptions(), MatXXXXSetPreallocation() >> paradigm. I thought that I should use the MatXAIJSetPreallocation() routine >> since the code may be run with a SeqAIJ or MPIAIJ matrix but I do not >> understand all of the inputs required for the >> MatXAIJSetPreallocation routine. In particular, the dnnzu and >> onnzu variables don't quite make sense to me. Can these be NULL? I was >> basically just hoping for a routine that would preallocate for either a >> sequential or parallel matrix depending on what was given at runtime. This >> routine seems to be what I want but I don't understand it very well and the >> documentation hasn't helped me to figure it out. >> > > The example for this is here > > > https://petsc.org/main/docs/manualpages/Mat/MatMPIAIJSetPreallocation.html#MatMPIAIJSetPreallocation > > Maybe we should copy it to the XAIJ page as well. Does this help explain > the arguments? > > >> A related followup question: Is it good practice to use this function or >> should I just use the other routines like MatSeqAIJSetPreallocation() and >> MatMPIAIJSetPreallocation()? >> >> And finally my last question: if I were to use the >> MatSeqAIJSetPreallocation()/MatMPIAIJSetPreallocation() routines for >> preallocating memory, is it common to just call MatGetType() then call the >> appropriate routine depending on whether or not the matrix is parallel or >> not? I ask because when I have tested these routines out, it seems >> that MatSeqAIJSetPreallocation() works even for parallel matrices which is >> a bit confusing. I'm assuming that it just sets the diagonal part of the >> matrix? >> > > No, it definitely will not preallocate in parallel, so something else is > happening. > With only MatSeqAIJSetPreallocation in parallel it would not do any preallocation, in which case it would fall back to dynamic allocation. I'm guessing that is what is happening and it will be very slow. > > Thanks, > > Matt > > >> I hope that my questions were clear. Let me know if they need >> clarification and thanks in advance for the help! >> >> Sam >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From samuelestes91 at gmail.com Sun Feb 6 13:39:07 2022 From: samuelestes91 at gmail.com (Samuel Estes) Date: Sun, 6 Feb 2022 13:39:07 -0600 Subject: [petsc-users] Matrix preallocation In-Reply-To: References: Message-ID: First of all, thank you so much for the detailed answers! That clears up most of my confusion. Just to clarify let me make sure I understand everything: 1. So it seems that your advice is just to call the MatXAIJSetPreallocation() routine (rather than make separate calls to other preallocation routines such as MatSeqAIJPreallcation() and MatMPIAIJPreallcation())? This will preallocate for Seq and MPI matrices (among other types) and the decision will be made at runtime? 2. So the old way was just to call MatSeqAIJPreallcation() and MatMPIAIJPreallcation()? I had thought that if you called the sequential version for a parallel matrix or vice versa that the program would crash. It seems that instead nothing happens, so by calling both you were covered in either case. Is this correct? Essentially, one would always execute and the other would do nothing? 3. If the dnnzu and onnzu arrays for the upper triangular parts seem unnecessary for me I can just call the MatXAIJSetPreallocation() routine with those values set to 'NULL'? I'm really just looking for a blend of the MatSeqAIJPreallcation() and MatMPIAIJPreallcation() routines so it seems that I only need dnnz and onnz arrays. 4. If we ignore the dnnzu and onnzu upper triangular parameters, then the MatXAIJSetPreallocation() routine has the same parameters as the MatMPIAIJSetPreallocation() routine without the option to give constant integer values for diagonal and off-diagonal parts of the matrix so it seems clear to me how this routine would work for a parallel matrix when the upper triangular parameters are NULL. If the matrix is sequential then does MPIXAIJSetPreallocation() just ignore the onnz parameter and just use the dnnz parameter as the sole argument for determining preallocation? Thanks again for all the help. I think I understand well enough to use this routine now. The questions above are mostly just to clarify and double-check that I understood your previous responses correctly. Sam On Sat, Feb 5, 2022 at 10:12 AM Mark Adams wrote: > > > On Sat, Feb 5, 2022 at 10:36 AM Matthew Knepley wrote: > >> On Fri, Feb 4, 2022 at 11:47 PM Samuel Estes >> wrote: >> >>> Hi, >>> >>> I have a very basic question about matrix preallocation. I am trying to >>> use the MatCreate(), MatSetFromOptions(), MatXXXXSetPreallocation() >>> paradigm. I thought that I should use the MatXAIJSetPreallocation() routine >>> since the code may be run with a SeqAIJ or MPIAIJ matrix but I do not >>> understand all of the inputs required for the >>> MatXAIJSetPreallocation routine. In particular, the dnnzu and >>> onnzu variables don't quite make sense to me. Can these be NULL? I was >>> basically just hoping for a routine that would preallocate for either a >>> sequential or parallel matrix depending on what was given at runtime. This >>> routine seems to be what I want but I don't understand it very well and the >>> documentation hasn't helped me to figure it out. >>> >> >> The example for this is here >> >> >> https://petsc.org/main/docs/manualpages/Mat/MatMPIAIJSetPreallocation.html#MatMPIAIJSetPreallocation >> >> Maybe we should copy it to the XAIJ page as well. Does this help explain >> the arguments? >> >> >>> A related followup question: Is it good practice to use this function or >>> should I just use the other routines like MatSeqAIJSetPreallocation() and >>> MatMPIAIJSetPreallocation()? >>> >>> And finally my last question: if I were to use the >>> MatSeqAIJSetPreallocation()/MatMPIAIJSetPreallocation() routines for >>> preallocating memory, is it common to just call MatGetType() then call the >>> appropriate routine depending on whether or not the matrix is parallel or >>> not? I ask because when I have tested these routines out, it seems >>> that MatSeqAIJSetPreallocation() works even for parallel matrices which is >>> a bit confusing. I'm assuming that it just sets the diagonal part of the >>> matrix? >>> >> >> No, it definitely will not preallocate in parallel, so something else is >> happening. >> > > With only MatSeqAIJSetPreallocation in parallel it would not do any > preallocation, in which case it would fall back to dynamic allocation. I'm > guessing that is what is happening and it will be very slow. > > >> >> Thanks, >> >> Matt >> >> >>> I hope that my questions were clear. Let me know if they need >>> clarification and thanks in advance for the help! >>> >>> Sam >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Feb 6 13:48:48 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 6 Feb 2022 14:48:48 -0500 Subject: [petsc-users] Matrix preallocation In-Reply-To: References: Message-ID: On Sun, Feb 6, 2022 at 2:39 PM Samuel Estes wrote: > First of all, thank you so much for the detailed answers! > That clears up most of my confusion. Just to clarify let me make sure I > understand everything: > 1. So it seems that your advice is just to call the > MatXAIJSetPreallocation() routine (rather than make separate calls to other > preallocation routines such as MatSeqAIJPreallcation() and > MatMPIAIJPreallcation())? This will preallocate for Seq and MPI matrices > (among other types) and the decision will be made at runtime? > Yes. > 2. So the old way was just to call MatSeqAIJPreallcation() and > MatMPIAIJPreallcation()? I had thought that if you called the sequential > version for a parallel matrix or vice versa that the program would crash. > It seems that instead nothing happens, so by calling both you were covered > in either case. Is this correct? Essentially, one would always execute and > the other would do nothing? > Yes, PETSc works like Objective-C, in that if you call a routine for a subclass, and the object is of a different subclass, it is just ignored. > 3. If the dnnzu and onnzu arrays for the upper triangular parts seem > unnecessary for me I can just call the MatXAIJSetPreallocation() routine > with those values set to 'NULL'? I'm really just looking for a blend of the > MatSeqAIJPreallcation() and MatMPIAIJPreallcation() routines so it seems > that I only need dnnz and onnz arrays. > Yes. > 4. If we ignore the dnnzu and onnzu upper triangular parameters, then the > MatXAIJSetPreallocation() routine has the same parameters as the > MatMPIAIJSetPreallocation() routine without the option to give constant > integer values for diagonal and off-diagonal parts of the matrix so it > seems clear to me how this routine would work for a parallel matrix when > the upper triangular parameters are NULL. If the matrix is sequential then > does MPIXAIJSetPreallocation() just ignore the onnz parameter and just use > the dnnz parameter as the sole argument for determining preallocation? > Yes. Thanks, Matt > Thanks again for all the help. I think I understand well enough to use > this routine now. The questions above are mostly just to clarify and > double-check that I understood your previous responses correctly. > > Sam > > On Sat, Feb 5, 2022 at 10:12 AM Mark Adams wrote: > >> >> >> On Sat, Feb 5, 2022 at 10:36 AM Matthew Knepley >> wrote: >> >>> On Fri, Feb 4, 2022 at 11:47 PM Samuel Estes >>> wrote: >>> >>>> Hi, >>>> >>>> I have a very basic question about matrix preallocation. I am trying to >>>> use the MatCreate(), MatSetFromOptions(), MatXXXXSetPreallocation() >>>> paradigm. I thought that I should use the MatXAIJSetPreallocation() routine >>>> since the code may be run with a SeqAIJ or MPIAIJ matrix but I do not >>>> understand all of the inputs required for the >>>> MatXAIJSetPreallocation routine. In particular, the dnnzu and >>>> onnzu variables don't quite make sense to me. Can these be NULL? I was >>>> basically just hoping for a routine that would preallocate for either a >>>> sequential or parallel matrix depending on what was given at runtime. This >>>> routine seems to be what I want but I don't understand it very well and the >>>> documentation hasn't helped me to figure it out. >>>> >>> >>> The example for this is here >>> >>> >>> https://petsc.org/main/docs/manualpages/Mat/MatMPIAIJSetPreallocation.html#MatMPIAIJSetPreallocation >>> >>> Maybe we should copy it to the XAIJ page as well. Does this help explain >>> the arguments? >>> >>> >>>> A related followup question: Is it good practice to use this function >>>> or should I just use the other routines like MatSeqAIJSetPreallocation() >>>> and MatMPIAIJSetPreallocation()? >>>> >>>> And finally my last question: if I were to use the >>>> MatSeqAIJSetPreallocation()/MatMPIAIJSetPreallocation() routines for >>>> preallocating memory, is it common to just call MatGetType() then call the >>>> appropriate routine depending on whether or not the matrix is parallel or >>>> not? I ask because when I have tested these routines out, it seems >>>> that MatSeqAIJSetPreallocation() works even for parallel matrices which is >>>> a bit confusing. I'm assuming that it just sets the diagonal part of the >>>> matrix? >>>> >>> >>> No, it definitely will not preallocate in parallel, so something else is >>> happening. >>> >> >> With only MatSeqAIJSetPreallocation in parallel it would not do any >> preallocation, in which case it would fall back to dynamic allocation. I'm >> guessing that is what is happening and it will be very slow. >> >> >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> I hope that my questions were clear. Let me know if they need >>>> clarification and thanks in advance for the help! >>>> >>>> Sam >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From samuelestes91 at gmail.com Sun Feb 6 14:56:38 2022 From: samuelestes91 at gmail.com (Samuel Estes) Date: Sun, 6 Feb 2022 14:56:38 -0600 Subject: [petsc-users] Matrix preallocation In-Reply-To: References: Message-ID: Great. I think I've got it now. Thanks so much! On Sun, Feb 6, 2022 at 1:48 PM Matthew Knepley wrote: > On Sun, Feb 6, 2022 at 2:39 PM Samuel Estes > wrote: > >> First of all, thank you so much for the detailed answers! >> That clears up most of my confusion. Just to clarify let me make sure I >> understand everything: >> 1. So it seems that your advice is just to call the >> MatXAIJSetPreallocation() routine (rather than make separate calls to other >> preallocation routines such as MatSeqAIJPreallcation() and >> MatMPIAIJPreallcation())? This will preallocate for Seq and MPI matrices >> (among other types) and the decision will be made at runtime? >> > > Yes. > > >> 2. So the old way was just to call MatSeqAIJPreallcation() and >> MatMPIAIJPreallcation()? I had thought that if you called the sequential >> version for a parallel matrix or vice versa that the program would crash. >> It seems that instead nothing happens, so by calling both you were covered >> in either case. Is this correct? Essentially, one would always execute and >> the other would do nothing? >> > > Yes, PETSc works like Objective-C, in that if you call a routine for a > subclass, and the object is of a different subclass, it is just ignored. > > >> 3. If the dnnzu and onnzu arrays for the upper triangular parts seem >> unnecessary for me I can just call the MatXAIJSetPreallocation() routine >> with those values set to 'NULL'? I'm really just looking for a blend of the >> MatSeqAIJPreallcation() and MatMPIAIJPreallcation() routines so it seems >> that I only need dnnz and onnz arrays. >> > > Yes. > > >> 4. If we ignore the dnnzu and onnzu upper triangular parameters, then the >> MatXAIJSetPreallocation() routine has the same parameters as the >> MatMPIAIJSetPreallocation() routine without the option to give constant >> integer values for diagonal and off-diagonal parts of the matrix so it >> seems clear to me how this routine would work for a parallel matrix when >> the upper triangular parameters are NULL. If the matrix is sequential then >> does MPIXAIJSetPreallocation() just ignore the onnz parameter and just use >> the dnnz parameter as the sole argument for determining preallocation? >> > > Yes. > > Thanks, > > Matt > > >> Thanks again for all the help. I think I understand well enough to use >> this routine now. The questions above are mostly just to clarify and >> double-check that I understood your previous responses correctly. >> >> Sam >> >> On Sat, Feb 5, 2022 at 10:12 AM Mark Adams wrote: >> >>> >>> >>> On Sat, Feb 5, 2022 at 10:36 AM Matthew Knepley >>> wrote: >>> >>>> On Fri, Feb 4, 2022 at 11:47 PM Samuel Estes >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I have a very basic question about matrix preallocation. I am trying >>>>> to use the MatCreate(), MatSetFromOptions(), MatXXXXSetPreallocation() >>>>> paradigm. I thought that I should use the MatXAIJSetPreallocation() routine >>>>> since the code may be run with a SeqAIJ or MPIAIJ matrix but I do not >>>>> understand all of the inputs required for the >>>>> MatXAIJSetPreallocation routine. In particular, the dnnzu and >>>>> onnzu variables don't quite make sense to me. Can these be NULL? I was >>>>> basically just hoping for a routine that would preallocate for either a >>>>> sequential or parallel matrix depending on what was given at runtime. This >>>>> routine seems to be what I want but I don't understand it very well and the >>>>> documentation hasn't helped me to figure it out. >>>>> >>>> >>>> The example for this is here >>>> >>>> >>>> https://petsc.org/main/docs/manualpages/Mat/MatMPIAIJSetPreallocation.html#MatMPIAIJSetPreallocation >>>> >>>> Maybe we should copy it to the XAIJ page as well. Does this help >>>> explain the arguments? >>>> >>>> >>>>> A related followup question: Is it good practice to use this function >>>>> or should I just use the other routines like MatSeqAIJSetPreallocation() >>>>> and MatMPIAIJSetPreallocation()? >>>>> >>>>> And finally my last question: if I were to use the >>>>> MatSeqAIJSetPreallocation()/MatMPIAIJSetPreallocation() routines for >>>>> preallocating memory, is it common to just call MatGetType() then call the >>>>> appropriate routine depending on whether or not the matrix is parallel or >>>>> not? I ask because when I have tested these routines out, it seems >>>>> that MatSeqAIJSetPreallocation() works even for parallel matrices which is >>>>> a bit confusing. I'm assuming that it just sets the diagonal part of the >>>>> matrix? >>>>> >>>> >>>> No, it definitely will not preallocate in parallel, so something else >>>> is happening. >>>> >>> >>> With only MatSeqAIJSetPreallocation in parallel it would not do any >>> preallocation, in which case it would fall back to dynamic allocation. I'm >>> guessing that is what is happening and it will be very slow. >>> >>> >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> I hope that my questions were clear. Let me know if they need >>>>> clarification and thanks in advance for the help! >>>>> >>>>> Sam >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sijietang1995 at gmail.com Sat Feb 5 22:53:56 2022 From: sijietang1995 at gmail.com (Sijie Tang) Date: Sat, 5 Feb 2022 21:53:56 -0700 Subject: [petsc-users] The Matrix and Vector Format Convert between PETSc and HYPRE Message-ID: <0F36F6C4-DC3D-4F52-A06A-AB16BA5700F0@gmail.com> Hi developer, I have many questions about he Matrix and Vector Format Convert between PETSc and HYPRE, could you give me some answers or hints? Can I convert MATHYPRE (in PETSc) to hypre_ParCSRMatrix ( HYPRE ) use function MatHYPREGetParCSR() (in PETSc) ? for 2, or I should use MatHYPRE_IJMatrixCreate and MatHYPRE_IJMatrixCopy to get hypre_IJMatrix, then hypre_IJMatrix convert to hypre_ParCSRMatrix ? for the vector, I don't find any function can convert vector in PETSc to hypre_ParCSRVector, Is there any function can do this work ? But I find I can use VecHYPRE_IJVectorCreate and VecHYPRE_IJVectorCopy to get hypre_IJVector, then hypre_IJVector convert to hypre_ParCSRVector? Is there any function can convert the format back? like hypre_ParCSRMatrix convert to MATHYPRE, and hypre_ParCSRVector convert to PETSc's vector? Thanks, Sijie -------------- next part -------------- An HTML attachment was scrubbed... URL: From sijietang1995 at gmail.com Sat Feb 5 23:34:35 2022 From: sijietang1995 at gmail.com (Sijie Tang) Date: Sat, 5 Feb 2022 22:34:35 -0700 Subject: [petsc-users] The Matrix and Vector Format Convert between PETSc and HYPRE In-Reply-To: <0F36F6C4-DC3D-4F52-A06A-AB16BA5700F0@gmail.com> References: <0F36F6C4-DC3D-4F52-A06A-AB16BA5700F0@gmail.com> Message-ID: <229921F1-4F00-462C-9F1F-E82032328436@gmail.com> I make a mistake there is no hypre_ParCSRVector, that should be hypre_ParVector. Sijie > On Feb 5, 2022, at 21:53, Sijie Tang wrote: > > Hi developer, > > I have many questions about he Matrix and Vector Format Convert between PETSc and HYPRE, could you give me some answers or hints? > > Can I convert MATHYPRE (in PETSc) to hypre_ParCSRMatrix ( HYPRE ) use function MatHYPREGetParCSR() (in PETSc) ? > for 2, or I should use MatHYPRE_IJMatrixCreate and MatHYPRE_IJMatrixCopy to get hypre_IJMatrix, then hypre_IJMatrix convert to hypre_ParCSRMatrix ? > for the vector, I don't find any function can convert vector in PETSc to hypre_ParCSRVector, Is there any function can do this work ? > But I find I can use VecHYPRE_IJVectorCreate and VecHYPRE_IJVectorCopy to get hypre_IJVector, then hypre_IJVector convert to hypre_ParCSRVector? > Is there any function can convert the format back? like hypre_ParCSRMatrix convert to MATHYPRE, and hypre_ParCSRVector convert to PETSc's vector? > > Thanks, > Sijie -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Mon Feb 7 09:40:51 2022 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 7 Feb 2022 10:40:51 -0500 Subject: [petsc-users] The Matrix and Vector Format Convert between PETSc and HYPRE In-Reply-To: <229921F1-4F00-462C-9F1F-E82032328436@gmail.com> References: <0F36F6C4-DC3D-4F52-A06A-AB16BA5700F0@gmail.com> <229921F1-4F00-462C-9F1F-E82032328436@gmail.com> Message-ID: On Mon, Feb 7, 2022 at 9:17 AM Sijie Tang wrote: > I make a mistake there is no hypre_ParCSRVector, that should be > hypre_ParVector. > > Sijie > > On Feb 5, 2022, at 21:53, Sijie Tang wrote: > > Hi developer, > > I have many questions about he Matrix and Vector Format Convert between > PETSc and HYPRE, could you give me some answers or hints? > > > 1. Can I convert MATHYPRE (in PETSc) to hypre_ParCSRMatrix ( HYPRE ) > use function MatHYPREGetParCSR() (in PETSc) ? > > You specify a hypre matrix with '-mat_type hypre' and then you can use: PetscErrorCode MatHYPREGetParCSR(Mat A, hypre_ParCSRMatrix **parcsr) > > 1. for 2, or I should use MatHYPRE_IJMatrixCreate and > MatHYPRE_IJMatrixCopy to get hypre_IJMatrix, then hypre_IJMatrix > convert to hypre_ParCSRMatrix ? > > Not sure. It easier to use MatSetFromOptions and '-mat_type hypre' as above > > 1. for the vector, I don't find any function can convert vector in > PETSc to hypre_ParCSRVector, Is there any function can do this work ? > > We don't use hypre vectors, or at least we don't expose them. You specify the vector type (CPU is the default) like -vec_type cuda Then you use a normal Vec in your code. And use MatCreateVecs to get the right type. Hypre will use the device if it is available and MatCreateVecs will create the correct type. > > 1. But I find I can use VecHYPRE_IJVectorCreate and VecHYPRE_IJVectorCopy > to get hypre_IJVector, then hypre_IJVector convert to > hypre_ParCSRVector? > 2. Is there any function can convert the format back? like hypre_ParCSRMatrix > convert to MATHYPRE, > > Maybe. I don't see docs on this (very new) but I see: include/petscmathypre.h:PETSC_EXTERN PetscErrorCode MatCreateFromParCSR(hypre_ParCSRMatrix*,MatType,PetscCopyMode,Mat*); And there is a test that you should look at that tests hypre matrices: src/mat/tests/ex115.c Mark > > 1. and hypre_ParCSRVector convert to PETSc's vector? > > > Thanks, > Sijie > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Feb 7 10:01:15 2022 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 7 Feb 2022 11:01:15 -0500 Subject: [petsc-users] The Matrix and Vector Format Convert between PETSc and HYPRE In-Reply-To: <229921F1-4F00-462C-9F1F-E82032328436@gmail.com> References: <0F36F6C4-DC3D-4F52-A06A-AB16BA5700F0@gmail.com> <229921F1-4F00-462C-9F1F-E82032328436@gmail.com> Message-ID: Sijie, Generally to use hypre from PETSc you just use PETSc matrices and vectors in your code and set the desired hypre solver with PC; you don't need to deal with hypre matrices and vectors directly at all. Barry > On Feb 6, 2022, at 12:34 AM, Sijie Tang wrote: > > I make a mistake there is no hypre_ParCSRVector, that should be hypre_ParVector. > > Sijie > >> On Feb 5, 2022, at 21:53, Sijie Tang > wrote: >> >> Hi developer, >> >> I have many questions about he Matrix and Vector Format Convert between PETSc and HYPRE, could you give me some answers or hints? >> >> Can I convert MATHYPRE (in PETSc) to hypre_ParCSRMatrix ( HYPRE ) use function MatHYPREGetParCSR() (in PETSc) ? >> for 2, or I should use MatHYPRE_IJMatrixCreate and MatHYPRE_IJMatrixCopy to get hypre_IJMatrix, then hypre_IJMatrix convert to hypre_ParCSRMatrix ? >> for the vector, I don't find any function can convert vector in PETSc to hypre_ParCSRVector, Is there any function can do this work ? >> But I find I can use VecHYPRE_IJVectorCreate and VecHYPRE_IJVectorCopy to get hypre_IJVector, then hypre_IJVector convert to hypre_ParCSRVector? >> Is there any function can convert the format back? like hypre_ParCSRMatrix convert to MATHYPRE, and hypre_ParCSRVector convert to PETSc's vector? >> >> Thanks, >> Sijie > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amin.nadimy19 at imperial.ac.uk Tue Feb 8 08:20:19 2022 From: amin.nadimy19 at imperial.ac.uk (Nadimy, Amin) Date: Tue, 8 Feb 2022 14:20:19 +0000 Subject: [petsc-users] Development enquiry Message-ID: Dear Sir/Madam, We are developing a semi-structured code based on a triangular mesh. It has similarities to Adaptive Mesh refinement (AMR) in which from an initial mesh (in our case unstructured) a structured mesh is generated based on a refine-by-splitting strategy, ending up, like in AMR, with a semi-structured mesh. * Effectively we need a CSR type storage for the coarse initial mesh and a stencil-based storage for the internal, structured mesh. We have noticed that you have some routines to deal with semi-structured meshes but they specifically target AMR type meshes, which may not be useful in our case as the stencil and neighbouring are different to that of a structured grid-based mesh. Do you think these approaches could be used directly for our case or with minor modifications? * Other possibilities that we have considered are the use of block-structured solvers, however, in our case, the blocks are not dense and therefore this approach will be worse. * Another alternative would be to develop our own multigrid based on PETSc ensuring that there is communication between the different blocks during the smoothing operation, could this also easily be done or would effectively require applying the smoothers independently to the different structured sections and us performing the communication and extra-smoothing steps at the interface? Kind regards, -- Amin Nadimy Applied Modelling and Computation Group (AMCG), Imperial College London -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Feb 8 10:06:50 2022 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 8 Feb 2022 11:06:50 -0500 Subject: [petsc-users] Development enquiry In-Reply-To: References: Message-ID: Let me take a shot at this, * Using my nomenclature, you want a forest (unstructured coarse grid) of, I guess tri-trees in 2D and quad-trees in 3D. Or a non-conforming AMR grid with simplices. * We have support for quad/oct trees, tensor grids, (p4est) and single level tensor, Cartesian grids, but no 60/120 degree grids. * It sounds like you do your own AMR and want to use PETSc for solvers. In that case I can only see using an unstructured (CSR) matrix with constraints added for "hanging" nodes. And then use AMG solvers. * Developing your own MG solvers a doable, but its a big project. Like a Ph.D. dissertation. Our team members have done this but I don't think any of these solvers are in the library. * We do have a p4est mesh class, Forest, that deals with the mesh management of these forest of oct-trees, but I don't believe we have mesh coarsening for this. * About 6 years ago the p4est developer told me he was working on your kind of simplex forest of trees, but I don't know how that progressed and PETSc does not have an interface to it anyway. Thanks, Mark On Tue, Feb 8, 2022 at 9:29 AM Nadimy, Amin wrote: > Dear Sir/Madam, > > > We are developing a semi-structured code based on a triangular mesh. It > has similarities to Adaptive Mesh refinement (AMR) in which from an initial > mesh (in our case unstructured) a structured mesh is generated based on a > refine-by-splitting strategy, ending up, like in AMR, with a > semi-structured mesh. > > > > - > > Effectively we need a CSR type storage for the coarse initial mesh and > a stencil-based storage for the internal, structured mesh. We have noticed > that you have some routines to deal with semi-structured meshes but they > specifically target AMR type meshes, which may not be useful in our case as > the stencil and neighbouring are different to that of a structured > grid-based mesh. Do you think these approaches could be used directly for > our case or with minor modifications? > > > - > > Other possibilities that we have considered are the use of > block-structured solvers, however, in our case, the blocks are not dense > and therefore this approach will be worse. > > > - > > Another alternative would be to develop our own multigrid based on > PETSc ensuring that there is communication between the different blocks > during the smoothing operation, could this also easily be done or would > effectively require applying the smoothers independently to the different > structured sections and us performing the communication and extra-smoothing > steps at the interface? > > > Kind regards, > > -- > Amin Nadimy > > Applied Modelling and Computation Group (AMCG), > > Imperial College London > -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Tue Feb 8 12:08:06 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Tue, 8 Feb 2022 19:08:06 +0100 Subject: [petsc-users] Debugging with valgrind Message-ID: Hello , I have been debugging my code with valgrind, and found many memory leakage that i removed so far. But i keep having the type of lines in my logs -------------------- ==26817== 384 bytes in 1 blocks are still reachable in loss record 1,107 of 1,151 ==26817==??? at 0x483877F: malloc (vg_replace_malloc.c:307) ==26817==??? by 0x67BC429: MPIR_T_CVAR_REGISTER_impl (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) ==26817==??? by 0x66D6B41: MPIR_T_cvar_init (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) ==26817==??? by 0x65D61F2: MPIR_T_cvar_env_init (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) ==26817==??? by 0x65D62AE: MPIR_T_env_init (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) ==26817==??? by 0x655D059: PMPI_Init_thread (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) ==26817==??? by 0x49BB19D: PetscInitialize (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libpetsc.so.3.14.2) ==26817==??? by 0x10B65E: main (in /home/mtchakorom/petsc-3.14.2/src/ksp/ksp/tutorials/code_multisplitting_async) ---------- and this ------------------------ ?65,536 bytes in 1 blocks are definitely lost in loss record 1,149 of 1,151 ==26817==??? at 0x48386AF: malloc (vg_replace_malloc.c:306) ==26817==??? by 0x483ADE7: realloc (vg_replace_malloc.c:834) ==26817==??? by 0x87B284F: ??? ==26817==??? by 0x87B9DF3: ??? ==26817==??? by 0x8790778: ??? ==26817==??? by 0x8796B87: ??? ==26817==??? by 0x873C3E7: ??? ==26817==??? by 0x76F66A2: ??? (in /usr/local/cuda-11.2/targets/x86_64-linux/lib/libOpenCL.so.1.0.0) ==26817==??? by 0x76F88CB: ??? (in /usr/local/cuda-11.2/targets/x86_64-linux/lib/libOpenCL.so.1.0.0) ==26817==??? by 0x72AF34E: __pthread_once_slow (pthread_once.c:116) ==26817==??? by 0x76F6C70: clGetPlatformIDs (in /usr/local/cuda-11.2/targets/x86_64-linux/lib/libOpenCL.so.1.0.0) ==26817==??? by 0x67F4409: hwloc_opencl_discover (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) -------- Should i consider this as normal output for valgrind on a petsc program ? Thanks From bsmith at petsc.dev Tue Feb 8 12:47:24 2022 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 8 Feb 2022 13:47:24 -0500 Subject: [petsc-users] Debugging with valgrind In-Reply-To: References: Message-ID: <7DF9B915-0399-42BB-BF4B-7EC2DEF080BE@petsc.dev> Yes, these come from other packages or the OS so you cannot do anything about them. Barry > On Feb 8, 2022, at 1:08 PM, Medane TCHAKOROM wrote: > > Hello , > > I have been debugging my code with valgrind, and found many memory leakage that i removed so far. > > But i keep having the type of lines in my logs > > > -------------------- > > ==26817== 384 bytes in 1 blocks are still reachable in loss record 1,107 of 1,151 > ==26817== at 0x483877F: malloc (vg_replace_malloc.c:307) > ==26817== by 0x67BC429: MPIR_T_CVAR_REGISTER_impl (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) > ==26817== by 0x66D6B41: MPIR_T_cvar_init (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) > ==26817== by 0x65D61F2: MPIR_T_cvar_env_init (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) > ==26817== by 0x65D62AE: MPIR_T_env_init (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) > ==26817== by 0x655D059: PMPI_Init_thread (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) > ==26817== by 0x49BB19D: PetscInitialize (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libpetsc.so.3.14.2) > ==26817== by 0x10B65E: main (in /home/mtchakorom/petsc-3.14.2/src/ksp/ksp/tutorials/code_multisplitting_async) > ---------- > > > and this > > > ------------------------ > > 65,536 bytes in 1 blocks are definitely lost in loss record 1,149 of 1,151 > ==26817== at 0x48386AF: malloc (vg_replace_malloc.c:306) > ==26817== by 0x483ADE7: realloc (vg_replace_malloc.c:834) > ==26817== by 0x87B284F: ??? > ==26817== by 0x87B9DF3: ??? > ==26817== by 0x8790778: ??? > ==26817== by 0x8796B87: ??? > ==26817== by 0x873C3E7: ??? > ==26817== by 0x76F66A2: ??? (in /usr/local/cuda-11.2/targets/x86_64-linux/lib/libOpenCL.so.1.0.0) > ==26817== by 0x76F88CB: ??? (in /usr/local/cuda-11.2/targets/x86_64-linux/lib/libOpenCL.so.1.0.0) > ==26817== by 0x72AF34E: __pthread_once_slow (pthread_once.c:116) > ==26817== by 0x76F6C70: clGetPlatformIDs (in /usr/local/cuda-11.2/targets/x86_64-linux/lib/libOpenCL.so.1.0.0) > ==26817== by 0x67F4409: hwloc_opencl_discover (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) > > -------- > > > Should i consider this as normal output for valgrind on a petsc program ? > > > Thanks > > > > > > > > > > > > > > > > From knepley at gmail.com Tue Feb 8 18:01:01 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 8 Feb 2022 19:01:01 -0500 Subject: [petsc-users] Development enquiry In-Reply-To: References: Message-ID: On Tue, Feb 8, 2022 at 9:29 AM Nadimy, Amin wrote: > Dear Sir/Madam, > > > We are developing a semi-structured code based on a triangular mesh. It > has similarities to Adaptive Mesh refinement (AMR) in which from an initial > mesh (in our case unstructured) a structured mesh is generated based on a > refine-by-splitting strategy, ending up, like in AMR, with a > semi-structured mesh. > I need to ask some questions to make sure I understand. First, the above sounds like a regular refinement of a triangular mesh. We support that, in parallel, to any level of refinement. Is that correct? Second, if instead you want a different level of regular refinement in each cell, that is harder. If you obey the 2:1 balance constraint between cells, then I think our current infrastructure (written by Toby Isaac) can handle it. There is a new version of p4est that handles simplices in this way, but we have not yet integrated it. If we need that, it might take a little doing. > > - > > Effectively we need a CSR type storage for the coarse initial mesh and > a stencil-based storage for the internal, structured mesh. We have noticed > that you have some routines to deal with semi-structured meshes but they > specifically target AMR type meshes, which may not be useful in our case as > the stencil and neighbouring are different to that of a structured > grid-based mesh. Do you think these approaches could be used directly for > our case or with minor modifications? > > This should be automatic once we have the topology that you want. > > - > > > - > > Other possibilities that we have considered are the use of > block-structured solvers, however, in our case, the blocks are not dense > and therefore this approach will be worse. > > Yes, Rich Vuduc did a study of densifying blocks during his Phd, but my memory of the results was that it did not often win. > > - > > > - > > Another alternative would be to develop our own multigrid based on > PETSc ensuring that there is communication between the different blocks > during the smoothing operation, could this also easily be done or would > effectively require applying the smoothers independently to the different > structured sections and us performing the communication and extra-smoothing > steps at the interface? > > My plan would be to keep track of the refinements. Smooth in a structured way on the refinements, and then turn the original unstructured grid over to GAMG. What kind of physics is this for? Thanks, Matt > Kind regards, > > -- > Amin Nadimy > > Applied Modelling and Computation Group (AMCG), > > Imperial College London > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yann.jobic at univ-amu.fr Wed Feb 9 04:56:30 2022 From: yann.jobic at univ-amu.fr (Yann Jobic) Date: Wed, 9 Feb 2022 11:56:30 +0100 Subject: [petsc-users] MatSetValue in Fortran Message-ID: <0d721688-c857-df70-f58d-50db43f917f0@univ-amu.fr> Dear All, I'm facing a strange problem. I did not succeed in putting some values in an MPI matrix. I'm using Petsc 3.16.4. The matrix is pre-allocated, with some zeroes at the right position. To explain the context, it's a finit elements code, thus in the tangent matrix creation, i've got a first loop over the elements, and feed the matrix accordingly. This part is working. I'm using ADD_VALUES. I then have to put the periodic boundary conditions on some nodes, that is to say that i've got 1 at the diagonal, and -1 on the mirror element of the designated ddl (i've got a Newton-Raphson minimisation procedure). I'm using INSERT_VALUES here. According to the documentation, i should do : Loop over Elements call MatSetValues end loop call MatAssemblyBegin(A,MAT_FLUSH_ASSEMBLY,ierr);CHKERRA(ierr) call MatAssemblyEnd(A,MAT_FLUSH_ASSEMBLY,ierr);CHKERRA(ierr) Loop over periodic nodes call MatSetValue end loop call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr);CHKERRA(ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr);CHKERRA(ierr) But it's not working. I don't have those values from the periodic boundary condition in the global matrix. The return value of MatSetValue is 0. I tried valgrind, but memory access looks ok. I really don't know how to debug this. Do you have any idea of what could happen here ? Debug ideas ? Many thanks, Best Regards, Yann From knepley at gmail.com Wed Feb 9 05:46:15 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 9 Feb 2022 06:46:15 -0500 Subject: [petsc-users] MatSetValue in Fortran In-Reply-To: <0d721688-c857-df70-f58d-50db43f917f0@univ-amu.fr> References: <0d721688-c857-df70-f58d-50db43f917f0@univ-amu.fr> Message-ID: On Wed, Feb 9, 2022 at 5:56 AM Yann Jobic wrote: > Dear All, > > I'm facing a strange problem. I did not succeed in putting some values > in an MPI matrix. I'm using Petsc 3.16.4. The matrix is pre-allocated, > with some zeroes at the right position. > > To explain the context, it's a finit elements code, thus in the tangent > matrix creation, i've got a first loop over the elements, and feed the > matrix accordingly. This part is working. I'm using ADD_VALUES. > > I then have to put the periodic boundary conditions on some nodes, that > is to say that i've got 1 at the diagonal, and -1 on the mirror element > of the designated ddl (i've got a Newton-Raphson minimisation > procedure). I'm using INSERT_VALUES here. > > According to the documentation, i should do : > > Loop over Elements > call MatSetValues > end loop > call MatAssemblyBegin(A,MAT_FLUSH_ASSEMBLY,ierr);CHKERRA(ierr) > call MatAssemblyEnd(A,MAT_FLUSH_ASSEMBLY,ierr);CHKERRA(ierr) > > Loop over periodic nodes > call MatSetValue > end loop > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr);CHKERRA(ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr);CHKERRA(ierr) > > But it's not working. I don't have those values from the periodic > boundary condition in the global matrix. The return value of MatSetValue > is 0. > > I tried valgrind, but memory access looks ok. I really don't know how to > debug this. Do you have any idea of what could happen here ? > Debug ideas ? > This is strange since we have tests for this kind of insertion. I would make a tiny code that adds two neighboring cells and a periodic boundary between the other side. If that fails it will be easy for us to look at the figure out what is happening. Thanks, Matt > Many thanks, > > Best Regards, > > Yann > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brunorssouza at usp.br Wed Feb 9 06:24:13 2022 From: brunorssouza at usp.br (Bruno Rammon Silva Souza) Date: Wed, 9 Feb 2022 09:24:13 -0300 Subject: [petsc-users] Quasi newton SNES change variable Message-ID: Hello everyone, I am using the LBFGS type in the SNES solver, and it's working fine. But I want to change the number of stored updates in this method. This variable of quasi-newton methods is usually chosen by the runtime option: -snes_qn_m . However, I would like to change this variable inside my code, when calling a function, for example, but I can't find any kind of PETSc function that changes this variable directly. Is there any function like this? If not, is there some way to do that without using -snes_qn_m at runtime? Best regards, -- Bruno Souza -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 9 06:58:01 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 9 Feb 2022 07:58:01 -0500 Subject: [petsc-users] Quasi newton SNES change variable In-Reply-To: References: Message-ID: On Wed, Feb 9, 2022 at 7:24 AM Bruno Rammon Silva Souza via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello everyone, > > I am using the LBFGS type in the SNES solver, and it's working fine. But I > want to change the number of stored updates in this method. This variable > of quasi-newton methods is usually chosen by the runtime option: -snes_qn_m > . However, I would like to change this variable inside my code, when > calling a function, for example, but I can't find any kind of PETSc > function that changes this variable directly. Is there any function like > this? If not, is there some way to do that without using -snes_qn_m at > runtime? > This is an oversight which we will fix. For now you can use https://petsc.org/main/docs/manualpages/Sys/PetscOptionsSetValue.html in your code. Thanks, Matt > Best regards, > -- > Bruno Souza > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From yann.jobic at univ-amu.fr Wed Feb 9 09:33:30 2022 From: yann.jobic at univ-amu.fr (Yann Jobic) Date: Wed, 9 Feb 2022 16:33:30 +0100 Subject: [petsc-users] Quasi newton SNES change variable In-Reply-To: References: Message-ID: <98d64373-a5e7-9fd3-c033-ac7a57745a9d@univ-amu.fr> I'm struggling for a very simple error that i can not see. I'm running in a sequential program, for the test. MatView is giving me for the row 125 : row 125: (125, 0.) (107, 0.) I'm getting those values with MatGetRow : row=125 CALL MatGetRow(MATGLOB,row,nb,testcols,testvalues,IER) write(*,*)row,testcols(1),testvalues(1), & testcols(2),testvalues(2) CALL MatRestoreRow(MATGLOB,row,nb,testcols,testvalues,IER) The output is : 125 125 0.000000000000000E+000 107 0.000000000000000E+000 Which is what i want. It's ok. Then i'm doing the MatSetValue : val = -1 row = 125 col = 107 CALL MatSetValue(MATGLOB,row,col,val, & INSERT_VALUES, IER) And i've got the error : [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: New nonzero at (125,107) caused a malloc Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.4, unknown [0]PETSC ERROR: /home/jobic/projet/fe-utils/marcus/3.16/test_MatSetValue_loem_3.16p4_openmpi_intel on a named leto4.iusti-calcul.recherche by jobic Wed Feb 9 16:01:51 2022 [0]PETSC ERROR: Configure options --prefix=/local/lib/petsc/3.16/p4/17/openmpi_intel-mkl-works --with-single-library=0 --with-large-file-io=1 --with-debugging=0 --with-blacs=1 --with-blacs-dir=/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/ --download-scalapack=1 --download-parmetis=1 --download-make=1 --download-mumps=1 --LIBS=" -Wl,-rpath,/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/" --with-blaslapack-dir=/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/ --download-metis=1 --download-parmetis=1 --download-ptscotch=1 --download-cmake=1 --download-slepc=1 --download-hdf5=1 --with-zlib=1 --download-szlib=1 --download-suitesparse=1 --download-p4est=1 --download-netcdf=1 --download-triangle=1 --with-shared-libraries=0 --with-cxx-dialect=C++11 -CFLAGS=" -O3 -mtune=core-avx2 -mkl" --COPTFLAGS="-D_POSIX_C_SOURCE=199309L" -CXXFLAGS=" -O3 -mtune=core-avx2 -mkl" -FFLAGS=" -O3 -mtune=core-avx2 -mkl" PETSC_ARCH=openmpi_intel-mkl-17-works [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at /home/devel/src_linux/petsc-3.16.4/src/mat/impls/aij/seq/aij.c:520 [0]PETSC ERROR: #2 MatSetValues() at /home/devel/src_linux/petsc-3.16.4/src/mat/interface/matrix.c:1398 [0]PETSC ERROR: #3 MatGetRow() at /home/devel/src_linux/petsc-3.16.4/src/mat/interface/matrix.c:558 [0]PETSC ERROR: #4 MatRestoreRow_Fortran() at /home/devel/src_linux/petsc-3.16.4/src/mat/interface/ftn-custom/zmatrixf.c:582 I'm obviously doing something wrong, but where ? Thanks, Yann Le 2/9/2022 ? 1:58 PM, Matthew Knepley a ?crit?: > On Wed, Feb 9, 2022 at 7:24 AM Bruno Rammon Silva Souza via petsc-users > > wrote: > > Hello everyone, > > I am using the LBFGS type in the SNES solver, and it's working fine. > But I want to change the number of stored updates in this method. > This variable of quasi-newton methods is usually chosen by the > runtime option: -snes_qn_m . However, I would like to change this > variable inside my code, when calling a function, for example, but I > can't find any kind of PETSc function that changes this variable > directly. Is there any function like this? If not, is there some way > to do that without using -snes_qn_m at runtime? > > > This is an oversight which we will fix. For now you can use > > https://petsc.org/main/docs/manualpages/Sys/PetscOptionsSetValue.html > > > in your code. > > ? Thanks, > > ? ? ?Matt > > Best regards, > -- > Bruno Souza > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From yann.jobic at univ-amu.fr Wed Feb 9 09:36:08 2022 From: yann.jobic at univ-amu.fr (Yann Jobic) Date: Wed, 9 Feb 2022 16:36:08 +0100 Subject: [petsc-users] MatSetValue in Fortran In-Reply-To: References: <0d721688-c857-df70-f58d-50db43f917f0@univ-amu.fr> Message-ID: <0e0d60df-6fe7-afd4-382e-877a74191e08@univ-amu.fr> Sorry for the bad subject , i made a mistake. The correct email subject and message was : I'm struggling for a very simple error that i can not see. I'm running in a sequential program, for the test. MatView is giving me for the row 125 : row 125: (125, 0.) (107, 0.) I'm getting those values with MatGetRow : row=125 CALL MatGetRow(MATGLOB,row,nb,testcols,testvalues,IER) write(*,*)row,testcols(1),testvalues(1), & testcols(2),testvalues(2) CALL MatRestoreRow(MATGLOB,row,nb,testcols,testvalues,IER) The output is : 125 125 0.000000000000000E+000 107 0.000000000000000E+000 Which is what i want. It's ok. Then i'm doing the MatSetValue : val = -1 row = 125 col = 107 CALL MatSetValue(MATGLOB,row,col,val, & INSERT_VALUES, IER) And i've got the error : [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: New nonzero at (125,107) caused a malloc Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.16.4, unknown [0]PETSC ERROR: /home/jobic/projet/fe-utils/marcus/3.16/test_MatSetValue_loem_3.16p4_openmpi_intel on a named leto4.iusti-calcul.recherche by jobic Wed Feb 9 16:01:51 2022 [0]PETSC ERROR: Configure options --prefix=/local/lib/petsc/3.16/p4/17/openmpi_intel-mkl-works --with-single-library=0 --with-large-file-io=1 --with-debugging=0 --with-blacs=1 --with-blacs-dir=/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/ --download-scalapack=1 --download-parmetis=1 --download-make=1 --download-mumps=1 --LIBS=" -Wl,-rpath,/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/" --with-blaslapack-dir=/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/ --download-metis=1 --download-parmetis=1 --download-ptscotch=1 --download-cmake=1 --download-slepc=1 --download-hdf5=1 --with-zlib=1 --download-szlib=1 --download-suitesparse=1 --download-p4est=1 --download-netcdf=1 --download-triangle=1 --with-shared-libraries=0 --with-cxx-dialect=C++11 -CFLAGS=" -O3 -mtune=core-avx2 -mkl" --COPTFLAGS="-D_POSIX_C_SOURCE=199309L" -CXXFLAGS=" -O3 -mtune=core-avx2 -mkl" -FFLAGS=" -O3 -mtune=core-avx2 -mkl" PETSC_ARCH=openmpi_intel-mkl-17-works [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at /home/devel/src_linux/petsc-3.16.4/src/mat/impls/aij/seq/aij.c:520 [0]PETSC ERROR: #2 MatSetValues() at /home/devel/src_linux/petsc-3.16.4/src/mat/interface/matrix.c:1398 [0]PETSC ERROR: #3 MatGetRow() at /home/devel/src_linux/petsc-3.16.4/src/mat/interface/matrix.c:558 [0]PETSC ERROR: #4 MatRestoreRow_Fortran() at /home/devel/src_linux/petsc-3.16.4/src/mat/interface/ftn-custom/zmatrixf.c:582 I'm obviously doing something wrong, but where ? Thanks, Yann Le 2/9/2022 ? 12:46 PM, Matthew Knepley a ?crit?: > On Wed, Feb 9, 2022 at 5:56 AM Yann Jobic > wrote: > > Dear All, > > I'm facing a strange problem. I did not succeed in putting some values > in an MPI matrix. I'm using Petsc 3.16.4. The matrix is pre-allocated, > with some zeroes at the right position. > > To explain the context, it's a finit elements code, thus in the tangent > matrix creation, i've got a first loop over the elements, and feed the > matrix accordingly. This part is working. I'm using ADD_VALUES. > > I then have to put the periodic boundary conditions on some nodes, that > is to say that i've got 1 at the diagonal, and -1 on the mirror element > of the designated ddl? (i've got a Newton-Raphson minimisation > procedure). I'm using INSERT_VALUES here. > > According to the documentation, i should do : > > Loop over Elements > call MatSetValues > end loop > call MatAssemblyBegin(A,MAT_FLUSH_ASSEMBLY,ierr);CHKERRA(ierr) > call MatAssemblyEnd(A,MAT_FLUSH_ASSEMBLY,ierr);CHKERRA(ierr) > > Loop over periodic nodes > call MatSetValue > end loop > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr);CHKERRA(ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr);CHKERRA(ierr) > > But it's not working. I don't have those values from the periodic > boundary condition in the global matrix. The return value of > MatSetValue > is 0. > > I tried valgrind, but memory access looks ok. I really don't know > how to > debug this. Do you have any idea of what could happen here ? > Debug ideas ? > > > This is strange since we have tests for this kind of insertion. I would > make a tiny > code that adds two neighboring cells and a periodic boundary between the > other side. > If that fails it will be easy for us to look at the figure out what is > happening. > > ? Thanks, > > ? ? ?Matt > > Many thanks, > > Best Regards, > > Yann > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From medane.tchakorom at univ-fcomte.fr Wed Feb 9 10:11:47 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 9 Feb 2022 17:11:47 +0100 Subject: [petsc-users] Debugging with valgrind In-Reply-To: <7DF9B915-0399-42BB-BF4B-7EC2DEF080BE@petsc.dev> References: <7DF9B915-0399-42BB-BF4B-7EC2DEF080BE@petsc.dev> Message-ID: <1aabc3b5-93ad-fefe-cfb2-b32338b5005a@univ-fcomte.fr> Re: Thanks for your prompts reply On 08/02/2022 19:47, Barry Smith wrote: > Yes, these come from other packages or the OS so you cannot do anything about them. > > Barry > > >> On Feb 8, 2022, at 1:08 PM, Medane TCHAKOROM wrote: >> >> Hello , >> >> I have been debugging my code with valgrind, and found many memory leakage that i removed so far. >> >> But i keep having the type of lines in my logs >> >> >> -------------------- >> >> ==26817== 384 bytes in 1 blocks are still reachable in loss record 1,107 of 1,151 >> ==26817== at 0x483877F: malloc (vg_replace_malloc.c:307) >> ==26817== by 0x67BC429: MPIR_T_CVAR_REGISTER_impl (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) >> ==26817== by 0x66D6B41: MPIR_T_cvar_init (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) >> ==26817== by 0x65D61F2: MPIR_T_cvar_env_init (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) >> ==26817== by 0x65D62AE: MPIR_T_env_init (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) >> ==26817== by 0x655D059: PMPI_Init_thread (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) >> ==26817== by 0x49BB19D: PetscInitialize (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libpetsc.so.3.14.2) >> ==26817== by 0x10B65E: main (in /home/mtchakorom/petsc-3.14.2/src/ksp/ksp/tutorials/code_multisplitting_async) >> ---------- >> >> >> and this >> >> >> ------------------------ >> >> 65,536 bytes in 1 blocks are definitely lost in loss record 1,149 of 1,151 >> ==26817== at 0x48386AF: malloc (vg_replace_malloc.c:306) >> ==26817== by 0x483ADE7: realloc (vg_replace_malloc.c:834) >> ==26817== by 0x87B284F: ??? >> ==26817== by 0x87B9DF3: ??? >> ==26817== by 0x8790778: ??? >> ==26817== by 0x8796B87: ??? >> ==26817== by 0x873C3E7: ??? >> ==26817== by 0x76F66A2: ??? (in /usr/local/cuda-11.2/targets/x86_64-linux/lib/libOpenCL.so.1.0.0) >> ==26817== by 0x76F88CB: ??? (in /usr/local/cuda-11.2/targets/x86_64-linux/lib/libOpenCL.so.1.0.0) >> ==26817== by 0x72AF34E: __pthread_once_slow (pthread_once.c:116) >> ==26817== by 0x76F6C70: clGetPlatformIDs (in /usr/local/cuda-11.2/targets/x86_64-linux/lib/libOpenCL.so.1.0.0) >> ==26817== by 0x67F4409: hwloc_opencl_discover (in /home/mtchakorom/petsc-3.14.2/linux-gnu-debug/lib/libmpi.so.12.1.8) >> >> -------- >> >> >> Should i consider this as normal output for valgrind on a petsc program ? >> >> >> Thanks >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> From medane.tchakorom at univ-fcomte.fr Wed Feb 9 10:15:34 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 9 Feb 2022 17:15:34 +0100 Subject: [petsc-users] stopping KSPSolve and then restarting where it stopped Message-ID: <86176554-468c-a651-a826-89fe9e9d87c4@univ-fcomte.fr> Hello, Is it possible to run KSPSolve till the convergence (or not) , let the program do some work, and then restart KSPSolve where it stopped ? (maybe with different tolerance or max iterations) Thanks From bojan.niceno.scientist at gmail.com Tue Feb 8 22:21:09 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Wed, 9 Feb 2022 05:21:09 +0100 Subject: [petsc-users] Warning while compiling Fortran with PETSc Message-ID: To whom it may concern, I am working on a Fortran (2003) computational fluid dynamics solver, which is actually quite mature, was parallelized with MPI from the very beginning and it comes with its own suite of Krylov solvers. Although the code is self-sustained, I am inclined to believe that it would be better to use PETSc instead of my own home-grown solvers. In the attempt to do so, I have installed PETSc 3.16.4 with following options: ./configure --with-debugging=yes --download-openmpi=yes --download-fblaslapack=yes --download-metis=yes --download-parmetis=yes --download-cmake=yes on a workstation running Ubuntu 20.04 LTS. The mpif90 command which I use to compile the code, wraps gfortran with OpenMPI, hence the option "--download-openmpi=yes" when configuring PETSc. Anyhow, installation of PETSc went fine, I managed to link and run it with my code, but I am getting the following messages during compilation: Petsc_Mod.f90:18:6: 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY | 1 Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall be of the same size as elsewhere (4 vs 8 bytes) Petsc_Mod.f90 is a module I wrote for interfacing PETSc. All works, but these messages give me a reason to worry. Can you tell what causes this warnings? I would guess they might appear if one mixes OpenMPI with MPICH, but I don't think I even have MPICH on my system. Please let me know what you think about it? Cheers, Bojan -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Feb 9 10:33:01 2022 From: balay at mcs.anl.gov (Balay, Satish) Date: Wed, 9 Feb 2022 16:33:01 +0000 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: Message-ID: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> Are you using the same MPI to build both PETSc and your appliation? Satish On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > To whom it may concern, > > > I am working on a Fortran (2003) computational fluid dynamics solver, > which is actually quite mature, was parallelized with MPI from the > very beginning and it comes with its own suite of Krylov solvers.? > Although the code is self-sustained, I am inclined to believe that it > would be better to use PETSc instead of my own home-grown solvers. > > In the attempt to do so, I have installed PETSc 3.16.4 with following > options: > > ./configure --with-debugging=yes --download-openmpi=yes --download- > fblaslapack=yes --download-metis=yes --download-parmetis=yes -- > download-cmake=yes > > on a workstation running Ubuntu 20.04 LTS.? The mpif90 command which > I use to compile the code, wraps gfortran with OpenMPI, hence the > option "--download-openmpi=yes" when configuring PETSc. > > Anyhow, installation of PETSc went fine, I managed to link and run it > with my code, but I am getting the following messages during > compilation: > > Petsc_Mod.f90:18:6: > > ? ?18 | ? use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY > ? ? ? | ? ? ?1 > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall be of > the same size as elsewhere (4 vs 8 bytes) > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc.? All works, > but these messages give me a reason to worry. > > Can you tell what causes this warnings?? I would guess they might > appear if one mixes OpenMPI with MPICH, but I don't think I even have > MPICH on my system. > > Please let me know what you think about it? > > ? ? Cheers, > > ? ? Bojan > > > > From bsmith at petsc.dev Wed Feb 9 10:42:54 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 9 Feb 2022 11:42:54 -0500 Subject: [petsc-users] Quasi newton SNES change variable In-Reply-To: <98d64373-a5e7-9fd3-c033-ac7a57745a9d@univ-amu.fr> References: <98d64373-a5e7-9fd3-c033-ac7a57745a9d@univ-amu.fr> Message-ID: <84AF7492-20D8-46B5-A56B-148FD0C9ED4D@petsc.dev> It is odd that the columns are not sorted row 125: (125, 0.) (107, 0.) How was this matrix created? It is erroring because it compares the 125 column in the matrix to the 107 column requested and concludes there is no slot for 107 column. Barry > On Feb 9, 2022, at 10:33 AM, Yann Jobic wrote: > > I'm struggling for a very simple error that i can not see. I'm running in a sequential program, for the test. > > MatView is giving me for the row 125 : > row 125: (125, 0.) (107, 0.) > > I'm getting those values with MatGetRow : > row=125 > CALL MatGetRow(MATGLOB,row,nb,testcols,testvalues,IER) > write(*,*)row,testcols(1),testvalues(1), > & testcols(2),testvalues(2) > CALL MatRestoreRow(MATGLOB,row,nb,testcols,testvalues,IER) > > The output is : > 125 125 0.000000000000000E+000 107 0.000000000000000E+000 > Which is what i want. It's ok. > > Then i'm doing the MatSetValue : > val = -1 > row = 125 > col = 107 > CALL MatSetValue(MATGLOB,row,col,val, > & INSERT_VALUES, IER) > > And i've got the error : > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: New nonzero at (125,107) caused a malloc > Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.16.4, unknown > [0]PETSC ERROR: /home/jobic/projet/fe-utils/marcus/3.16/test_MatSetValue_loem_3.16p4_openmpi_intel on a named leto4.iusti-calcul.recherche by jobic Wed Feb 9 16:01:51 2022 > [0]PETSC ERROR: Configure options --prefix=/local/lib/petsc/3.16/p4/17/openmpi_intel-mkl-works --with-single-library=0 --with-large-file-io=1 --with-debugging=0 --with-blacs=1 --with-blacs-dir=/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/ --download-scalapack=1 --download-parmetis=1 --download-make=1 --download-mumps=1 --LIBS=" -Wl,-rpath,/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/" --with-blaslapack-dir=/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/ --download-metis=1 --download-parmetis=1 --download-ptscotch=1 --download-cmake=1 --download-slepc=1 --download-hdf5=1 --with-zlib=1 --download-szlib=1 --download-suitesparse=1 --download-p4est=1 --download-netcdf=1 --download-triangle=1 --with-shared-libraries=0 --with-cxx-dialect=C++11 -CFLAGS=" -O3 -mtune=core-avx2 -mkl" --COPTFLAGS="-D_POSIX_C_SOURCE=199309L" -CXXFLAGS=" -O3 -mtune=core-avx2 -mkl" -FFLAGS=" -O3 -mtune=core-avx2 -mkl" PETSC_ARCH=openmpi_intel-mkl-17-works > [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at /home/devel/src_linux/petsc-3.16.4/src/mat/impls/aij/seq/aij.c:520 > [0]PETSC ERROR: #2 MatSetValues() at /home/devel/src_linux/petsc-3.16.4/src/mat/interface/matrix.c:1398 > [0]PETSC ERROR: #3 MatGetRow() at /home/devel/src_linux/petsc-3.16.4/src/mat/interface/matrix.c:558 > [0]PETSC ERROR: #4 MatRestoreRow_Fortran() at /home/devel/src_linux/petsc-3.16.4/src/mat/interface/ftn-custom/zmatrixf.c:582 > > I'm obviously doing something wrong, but where ? > > Thanks, > > Yann > > Le 2/9/2022 ? 1:58 PM, Matthew Knepley a ?crit : >> On Wed, Feb 9, 2022 at 7:24 AM Bruno Rammon Silva Souza via petsc-users > wrote: >> Hello everyone, >> I am using the LBFGS type in the SNES solver, and it's working fine. >> But I want to change the number of stored updates in this method. >> This variable of quasi-newton methods is usually chosen by the >> runtime option: -snes_qn_m . However, I would like to change this >> variable inside my code, when calling a function, for example, but I >> can't find any kind of PETSc function that changes this variable >> directly. Is there any function like this? If not, is there some way >> to do that without using -snes_qn_m at runtime? >> This is an oversight which we will fix. For now you can use >> https://petsc.org/main/docs/manualpages/Sys/PetscOptionsSetValue.html >> in your code. >> Thanks, >> Matt >> Best regards, >> -- Bruno Souza >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Feb 9 10:47:44 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 9 Feb 2022 11:47:44 -0500 Subject: [petsc-users] stopping KSPSolve and then restarting where it stopped In-Reply-To: <86176554-468c-a651-a826-89fe9e9d87c4@univ-fcomte.fr> References: <86176554-468c-a651-a826-89fe9e9d87c4@univ-fcomte.fr> Message-ID: Yes, in that situation you just need to call KSPSetInitalGuessNonzero() and then call KSPSolve again. Note that for GMRES and friends this will begin afresh by starting with a new Krylov space, it won't start from the partially built one. Be sure to reset KSPSetInitalGuessNonzero() for your next solve if that does require an initial guess of zero. > On Feb 9, 2022, at 11:15 AM, Medane TCHAKOROM wrote: > > Hello, > > Is it possible to run KSPSolve till the convergence (or not) , let the program do some work, > > and then restart KSPSolve where it stopped ? (maybe with different tolerance or max iterations) > > > Thanks > From knepley at gmail.com Wed Feb 9 10:49:05 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 9 Feb 2022 11:49:05 -0500 Subject: [petsc-users] stopping KSPSolve and then restarting where it stopped In-Reply-To: <86176554-468c-a651-a826-89fe9e9d87c4@univ-fcomte.fr> References: <86176554-468c-a651-a826-89fe9e9d87c4@univ-fcomte.fr> Message-ID: On Wed, Feb 9, 2022 at 11:15 AM Medane TCHAKOROM < medane.tchakorom at univ-fcomte.fr> wrote: > Hello, > > Is it possible to run KSPSolve till the convergence (or not) , let the > program do some work, > You can always put work in the Monitor function. > and then restart KSPSolve where it stopped ? (maybe with different > tolerance or max iterations) > These can be changed, but I do not think we have tested this. Thanks, Matt > Thanks > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Feb 9 10:49:20 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 9 Feb 2022 10:49:20 -0600 (CST) Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> Message-ID: <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> To clarify: you are using --download-openmpi=yes with petsc. However you say: > > The mpif90 command which > > I use to compile the code, wraps gfortran with OpenMPI This suggests a different install of OpenMPI is used to build your code. One way to resolve this is - delete current build of PETSc - and rebuild it with this same MPI [that you are using with your application] ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --download-fblaslapack --download-metis --download-parmetis --download-cmake Also PETSc provides makefile format that minimizes such conflicts.. https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications Satish On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: > Are you using the same MPI to build both PETSc and your appliation? > > Satish > > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > > To whom it may concern, > > > > > > I am working on a Fortran (2003) computational fluid dynamics solver, > > which is actually quite mature, was parallelized with MPI from the > > very beginning and it comes with its own suite of Krylov solvers.? > > Although the code is self-sustained, I am inclined to believe that it > > would be better to use PETSc instead of my own home-grown solvers. > > > > In the attempt to do so, I have installed PETSc 3.16.4 with following > > options: > > > > ./configure --with-debugging=yes --download-openmpi=yes --download- > > fblaslapack=yes --download-metis=yes --download-parmetis=yes -- > > download-cmake=yes > > > > on a workstation running Ubuntu 20.04 LTS.? The mpif90 command which > > I use to compile the code, wraps gfortran with OpenMPI, hence the > > option "--download-openmpi=yes" when configuring PETSc. > > > > Anyhow, installation of PETSc went fine, I managed to link and run it > > with my code, but I am getting the following messages during > > compilation: > > > > Petsc_Mod.f90:18:6: > > > > ? ?18 | ? use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY > > ? ? ? | ? ? ?1 > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall be of > > the same size as elsewhere (4 vs 8 bytes) > > > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc.? All works, > > but these messages give me a reason to worry. > > > > Can you tell what causes this warnings?? I would guess they might > > appear if one mixes OpenMPI with MPICH, but I don't think I even have > > MPICH on my system. > > > > Please let me know what you think about it? > > > > ? ? Cheers, > > > > ? ? Bojan > > > > > > > > > > From yann.jobic at univ-amu.fr Wed Feb 9 11:16:12 2022 From: yann.jobic at univ-amu.fr (Yann Jobic) Date: Wed, 9 Feb 2022 18:16:12 +0100 Subject: [petsc-users] MatSetValue in Fortran In-Reply-To: <84AF7492-20D8-46B5-A56B-148FD0C9ED4D@petsc.dev> References: <98d64373-a5e7-9fd3-c033-ac7a57745a9d@univ-amu.fr> <84AF7492-20D8-46B5-A56B-148FD0C9ED4D@petsc.dev> Message-ID: <7d7fa56b-6618-ba31-5877-bdd8c2485d73@univ-amu.fr> Many thanks Barry ! It solved my problem, and my original question !! I'm creating the matrix myself. On this projet, I'm begining to move the ddl to the associated DMPlex datas, allowing much more possibilities. But it's not done yet. Sorry for messing up the threads of the list with my replied error. Thanks again, Yann Le 2/9/2022 ? 5:42 PM, Barry Smith a ?crit?: > > It is odd that the columns are not sorted > > row 125: (125, 0.) ?(107, 0.) > > How was this matrix created? > > It is erroring because it compares the 125 column in the matrix to the > 107 column requested and concludes there is no slot for 107 column. > > Barry > > > >> On Feb 9, 2022, at 10:33 AM, Yann Jobic > > wrote: >> >> I'm struggling for a very simple error that i can not see. I'm running >> in a sequential program, for the test. >> >> MatView is giving me for the row 125 : >> row 125: (125, 0.) ?(107, 0.) >> >> I'm getting those values with MatGetRow : >> ?????row=125 >> ?????CALL MatGetRow(MATGLOB,row,nb,testcols,testvalues,IER) >> ?????write(*,*)row,testcols(1),testvalues(1), >> ????& ?????????testcols(2),testvalues(2) >> ?????CALL MatRestoreRow(MATGLOB,row,nb,testcols,testvalues,IER) >> >> The output is : >> ?????125 ???125 ?0.000000000000000E+000 ????107 ???0.000000000000000E+000 >> Which is what i want. It's ok. >> >> Then i'm doing the MatSetValue : >> ?????val = -1 >> ?????row = 125 >> ?????col = 107 >> ?????CALL MatSetValue(MATGLOB,row,col,val, >> ????& ?????????????????INSERT_VALUES, IER) >> >> And i've got the error : >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: New nonzero at (125,107) caused a malloc >> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to >> turn off this check >> [0]PETSC ERROR: See https://petsc.org/release/faq/ >> for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.16.4, unknown >> [0]PETSC ERROR: >> /home/jobic/projet/fe-utils/marcus/3.16/test_MatSetValue_loem_3.16p4_openmpi_intel >> on a ?named leto4.iusti-calcul.recherche by jobic Wed Feb ?9 16:01:51 2022 >> [0]PETSC ERROR: Configure options >> --prefix=/local/lib/petsc/3.16/p4/17/openmpi_intel-mkl-works >> --with-single-library=0 --with-large-file-io=1 --with-debugging=0 >> --with-blacs=1 >> --with-blacs-dir=/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/ >> --download-scalapack=1 --download-parmetis=1 --download-make=1 >> --download-mumps=1 --LIBS=" >> -Wl,-rpath,/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/" >> --with-blaslapack-dir=/opt/intel/compilers_and_libraries_2017.1.132/linux/compiler/lib/intel64/ >> --download-metis=1 --download-parmetis=1 --download-ptscotch=1 >> --download-cmake=1 --download-slepc=1 --download-hdf5=1 --with-zlib=1 >> --download-szlib=1 --download-suitesparse=1 --download-p4est=1 >> --download-netcdf=1 --download-triangle=1 --with-shared-libraries=0 >> --with-cxx-dialect=C++11 -CFLAGS=" -O3 -mtune=core-avx2 -mkl" >> --COPTFLAGS="-D_POSIX_C_SOURCE=199309L" -CXXFLAGS=" -O3 >> -mtune=core-avx2 -mkl" -FFLAGS=" -O3 -mtune=core-avx2 -mkl" >> PETSC_ARCH=openmpi_intel-mkl-17-works >> [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() at >> /home/devel/src_linux/petsc-3.16.4/src/mat/impls/aij/seq/aij.c:520 >> [0]PETSC ERROR: #2 MatSetValues() at >> /home/devel/src_linux/petsc-3.16.4/src/mat/interface/matrix.c:1398 >> [0]PETSC ERROR: #3 MatGetRow() at >> /home/devel/src_linux/petsc-3.16.4/src/mat/interface/matrix.c:558 >> [0]PETSC ERROR: #4 MatRestoreRow_Fortran() at >> /home/devel/src_linux/petsc-3.16.4/src/mat/interface/ftn-custom/zmatrixf.c:582 >> >> I'm obviously doing something wrong, but where ? >> >> Thanks, >> >> Yann >> >> Le 2/9/2022 ? 1:58 PM, Matthew Knepley a ?crit?: >>> On Wed, Feb 9, 2022 at 7:24 AM Bruno Rammon Silva Souza via >>> petsc-users >>> >> wrote: >>> ???Hello everyone, >>> ???I am using the LBFGS type in the SNES solver, and it's working fine. >>> ???But I want to change the number of stored updates in this method. >>> ???This variable of quasi-newton methods is usually chosen by the >>> ???runtime option: -snes_qn_m . However, I would like to change this >>> ???variable inside my code, when calling a function, for example, but I >>> ???can't find any kind of PETSc function that changes this variable >>> ???directly. Is there any function like this? If not, is there some way >>> ???to do that without using -snes_qn_m at runtime? >>> This is an oversight which we will fix. For now you can use >>> https://petsc.org/main/docs/manualpages/Sys/PetscOptionsSetValue.html >>> >>> > >>> in your code. >>> ? Thanks, >>> ? ? ?Matt >>> ???Best regards, >>> ???-- ????Bruno Souza >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > From medane.tchakorom at univ-fcomte.fr Wed Feb 9 12:39:56 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 9 Feb 2022 19:39:56 +0100 Subject: [petsc-users] stopping KSPSolve and then restarting where it stopped In-Reply-To: References: <86176554-468c-a651-a826-89fe9e9d87c4@univ-fcomte.fr> Message-ID: <3a557866-9bb0-1b84-a344-7232b884548d@univ-fcomte.fr> Re: Thanks On 09/02/2022 17:47, Barry Smith wrote: > Yes, in that situation you just need to call KSPSetInitalGuessNonzero() and then call KSPSolve again. Note that for GMRES and friends this will begin afresh by starting with a new Krylov space, it won't start from the partially built one. > > Be sure to reset KSPSetInitalGuessNonzero() for your next solve if that does require an initial guess of zero. > >> On Feb 9, 2022, at 11:15 AM, Medane TCHAKOROM wrote: >> >> Hello, >> >> Is it possible to run KSPSolve till the convergence (or not) , let the program do some work, >> >> and then restart KSPSolve where it stopped ? (maybe with different tolerance or max iterations) >> >> >> Thanks >> From medane.tchakorom at univ-fcomte.fr Wed Feb 9 12:40:10 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 9 Feb 2022 19:40:10 +0100 Subject: [petsc-users] stopping KSPSolve and then restarting where it stopped In-Reply-To: References: <86176554-468c-a651-a826-89fe9e9d87c4@univ-fcomte.fr> Message-ID: Re: Thanks On 09/02/2022 17:49, Matthew Knepley wrote: > On Wed, Feb 9, 2022 at 11:15 AM Medane TCHAKOROM > wrote: > > Hello, > > Is it possible to run KSPSolve till the convergence (or not) , let > the > program do some work, > > > You can always put work?in the Monitor function. > > and then restart KSPSolve where it stopped ? (maybe with different > tolerance or max iterations) > > > These can be changed, but I do not think we have tested this. > > ? Thanks, > > ? ? ?Matt > > Thanks > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From francois.fillion at inrs.ca Wed Feb 9 21:04:22 2022 From: francois.fillion at inrs.ca (Fillion-Gourdeau, Francois) Date: Thu, 10 Feb 2022 03:04:22 +0000 Subject: [petsc-users] Question on DMPlex Message-ID: <2969a20506ca4e5b951c7b6b0efffcb3@inrs.ca> Dear developers, I have a mesh with topological dimension 2 (with triangular faces) embedded in a 3D space stored in a hdf5 file. This file contains the coordinates (x,y,z) of each vertex and also, it contains a list of vertices that make each element. I was able to read the mesh into a DMPlex using DMPlexCreateFromFile (without interpolation, i.e. interpolate=PETSC_FALSE ). When I do this, the elements are stored at stratum height 0 and the vertices are stored at stratum height 1 (as checked via DMPlexGetHeightStratum). However, when I interpolate the mesh (interpolate=PETSC_TRUE), it seems that some entities are created in stratum height 2 and 3. From my understanding of DMPlex, this should not happen for 2D elements because there are edges but no 3D cells. Does DMPlex assume that the mesh in 3D has a topological dimension 3 and is it possible to use 2D mesh embeded in 3D space? Best regards Fran?ois Fillion-Gourdeau, PhD INRS-EMT -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 9 22:12:11 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 9 Feb 2022 23:12:11 -0500 Subject: [petsc-users] Question on DMPlex In-Reply-To: <2969a20506ca4e5b951c7b6b0efffcb3@inrs.ca> References: <2969a20506ca4e5b951c7b6b0efffcb3@inrs.ca> Message-ID: On Wed, Feb 9, 2022 at 10:15 PM Fillion-Gourdeau, Francois < francois.fillion at inrs.ca> wrote: > Dear developers, > > I have a mesh with topological dimension 2 (with triangular faces) > embedded in a 3D space stored in a hdf5 file. This file contains the > coordinates (x,y,z) of each vertex and also, it contains a list of vertices > that make each element. I was able to read the mesh into a DMPlex using DMPlexCreateFromFile > (without interpolation, i.e. interpolate=PETSC_FALSE ). When I do this, > the elements are stored at stratum height 0 and the vertices are stored at > stratum height 1 (as checked via DMPlexGetHeightStratum). However, when I > interpolate the mesh (interpolate=PETSC_TRUE), it seems that some > entities are created in stratum height 2 and 3. From my understanding of > DMPlex, this should not happen for 2D elements because there are edges but > no 3D cells. > > > Does DMPlex assume that the mesh in 3D has a topological dimension 3 and > is it possible to use 2D mesh embeded in 3D space? > No, the coordinate dimension should not affect the topological dimension. Can you send me the mesh you want to read? It would be best to send the smallest one that fails. Thanks, Matt > Best regards > > > Fran?ois Fillion-Gourdeau, PhD > INRS-EMT > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jp.salazar at pm.me Thu Feb 10 07:29:54 2022 From: jp.salazar at pm.me (Juan Salazar) Date: Thu, 10 Feb 2022 13:29:54 +0000 Subject: [petsc-users] Compilation issues on cluster - PETSC ERROR: Unknown Mat type given: mpiaijmkl Message-ID: Hello, I am having issues compiling PETsc on a cluster using the following configure command. ./configure --force \ --with-64-bit-indices=1 \ --with-precision=double \ --with-debugging=0 \ --COPTFLAGS=-O3 \ --CXXOPTFLAGS=-O3 \ --FOPTFLAGS=-O3 \ PETSC_ARCH=$WM_OPTIONS \ --with-blaslapack-dir=$MKLROOT \ --with-mkl_sparse-dir=$MKLROOT \ --with-mkl_sparse_optimize-dir=$MKLROOT \ --with-mpi-dir=$MPI_ARCH_PATH \ --download-hypre Where MKLROOT=/scratch/app_sequana/intel-oneapi/2021.1.0-2659/mkl/2021.1.1 WM_OPTIONS=linux64GccDPInt64Opt MPI_ARCH_PATH=/scratch/app_sequana/openmpi/2.1.1 ----- $ make --version GNU Make 3.82 Built for x86_64-redhat-linux-gnu ------ ------ $ls $MKLROOT/lib/intel64 libmkl_avx2.so.1 libmkl_blacs_sgimpt_ilp64.so libmkl_gf_lp64.a libmkl_mc3.so.1 libmkl_sycl.so libmkl_avx512_mic.so.1 libmkl_blacs_sgimpt_ilp64.so.1 libmkl_gf_lp64.so libmkl_mc.so.1 libmkl_sycl.so.1 libmkl_avx512.so.1 libmkl_blacs_sgimpt_lp64.a libmkl_gf_lp64.so.1 libmkl_pgi_thread.a libmkl_tbb_thread.a libmkl_avx.so.1 libmkl_blacs_sgimpt_lp64.so libmkl_gnu_thread.a libmkl_pgi_thread.so libmkl_tbb_thread.so libmkl_blacs_intelmpi_ilp64.a libmkl_blacs_sgimpt_lp64.so.1 libmkl_gnu_thread.so libmkl_pgi_thread.so.1 libmkl_tbb_thread.so.1 libmkl_blacs_intelmpi_ilp64.so libmkl_blas95_ilp64.a libmkl_gnu_thread.so.1 libmkl_rt.so libmkl_vml_avx2.so.1 libmkl_blacs_intelmpi_ilp64.so.1 libmkl_blas95_lp64.a libmkl_intel_ilp64.a libmkl_rt.so.1 libmkl_vml_avx512_mic.so.1 libmkl_blacs_intelmpi_lp64.a libmkl_cdft_core.a libmkl_intel_ilp64.so libmkl_scalapack_ilp64.a libmkl_vml_avx512.so.1 libmkl_blacs_intelmpi_lp64.so libmkl_cdft_core.so libmkl_intel_ilp64.so.1 libmkl_scalapack_ilp64.so libmkl_vml_avx.so.1 libmkl_blacs_intelmpi_lp64.so.1 libmkl_cdft_core.so.1 libmkl_intel_lp64.a libmkl_scalapack_ilp64.so.1 libmkl_vml_cmpt.so.1 libmkl_blacs_openmpi_ilp64.a libmkl_core.a libmkl_intel_lp64.so libmkl_scalapack_lp64.a libmkl_vml_def.so.1 libmkl_blacs_openmpi_ilp64.so libmkl_core.so libmkl_intel_lp64.so.1 libmkl_scalapack_lp64.so libmkl_vml_mc2.so.1 libmkl_blacs_openmpi_ilp64.so.1 libmkl_core.so.1 libmkl_intel_thread.a libmkl_scalapack_lp64.so.1 libmkl_vml_mc3.so.1 libmkl_blacs_openmpi_lp64.a libmkl_def.so.1 libmkl_intel_thread.so libmkl_sequential.a libmkl_vml_mc.so.1 libmkl_blacs_openmpi_lp64.so libmkl_gf_ilp64.a libmkl_intel_thread.so.1 libmkl_sequential.so locale libmkl_blacs_openmpi_lp64.so.1 libmkl_gf_ilp64.so libmkl_lapack95_ilp64.a libmkl_sequential.so.1 libmkl_blacs_sgimpt_ilp64.a libmkl_gf_ilp64.so.1 libmkl_lapack95_lp64.a libmkl_sycl.a ------ I am running code that requires mat_type mpiaijmkl, but unfortunately it seems that mpiaijmkl.c is not compiled and I get the error: PETSC ERROR: Unknown Mat type given: mpiaijmkl ------ $ls linux64GccDPInt64Opt/obj/mat/impls/aij/mpi/ aijperm fdmpiaij.d ftn-custom mpb_aij.d mpiaij.o mpimatmatmatmult.d mpimatmatmult.o mpiov.d mpiptap.o aijsell fdmpiaij.o mmaij.d mpb_aij.o mpiaijpc.d mpimatmatmatmult.o mpimattransposematmult.d mpiov.o crl ftn-auto mmaij.o mpiaij.d mpiaijpc.o mpimatmatmult.d mpimattransposematmult.o mpiptap.d ------ In the make.log I see: PETSC_HAVE_MKL 1 But the variable PETSC_HAVE_MKL_SPARSE is not set, and according to src/mat/impls/aij/mpi/aijmkl/makefile it should be set to 1 for the file to be included in the compilation. I have searched in the user list and tried different configure options, but so far without success. Any guidance is highly appreciated. Attached are the configure and make logs. Cheers, Juan S. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1563037 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 116642 bytes Desc: not available URL: From knepley at gmail.com Thu Feb 10 07:48:17 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Feb 2022 08:48:17 -0500 Subject: [petsc-users] Compilation issues on cluster - PETSC ERROR: Unknown Mat type given: mpiaijmkl In-Reply-To: References: Message-ID: On Thu, Feb 10, 2022 at 8:30 AM Juan Salazar via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I am having issues compiling PETsc on a cluster using the following > configure command. > > ./configure --force \ > --with-64-bit-indices=1 \ > --with-precision=double \ > --with-debugging=0 \ > --COPTFLAGS=-O3 \ > --CXXOPTFLAGS=-O3 \ > --FOPTFLAGS=-O3 \ > PETSC_ARCH=$WM_OPTIONS \ > --with-blaslapack-dir=$MKLROOT \ > --with-mkl_sparse-dir=$MKLROOT \ > --with-mkl_sparse_optimize-dir=$MKLROOT \ > --with-mpi-dir=$MPI_ARCH_PATH \ > --download-hypre > > Where > > MKLROOT=/scratch/app_sequana/intel-oneapi/2021.1.0-2659/mkl/2021.1.1 > WM_OPTIONS=linux64GccDPInt64Opt > MPI_ARCH_PATH=/scratch/app_sequana/openmpi/2.1.1 > > ----- > $ make --version > GNU Make 3.82 > Built for x86_64-redhat-linux-gnu > ------ > > > ------ > $ls $MKLROOT/lib/intel64 > > libmkl_avx2.so.1 libmkl_blacs_sgimpt_ilp64.so > libmkl_gf_lp64.a libmkl_mc3.so.1 libmkl_sycl.so > libmkl_avx512_mic.so.1 > libmkl_blacs_sgimpt_ilp64.so.1 libmkl_gf_lp64.so > libmkl_mc.so.1 libmkl_sycl.so.1 > libmkl_avx512.so.1 libmkl_blacs_sgimpt_lp64.a > libmkl_gf_lp64.so.1 libmkl_pgi_thread.a libmkl_tbb_thread.a > libmkl_avx.so.1 libmkl_blacs_sgimpt_lp64.so > libmkl_gnu_thread.a libmkl_pgi_thread.so > libmkl_tbb_thread.so > libmkl_blacs_intelmpi_ilp64.a > libmkl_blacs_sgimpt_lp64.so.1 libmkl_gnu_thread.so > libmkl_pgi_thread.so.1 libmkl_tbb_thread.so.1 > libmkl_blacs_intelmpi_ilp64.so libmkl_blas95_ilp64.a > libmkl_gnu_thread.so.1 libmkl_rt.so > libmkl_vml_avx2.so.1 > libmkl_blacs_intelmpi_ilp64.so.1 libmkl_blas95_lp64.a > libmkl_intel_ilp64.a libmkl_rt.so.1 > libmkl_vml_avx512_mic.so.1 > libmkl_blacs_intelmpi_lp64.a libmkl_cdft_core.a > libmkl_intel_ilp64.so libmkl_scalapack_ilp64.a > libmkl_vml_avx512.so.1 > libmkl_blacs_intelmpi_lp64.so libmkl_cdft_core.so > libmkl_intel_ilp64.so.1 libmkl_scalapack_ilp64.so libmkl_vml_avx.so.1 > libmkl_blacs_intelmpi_lp64.so.1 libmkl_cdft_core.so.1 > libmkl_intel_lp64.a > libmkl_scalapack_ilp64.so.1 libmkl_vml_cmpt.so.1 > libmkl_blacs_openmpi_ilp64.a libmkl_core.a > libmkl_intel_lp64.so libmkl_scalapack_lp64.a libmkl_vml_def.so.1 > libmkl_blacs_openmpi_ilp64.so libmkl_core.so > libmkl_intel_lp64.so.1 libmkl_scalapack_lp64.so libmkl_vml_mc2.so.1 > libmkl_blacs_openmpi_ilp64.so.1 libmkl_core.so.1 > libmkl_intel_thread.a libmkl_scalapack_lp64.so.1 libmkl_vml_mc3.so.1 > libmkl_blacs_openmpi_lp64.a libmkl_def.so.1 > libmkl_intel_thread.so libmkl_sequential.a libmkl_vml_mc.so.1 > libmkl_blacs_openmpi_lp64.so libmkl_gf_ilp64.a > libmkl_intel_thread.so.1 libmkl_sequential.so locale > libmkl_blacs_openmpi_lp64.so.1 libmkl_gf_ilp64.so > libmkl_lapack95_ilp64.a libmkl_sequential.so.1 > libmkl_blacs_sgimpt_ilp64.a libmkl_gf_ilp64.so.1 > libmkl_lapack95_lp64.a libmkl_sycl.a > ------ > > I am running code that requires mat_type mpiaijmkl, but unfortunately it > seems that mpiaijmkl.c is not compiled and I get the error: PETSC ERROR: > Unknown Mat type given: mpiaijmkl > > ------ > $ls linux64GccDPInt64Opt/obj/mat/impls/aij/mpi/ > > aijperm fdmpiaij.d ftn-custom mpb_aij.d mpiaij.o > mpimatmatmatmult.d mpimatmatmult.o mpiov.d mpiptap.o > aijsell fdmpiaij.o mmaij.d > mpb_aij.o mpiaijpc.d mpimatmatmatmult.o mpimattransposematmult.d mpiov.o > crl ftn-auto mmaij.o mpiaij.d mpiaijpc.o mpimatmatmult.d > mpimattransposematmult.o mpiptap.d > ------ > > In the make.log I see: > > PETSC_HAVE_MKL 1 > > But the variable PETSC_HAVE_MKL_SPARSE is not set, and according > to src/mat/impls/aij/mpi/aijmkl/makefile it should be set to 1 for the file > to be included in the compilation. > > I have searched in the user list and tried different configure options, > but so far without success. Any guidance is highly appreciated. Attached > are the configure and make logs. > Hi Juan, I believe the problem is that you specify --with-mkl_sparse-dir, but that is not used because the BLAS/LAPACK logic checks for that, and you just need --with-mkl_sparse. Normally the "dir" option would do this automatically, but since it is not used, that logic does not kick in. Please tell me if this works. Thanks, Matt > Cheers, > Juan S. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Thu Feb 10 08:16:47 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Thu, 10 Feb 2022 15:16:47 +0100 Subject: [petsc-users] Update of the buffer Message-ID: <52b90fdd-b142-6d85-a9df-642825d047a9@univ-fcomte.fr> Hello , Sorry if this question does not belong to this mailling list, i'am using Petsc , but with some MPI parts code, when dealing with communication. If a make two consecutive MPI_Isend requests, and if the destination processor has not yet receive the message inbetween the two calls, will the buffer be updated ? I mean if I send message "1" for the first request, then send "0" as the second message. Will the receiver receive "0" as message ? I not, how can I do to update the message ? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Feb 10 09:00:18 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Feb 2022 10:00:18 -0500 Subject: [petsc-users] Update of the buffer In-Reply-To: <52b90fdd-b142-6d85-a9df-642825d047a9@univ-fcomte.fr> References: <52b90fdd-b142-6d85-a9df-642825d047a9@univ-fcomte.fr> Message-ID: On Thu, Feb 10, 2022 at 9:17 AM Medane TCHAKOROM < medane.tchakorom at univ-fcomte.fr> wrote: > Hello , > > Sorry if this question does not belong to this mailling list, i'am using > Petsc , but with some > > MPI parts code, when dealing with communication. > > If a make two consecutive MPI_Isend requests, and if the destination > processor has not yet receive the message inbetween the two calls, will the > buffer be updated ? I mean if I send message "1" for the first request, > then send "0" as the second message. Will the receiver receive "0" as > message ? I not, how can I do to update the message ? > I believe that MPI guarantees in-order message delivery from a source to a target, so if you send 1 before 0, the receiver should get them in that order. However, someone here should know for sure. Thanks, Matt > Thanks > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Thu Feb 10 09:02:54 2022 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Thu, 10 Feb 2022 18:02:54 +0300 Subject: [petsc-users] Update of the buffer In-Reply-To: References: <52b90fdd-b142-6d85-a9df-642825d047a9@univ-fcomte.fr> Message-ID: <7561B301-D053-4049-B318-023E39F82E30@gmail.com> > On Feb 10, 2022, at 6:00 PM, Matthew Knepley wrote: > > On Thu, Feb 10, 2022 at 9:17 AM Medane TCHAKOROM > wrote: > Hello , > > Sorry if this question does not belong to this mailling list, i'am using Petsc , but with some > > MPI parts code, when dealing with communication. > > If a make two consecutive MPI_Isend requests, and if the destination processor has not yet receive the message inbetween the two calls, will the buffer be updated ? I mean if I send message "1" for the first request, then send "0" as the second message. Will the receiver receive "0" as message ? I not, how can I do to update the message ? > > I believe that MPI guarantees in-order message delivery from a source to a target, so if you send 1 before 0, the receiver > should get them in that order. However, someone here should know for sure. I don?t think so. You should use tags to flag the proper operation and wait for it if you need the value to arrive. > > Thanks, > > Matt > > Thanks > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wence at gmx.li Thu Feb 10 09:11:46 2022 From: wence at gmx.li (Lawrence Mitchell) Date: Thu, 10 Feb 2022 15:11:46 +0000 Subject: [petsc-users] Update of the buffer In-Reply-To: <7561B301-D053-4049-B318-023E39F82E30@gmail.com> References: <52b90fdd-b142-6d85-a9df-642825d047a9@univ-fcomte.fr> <7561B301-D053-4049-B318-023E39F82E30@gmail.com> Message-ID: <4EF9F904-BFCD-4B22-9617-63A8DBF3036F@gmx.li> > On 10 Feb 2022, at 15:02, Stefano Zampini wrote: > > > >> On Feb 10, 2022, at 6:00 PM, Matthew Knepley wrote: >> >> On Thu, Feb 10, 2022 at 9:17 AM Medane TCHAKOROM wrote: >> Hello , >> >> Sorry if this question does not belong to this mailling list, i'am using Petsc , but with some >> >> MPI parts code, when dealing with communication. >> >> If a make two consecutive MPI_Isend requests, and if the destination processor has not yet receive the message inbetween the two calls, will the buffer be updated ? I mean if I send message "1" for the first request, then send "0" as the second message. Will the receiver receive "0" as message ? I not, how can I do to update the message ? >> >> I believe that MPI guarantees in-order message delivery from a source to a target, so if you send 1 before 0, the receiver >> should get them in that order. However, someone here should know for sure. > > I don?t think so. You should use tags to flag the proper operation and wait for it if you need the value to arrive. Multiple messages that match the same receive are guaranteed not to overtake (modulo some technical conditions to do with wildcards and waitany, as well as threading). See https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node58.htm#Node58, and the discussion for semantics of nonblocking messages here https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node65.htm#Node65 Lawrence From susanne.claus at onera.fr Thu Feb 10 09:09:24 2022 From: susanne.claus at onera.fr (Susanne Claus) Date: Thu, 10 Feb 2022 16:09:24 +0100 Subject: [petsc-users] Gmsh 8-noded quadrilateral Message-ID: Hello, I am using DMPlex for the mesh structure of a solid mechanics finite element code. I mainly use gmsh as input file format. When I try to read in 8-noded Quadrilaterals (Element type 16 in gmsh) DMPlex tells me that this element type is unknown. However a 9-noded Quadrilateral can be read without problem. On inspecting the plexgmsh.c source code I can see that 8-noded quadrilaterals are deactivated: #if 0 146: {20, GMSH_TRI, 2, 3, 3, 9, NULL}, 147: {16, GMSH_QUA, 2, 2, 4, 8, NULL}, For our application these 8-noded quadrilateral are very important. Is there any reason why they have not been implemented/deactivated in the dmplex gmsh reader? Thank you for all the great work you are doing. PETSc is amazing. Best wishes, Susanne Claus -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Feb 10 09:23:44 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Feb 2022 10:23:44 -0500 Subject: [petsc-users] Gmsh 8-noded quadrilateral In-Reply-To: References: Message-ID: On Thu, Feb 10, 2022 at 10:12 AM Susanne Claus wrote: > Hello, > > I am using DMPlex for the mesh structure of a solid mechanics finite > element code. I mainly use gmsh as input file format. When I try to read in > 8-noded Quadrilaterals (Element type 16 in gmsh) DMPlex tells me that this > element type is unknown. However a 9-noded Quadrilateral can be read > without problem. On inspecting the plexgmsh.c source code I can see that > 8-noded quadrilaterals are deactivated: > > #if 0146: {20, GMSH_TRI, 2, 3, 3, 9, NULL},147: {16, GMSH_QUA, 2, 2, 4, 8, NULL}, > > For our application these 8-noded quadrilateral are very important. > > Is there any reason why they have not been implemented/deactivated in the dmplex gmsh reader? > > No, we can handle them in the same way I think. Let me look at it. Hopefully it is easy. Thanks, Matt > Thank you for all the great work you are doing. PETSc is amazing. > > Best wishes, > Susanne Claus > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno.scientist at gmail.com Thu Feb 10 09:34:04 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Thu, 10 Feb 2022 16:34:04 +0100 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> Message-ID: Dear Satish, Thanks for the answer. Your suggestion makes a lot of sense, but this is what I get as a result of that: Running check examples to verify correct installation Using PETSC_DIR=/home/niceno/Development/petsc-debug and PETSC_ARCH=arch-linux-c-debug Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process See http://www.mcs.anl.gov/petsc/documentation/faq.html Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., grashof # = 1. Number of SNES iterations = 2 Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes See http://www.mcs.anl.gov/petsc/documentation/faq.html Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., grashof # = 1. Number of SNES iterations = 2 Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI process See http://www.mcs.anl.gov/petsc/documentation/faq.html Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 Completed test examples I am getting the "Possible error running Fortran example" warning with this. This somehow looks more severe to me. But I could be wrong. Any suggestions what to do? Kind regards, Bojan On Wed, Feb 9, 2022 at 5:49 PM Satish Balay wrote: > To clarify: > > you are using --download-openmpi=yes with petsc. However you say: > > > > The mpif90 command which > > > I use to compile the code, wraps gfortran with OpenMPI > > This suggests a different install of OpenMPI is used to build your code. > > One way to resolve this is - delete current build of PETSc - and rebuild > it with this same MPI [that you are using with your application] > > ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > --download-fblaslapack --download-metis --download-parmetis --download-cmake > > Also PETSc provides makefile format that minimizes such conflicts.. > > > https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications > > Satish > > On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: > > > Are you using the same MPI to build both PETSc and your appliation? > > > > Satish > > > > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > > > To whom it may concern, > > > > > > > > > I am working on a Fortran (2003) computational fluid dynamics solver, > > > which is actually quite mature, was parallelized with MPI from the > > > very beginning and it comes with its own suite of Krylov solvers. > > > Although the code is self-sustained, I am inclined to believe that it > > > would be better to use PETSc instead of my own home-grown solvers. > > > > > > In the attempt to do so, I have installed PETSc 3.16.4 with following > > > options: > > > > > > ./configure --with-debugging=yes --download-openmpi=yes --download- > > > fblaslapack=yes --download-metis=yes --download-parmetis=yes -- > > > download-cmake=yes > > > > > > on a workstation running Ubuntu 20.04 LTS. The mpif90 command which > > > I use to compile the code, wraps gfortran with OpenMPI, hence the > > > option "--download-openmpi=yes" when configuring PETSc. > > > > > > Anyhow, installation of PETSc went fine, I managed to link and run it > > > with my code, but I am getting the following messages during > > > compilation: > > > > > > Petsc_Mod.f90:18:6: > > > > > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY > > > | 1 > > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall be of > > > the same size as elsewhere (4 vs 8 bytes) > > > > > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. All works, > > > but these messages give me a reason to worry. > > > > > > Can you tell what causes this warnings? I would guess they might > > > appear if one mixes OpenMPI with MPICH, but I don't think I even have > > > MPICH on my system. > > > > > > Please let me know what you think about it? > > > > > > Cheers, > > > > > > Bojan > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Feb 10 09:37:17 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Feb 2022 10:37:17 -0500 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> Message-ID: On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Dear Satish, > > Thanks for the answer. Your suggestion makes a lot of sense, but this is > what I get as a result of that: > > Running check examples to verify correct installation > Using PETSC_DIR=/home/niceno/Development/petsc-debug and > PETSC_ARCH=arch-linux-c-debug > Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process > See http://www.mcs.anl.gov/petsc/documentation/faq.html > Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., > grashof # = 1. > Number of SNES iterations = 2 > Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes > See http://www.mcs.anl.gov/petsc/documentation/faq.html > Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., > grashof # = 1. > Number of SNES iterations = 2 > Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI > process > See http://www.mcs.anl.gov/petsc/documentation/faq.html > Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 > Completed test examples > > I am getting the "Possible error running Fortran example" warning with > this. This somehow looks more severe to me. But I could be wrong. > You are getting this message because your MPI implementation is printing Invalid MIT-MAGIC-COOKIE-1 key It is still running fine, but this is an MPI configuration issue. Thanks, Matt Any suggestions what to do? > > > Kind regards, > > Bojan > > > > On Wed, Feb 9, 2022 at 5:49 PM Satish Balay wrote: > >> To clarify: >> >> you are using --download-openmpi=yes with petsc. However you say: >> >> > > The mpif90 command which >> > > I use to compile the code, wraps gfortran with OpenMPI >> >> This suggests a different install of OpenMPI is used to build your code. >> >> One way to resolve this is - delete current build of PETSc - and rebuild >> it with this same MPI [that you are using with your application] >> >> ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 >> --download-fblaslapack --download-metis --download-parmetis --download-cmake >> >> Also PETSc provides makefile format that minimizes such conflicts.. >> >> >> https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications >> >> Satish >> >> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: >> >> > Are you using the same MPI to build both PETSc and your appliation? >> > >> > Satish >> > >> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: >> > > To whom it may concern, >> > > >> > > >> > > I am working on a Fortran (2003) computational fluid dynamics solver, >> > > which is actually quite mature, was parallelized with MPI from the >> > > very beginning and it comes with its own suite of Krylov solvers. >> > > Although the code is self-sustained, I am inclined to believe that it >> > > would be better to use PETSc instead of my own home-grown solvers. >> > > >> > > In the attempt to do so, I have installed PETSc 3.16.4 with following >> > > options: >> > > >> > > ./configure --with-debugging=yes --download-openmpi=yes --download- >> > > fblaslapack=yes --download-metis=yes --download-parmetis=yes -- >> > > download-cmake=yes >> > > >> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 command which >> > > I use to compile the code, wraps gfortran with OpenMPI, hence the >> > > option "--download-openmpi=yes" when configuring PETSc. >> > > >> > > Anyhow, installation of PETSc went fine, I managed to link and run it >> > > with my code, but I am getting the following messages during >> > > compilation: >> > > >> > > Petsc_Mod.f90:18:6: >> > > >> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY >> > > | 1 >> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall be of >> > > the same size as elsewhere (4 vs 8 bytes) >> > > >> > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. All works, >> > > but these messages give me a reason to worry. >> > > >> > > Can you tell what causes this warnings? I would guess they might >> > > appear if one mixes OpenMPI with MPICH, but I don't think I even have >> > > MPICH on my system. >> > > >> > > Please let me know what you think about it? >> > > >> > > Cheers, >> > > >> > > Bojan >> > > >> > > >> > > >> > > >> > >> > >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno.scientist at gmail.com Thu Feb 10 09:40:06 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Thu, 10 Feb 2022 16:40:06 +0100 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> Message-ID: Thanks a lot, now I feel much better. By the way, I can't get around the invalid magic cookie. It is occurring ever since I installed the OS (Ubuntu 20.04) so I eventually gave up and decided to live with it :-D Cheers, Bojan On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley wrote: > On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < > bojan.niceno.scientist at gmail.com> wrote: > >> Dear Satish, >> >> Thanks for the answer. Your suggestion makes a lot of sense, but this is >> what I get as a result of that: >> >> Running check examples to verify correct installation >> Using PETSC_DIR=/home/niceno/Development/petsc-debug and >> PETSC_ARCH=arch-linux-c-debug >> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process >> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., >> grashof # = 1. >> Number of SNES iterations = 2 >> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes >> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., >> grashof # = 1. >> Number of SNES iterations = 2 >> Possible error running Fortran example src/snes/tutorials/ex5f with 1 MPI >> process >> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 >> Completed test examples >> >> I am getting the "Possible error running Fortran example" warning with >> this. This somehow looks more severe to me. But I could be wrong. >> > > You are getting this message because your MPI implementation is printing > > Invalid MIT-MAGIC-COOKIE-1 key > > It is still running fine, but this is an MPI configuration issue. > > Thanks, > > Matt > > Any suggestions what to do? >> >> >> Kind regards, >> >> Bojan >> >> >> >> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay wrote: >> >>> To clarify: >>> >>> you are using --download-openmpi=yes with petsc. However you say: >>> >>> > > The mpif90 command which >>> > > I use to compile the code, wraps gfortran with OpenMPI >>> >>> This suggests a different install of OpenMPI is used to build your code. >>> >>> One way to resolve this is - delete current build of PETSc - and rebuild >>> it with this same MPI [that you are using with your application] >>> >>> ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 >>> --download-fblaslapack --download-metis --download-parmetis --download-cmake >>> >>> Also PETSc provides makefile format that minimizes such conflicts.. >>> >>> >>> https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications >>> >>> Satish >>> >>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: >>> >>> > Are you using the same MPI to build both PETSc and your appliation? >>> > >>> > Satish >>> > >>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: >>> > > To whom it may concern, >>> > > >>> > > >>> > > I am working on a Fortran (2003) computational fluid dynamics solver, >>> > > which is actually quite mature, was parallelized with MPI from the >>> > > very beginning and it comes with its own suite of Krylov solvers. >>> > > Although the code is self-sustained, I am inclined to believe that it >>> > > would be better to use PETSc instead of my own home-grown solvers. >>> > > >>> > > In the attempt to do so, I have installed PETSc 3.16.4 with following >>> > > options: >>> > > >>> > > ./configure --with-debugging=yes --download-openmpi=yes --download- >>> > > fblaslapack=yes --download-metis=yes --download-parmetis=yes -- >>> > > download-cmake=yes >>> > > >>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 command which >>> > > I use to compile the code, wraps gfortran with OpenMPI, hence the >>> > > option "--download-openmpi=yes" when configuring PETSc. >>> > > >>> > > Anyhow, installation of PETSc went fine, I managed to link and run it >>> > > with my code, but I am getting the following messages during >>> > > compilation: >>> > > >>> > > Petsc_Mod.f90:18:6: >>> > > >>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY >>> > > | 1 >>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall be of >>> > > the same size as elsewhere (4 vs 8 bytes) >>> > > >>> > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. All works, >>> > > but these messages give me a reason to worry. >>> > > >>> > > Can you tell what causes this warnings? I would guess they might >>> > > appear if one mixes OpenMPI with MPICH, but I don't think I even have >>> > > MPICH on my system. >>> > > >>> > > Please let me know what you think about it? >>> > > >>> > > Cheers, >>> > > >>> > > Bojan >>> > > >>> > > >>> > > >>> > > >>> > >>> > >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Feb 10 09:43:52 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 10 Feb 2022 10:43:52 -0500 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> Message-ID: On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Thanks a lot, now I feel much better. > > By the way, I can't get around the invalid magic cookie. It is occurring > ever since I installed the OS (Ubuntu 20.04) so I eventually gave up and > decided to live with it :-D > https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely Thanks, Matt > Cheers, > > Bojan > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley wrote: > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < >> bojan.niceno.scientist at gmail.com> wrote: >> >>> Dear Satish, >>> >>> Thanks for the answer. Your suggestion makes a lot of sense, but this >>> is what I get as a result of that: >>> >>> Running check examples to verify correct installation >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and >>> PETSC_ARCH=arch-linux-c-debug >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., >>> grashof # = 1. >>> Number of SNES iterations = 2 >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., >>> grashof # = 1. >>> Number of SNES iterations = 2 >>> Possible error running Fortran example src/snes/tutorials/ex5f with 1 >>> MPI process >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 >>> Completed test examples >>> >>> I am getting the "Possible error running Fortran example" warning with >>> this. This somehow looks more severe to me. But I could be wrong. >>> >> >> You are getting this message because your MPI implementation is printing >> >> Invalid MIT-MAGIC-COOKIE-1 key >> >> It is still running fine, but this is an MPI configuration issue. >> >> Thanks, >> >> Matt >> >> Any suggestions what to do? >>> >>> >>> Kind regards, >>> >>> Bojan >>> >>> >>> >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay wrote: >>> >>>> To clarify: >>>> >>>> you are using --download-openmpi=yes with petsc. However you say: >>>> >>>> > > The mpif90 command which >>>> > > I use to compile the code, wraps gfortran with OpenMPI >>>> >>>> This suggests a different install of OpenMPI is used to build your code. >>>> >>>> One way to resolve this is - delete current build of PETSc - and >>>> rebuild it with this same MPI [that you are using with your application] >>>> >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 >>>> --download-fblaslapack --download-metis --download-parmetis --download-cmake >>>> >>>> Also PETSc provides makefile format that minimizes such conflicts.. >>>> >>>> >>>> https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications >>>> >>>> Satish >>>> >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: >>>> >>>> > Are you using the same MPI to build both PETSc and your appliation? >>>> > >>>> > Satish >>>> > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: >>>> > > To whom it may concern, >>>> > > >>>> > > >>>> > > I am working on a Fortran (2003) computational fluid dynamics >>>> solver, >>>> > > which is actually quite mature, was parallelized with MPI from the >>>> > > very beginning and it comes with its own suite of Krylov solvers. >>>> > > Although the code is self-sustained, I am inclined to believe that >>>> it >>>> > > would be better to use PETSc instead of my own home-grown solvers. >>>> > > >>>> > > In the attempt to do so, I have installed PETSc 3.16.4 with >>>> following >>>> > > options: >>>> > > >>>> > > ./configure --with-debugging=yes --download-openmpi=yes --download- >>>> > > fblaslapack=yes --download-metis=yes --download-parmetis=yes -- >>>> > > download-cmake=yes >>>> > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 command which >>>> > > I use to compile the code, wraps gfortran with OpenMPI, hence the >>>> > > option "--download-openmpi=yes" when configuring PETSc. >>>> > > >>>> > > Anyhow, installation of PETSc went fine, I managed to link and run >>>> it >>>> > > with my code, but I am getting the following messages during >>>> > > compilation: >>>> > > >>>> > > Petsc_Mod.f90:18:6: >>>> > > >>>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY >>>> > > | 1 >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall be of >>>> > > the same size as elsewhere (4 vs 8 bytes) >>>> > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. All works, >>>> > > but these messages give me a reason to worry. >>>> > > >>>> > > Can you tell what causes this warnings? I would guess they might >>>> > > appear if one mixes OpenMPI with MPICH, but I don't think I even >>>> have >>>> > > MPICH on my system. >>>> > > >>>> > > Please let me know what you think about it? >>>> > > >>>> > > Cheers, >>>> > > >>>> > > Bojan >>>> > > >>>> > > >>>> > > >>>> > > >>>> > >>>> > >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Feb 10 09:53:08 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 10 Feb 2022 09:53:08 -0600 (CST) Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> Message-ID: <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> Do the compute nodes and frontend share the same NFS? I would try the following [to see if they work): - delete ~/.Xauthority [first check with 'xauth list') - setup ssh to not use X - i.e add the following to ~/.ssh/config ForwardX11 no ForwardX11Trusted no [this can be tailored to apply only to your specific compute nodes - if needed] Satish On Thu, 10 Feb 2022, Matthew Knepley wrote: > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < > bojan.niceno.scientist at gmail.com> wrote: > > > Thanks a lot, now I feel much better. > > > > By the way, I can't get around the invalid magic cookie. It is occurring > > ever since I installed the OS (Ubuntu 20.04) so I eventually gave up and > > decided to live with it :-D > > > > https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely > > Thanks, > > Matt > > > > Cheers, > > > > Bojan > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley wrote: > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < > >> bojan.niceno.scientist at gmail.com> wrote: > >> > >>> Dear Satish, > >>> > >>> Thanks for the answer. Your suggestion makes a lot of sense, but this > >>> is what I get as a result of that: > >>> > >>> Running check examples to verify correct installation > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and > >>> PETSC_ARCH=arch-linux-c-debug > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI process > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., > >>> grashof # = 1. > >>> Number of SNES iterations = 2 > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI processes > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., > >>> grashof # = 1. > >>> Number of SNES iterations = 2 > >>> Possible error running Fortran example src/snes/tutorials/ex5f with 1 > >>> MPI process > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 > >>> Completed test examples > >>> > >>> I am getting the "Possible error running Fortran example" warning with > >>> this. This somehow looks more severe to me. But I could be wrong. > >>> > >> > >> You are getting this message because your MPI implementation is printing > >> > >> Invalid MIT-MAGIC-COOKIE-1 key > >> > >> It is still running fine, but this is an MPI configuration issue. > >> > >> Thanks, > >> > >> Matt > >> > >> Any suggestions what to do? > >>> > >>> > >>> Kind regards, > >>> > >>> Bojan > >>> > >>> > >>> > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay wrote: > >>> > >>>> To clarify: > >>>> > >>>> you are using --download-openmpi=yes with petsc. However you say: > >>>> > >>>> > > The mpif90 command which > >>>> > > I use to compile the code, wraps gfortran with OpenMPI > >>>> > >>>> This suggests a different install of OpenMPI is used to build your code. > >>>> > >>>> One way to resolve this is - delete current build of PETSc - and > >>>> rebuild it with this same MPI [that you are using with your application] > >>>> > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > >>>> --download-fblaslapack --download-metis --download-parmetis --download-cmake > >>>> > >>>> Also PETSc provides makefile format that minimizes such conflicts.. > >>>> > >>>> > >>>> https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications > >>>> > >>>> Satish > >>>> > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: > >>>> > >>>> > Are you using the same MPI to build both PETSc and your appliation? > >>>> > > >>>> > Satish > >>>> > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > >>>> > > To whom it may concern, > >>>> > > > >>>> > > > >>>> > > I am working on a Fortran (2003) computational fluid dynamics > >>>> solver, > >>>> > > which is actually quite mature, was parallelized with MPI from the > >>>> > > very beginning and it comes with its own suite of Krylov solvers. > >>>> > > Although the code is self-sustained, I am inclined to believe that > >>>> it > >>>> > > would be better to use PETSc instead of my own home-grown solvers. > >>>> > > > >>>> > > In the attempt to do so, I have installed PETSc 3.16.4 with > >>>> following > >>>> > > options: > >>>> > > > >>>> > > ./configure --with-debugging=yes --download-openmpi=yes --download- > >>>> > > fblaslapack=yes --download-metis=yes --download-parmetis=yes -- > >>>> > > download-cmake=yes > >>>> > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 command which > >>>> > > I use to compile the code, wraps gfortran with OpenMPI, hence the > >>>> > > option "--download-openmpi=yes" when configuring PETSc. > >>>> > > > >>>> > > Anyhow, installation of PETSc went fine, I managed to link and run > >>>> it > >>>> > > with my code, but I am getting the following messages during > >>>> > > compilation: > >>>> > > > >>>> > > Petsc_Mod.f90:18:6: > >>>> > > > >>>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY > >>>> > > | 1 > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall be of > >>>> > > the same size as elsewhere (4 vs 8 bytes) > >>>> > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. All works, > >>>> > > but these messages give me a reason to worry. > >>>> > > > >>>> > > Can you tell what causes this warnings? I would guess they might > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't think I even > >>>> have > >>>> > > MPICH on my system. > >>>> > > > >>>> > > Please let me know what you think about it? > >>>> > > > >>>> > > Cheers, > >>>> > > > >>>> > > Bojan > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > >>>> > > >>>> > >>> > >> > >> -- > >> What most experimenters take for granted before they begin their > >> experiments is infinitely more interesting than any results to which their > >> experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ > >> > >> > > > > From bojan.niceno.scientist at gmail.com Thu Feb 10 09:59:35 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Thu, 10 Feb 2022 16:59:35 +0100 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> Message-ID: Dear both, I work on an ASUS ROG laptop and don't use any NFS. Everything is on one computer, one disk. That is why I couldn't resolve the Invalid Magic Cookie, because all the advice I've found about it concerns the remote access/display. It is not an issue for me. My laptop has an Nvidia GeForce RTX graphical card, maybe Ubuntu drivers are simply not able to cope with it. I am out of ideas, really. Cheers, Bojan On Thu, Feb 10, 2022 at 4:53 PM Satish Balay wrote: > Do the compute nodes and frontend share the same NFS? > > I would try the following [to see if they work): > > - delete ~/.Xauthority [first check with 'xauth list') > - setup ssh to not use X - i.e add the following to ~/.ssh/config > > ForwardX11 no > ForwardX11Trusted no > > [this can be tailored to apply only to your specific compute nodes - if > needed] > > Satish > > On Thu, 10 Feb 2022, Matthew Knepley wrote: > > > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < > > bojan.niceno.scientist at gmail.com> wrote: > > > > > Thanks a lot, now I feel much better. > > > > > > By the way, I can't get around the invalid magic cookie. It is > occurring > > > ever since I installed the OS (Ubuntu 20.04) so I eventually gave up > and > > > decided to live with it :-D > > > > > > > > https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely > > > > Thanks, > > > > Matt > > > > > > > Cheers, > > > > > > Bojan > > > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley > wrote: > > > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < > > >> bojan.niceno.scientist at gmail.com> wrote: > > >> > > >>> Dear Satish, > > >>> > > >>> Thanks for the answer. Your suggestion makes a lot of sense, but > this > > >>> is what I get as a result of that: > > >>> > > >>> Running check examples to verify correct installation > > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and > > >>> PETSC_ARCH=arch-linux-c-debug > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI > process > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., > > >>> grashof # = 1. > > >>> Number of SNES iterations = 2 > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI > processes > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., > > >>> grashof # = 1. > > >>> Number of SNES iterations = 2 > > >>> Possible error running Fortran example src/snes/tutorials/ex5f with 1 > > >>> MPI process > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 > > >>> Completed test examples > > >>> > > >>> I am getting the "Possible error running Fortran example" warning > with > > >>> this. This somehow looks more severe to me. But I could be wrong. > > >>> > > >> > > >> You are getting this message because your MPI implementation is > printing > > >> > > >> Invalid MIT-MAGIC-COOKIE-1 key > > >> > > >> It is still running fine, but this is an MPI configuration issue. > > >> > > >> Thanks, > > >> > > >> Matt > > >> > > >> Any suggestions what to do? > > >>> > > >>> > > >>> Kind regards, > > >>> > > >>> Bojan > > >>> > > >>> > > >>> > > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay > wrote: > > >>> > > >>>> To clarify: > > >>>> > > >>>> you are using --download-openmpi=yes with petsc. However you say: > > >>>> > > >>>> > > The mpif90 command which > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI > > >>>> > > >>>> This suggests a different install of OpenMPI is used to build your > code. > > >>>> > > >>>> One way to resolve this is - delete current build of PETSc - and > > >>>> rebuild it with this same MPI [that you are using with your > application] > > >>>> > > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > > >>>> --download-fblaslapack --download-metis --download-parmetis > --download-cmake > > >>>> > > >>>> Also PETSc provides makefile format that minimizes such conflicts.. > > >>>> > > >>>> > > >>>> > https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications > > >>>> > > >>>> Satish > > >>>> > > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: > > >>>> > > >>>> > Are you using the same MPI to build both PETSc and your > appliation? > > >>>> > > > >>>> > Satish > > >>>> > > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > > >>>> > > To whom it may concern, > > >>>> > > > > >>>> > > > > >>>> > > I am working on a Fortran (2003) computational fluid dynamics > > >>>> solver, > > >>>> > > which is actually quite mature, was parallelized with MPI from > the > > >>>> > > very beginning and it comes with its own suite of Krylov > solvers. > > >>>> > > Although the code is self-sustained, I am inclined to believe > that > > >>>> it > > >>>> > > would be better to use PETSc instead of my own home-grown > solvers. > > >>>> > > > > >>>> > > In the attempt to do so, I have installed PETSc 3.16.4 with > > >>>> following > > >>>> > > options: > > >>>> > > > > >>>> > > ./configure --with-debugging=yes --download-openmpi=yes > --download- > > >>>> > > fblaslapack=yes --download-metis=yes --download-parmetis=yes -- > > >>>> > > download-cmake=yes > > >>>> > > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 command > which > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI, hence > the > > >>>> > > option "--download-openmpi=yes" when configuring PETSc. > > >>>> > > > > >>>> > > Anyhow, installation of PETSc went fine, I managed to link and > run > > >>>> it > > >>>> > > with my code, but I am getting the following messages during > > >>>> > > compilation: > > >>>> > > > > >>>> > > Petsc_Mod.f90:18:6: > > >>>> > > > > >>>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY > > >>>> > > | 1 > > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall > be of > > >>>> > > the same size as elsewhere (4 vs 8 bytes) > > >>>> > > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. All > works, > > >>>> > > but these messages give me a reason to worry. > > >>>> > > > > >>>> > > Can you tell what causes this warnings? I would guess they > might > > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't think I even > > >>>> have > > >>>> > > MPICH on my system. > > >>>> > > > > >>>> > > Please let me know what you think about it? > > >>>> > > > > >>>> > > Cheers, > > >>>> > > > > >>>> > > Bojan > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > > >>>> > > > >>>> > > >>> > > >> > > >> -- > > >> What most experimenters take for granted before they begin their > > >> experiments is infinitely more interesting than any results to which > their > > >> experiments lead. > > >> -- Norbert Wiener > > >> > > >> https://www.cse.buffalo.edu/~knepley/ > > >> > > >> > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Feb 10 10:06:16 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 10 Feb 2022 10:06:16 -0600 (CST) Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> Message-ID: <2bed48e9-b141-44f8-807e-14d3e6f4c3fe@mcs.anl.gov> Hm - this is strange. Do you have 'xauth' installed? I would make sure xauth is installed, delete ~/.Xauthority - and reboot [or restart the X server] Yeah - it might not work - but perhaps worth a try.. Or perhaps its not X11 related.. I would also try 'strace' on an application that is producing this message - to see if I can narrow down further.. Do you get this message with both (runs)?: cd src/ksp/ksp/tutorials make ex2 mpiexec -n 1 ./ex2 ./ex2 Satish On Thu, 10 Feb 2022, Bojan Niceno wrote: > Dear both, > > I work on an ASUS ROG laptop and don't use any NFS. Everything is on one > computer, one disk. That is why I couldn't resolve the Invalid Magic > Cookie, because all the advice I've found about it concerns the remote > access/display. It is not an issue for me. My laptop has an Nvidia > GeForce RTX graphical card, maybe Ubuntu drivers are simply not able to > cope with it. I am out of ideas, really. > > > Cheers, > > Bojan > > On Thu, Feb 10, 2022 at 4:53 PM Satish Balay wrote: > > > Do the compute nodes and frontend share the same NFS? > > > > I would try the following [to see if they work): > > > > - delete ~/.Xauthority [first check with 'xauth list') > > - setup ssh to not use X - i.e add the following to ~/.ssh/config > > > > ForwardX11 no > > ForwardX11Trusted no > > > > [this can be tailored to apply only to your specific compute nodes - if > > needed] > > > > Satish > > > > On Thu, 10 Feb 2022, Matthew Knepley wrote: > > > > > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < > > > bojan.niceno.scientist at gmail.com> wrote: > > > > > > > Thanks a lot, now I feel much better. > > > > > > > > By the way, I can't get around the invalid magic cookie. It is > > occurring > > > > ever since I installed the OS (Ubuntu 20.04) so I eventually gave up > > and > > > > decided to live with it :-D > > > > > > > > > > > > https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely > > > > > > Thanks, > > > > > > Matt > > > > > > > > > > Cheers, > > > > > > > > Bojan > > > > > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley > > wrote: > > > > > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < > > > >> bojan.niceno.scientist at gmail.com> wrote: > > > >> > > > >>> Dear Satish, > > > >>> > > > >>> Thanks for the answer. Your suggestion makes a lot of sense, but > > this > > > >>> is what I get as a result of that: > > > >>> > > > >>> Running check examples to verify correct installation > > > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and > > > >>> PETSC_ARCH=arch-linux-c-debug > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI > > process > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., > > > >>> grashof # = 1. > > > >>> Number of SNES iterations = 2 > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI > > processes > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = 1., > > > >>> grashof # = 1. > > > >>> Number of SNES iterations = 2 > > > >>> Possible error running Fortran example src/snes/tutorials/ex5f with 1 > > > >>> MPI process > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 > > > >>> Completed test examples > > > >>> > > > >>> I am getting the "Possible error running Fortran example" warning > > with > > > >>> this. This somehow looks more severe to me. But I could be wrong. > > > >>> > > > >> > > > >> You are getting this message because your MPI implementation is > > printing > > > >> > > > >> Invalid MIT-MAGIC-COOKIE-1 key > > > >> > > > >> It is still running fine, but this is an MPI configuration issue. > > > >> > > > >> Thanks, > > > >> > > > >> Matt > > > >> > > > >> Any suggestions what to do? > > > >>> > > > >>> > > > >>> Kind regards, > > > >>> > > > >>> Bojan > > > >>> > > > >>> > > > >>> > > > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay > > wrote: > > > >>> > > > >>>> To clarify: > > > >>>> > > > >>>> you are using --download-openmpi=yes with petsc. However you say: > > > >>>> > > > >>>> > > The mpif90 command which > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI > > > >>>> > > > >>>> This suggests a different install of OpenMPI is used to build your > > code. > > > >>>> > > > >>>> One way to resolve this is - delete current build of PETSc - and > > > >>>> rebuild it with this same MPI [that you are using with your > > application] > > > >>>> > > > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > > > >>>> --download-fblaslapack --download-metis --download-parmetis > > --download-cmake > > > >>>> > > > >>>> Also PETSc provides makefile format that minimizes such conflicts.. > > > >>>> > > > >>>> > > > >>>> > > https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications > > > >>>> > > > >>>> Satish > > > >>>> > > > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: > > > >>>> > > > >>>> > Are you using the same MPI to build both PETSc and your > > appliation? > > > >>>> > > > > >>>> > Satish > > > >>>> > > > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > > > >>>> > > To whom it may concern, > > > >>>> > > > > > >>>> > > > > > >>>> > > I am working on a Fortran (2003) computational fluid dynamics > > > >>>> solver, > > > >>>> > > which is actually quite mature, was parallelized with MPI from > > the > > > >>>> > > very beginning and it comes with its own suite of Krylov > > solvers. > > > >>>> > > Although the code is self-sustained, I am inclined to believe > > that > > > >>>> it > > > >>>> > > would be better to use PETSc instead of my own home-grown > > solvers. > > > >>>> > > > > > >>>> > > In the attempt to do so, I have installed PETSc 3.16.4 with > > > >>>> following > > > >>>> > > options: > > > >>>> > > > > > >>>> > > ./configure --with-debugging=yes --download-openmpi=yes > > --download- > > > >>>> > > fblaslapack=yes --download-metis=yes --download-parmetis=yes -- > > > >>>> > > download-cmake=yes > > > >>>> > > > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 command > > which > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI, hence > > the > > > >>>> > > option "--download-openmpi=yes" when configuring PETSc. > > > >>>> > > > > > >>>> > > Anyhow, installation of PETSc went fine, I managed to link and > > run > > > >>>> it > > > >>>> > > with my code, but I am getting the following messages during > > > >>>> > > compilation: > > > >>>> > > > > > >>>> > > Petsc_Mod.f90:18:6: > > > >>>> > > > > > >>>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY > > > >>>> > > | 1 > > > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) shall > > be of > > > >>>> > > the same size as elsewhere (4 vs 8 bytes) > > > >>>> > > > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. All > > works, > > > >>>> > > but these messages give me a reason to worry. > > > >>>> > > > > > >>>> > > Can you tell what causes this warnings? I would guess they > > might > > > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't think I even > > > >>>> have > > > >>>> > > MPICH on my system. > > > >>>> > > > > > >>>> > > Please let me know what you think about it? > > > >>>> > > > > > >>>> > > Cheers, > > > >>>> > > > > > >>>> > > Bojan > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > >>>> > > > > >>>> > > > >>> > > > >> > > > >> -- > > > >> What most experimenters take for granted before they begin their > > > >> experiments is infinitely more interesting than any results to which > > their > > > >> experiments lead. > > > >> -- Norbert Wiener > > > >> > > > >> https://www.cse.buffalo.edu/~knepley/ > > > >> > > > >> > > > > > > > > > > > > > From bojan.niceno.scientist at gmail.com Thu Feb 10 10:08:50 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Thu, 10 Feb 2022 17:08:50 +0100 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: <2bed48e9-b141-44f8-807e-14d3e6f4c3fe@mcs.anl.gov> References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> <2bed48e9-b141-44f8-807e-14d3e6f4c3fe@mcs.anl.gov> Message-ID: Dear Satish, Thanks for the advice. I will try in a few hours because it is almost dinner time with me (I am in Europe) and I am supposed to go out with a friend this evening. Will let you know. Thanks for help, I highly appreciate it. Kind regards, Bojan On Thu, Feb 10, 2022 at 5:06 PM Satish Balay wrote: > Hm - this is strange. > > Do you have 'xauth' installed? > > I would make sure xauth is installed, delete ~/.Xauthority - and reboot > [or restart the X server] > > Yeah - it might not work - but perhaps worth a try.. > > Or perhaps its not X11 related.. > > I would also try 'strace' on an application that is producing this message > - to see if I can narrow down further.. > > Do you get this message with both (runs)?: > > cd src/ksp/ksp/tutorials > make ex2 > mpiexec -n 1 ./ex2 > ./ex2 > > Satish > > On Thu, 10 Feb 2022, Bojan Niceno wrote: > > > Dear both, > > > > I work on an ASUS ROG laptop and don't use any NFS. Everything is on one > > computer, one disk. That is why I couldn't resolve the Invalid Magic > > Cookie, because all the advice I've found about it concerns the remote > > access/display. It is not an issue for me. My laptop has an Nvidia > > GeForce RTX graphical card, maybe Ubuntu drivers are simply not able to > > cope with it. I am out of ideas, really. > > > > > > Cheers, > > > > Bojan > > > > On Thu, Feb 10, 2022 at 4:53 PM Satish Balay wrote: > > > > > Do the compute nodes and frontend share the same NFS? > > > > > > I would try the following [to see if they work): > > > > > > - delete ~/.Xauthority [first check with 'xauth list') > > > - setup ssh to not use X - i.e add the following to ~/.ssh/config > > > > > > ForwardX11 no > > > ForwardX11Trusted no > > > > > > [this can be tailored to apply only to your specific compute nodes - if > > > needed] > > > > > > Satish > > > > > > On Thu, 10 Feb 2022, Matthew Knepley wrote: > > > > > > > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < > > > > bojan.niceno.scientist at gmail.com> wrote: > > > > > > > > > Thanks a lot, now I feel much better. > > > > > > > > > > By the way, I can't get around the invalid magic cookie. It is > > > occurring > > > > > ever since I installed the OS (Ubuntu 20.04) so I eventually gave > up > > > and > > > > > decided to live with it :-D > > > > > > > > > > > > > > > > > https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > > > > > > Cheers, > > > > > > > > > > Bojan > > > > > > > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley > > > > wrote: > > > > > > > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < > > > > >> bojan.niceno.scientist at gmail.com> wrote: > > > > >> > > > > >>> Dear Satish, > > > > >>> > > > > >>> Thanks for the answer. Your suggestion makes a lot of sense, but > > > this > > > > >>> is what I get as a result of that: > > > > >>> > > > > >>> Running check examples to verify correct installation > > > > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and > > > > >>> PETSC_ARCH=arch-linux-c-debug > > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI > > > process > > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = > 1., > > > > >>> grashof # = 1. > > > > >>> Number of SNES iterations = 2 > > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI > > > processes > > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # = > 1., > > > > >>> grashof # = 1. > > > > >>> Number of SNES iterations = 2 > > > > >>> Possible error running Fortran example src/snes/tutorials/ex5f > with 1 > > > > >>> MPI process > > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > > > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 > > > > >>> Completed test examples > > > > >>> > > > > >>> I am getting the "Possible error running Fortran example" warning > > > with > > > > >>> this. This somehow looks more severe to me. But I could be > wrong. > > > > >>> > > > > >> > > > > >> You are getting this message because your MPI implementation is > > > printing > > > > >> > > > > >> Invalid MIT-MAGIC-COOKIE-1 key > > > > >> > > > > >> It is still running fine, but this is an MPI configuration issue. > > > > >> > > > > >> Thanks, > > > > >> > > > > >> Matt > > > > >> > > > > >> Any suggestions what to do? > > > > >>> > > > > >>> > > > > >>> Kind regards, > > > > >>> > > > > >>> Bojan > > > > >>> > > > > >>> > > > > >>> > > > > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay > > > wrote: > > > > >>> > > > > >>>> To clarify: > > > > >>>> > > > > >>>> you are using --download-openmpi=yes with petsc. However you > say: > > > > >>>> > > > > >>>> > > The mpif90 command which > > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI > > > > >>>> > > > > >>>> This suggests a different install of OpenMPI is used to build > your > > > code. > > > > >>>> > > > > >>>> One way to resolve this is - delete current build of PETSc - and > > > > >>>> rebuild it with this same MPI [that you are using with your > > > application] > > > > >>>> > > > > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > > > > >>>> --download-fblaslapack --download-metis --download-parmetis > > > --download-cmake > > > > >>>> > > > > >>>> Also PETSc provides makefile format that minimizes such > conflicts.. > > > > >>>> > > > > >>>> > > > > >>>> > > > > https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications > > > > >>>> > > > > >>>> Satish > > > > >>>> > > > > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: > > > > >>>> > > > > >>>> > Are you using the same MPI to build both PETSc and your > > > appliation? > > > > >>>> > > > > > >>>> > Satish > > > > >>>> > > > > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > > > > >>>> > > To whom it may concern, > > > > >>>> > > > > > > >>>> > > > > > > >>>> > > I am working on a Fortran (2003) computational fluid > dynamics > > > > >>>> solver, > > > > >>>> > > which is actually quite mature, was parallelized with MPI > from > > > the > > > > >>>> > > very beginning and it comes with its own suite of Krylov > > > solvers. > > > > >>>> > > Although the code is self-sustained, I am inclined to > believe > > > that > > > > >>>> it > > > > >>>> > > would be better to use PETSc instead of my own home-grown > > > solvers. > > > > >>>> > > > > > > >>>> > > In the attempt to do so, I have installed PETSc 3.16.4 with > > > > >>>> following > > > > >>>> > > options: > > > > >>>> > > > > > > >>>> > > ./configure --with-debugging=yes --download-openmpi=yes > > > --download- > > > > >>>> > > fblaslapack=yes --download-metis=yes > --download-parmetis=yes -- > > > > >>>> > > download-cmake=yes > > > > >>>> > > > > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 > command > > > which > > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI, > hence > > > the > > > > >>>> > > option "--download-openmpi=yes" when configuring PETSc. > > > > >>>> > > > > > > >>>> > > Anyhow, installation of PETSc went fine, I managed to link > and > > > run > > > > >>>> it > > > > >>>> > > with my code, but I am getting the following messages during > > > > >>>> > > compilation: > > > > >>>> > > > > > > >>>> > > Petsc_Mod.f90:18:6: > > > > >>>> > > > > > > >>>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY > > > > >>>> > > | 1 > > > > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) > shall > > > be of > > > > >>>> > > the same size as elsewhere (4 vs 8 bytes) > > > > >>>> > > > > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. > All > > > works, > > > > >>>> > > but these messages give me a reason to worry. > > > > >>>> > > > > > > >>>> > > Can you tell what causes this warnings? I would guess they > > > might > > > > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't think I > even > > > > >>>> have > > > > >>>> > > MPICH on my system. > > > > >>>> > > > > > > >>>> > > Please let me know what you think about it? > > > > >>>> > > > > > > >>>> > > Cheers, > > > > >>>> > > > > > > >>>> > > Bojan > > > > >>>> > > > > > > >>>> > > > > > > >>>> > > > > > > >>>> > > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > >>> > > > > >> > > > > >> -- > > > > >> What most experimenters take for granted before they begin their > > > > >> experiments is infinitely more interesting than any results to > which > > > their > > > > >> experiments lead. > > > > >> -- Norbert Wiener > > > > >> > > > > >> https://www.cse.buffalo.edu/~knepley/ > > > > >> > > > > >> > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Feb 10 10:10:42 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 10 Feb 2022 10:10:42 -0600 Subject: [petsc-users] Update of the buffer In-Reply-To: <52b90fdd-b142-6d85-a9df-642825d047a9@univ-fcomte.fr> References: <52b90fdd-b142-6d85-a9df-642825d047a9@univ-fcomte.fr> Message-ID: On Thu, Feb 10, 2022 at 8:17 AM Medane TCHAKOROM < medane.tchakorom at univ-fcomte.fr> wrote: > Hello , > > Sorry if this question does not belong to this mailling list, i'am using > Petsc , but with some > > MPI parts code, when dealing with communication. > > If a make two consecutive MPI_Isend requests, and if the destination > processor has not yet receive the message inbetween the two calls, will the > buffer be updated ? I mean if I send message "1" for the first request, > then send "0" as the second message. Will the receiver receive "0" as > message ? I not, how can I do to update the message ? > If you mean send buffer, you need MPI_Wait() to reuse the send buffer, i.e, MPI_Isend(sbuf, .., &req); MPI_Wait(&req,MPI_STATUS_IGNORE); refill sbuf MPI_Isend(sbuf, .., &req); > Thanks > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sasyed at fnal.gov Thu Feb 10 13:21:44 2022 From: sasyed at fnal.gov (Sajid Ali Syed) Date: Thu, 10 Feb 2022 19:21:44 +0000 Subject: [petsc-users] GAMG crash during setup when using multiple GPUs Message-ID: Hi PETSc-developers, I?m seeing the following crash that occurs during the setup phase of the preconditioner when using multiple GPUs. The relevant error trace is shown below: (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, CUDA_ERROR_ALREADY_MAPPED, line no 272 [24]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [24]PETSC ERROR: General MPI error [24]PETSC ERROR: MPI error 1 Invalid buffer pointer [24]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [24]PETSC ERROR: Petsc Development GIT revision: f351d5494b5462f62c419e00645ac2e477b88cae GIT Date: 2022-02-08 15:08:19 +0000 ... [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54 [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274 [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218 [24]PETSC ERROR: #4 PetscSFBcastEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499 [24]PETSC ERROR: #5 VecScatterEnd_Internal() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87 [24]PETSC ERROR: #6 VecScatterEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366 [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302 [24]PETSC ERROR: #8 MatMult() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438 [24]PETSC ERROR: #9 PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730 [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421 [24]PETSC ERROR: #11 KSPGMRESCycle() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:162 [24]PETSC ERROR: #12 KSPSolve_GMRES() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:247 [24]PETSC ERROR: #13 KSPSolve_Private() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:925 [24]PETSC ERROR: #14 KSPSolve() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:1103 [24]PETSC ERROR: #15 PCGAMGOptProlongator_AGG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/agg.c:1127 [24]PETSC ERROR: #16 PCSetUp_GAMG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/gamg.c:626 [24]PETSC ERROR: #17 PCSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:1017 [24]PETSC ERROR: #18 KSPSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:417 [24]PETSC ERROR: #19 main() at poisson3d.c:69 [24]PETSC ERROR: PETSc Option Table entries: [24]PETSC ERROR: -dm_mat_type aijcusparse [24]PETSC ERROR: -dm_vec_type cuda [24]PETSC ERROR: -ksp_monitor [24]PETSC ERROR: -ksp_norm_type unpreconditioned [24]PETSC ERROR: -ksp_type cg [24]PETSC ERROR: -ksp_view [24]PETSC ERROR: -log_view [24]PETSC ERROR: -mg_levels_esteig_ksp_type cg [24]PETSC ERROR: -mg_levels_ksp_type chebyshev [24]PETSC ERROR: -mg_levels_pc_type jacobi [24]PETSC ERROR: -pc_gamg_agg_nsmooths 1 [24]PETSC ERROR: -pc_gamg_square_graph 1 [24]PETSC ERROR: -pc_gamg_threshold 0.0 [24]PETSC ERROR: -pc_gamg_threshold_scale 0.0 [24]PETSC ERROR: -pc_gamg_type agg [24]PETSC ERROR: -pc_type gamg [24]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Attached with this email is the full error log and the submit script for a 8-node/64-GPU/64 MPI rank job. I?ll also note that the same program did not crash when using either 2 or 4 nodes (with 8 & 16 GPUs/MPI ranks respectively) and attach those logs as well if that helps. Could someone let me know what this error means and what can be done to prevent it? Thank You, Sajid Ali (he/him) | Research Associate Scientific Computing Division Fermi National Accelerator Laboratory s-sajid-ali.github.io ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 2n8g.sh Type: application/x-sh Size: 686 bytes Desc: 2n8g.sh URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 2n8g-log Type: application/octet-stream Size: 88908 bytes Desc: 2n8g-log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 4n16g.sh Type: application/x-sh Size: 687 bytes Desc: 4n16g.sh URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 4n16g-log Type: application/octet-stream Size: 89091 bytes Desc: 4n16g-log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 8n32g.sh Type: application/x-sh Size: 687 bytes Desc: 8n32g.sh URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 8n32g-errlog Type: application/octet-stream Size: 179380 bytes Desc: 8n32g-errlog URL: From junchao.zhang at gmail.com Thu Feb 10 13:40:43 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 10 Feb 2022 13:40:43 -0600 Subject: [petsc-users] GAMG crash during setup when using multiple GPUs In-Reply-To: References: Message-ID: Did it fail without GPU at 64 MPI ranks? --Junchao Zhang On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed wrote: > Hi PETSc-developers, > > I?m seeing the following crash that occurs during the setup phase of the > preconditioner when using multiple GPUs. The relevant error trace is shown > below: > > (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, CUDA_ERROR_ALREADY_MAPPED, line no 272 > [24]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [24]PETSC ERROR: General MPI error > [24]PETSC ERROR: MPI error 1 Invalid buffer pointer > [24]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [24]PETSC ERROR: Petsc Development GIT revision: f351d5494b5462f62c419e00645ac2e477b88cae GIT Date: 2022-02-08 15:08:19 +0000 > ... > [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54 > [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274 > [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218 > [24]PETSC ERROR: #4 PetscSFBcastEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499 > [24]PETSC ERROR: #5 VecScatterEnd_Internal() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87 > [24]PETSC ERROR: #6 VecScatterEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366 > [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302 > [24]PETSC ERROR: #8 MatMult() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438 > [24]PETSC ERROR: #9 PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730 > [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421 > [24]PETSC ERROR: #11 KSPGMRESCycle() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:162 > [24]PETSC ERROR: #12 KSPSolve_GMRES() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:247 > [24]PETSC ERROR: #13 KSPSolve_Private() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:925 > [24]PETSC ERROR: #14 KSPSolve() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:1103 > [24]PETSC ERROR: #15 PCGAMGOptProlongator_AGG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/agg.c:1127 > [24]PETSC ERROR: #16 PCSetUp_GAMG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/gamg.c:626 > [24]PETSC ERROR: #17 PCSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:1017 > [24]PETSC ERROR: #18 KSPSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:417 > [24]PETSC ERROR: #19 main() at poisson3d.c:69 > [24]PETSC ERROR: PETSc Option Table entries: > [24]PETSC ERROR: -dm_mat_type aijcusparse > [24]PETSC ERROR: -dm_vec_type cuda > [24]PETSC ERROR: -ksp_monitor > [24]PETSC ERROR: -ksp_norm_type unpreconditioned > [24]PETSC ERROR: -ksp_type cg > [24]PETSC ERROR: -ksp_view > [24]PETSC ERROR: -log_view > [24]PETSC ERROR: -mg_levels_esteig_ksp_type cg > [24]PETSC ERROR: -mg_levels_ksp_type chebyshev > [24]PETSC ERROR: -mg_levels_pc_type jacobi > [24]PETSC ERROR: -pc_gamg_agg_nsmooths 1 > [24]PETSC ERROR: -pc_gamg_square_graph 1 > [24]PETSC ERROR: -pc_gamg_threshold 0.0 > [24]PETSC ERROR: -pc_gamg_threshold_scale 0.0 > [24]PETSC ERROR: -pc_gamg_type agg > [24]PETSC ERROR: -pc_type gamg > [24]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > > Attached with this email is the full error log and the submit script for a > 8-node/64-GPU/64 MPI rank job. I?ll also note that the same program did not > crash when using either 2 or 4 nodes (with 8 & 16 GPUs/MPI ranks > respectively) and attach those logs as well if that helps. Could someone > let me know what this error means and what can be done to prevent it? > > Thank You, > Sajid Ali (he/him) | Research Associate > > Scientific Computing Division > > Fermi National Accelerator Laboratory > > s-sajid-ali.github.io > ? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Feb 10 13:43:06 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 10 Feb 2022 13:43:06 -0600 Subject: [petsc-users] GAMG crash during setup when using multiple GPUs In-Reply-To: References: Message-ID: Also, try "-use_gpu_aware_mpi 0" to see if there is a difference. --Junchao Zhang On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang wrote: > Did it fail without GPU at 64 MPI ranks? > > --Junchao Zhang > > > On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed wrote: > >> Hi PETSc-developers, >> >> I?m seeing the following crash that occurs during the setup phase of the >> preconditioner when using multiple GPUs. The relevant error trace is shown >> below: >> >> (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, CUDA_ERROR_ALREADY_MAPPED, line no 272 >> [24]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [24]PETSC ERROR: General MPI error >> [24]PETSC ERROR: MPI error 1 Invalid buffer pointer >> [24]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [24]PETSC ERROR: Petsc Development GIT revision: f351d5494b5462f62c419e00645ac2e477b88cae GIT Date: 2022-02-08 15:08:19 +0000 >> ... >> [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54 >> [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274 >> [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218 >> [24]PETSC ERROR: #4 PetscSFBcastEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499 >> [24]PETSC ERROR: #5 VecScatterEnd_Internal() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87 >> [24]PETSC ERROR: #6 VecScatterEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366 >> [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302 >> [24]PETSC ERROR: #8 MatMult() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438 >> [24]PETSC ERROR: #9 PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730 >> [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421 >> [24]PETSC ERROR: #11 KSPGMRESCycle() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:162 >> [24]PETSC ERROR: #12 KSPSolve_GMRES() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:247 >> [24]PETSC ERROR: #13 KSPSolve_Private() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:925 >> [24]PETSC ERROR: #14 KSPSolve() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:1103 >> [24]PETSC ERROR: #15 PCGAMGOptProlongator_AGG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/agg.c:1127 >> [24]PETSC ERROR: #16 PCSetUp_GAMG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/gamg.c:626 >> [24]PETSC ERROR: #17 PCSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:1017 >> [24]PETSC ERROR: #18 KSPSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:417 >> [24]PETSC ERROR: #19 main() at poisson3d.c:69 >> [24]PETSC ERROR: PETSc Option Table entries: >> [24]PETSC ERROR: -dm_mat_type aijcusparse >> [24]PETSC ERROR: -dm_vec_type cuda >> [24]PETSC ERROR: -ksp_monitor >> [24]PETSC ERROR: -ksp_norm_type unpreconditioned >> [24]PETSC ERROR: -ksp_type cg >> [24]PETSC ERROR: -ksp_view >> [24]PETSC ERROR: -log_view >> [24]PETSC ERROR: -mg_levels_esteig_ksp_type cg >> [24]PETSC ERROR: -mg_levels_ksp_type chebyshev >> [24]PETSC ERROR: -mg_levels_pc_type jacobi >> [24]PETSC ERROR: -pc_gamg_agg_nsmooths 1 >> [24]PETSC ERROR: -pc_gamg_square_graph 1 >> [24]PETSC ERROR: -pc_gamg_threshold 0.0 >> [24]PETSC ERROR: -pc_gamg_threshold_scale 0.0 >> [24]PETSC ERROR: -pc_gamg_type agg >> [24]PETSC ERROR: -pc_type gamg >> [24]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >> >> Attached with this email is the full error log and the submit script for >> a 8-node/64-GPU/64 MPI rank job. I?ll also note that the same program did >> not crash when using either 2 or 4 nodes (with 8 & 16 GPUs/MPI ranks >> respectively) and attach those logs as well if that helps. Could someone >> let me know what this error means and what can be done to prevent it? >> >> Thank You, >> Sajid Ali (he/him) | Research Associate >> >> Scientific Computing Division >> >> Fermi National Accelerator Laboratory >> >> s-sajid-ali.github.io >> ? >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From susanne.claus at onera.fr Thu Feb 10 14:17:49 2022 From: susanne.claus at onera.fr (Susanne Claus) Date: Thu, 10 Feb 2022 21:17:49 +0100 Subject: [petsc-users] Gmsh 8-noded quadrilateral In-Reply-To: References: Message-ID: Dear Matthew, Thank you so much. I have a attached a small 8-noded quadrilateral mesh file (Version 4 ASCII) generated with gmsh 4.8.4. Best wishes, Susanne On 10.02.2022 16:23, Matthew Knepley wrote: > On Thu, Feb 10, 2022 at 10:12 AM Susanne Claus > wrote: > >> Hello, >> >> I am using DMPlex for the mesh structure of a solid mechanics finite >> element code. I mainly use gmsh as input file format. When I try to >> read in 8-noded Quadrilaterals (Element type 16 in gmsh) DMPlex tells >> me that this element type is unknown. However a 9-noded Quadrilateral >> can be read without problem. On inspecting the plexgmsh.c source code >> I can see that 8-noded quadrilaterals are deactivated: >> >> #if 0 >> 146: {20, GMSH_TRI, 2, 3, 3, 9, NULL}, >> 147: {16, GMSH_QUA, 2, 2, 4, 8, NULL}, >> >> For our application these 8-noded quadrilateral are very important. >> >> Is there any reason why they have not been implemented/deactivated in >> the dmplex gmsh reader? > > No, we can handle them in the same way I think. Let me look at it. > Hopefully it is easy. > > Thanks, > > Matt > >> Thank you for all the great work you are doing. PETSc is amazing. >> >> Best wishes, >> Susanne Claus > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ [1] -- Susanne Claus Ing?nieur Chercheur Applied Mathematics and Scientific Computing Group DTIS ONERA - The French Aerospace Lab 6 Chemin de la Vauve aux Granges, 91120 Palaiseau Links: ------ [1] http://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 3bb899cc.png Type: image/png Size: 4266 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: RectangleQuad8.msh URL: From jp.salazar at pm.me Thu Feb 10 14:19:08 2022 From: jp.salazar at pm.me (Juan Salazar) Date: Thu, 10 Feb 2022 20:19:08 +0000 Subject: [petsc-users] Compilation issues on cluster - PETSC ERROR: Unknown Mat type given: mpiaijmkl In-Reply-To: References: Message-ID: <2844F321-85E0-4766-AC0F-37350E9F911B@pm.me> > Hi Juan, > > I believe the problem is that you specify --with-mkl_sparse-dir, but that is not used because the BLAS/LAPACK logic checks for that, and you just > need --with-mkl_sparse. Normally the "dir" option would do this automatically, but since it is not used, that logic does not kick in. Please tell me if > this works. It worked! Thank you very much! The only caveat is that these options don?t support 64 bit indices, so I had to set: --with-64-bit-indices=0 Juan S. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Feb 10 17:47:01 2022 From: jed at jedbrown.org (Jed Brown) Date: Thu, 10 Feb 2022 16:47:01 -0700 Subject: [petsc-users] Gmsh 8-noded quadrilateral In-Reply-To: References: Message-ID: <87bkzeuvwq.fsf@jedbrown.org> Susanne, do you want PetscFE to make the serendipity (8-node) finite element space or do you just want to read these meshes? I.e., would it be okay with you if the coordinates were placed in a Q_2 (9-node, biquadratic) finite element space? This won't matter if you're traversing the dofs per edge manually, but there are some efficiency benefits of using the Q_2 space (especially if your code can use the tensor product, perhaps via a library like libCEED). Note that Q_2 spaces have better stability properties. For example, the Q_2 space is inf-sup stable with P_1 discontinuous pressure (gives third order L^2 and second order H^1 convergence), but serendipity (8-node) is only stable with piecewise constant pressure (gives second order L^2 and first order H^1 convergence). Susanne Claus writes: > Dear Matthew, > > Thank you so much. > I have a attached a small 8-noded quadrilateral mesh file (Version 4 > ASCII) generated with gmsh 4.8.4. > > Best wishes, > Susanne > > On 10.02.2022 16:23, Matthew Knepley wrote: > >> On Thu, Feb 10, 2022 at 10:12 AM Susanne Claus >> wrote: >> >>> Hello, >>> >>> I am using DMPlex for the mesh structure of a solid mechanics finite >>> element code. I mainly use gmsh as input file format. When I try to >>> read in 8-noded Quadrilaterals (Element type 16 in gmsh) DMPlex tells >>> me that this element type is unknown. However a 9-noded Quadrilateral >>> can be read without problem. On inspecting the plexgmsh.c source code >>> I can see that 8-noded quadrilaterals are deactivated: >>> >>> #if 0 >>> 146: {20, GMSH_TRI, 2, 3, 3, 9, NULL}, >>> 147: {16, GMSH_QUA, 2, 2, 4, 8, NULL}, >>> >>> For our application these 8-noded quadrilateral are very important. >>> >>> Is there any reason why they have not been implemented/deactivated in >>> the dmplex gmsh reader? >> >> No, we can handle them in the same way I think. Let me look at it. >> Hopefully it is easy. >> >> Thanks, >> >> Matt >> >>> Thank you for all the great work you are doing. PETSc is amazing. >>> >>> Best wishes, >>> Susanne Claus >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [1] > > -- > > Susanne Claus > Ing?nieur Chercheur > Applied Mathematics and Scientific Computing Group > DTIS > > ONERA - The French Aerospace Lab > 6 Chemin de la Vauve aux Granges, 91120 Palaiseau > > Links: > ------ > [1] http://www.cse.buffalo.edu/~knepley/ > $MeshFormat > 4.1 0 8 > $EndMeshFormat > $PhysicalNames > 2 > 1 2 "Neumann" > 2 1 "Domain" > $EndPhysicalNames > $Entities > 4 4 1 0 > 1 0 0 0 0 > 2 1 0 0 0 > 3 1 1 0 0 > 4 0 1 0 0 > 1 -9.999999994736442e-08 -1e-07 -1e-07 1.0000001 1e-07 1e-07 0 2 1 -2 > 2 0.9999999000000001 -9.999999994736442e-08 -1e-07 1.0000001 1.0000001 1e-07 1 2 2 2 -3 > 3 -9.999999994736442e-08 0.9999999000000001 -1e-07 1.0000001 1.0000001 1e-07 0 2 3 -4 > 4 -1e-07 -9.999999994736442e-08 -1e-07 1e-07 1.0000001 1e-07 0 2 4 -1 > 1 -9.999999994736442e-08 -9.999999994736442e-08 -1e-07 1.0000001 1.0000001 1e-07 1 1 4 1 2 3 4 > $EndEntities > $Nodes > 9 21 1 46 > 0 1 0 1 > 1 > 0 0 0 > 0 2 0 1 > 2 > 1 0 0 > 0 3 0 1 > 3 > 1 1 0 > 0 4 0 1 > 4 > 0 1 0 > 1 1 0 3 > 5 > 35 > 36 > 0.5 0 0 > 0.25 0 0 > 0.75 0 0 > 1 2 0 3 > 6 > 37 > 38 > 1 0.5 0 > 1 0.25 0 > 1 0.75 0 > 1 3 0 3 > 7 > 39 > 40 > 0.5 1 0 > 0.75 1 0 > 0.25 1 0 > 1 4 0 3 > 8 > 41 > 42 > 0 0.5 0 > 0 0.75 0 > 0 0.25 0 > 2 1 0 5 > 9 > 43 > 44 > 45 > 46 > 0.5 0.5 0 > 0.75 0.5 0 > 0.5 0.25 0 > 0.25 0.5 0 > 0.5 0.75 0 > $EndNodes > $Elements > 2 6 197 206 > 1 2 8 2 > 197 2 6 37 > 198 6 3 38 > 2 1 16 4 > 203 2 6 9 5 37 43 44 36 > 204 1 5 9 8 35 44 45 42 > 205 4 8 9 7 41 45 46 40 > 206 3 7 9 6 39 46 43 38 > $EndElements From sasyed at fnal.gov Thu Feb 10 18:04:25 2022 From: sasyed at fnal.gov (Sajid Ali Syed) Date: Fri, 11 Feb 2022 00:04:25 +0000 Subject: [petsc-users] GAMG crash during setup when using multiple GPUs In-Reply-To: References: Message-ID: Hi Junchao, With "-use_gpu_aware_mpi 0" there is no error. I'm attaching the log for this case with this email. I also ran with gpu aware mpi to see if I could reproduce the error and got the error but from a different location. This logfile is also attached. This was using the newest cray-mpich on NERSC-perlmutter (8.1.12). Let me know if I can share further information to help with debugging this. Thank You, Sajid Ali (he/him) | Research Associate Scientific Computing Division Fermi National Accelerator Laboratory s-sajid-ali.github.io ________________________________ From: Junchao Zhang Sent: Thursday, February 10, 2022 1:43 PM To: Sajid Ali Syed Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] GAMG crash during setup when using multiple GPUs Also, try "-use_gpu_aware_mpi 0" to see if there is a difference. --Junchao Zhang On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang > wrote: Did it fail without GPU at 64 MPI ranks? --Junchao Zhang On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed > wrote: Hi PETSc-developers, I?m seeing the following crash that occurs during the setup phase of the preconditioner when using multiple GPUs. The relevant error trace is shown below: (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, CUDA_ERROR_ALREADY_MAPPED, line no 272 [24]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [24]PETSC ERROR: General MPI error [24]PETSC ERROR: MPI error 1 Invalid buffer pointer [24]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [24]PETSC ERROR: Petsc Development GIT revision: f351d5494b5462f62c419e00645ac2e477b88cae GIT Date: 2022-02-08 15:08:19 +0000 ... [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54 [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274 [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218 [24]PETSC ERROR: #4 PetscSFBcastEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499 [24]PETSC ERROR: #5 VecScatterEnd_Internal() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87 [24]PETSC ERROR: #6 VecScatterEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366 [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302 [24]PETSC ERROR: #8 MatMult() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438 [24]PETSC ERROR: #9 PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730 [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421 [24]PETSC ERROR: #11 KSPGMRESCycle() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:162 [24]PETSC ERROR: #12 KSPSolve_GMRES() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:247 [24]PETSC ERROR: #13 KSPSolve_Private() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:925 [24]PETSC ERROR: #14 KSPSolve() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:1103 [24]PETSC ERROR: #15 PCGAMGOptProlongator_AGG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/agg.c:1127 [24]PETSC ERROR: #16 PCSetUp_GAMG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/gamg.c:626 [24]PETSC ERROR: #17 PCSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:1017 [24]PETSC ERROR: #18 KSPSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:417 [24]PETSC ERROR: #19 main() at poisson3d.c:69 [24]PETSC ERROR: PETSc Option Table entries: [24]PETSC ERROR: -dm_mat_type aijcusparse [24]PETSC ERROR: -dm_vec_type cuda [24]PETSC ERROR: -ksp_monitor [24]PETSC ERROR: -ksp_norm_type unpreconditioned [24]PETSC ERROR: -ksp_type cg [24]PETSC ERROR: -ksp_view [24]PETSC ERROR: -log_view [24]PETSC ERROR: -mg_levels_esteig_ksp_type cg [24]PETSC ERROR: -mg_levels_ksp_type chebyshev [24]PETSC ERROR: -mg_levels_pc_type jacobi [24]PETSC ERROR: -pc_gamg_agg_nsmooths 1 [24]PETSC ERROR: -pc_gamg_square_graph 1 [24]PETSC ERROR: -pc_gamg_threshold 0.0 [24]PETSC ERROR: -pc_gamg_threshold_scale 0.0 [24]PETSC ERROR: -pc_gamg_type agg [24]PETSC ERROR: -pc_type gamg [24]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Attached with this email is the full error log and the submit script for a 8-node/64-GPU/64 MPI rank job. I?ll also note that the same program did not crash when using either 2 or 4 nodes (with 8 & 16 GPUs/MPI ranks respectively) and attach those logs as well if that helps. Could someone let me know what this error means and what can be done to prevent it? Thank You, Sajid Ali (he/him) | Research Associate Scientific Computing Division Fermi National Accelerator Laboratory s-sajid-ali.github.io ? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 8n32g-nogpuawarempi-log Type: application/octet-stream Size: 90432 bytes Desc: 8n32g-nogpuawarempi-log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 8n32g-newerr-log Type: application/octet-stream Size: 170257 bytes Desc: 8n32g-newerr-log URL: From junchao.zhang at gmail.com Thu Feb 10 20:22:33 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 10 Feb 2022 20:22:33 -0600 Subject: [petsc-users] GAMG crash during setup when using multiple GPUs In-Reply-To: References: Message-ID: Hi, Sajid Ali, I have no clue. I have access to perlmutter. I am thinking how to debug that. If your app is open-sourced and easy to build, then I can build and debug it. Otherwise, suppose you build and install petsc (only with options needed by your app) to a shared directory, and I can access your executable (which uses RPATH for libraries), then maybe I can debug it (I only need to install my own petsc to the shared directory) --Junchao Zhang On Thu, Feb 10, 2022 at 6:04 PM Sajid Ali Syed wrote: > Hi Junchao, > > With "-use_gpu_aware_mpi 0" there is no error. I'm attaching the log for > this case with this email. > > I also ran with gpu aware mpi to see if I could reproduce the error and > got the error but from a different location. This logfile is also attached. > > This was using the newest cray-mpich on NERSC-perlmutter (8.1.12). Let me > know if I can share further information to help with debugging this. > > Thank You, > Sajid Ali (he/him) | Research Associate > Scientific Computing Division > Fermi National Accelerator Laboratory > s-sajid-ali.github.io > > ------------------------------ > *From:* Junchao Zhang > *Sent:* Thursday, February 10, 2022 1:43 PM > *To:* Sajid Ali Syed > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] GAMG crash during setup when using multiple > GPUs > > Also, try "-use_gpu_aware_mpi 0" to see if there is a difference. > > --Junchao Zhang > > > On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang > wrote: > > Did it fail without GPU at 64 MPI ranks? > > --Junchao Zhang > > > On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed wrote: > > Hi PETSc-developers, > > I?m seeing the following crash that occurs during the setup phase of the > preconditioner when using multiple GPUs. The relevant error trace is shown > below: > > (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, CUDA_ERROR_ALREADY_MAPPED, line no 272 > [24]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [24]PETSC ERROR: General MPI error > [24]PETSC ERROR: MPI error 1 Invalid buffer pointer > [24]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [24]PETSC ERROR: Petsc Development GIT revision: f351d5494b5462f62c419e00645ac2e477b88cae GIT Date: 2022-02-08 15:08:19 +0000 > ... > [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54 > [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274 > [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218 > [24]PETSC ERROR: #4 PetscSFBcastEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499 > [24]PETSC ERROR: #5 VecScatterEnd_Internal() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87 > [24]PETSC ERROR: #6 VecScatterEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366 > [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302 > [24]PETSC ERROR: #8 MatMult() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438 > [24]PETSC ERROR: #9 PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730 > [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421 > [24]PETSC ERROR: #11 KSPGMRESCycle() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:162 > [24]PETSC ERROR: #12 KSPSolve_GMRES() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:247 > [24]PETSC ERROR: #13 KSPSolve_Private() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:925 > [24]PETSC ERROR: #14 KSPSolve() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:1103 > [24]PETSC ERROR: #15 PCGAMGOptProlongator_AGG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/agg.c:1127 > [24]PETSC ERROR: #16 PCSetUp_GAMG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/gamg.c:626 > [24]PETSC ERROR: #17 PCSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:1017 > [24]PETSC ERROR: #18 KSPSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:417 > [24]PETSC ERROR: #19 main() at poisson3d.c:69 > [24]PETSC ERROR: PETSc Option Table entries: > [24]PETSC ERROR: -dm_mat_type aijcusparse > [24]PETSC ERROR: -dm_vec_type cuda > [24]PETSC ERROR: -ksp_monitor > [24]PETSC ERROR: -ksp_norm_type unpreconditioned > [24]PETSC ERROR: -ksp_type cg > [24]PETSC ERROR: -ksp_view > [24]PETSC ERROR: -log_view > [24]PETSC ERROR: -mg_levels_esteig_ksp_type cg > [24]PETSC ERROR: -mg_levels_ksp_type chebyshev > [24]PETSC ERROR: -mg_levels_pc_type jacobi > [24]PETSC ERROR: -pc_gamg_agg_nsmooths 1 > [24]PETSC ERROR: -pc_gamg_square_graph 1 > [24]PETSC ERROR: -pc_gamg_threshold 0.0 > [24]PETSC ERROR: -pc_gamg_threshold_scale 0.0 > [24]PETSC ERROR: -pc_gamg_type agg > [24]PETSC ERROR: -pc_type gamg > [24]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > > Attached with this email is the full error log and the submit script for a > 8-node/64-GPU/64 MPI rank job. I?ll also note that the same program did not > crash when using either 2 or 4 nodes (with 8 & 16 GPUs/MPI ranks > respectively) and attach those logs as well if that helps. Could someone > let me know what this error means and what can be done to prevent it? > > Thank You, > Sajid Ali (he/him) | Research Associate > > Scientific Computing Division > > Fermi National Accelerator Laboratory > > s-sajid-ali.github.io > > ? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Feb 10 20:47:03 2022 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 10 Feb 2022 21:47:03 -0500 Subject: [petsc-users] GAMG crash during setup when using multiple GPUs In-Reply-To: References: Message-ID: Perlmutter has problems with GPU aware MPI. This is being actively worked on at NERSc. Mark On Thu, Feb 10, 2022 at 9:22 PM Junchao Zhang wrote: > Hi, Sajid Ali, > I have no clue. I have access to perlmutter. I am thinking how to debug > that. > If your app is open-sourced and easy to build, then I can build and > debug it. Otherwise, suppose you build and install petsc (only with options > needed by your app) to a shared directory, and I can access your executable > (which uses RPATH for libraries), then maybe I can debug it (I only need to > install my own petsc to the shared directory) > > --Junchao Zhang > > > On Thu, Feb 10, 2022 at 6:04 PM Sajid Ali Syed wrote: > >> Hi Junchao, >> >> With "-use_gpu_aware_mpi 0" there is no error. I'm attaching the log for >> this case with this email. >> >> I also ran with gpu aware mpi to see if I could reproduce the error and >> got the error but from a different location. This logfile is also attached. >> >> This was using the newest cray-mpich on NERSC-perlmutter (8.1.12). Let me >> know if I can share further information to help with debugging this. >> >> Thank You, >> Sajid Ali (he/him) | Research Associate >> Scientific Computing Division >> Fermi National Accelerator Laboratory >> s-sajid-ali.github.io >> >> ------------------------------ >> *From:* Junchao Zhang >> *Sent:* Thursday, February 10, 2022 1:43 PM >> *To:* Sajid Ali Syed >> *Cc:* petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] GAMG crash during setup when using multiple >> GPUs >> >> Also, try "-use_gpu_aware_mpi 0" to see if there is a difference. >> >> --Junchao Zhang >> >> >> On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang >> wrote: >> >> Did it fail without GPU at 64 MPI ranks? >> >> --Junchao Zhang >> >> >> On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed wrote: >> >> Hi PETSc-developers, >> >> I?m seeing the following crash that occurs during the setup phase of the >> preconditioner when using multiple GPUs. The relevant error trace is shown >> below: >> >> (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, CUDA_ERROR_ALREADY_MAPPED, line no 272 >> [24]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [24]PETSC ERROR: General MPI error >> [24]PETSC ERROR: MPI error 1 Invalid buffer pointer >> [24]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. >> [24]PETSC ERROR: Petsc Development GIT revision: f351d5494b5462f62c419e00645ac2e477b88cae GIT Date: 2022-02-08 15:08:19 +0000 >> ... >> [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54 >> [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274 >> [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218 >> [24]PETSC ERROR: #4 PetscSFBcastEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499 >> [24]PETSC ERROR: #5 VecScatterEnd_Internal() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87 >> [24]PETSC ERROR: #6 VecScatterEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366 >> [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302 >> [24]PETSC ERROR: #8 MatMult() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438 >> [24]PETSC ERROR: #9 PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730 >> [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421 >> [24]PETSC ERROR: #11 KSPGMRESCycle() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:162 >> [24]PETSC ERROR: #12 KSPSolve_GMRES() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:247 >> [24]PETSC ERROR: #13 KSPSolve_Private() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:925 >> [24]PETSC ERROR: #14 KSPSolve() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:1103 >> [24]PETSC ERROR: #15 PCGAMGOptProlongator_AGG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/agg.c:1127 >> [24]PETSC ERROR: #16 PCSetUp_GAMG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/gamg.c:626 >> [24]PETSC ERROR: #17 PCSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:1017 >> [24]PETSC ERROR: #18 KSPSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:417 >> [24]PETSC ERROR: #19 main() at poisson3d.c:69 >> [24]PETSC ERROR: PETSc Option Table entries: >> [24]PETSC ERROR: -dm_mat_type aijcusparse >> [24]PETSC ERROR: -dm_vec_type cuda >> [24]PETSC ERROR: -ksp_monitor >> [24]PETSC ERROR: -ksp_norm_type unpreconditioned >> [24]PETSC ERROR: -ksp_type cg >> [24]PETSC ERROR: -ksp_view >> [24]PETSC ERROR: -log_view >> [24]PETSC ERROR: -mg_levels_esteig_ksp_type cg >> [24]PETSC ERROR: -mg_levels_ksp_type chebyshev >> [24]PETSC ERROR: -mg_levels_pc_type jacobi >> [24]PETSC ERROR: -pc_gamg_agg_nsmooths 1 >> [24]PETSC ERROR: -pc_gamg_square_graph 1 >> [24]PETSC ERROR: -pc_gamg_threshold 0.0 >> [24]PETSC ERROR: -pc_gamg_threshold_scale 0.0 >> [24]PETSC ERROR: -pc_gamg_type agg >> [24]PETSC ERROR: -pc_type gamg >> [24]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >> >> Attached with this email is the full error log and the submit script for >> a 8-node/64-GPU/64 MPI rank job. I?ll also note that the same program did >> not crash when using either 2 or 4 nodes (with 8 & 16 GPUs/MPI ranks >> respectively) and attach those logs as well if that helps. Could someone >> let me know what this error means and what can be done to prevent it? >> >> Thank You, >> Sajid Ali (he/him) | Research Associate >> >> Scientific Computing Division >> >> Fermi National Accelerator Laboratory >> >> s-sajid-ali.github.io >> >> ? >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno.scientist at gmail.com Thu Feb 10 21:37:18 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Fri, 11 Feb 2022 04:37:18 +0100 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> <2bed48e9-b141-44f8-807e-14d3e6f4c3fe@mcs.anl.gov> Message-ID: Dear both, Allow me to update you on the issue. I tried to re-compile PETSc with different configuration options as Satish suggested, and went further on by specifying exact location of OpenMPI libraries and include files to the ones installed by PETSc (for those configurations for which I used "--download-openmpi=1") and the original problem, the warning Named COMMON block ?mpi_fortran_bottom? at (1) shall be of the same size as elsewhere (4 vs 8 bytes), prevailed. In desperation, I completely removed OpenMPI from my workstation to make sure that only those which are downloaded with PETSc are used, yet the warning was still there. (That resolved the Invalid MIT-MAGIC-COOKIE-1 warning at least) Now I am wondering if the problem originates from the fact that I already have all the necessary MPI routines developed in Fortran? All calls, including the basic MPI_Init, MPI_Comm_Size and MPI_Comm_Rank, are done from Fortran. I actually have a module called Comm_Mod which does all MPI-related calls, and this module contains line include 'mpif.h'. That include statement does take the file from PETSc installation as no other MPI installation is left on my system, but still it somehow seems to be the origin of the warning on common blocks I observe. Now I am wondering if the include 'mpif.h' from Fortran somehow collides with the option include ${PETSC_DIR}/lib/petsc/conf/variables I put in my makefile in order to compile with PETSc. I am really not sure if it is possible to have main program and all MPI initialization done from Fortran (as I have now) and then plug PETSc on top of it? Should that be possible? Kind regards, Bojan P.S. The sequential version works fine, I can compile without warning and can call PETSc solvers from Fortran without a glitch. On Thu, Feb 10, 2022 at 5:08 PM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Dear Satish, > > Thanks for the advice. I will try in a few hours because it is almost > dinner time with me (I am in Europe) and I am supposed to go out with a > friend this evening. > > Will let you know. Thanks for help, I highly appreciate it. > > > Kind regards, > > Bojan > > > On Thu, Feb 10, 2022 at 5:06 PM Satish Balay wrote: > >> Hm - this is strange. >> >> Do you have 'xauth' installed? >> >> I would make sure xauth is installed, delete ~/.Xauthority - and reboot >> [or restart the X server] >> >> Yeah - it might not work - but perhaps worth a try.. >> >> Or perhaps its not X11 related.. >> >> I would also try 'strace' on an application that is producing this >> message - to see if I can narrow down further.. >> >> Do you get this message with both (runs)?: >> >> cd src/ksp/ksp/tutorials >> make ex2 >> mpiexec -n 1 ./ex2 >> ./ex2 >> >> Satish >> >> On Thu, 10 Feb 2022, Bojan Niceno wrote: >> >> > Dear both, >> > >> > I work on an ASUS ROG laptop and don't use any NFS. Everything is on >> one >> > computer, one disk. That is why I couldn't resolve the Invalid Magic >> > Cookie, because all the advice I've found about it concerns the remote >> > access/display. It is not an issue for me. My laptop has an Nvidia >> > GeForce RTX graphical card, maybe Ubuntu drivers are simply not able to >> > cope with it. I am out of ideas, really. >> > >> > >> > Cheers, >> > >> > Bojan >> > >> > On Thu, Feb 10, 2022 at 4:53 PM Satish Balay wrote: >> > >> > > Do the compute nodes and frontend share the same NFS? >> > > >> > > I would try the following [to see if they work): >> > > >> > > - delete ~/.Xauthority [first check with 'xauth list') >> > > - setup ssh to not use X - i.e add the following to ~/.ssh/config >> > > >> > > ForwardX11 no >> > > ForwardX11Trusted no >> > > >> > > [this can be tailored to apply only to your specific compute nodes - >> if >> > > needed] >> > > >> > > Satish >> > > >> > > On Thu, 10 Feb 2022, Matthew Knepley wrote: >> > > >> > > > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < >> > > > bojan.niceno.scientist at gmail.com> wrote: >> > > > >> > > > > Thanks a lot, now I feel much better. >> > > > > >> > > > > By the way, I can't get around the invalid magic cookie. It is >> > > occurring >> > > > > ever since I installed the OS (Ubuntu 20.04) so I eventually gave >> up >> > > and >> > > > > decided to live with it :-D >> > > > > >> > > > >> > > > >> > > >> https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely >> > > > >> > > > Thanks, >> > > > >> > > > Matt >> > > > >> > > > >> > > > > Cheers, >> > > > > >> > > > > Bojan >> > > > > >> > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley < >> knepley at gmail.com> >> > > wrote: >> > > > > >> > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < >> > > > >> bojan.niceno.scientist at gmail.com> wrote: >> > > > >> >> > > > >>> Dear Satish, >> > > > >>> >> > > > >>> Thanks for the answer. Your suggestion makes a lot of sense, >> but >> > > this >> > > > >>> is what I get as a result of that: >> > > > >>> >> > > > >>> Running check examples to verify correct installation >> > > > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and >> > > > >>> PETSC_ARCH=arch-linux-c-debug >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI >> > > process >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # >> = 1., >> > > > >>> grashof # = 1. >> > > > >>> Number of SNES iterations = 2 >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI >> > > processes >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # >> = 1., >> > > > >>> grashof # = 1. >> > > > >>> Number of SNES iterations = 2 >> > > > >>> Possible error running Fortran example src/snes/tutorials/ex5f >> with 1 >> > > > >>> MPI process >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 >> > > > >>> Completed test examples >> > > > >>> >> > > > >>> I am getting the "Possible error running Fortran example" >> warning >> > > with >> > > > >>> this. This somehow looks more severe to me. But I could be >> wrong. >> > > > >>> >> > > > >> >> > > > >> You are getting this message because your MPI implementation is >> > > printing >> > > > >> >> > > > >> Invalid MIT-MAGIC-COOKIE-1 key >> > > > >> >> > > > >> It is still running fine, but this is an MPI configuration issue. >> > > > >> >> > > > >> Thanks, >> > > > >> >> > > > >> Matt >> > > > >> >> > > > >> Any suggestions what to do? >> > > > >>> >> > > > >>> >> > > > >>> Kind regards, >> > > > >>> >> > > > >>> Bojan >> > > > >>> >> > > > >>> >> > > > >>> >> > > > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay >> > > wrote: >> > > > >>> >> > > > >>>> To clarify: >> > > > >>>> >> > > > >>>> you are using --download-openmpi=yes with petsc. However you >> say: >> > > > >>>> >> > > > >>>> > > The mpif90 command which >> > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI >> > > > >>>> >> > > > >>>> This suggests a different install of OpenMPI is used to build >> your >> > > code. >> > > > >>>> >> > > > >>>> One way to resolve this is - delete current build of PETSc - >> and >> > > > >>>> rebuild it with this same MPI [that you are using with your >> > > application] >> > > > >>>> >> > > > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 >> > > > >>>> --download-fblaslapack --download-metis --download-parmetis >> > > --download-cmake >> > > > >>>> >> > > > >>>> Also PETSc provides makefile format that minimizes such >> conflicts.. >> > > > >>>> >> > > > >>>> >> > > > >>>> >> > > >> https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications >> > > > >>>> >> > > > >>>> Satish >> > > > >>>> >> > > > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: >> > > > >>>> >> > > > >>>> > Are you using the same MPI to build both PETSc and your >> > > appliation? >> > > > >>>> > >> > > > >>>> > Satish >> > > > >>>> > >> > > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: >> > > > >>>> > > To whom it may concern, >> > > > >>>> > > >> > > > >>>> > > >> > > > >>>> > > I am working on a Fortran (2003) computational fluid >> dynamics >> > > > >>>> solver, >> > > > >>>> > > which is actually quite mature, was parallelized with MPI >> from >> > > the >> > > > >>>> > > very beginning and it comes with its own suite of Krylov >> > > solvers. >> > > > >>>> > > Although the code is self-sustained, I am inclined to >> believe >> > > that >> > > > >>>> it >> > > > >>>> > > would be better to use PETSc instead of my own home-grown >> > > solvers. >> > > > >>>> > > >> > > > >>>> > > In the attempt to do so, I have installed PETSc 3.16.4 with >> > > > >>>> following >> > > > >>>> > > options: >> > > > >>>> > > >> > > > >>>> > > ./configure --with-debugging=yes --download-openmpi=yes >> > > --download- >> > > > >>>> > > fblaslapack=yes --download-metis=yes >> --download-parmetis=yes -- >> > > > >>>> > > download-cmake=yes >> > > > >>>> > > >> > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 >> command >> > > which >> > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI, >> hence >> > > the >> > > > >>>> > > option "--download-openmpi=yes" when configuring PETSc. >> > > > >>>> > > >> > > > >>>> > > Anyhow, installation of PETSc went fine, I managed to link >> and >> > > run >> > > > >>>> it >> > > > >>>> > > with my code, but I am getting the following messages >> during >> > > > >>>> > > compilation: >> > > > >>>> > > >> > > > >>>> > > Petsc_Mod.f90:18:6: >> > > > >>>> > > >> > > > >>>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY >> > > > >>>> > > | 1 >> > > > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) >> shall >> > > be of >> > > > >>>> > > the same size as elsewhere (4 vs 8 bytes) >> > > > >>>> > > >> > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. >> All >> > > works, >> > > > >>>> > > but these messages give me a reason to worry. >> > > > >>>> > > >> > > > >>>> > > Can you tell what causes this warnings? I would guess they >> > > might >> > > > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't think >> I even >> > > > >>>> have >> > > > >>>> > > MPICH on my system. >> > > > >>>> > > >> > > > >>>> > > Please let me know what you think about it? >> > > > >>>> > > >> > > > >>>> > > Cheers, >> > > > >>>> > > >> > > > >>>> > > Bojan >> > > > >>>> > > >> > > > >>>> > > >> > > > >>>> > > >> > > > >>>> > > >> > > > >>>> > >> > > > >>>> > >> > > > >>>> >> > > > >>> >> > > > >> >> > > > >> -- >> > > > >> What most experimenters take for granted before they begin their >> > > > >> experiments is infinitely more interesting than any results to >> which >> > > their >> > > > >> experiments lead. >> > > > >> -- Norbert Wiener >> > > > >> >> > > > >> https://www.cse.buffalo.edu/~knepley/ >> > > > >> >> > > > >> >> > > > > >> > > > >> > > > >> > > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Feb 10 22:29:43 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 10 Feb 2022 22:29:43 -0600 (CST) Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> <2bed48e9-b141-44f8-807e-14d3e6f4c3fe@mcs.anl.gov> Message-ID: <28149e2c-4389-1629-bdb4-90984aeb1ecb@mcs.anl.gov> 1. you can call MPI_Init() before calling PetscInitialize() For example - check src/sys/tutorials/ex4f90.F90 2. Are you using -i8 -r8 type flags when compiling your code? That might case issues when using mpif.h. Perhaps you can switch from "include 'mpif.h'" to "use mpi" - in your module file - and see if that helps. Satish On Fri, 11 Feb 2022, Bojan Niceno wrote: > Dear both, > > Allow me to update you on the issue. I tried to re-compile PETSc with > different configuration options as Satish suggested, and went further on by > specifying exact location of OpenMPI libraries and include files to the > ones installed by PETSc (for those configurations for which I used > "--download-openmpi=1") and the original problem, the warning Named COMMON > block ?mpi_fortran_bottom? at (1) shall be of the same size as elsewhere (4 > vs 8 bytes), prevailed. > > In desperation, I completely removed OpenMPI from my workstation to make > sure that only those which are downloaded with PETSc are used, yet the > warning was still there. (That resolved the Invalid MIT-MAGIC-COOKIE-1 > warning at least) > > Now I am wondering if the problem originates from the fact that I already > have all the necessary MPI routines developed in Fortran? All calls, > including the basic MPI_Init, MPI_Comm_Size and MPI_Comm_Rank, are done > from Fortran. I actually have a module called Comm_Mod which does all > MPI-related calls, and this module contains line include 'mpif.h'. That > include statement does take the file from PETSc installation as no other > MPI installation is left on my system, but still it somehow seems to be the > origin of the warning on common blocks I observe. Now I am wondering if > the include 'mpif.h' from Fortran somehow collides with the option include > ${PETSC_DIR}/lib/petsc/conf/variables I put in my makefile in order to > compile with PETSc. > > I am really not sure if it is possible to have main program and all MPI > initialization done from Fortran (as I have now) and then plug PETSc on top > of it? Should that be possible? > > Kind regards, > > Bojan > > P.S. The sequential version works fine, I can compile without warning and > can call PETSc solvers from Fortran without a glitch. > > On Thu, Feb 10, 2022 at 5:08 PM Bojan Niceno < > bojan.niceno.scientist at gmail.com> wrote: > > > Dear Satish, > > > > Thanks for the advice. I will try in a few hours because it is almost > > dinner time with me (I am in Europe) and I am supposed to go out with a > > friend this evening. > > > > Will let you know. Thanks for help, I highly appreciate it. > > > > > > Kind regards, > > > > Bojan > > > > > > On Thu, Feb 10, 2022 at 5:06 PM Satish Balay wrote: > > > >> Hm - this is strange. > >> > >> Do you have 'xauth' installed? > >> > >> I would make sure xauth is installed, delete ~/.Xauthority - and reboot > >> [or restart the X server] > >> > >> Yeah - it might not work - but perhaps worth a try.. > >> > >> Or perhaps its not X11 related.. > >> > >> I would also try 'strace' on an application that is producing this > >> message - to see if I can narrow down further.. > >> > >> Do you get this message with both (runs)?: > >> > >> cd src/ksp/ksp/tutorials > >> make ex2 > >> mpiexec -n 1 ./ex2 > >> ./ex2 > >> > >> Satish > >> > >> On Thu, 10 Feb 2022, Bojan Niceno wrote: > >> > >> > Dear both, > >> > > >> > I work on an ASUS ROG laptop and don't use any NFS. Everything is on > >> one > >> > computer, one disk. That is why I couldn't resolve the Invalid Magic > >> > Cookie, because all the advice I've found about it concerns the remote > >> > access/display. It is not an issue for me. My laptop has an Nvidia > >> > GeForce RTX graphical card, maybe Ubuntu drivers are simply not able to > >> > cope with it. I am out of ideas, really. > >> > > >> > > >> > Cheers, > >> > > >> > Bojan > >> > > >> > On Thu, Feb 10, 2022 at 4:53 PM Satish Balay wrote: > >> > > >> > > Do the compute nodes and frontend share the same NFS? > >> > > > >> > > I would try the following [to see if they work): > >> > > > >> > > - delete ~/.Xauthority [first check with 'xauth list') > >> > > - setup ssh to not use X - i.e add the following to ~/.ssh/config > >> > > > >> > > ForwardX11 no > >> > > ForwardX11Trusted no > >> > > > >> > > [this can be tailored to apply only to your specific compute nodes - > >> if > >> > > needed] > >> > > > >> > > Satish > >> > > > >> > > On Thu, 10 Feb 2022, Matthew Knepley wrote: > >> > > > >> > > > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < > >> > > > bojan.niceno.scientist at gmail.com> wrote: > >> > > > > >> > > > > Thanks a lot, now I feel much better. > >> > > > > > >> > > > > By the way, I can't get around the invalid magic cookie. It is > >> > > occurring > >> > > > > ever since I installed the OS (Ubuntu 20.04) so I eventually gave > >> up > >> > > and > >> > > > > decided to live with it :-D > >> > > > > > >> > > > > >> > > > > >> > > > >> https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely > >> > > > > >> > > > Thanks, > >> > > > > >> > > > Matt > >> > > > > >> > > > > >> > > > > Cheers, > >> > > > > > >> > > > > Bojan > >> > > > > > >> > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley < > >> knepley at gmail.com> > >> > > wrote: > >> > > > > > >> > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < > >> > > > >> bojan.niceno.scientist at gmail.com> wrote: > >> > > > >> > >> > > > >>> Dear Satish, > >> > > > >>> > >> > > > >>> Thanks for the answer. Your suggestion makes a lot of sense, > >> but > >> > > this > >> > > > >>> is what I get as a result of that: > >> > > > >>> > >> > > > >>> Running check examples to verify correct installation > >> > > > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and > >> > > > >>> PETSC_ARCH=arch-linux-c-debug > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 MPI > >> > > process > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # > >> = 1., > >> > > > >>> grashof # = 1. > >> > > > >>> Number of SNES iterations = 2 > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 MPI > >> > > processes > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, prandtl # > >> = 1., > >> > > > >>> grashof # = 1. > >> > > > >>> Number of SNES iterations = 2 > >> > > > >>> Possible error running Fortran example src/snes/tutorials/ex5f > >> with 1 > >> > > > >>> MPI process > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = 4 > >> > > > >>> Completed test examples > >> > > > >>> > >> > > > >>> I am getting the "Possible error running Fortran example" > >> warning > >> > > with > >> > > > >>> this. This somehow looks more severe to me. But I could be > >> wrong. > >> > > > >>> > >> > > > >> > >> > > > >> You are getting this message because your MPI implementation is > >> > > printing > >> > > > >> > >> > > > >> Invalid MIT-MAGIC-COOKIE-1 key > >> > > > >> > >> > > > >> It is still running fine, but this is an MPI configuration issue. > >> > > > >> > >> > > > >> Thanks, > >> > > > >> > >> > > > >> Matt > >> > > > >> > >> > > > >> Any suggestions what to do? > >> > > > >>> > >> > > > >>> > >> > > > >>> Kind regards, > >> > > > >>> > >> > > > >>> Bojan > >> > > > >>> > >> > > > >>> > >> > > > >>> > >> > > > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay > >> > > wrote: > >> > > > >>> > >> > > > >>>> To clarify: > >> > > > >>>> > >> > > > >>>> you are using --download-openmpi=yes with petsc. However you > >> say: > >> > > > >>>> > >> > > > >>>> > > The mpif90 command which > >> > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI > >> > > > >>>> > >> > > > >>>> This suggests a different install of OpenMPI is used to build > >> your > >> > > code. > >> > > > >>>> > >> > > > >>>> One way to resolve this is - delete current build of PETSc - > >> and > >> > > > >>>> rebuild it with this same MPI [that you are using with your > >> > > application] > >> > > > >>>> > >> > > > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > >> > > > >>>> --download-fblaslapack --download-metis --download-parmetis > >> > > --download-cmake > >> > > > >>>> > >> > > > >>>> Also PETSc provides makefile format that minimizes such > >> conflicts.. > >> > > > >>>> > >> > > > >>>> > >> > > > >>>> > >> > > > >> https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications > >> > > > >>>> > >> > > > >>>> Satish > >> > > > >>>> > >> > > > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: > >> > > > >>>> > >> > > > >>>> > Are you using the same MPI to build both PETSc and your > >> > > appliation? > >> > > > >>>> > > >> > > > >>>> > Satish > >> > > > >>>> > > >> > > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > >> > > > >>>> > > To whom it may concern, > >> > > > >>>> > > > >> > > > >>>> > > > >> > > > >>>> > > I am working on a Fortran (2003) computational fluid > >> dynamics > >> > > > >>>> solver, > >> > > > >>>> > > which is actually quite mature, was parallelized with MPI > >> from > >> > > the > >> > > > >>>> > > very beginning and it comes with its own suite of Krylov > >> > > solvers. > >> > > > >>>> > > Although the code is self-sustained, I am inclined to > >> believe > >> > > that > >> > > > >>>> it > >> > > > >>>> > > would be better to use PETSc instead of my own home-grown > >> > > solvers. > >> > > > >>>> > > > >> > > > >>>> > > In the attempt to do so, I have installed PETSc 3.16.4 with > >> > > > >>>> following > >> > > > >>>> > > options: > >> > > > >>>> > > > >> > > > >>>> > > ./configure --with-debugging=yes --download-openmpi=yes > >> > > --download- > >> > > > >>>> > > fblaslapack=yes --download-metis=yes > >> --download-parmetis=yes -- > >> > > > >>>> > > download-cmake=yes > >> > > > >>>> > > > >> > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 > >> command > >> > > which > >> > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI, > >> hence > >> > > the > >> > > > >>>> > > option "--download-openmpi=yes" when configuring PETSc. > >> > > > >>>> > > > >> > > > >>>> > > Anyhow, installation of PETSc went fine, I managed to link > >> and > >> > > run > >> > > > >>>> it > >> > > > >>>> > > with my code, but I am getting the following messages > >> during > >> > > > >>>> > > compilation: > >> > > > >>>> > > > >> > > > >>>> > > Petsc_Mod.f90:18:6: > >> > > > >>>> > > > >> > > > >>>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY > >> > > > >>>> > > | 1 > >> > > > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) > >> shall > >> > > be of > >> > > > >>>> > > the same size as elsewhere (4 vs 8 bytes) > >> > > > >>>> > > > >> > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing PETSc. > >> All > >> > > works, > >> > > > >>>> > > but these messages give me a reason to worry. > >> > > > >>>> > > > >> > > > >>>> > > Can you tell what causes this warnings? I would guess they > >> > > might > >> > > > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't think > >> I even > >> > > > >>>> have > >> > > > >>>> > > MPICH on my system. > >> > > > >>>> > > > >> > > > >>>> > > Please let me know what you think about it? > >> > > > >>>> > > > >> > > > >>>> > > Cheers, > >> > > > >>>> > > > >> > > > >>>> > > Bojan > >> > > > >>>> > > > >> > > > >>>> > > > >> > > > >>>> > > > >> > > > >>>> > > > >> > > > >>>> > > >> > > > >>>> > > >> > > > >>>> > >> > > > >>> > >> > > > >> > >> > > > >> -- > >> > > > >> What most experimenters take for granted before they begin their > >> > > > >> experiments is infinitely more interesting than any results to > >> which > >> > > their > >> > > > >> experiments lead. > >> > > > >> -- Norbert Wiener > >> > > > >> > >> > > > >> https://www.cse.buffalo.edu/~knepley/ > >> > > > >> > >> > > > >> > >> > > > > > >> > > > > >> > > > > >> > > > >> > > >> > > > From bojan.niceno.scientist at gmail.com Thu Feb 10 23:51:21 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Fri, 11 Feb 2022 06:51:21 +0100 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: <28149e2c-4389-1629-bdb4-90984aeb1ecb@mcs.anl.gov> References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> <2bed48e9-b141-44f8-807e-14d3e6f4c3fe@mcs.anl.gov> <28149e2c-4389-1629-bdb4-90984aeb1ecb@mcs.anl.gov> Message-ID: Thanks for the example. I do use -i8 -r8 (I know it is a bad practice and am planning to get rid of it in mid terms), and I did suspect that, but when I tried to compile without those options, the warning remained. When I switched from "include mpif.h" to: use mpi, or even: #include use petscmpi use petscsys I get the following type of messages: Error: There is no specific subroutine for the generic ?mpi_barrier? at (1) Comm_Mod/Parallel/Start.f90:11:22: Error: There is no specific subroutine for the generic ?mpi_init? at (1) Comm_Mod/Parallel/Start.f90:14:52: Error: There is no specific subroutine for the generic ?mpi_comm_size? at (1) Comm_Mod/Parallel/Start.f90:17:54: I was googling for a solution, but StackOverflow seems to be down for maintenance at the moment :-( I did manage to find that "include mpif.h" is obsolete, which I didn't know before :-) On Fri, Feb 11, 2022 at 5:29 AM Satish Balay wrote: > 1. you can call MPI_Init() before calling PetscInitialize() For > example - check src/sys/tutorials/ex4f90.F90 > > 2. Are you using -i8 -r8 type flags when compiling your code? That > might case issues when using mpif.h. Perhaps you can switch from > "include 'mpif.h'" to "use mpi" - in your module file - and see if > that helps. > > Satish > > On Fri, 11 Feb 2022, Bojan Niceno wrote: > > > Dear both, > > > > Allow me to update you on the issue. I tried to re-compile PETSc with > > different configuration options as Satish suggested, and went further on > by > > specifying exact location of OpenMPI libraries and include files to the > > ones installed by PETSc (for those configurations for which I used > > "--download-openmpi=1") and the original problem, the warning Named > COMMON > > block ?mpi_fortran_bottom? at (1) shall be of the same size as elsewhere > (4 > > vs 8 bytes), prevailed. > > > > In desperation, I completely removed OpenMPI from my workstation to make > > sure that only those which are downloaded with PETSc are used, yet the > > warning was still there. (That resolved the Invalid MIT-MAGIC-COOKIE-1 > > warning at least) > > > > Now I am wondering if the problem originates from the fact that I already > > have all the necessary MPI routines developed in Fortran? All calls, > > including the basic MPI_Init, MPI_Comm_Size and MPI_Comm_Rank, are done > > from Fortran. I actually have a module called Comm_Mod which does all > > MPI-related calls, and this module contains line include 'mpif.h'. That > > include statement does take the file from PETSc installation as no other > > MPI installation is left on my system, but still it somehow seems to be > the > > origin of the warning on common blocks I observe. Now I am wondering if > > the include 'mpif.h' from Fortran somehow collides with the option > include > > ${PETSC_DIR}/lib/petsc/conf/variables I put in my makefile in order to > > compile with PETSc. > > > > I am really not sure if it is possible to have main program and all MPI > > initialization done from Fortran (as I have now) and then plug PETSc on > top > > of it? Should that be possible? > > > > Kind regards, > > > > Bojan > > > > P.S. The sequential version works fine, I can compile without warning and > > can call PETSc solvers from Fortran without a glitch. > > > > On Thu, Feb 10, 2022 at 5:08 PM Bojan Niceno < > > bojan.niceno.scientist at gmail.com> wrote: > > > > > Dear Satish, > > > > > > Thanks for the advice. I will try in a few hours because it is almost > > > dinner time with me (I am in Europe) and I am supposed to go out with a > > > friend this evening. > > > > > > Will let you know. Thanks for help, I highly appreciate it. > > > > > > > > > Kind regards, > > > > > > Bojan > > > > > > > > > On Thu, Feb 10, 2022 at 5:06 PM Satish Balay > wrote: > > > > > >> Hm - this is strange. > > >> > > >> Do you have 'xauth' installed? > > >> > > >> I would make sure xauth is installed, delete ~/.Xauthority - and > reboot > > >> [or restart the X server] > > >> > > >> Yeah - it might not work - but perhaps worth a try.. > > >> > > >> Or perhaps its not X11 related.. > > >> > > >> I would also try 'strace' on an application that is producing this > > >> message - to see if I can narrow down further.. > > >> > > >> Do you get this message with both (runs)?: > > >> > > >> cd src/ksp/ksp/tutorials > > >> make ex2 > > >> mpiexec -n 1 ./ex2 > > >> ./ex2 > > >> > > >> Satish > > >> > > >> On Thu, 10 Feb 2022, Bojan Niceno wrote: > > >> > > >> > Dear both, > > >> > > > >> > I work on an ASUS ROG laptop and don't use any NFS. Everything is > on > > >> one > > >> > computer, one disk. That is why I couldn't resolve the Invalid > Magic > > >> > Cookie, because all the advice I've found about it concerns the > remote > > >> > access/display. It is not an issue for me. My laptop has an Nvidia > > >> > GeForce RTX graphical card, maybe Ubuntu drivers are simply not > able to > > >> > cope with it. I am out of ideas, really. > > >> > > > >> > > > >> > Cheers, > > >> > > > >> > Bojan > > >> > > > >> > On Thu, Feb 10, 2022 at 4:53 PM Satish Balay > wrote: > > >> > > > >> > > Do the compute nodes and frontend share the same NFS? > > >> > > > > >> > > I would try the following [to see if they work): > > >> > > > > >> > > - delete ~/.Xauthority [first check with 'xauth list') > > >> > > - setup ssh to not use X - i.e add the following to ~/.ssh/config > > >> > > > > >> > > ForwardX11 no > > >> > > ForwardX11Trusted no > > >> > > > > >> > > [this can be tailored to apply only to your specific compute > nodes - > > >> if > > >> > > needed] > > >> > > > > >> > > Satish > > >> > > > > >> > > On Thu, 10 Feb 2022, Matthew Knepley wrote: > > >> > > > > >> > > > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < > > >> > > > bojan.niceno.scientist at gmail.com> wrote: > > >> > > > > > >> > > > > Thanks a lot, now I feel much better. > > >> > > > > > > >> > > > > By the way, I can't get around the invalid magic cookie. It > is > > >> > > occurring > > >> > > > > ever since I installed the OS (Ubuntu 20.04) so I eventually > gave > > >> up > > >> > > and > > >> > > > > decided to live with it :-D > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > >> > https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely > > >> > > > > > >> > > > Thanks, > > >> > > > > > >> > > > Matt > > >> > > > > > >> > > > > > >> > > > > Cheers, > > >> > > > > > > >> > > > > Bojan > > >> > > > > > > >> > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley < > > >> knepley at gmail.com> > > >> > > wrote: > > >> > > > > > > >> > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < > > >> > > > >> bojan.niceno.scientist at gmail.com> wrote: > > >> > > > >> > > >> > > > >>> Dear Satish, > > >> > > > >>> > > >> > > > >>> Thanks for the answer. Your suggestion makes a lot of > sense, > > >> but > > >> > > this > > >> > > > >>> is what I get as a result of that: > > >> > > > >>> > > >> > > > >>> Running check examples to verify correct installation > > >> > > > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and > > >> > > > >>> PETSC_ARCH=arch-linux-c-debug > > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 1 > MPI > > >> > > process > > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, > prandtl # > > >> = 1., > > >> > > > >>> grashof # = 1. > > >> > > > >>> Number of SNES iterations = 2 > > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with 2 > MPI > > >> > > processes > > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, > prandtl # > > >> = 1., > > >> > > > >>> grashof # = 1. > > >> > > > >>> Number of SNES iterations = 2 > > >> > > > >>> Possible error running Fortran example > src/snes/tutorials/ex5f > > >> with 1 > > >> > > > >>> MPI process > > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = > 4 > > >> > > > >>> Completed test examples > > >> > > > >>> > > >> > > > >>> I am getting the "Possible error running Fortran example" > > >> warning > > >> > > with > > >> > > > >>> this. This somehow looks more severe to me. But I could be > > >> wrong. > > >> > > > >>> > > >> > > > >> > > >> > > > >> You are getting this message because your MPI implementation > is > > >> > > printing > > >> > > > >> > > >> > > > >> Invalid MIT-MAGIC-COOKIE-1 key > > >> > > > >> > > >> > > > >> It is still running fine, but this is an MPI configuration > issue. > > >> > > > >> > > >> > > > >> Thanks, > > >> > > > >> > > >> > > > >> Matt > > >> > > > >> > > >> > > > >> Any suggestions what to do? > > >> > > > >>> > > >> > > > >>> > > >> > > > >>> Kind regards, > > >> > > > >>> > > >> > > > >>> Bojan > > >> > > > >>> > > >> > > > >>> > > >> > > > >>> > > >> > > > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay < > balay at mcs.anl.gov> > > >> > > wrote: > > >> > > > >>> > > >> > > > >>>> To clarify: > > >> > > > >>>> > > >> > > > >>>> you are using --download-openmpi=yes with petsc. However > you > > >> say: > > >> > > > >>>> > > >> > > > >>>> > > The mpif90 command which > > >> > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI > > >> > > > >>>> > > >> > > > >>>> This suggests a different install of OpenMPI is used to > build > > >> your > > >> > > code. > > >> > > > >>>> > > >> > > > >>>> One way to resolve this is - delete current build of PETSc > - > > >> and > > >> > > > >>>> rebuild it with this same MPI [that you are using with your > > >> > > application] > > >> > > > >>>> > > >> > > > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx > --with-fc=mpif90 > > >> > > > >>>> --download-fblaslapack --download-metis --download-parmetis > > >> > > --download-cmake > > >> > > > >>>> > > >> > > > >>>> Also PETSc provides makefile format that minimizes such > > >> conflicts.. > > >> > > > >>>> > > >> > > > >>>> > > >> > > > >>>> > > >> > > > > >> > https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications > > >> > > > >>>> > > >> > > > >>>> Satish > > >> > > > >>>> > > >> > > > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: > > >> > > > >>>> > > >> > > > >>>> > Are you using the same MPI to build both PETSc and your > > >> > > appliation? > > >> > > > >>>> > > > >> > > > >>>> > Satish > > >> > > > >>>> > > > >> > > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > > >> > > > >>>> > > To whom it may concern, > > >> > > > >>>> > > > > >> > > > >>>> > > > > >> > > > >>>> > > I am working on a Fortran (2003) computational fluid > > >> dynamics > > >> > > > >>>> solver, > > >> > > > >>>> > > which is actually quite mature, was parallelized with > MPI > > >> from > > >> > > the > > >> > > > >>>> > > very beginning and it comes with its own suite of > Krylov > > >> > > solvers. > > >> > > > >>>> > > Although the code is self-sustained, I am inclined to > > >> believe > > >> > > that > > >> > > > >>>> it > > >> > > > >>>> > > would be better to use PETSc instead of my own > home-grown > > >> > > solvers. > > >> > > > >>>> > > > > >> > > > >>>> > > In the attempt to do so, I have installed PETSc 3.16.4 > with > > >> > > > >>>> following > > >> > > > >>>> > > options: > > >> > > > >>>> > > > > >> > > > >>>> > > ./configure --with-debugging=yes --download-openmpi=yes > > >> > > --download- > > >> > > > >>>> > > fblaslapack=yes --download-metis=yes > > >> --download-parmetis=yes -- > > >> > > > >>>> > > download-cmake=yes > > >> > > > >>>> > > > > >> > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 > > >> command > > >> > > which > > >> > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI, > > >> hence > > >> > > the > > >> > > > >>>> > > option "--download-openmpi=yes" when configuring PETSc. > > >> > > > >>>> > > > > >> > > > >>>> > > Anyhow, installation of PETSc went fine, I managed to > link > > >> and > > >> > > run > > >> > > > >>>> it > > >> > > > >>>> > > with my code, but I am getting the following messages > > >> during > > >> > > > >>>> > > compilation: > > >> > > > >>>> > > > > >> > > > >>>> > > Petsc_Mod.f90:18:6: > > >> > > > >>>> > > > > >> > > > >>>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY > > >> > > > >>>> > > | 1 > > >> > > > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at (1) > > >> shall > > >> > > be of > > >> > > > >>>> > > the same size as elsewhere (4 vs 8 bytes) > > >> > > > >>>> > > > > >> > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing > PETSc. > > >> All > > >> > > works, > > >> > > > >>>> > > but these messages give me a reason to worry. > > >> > > > >>>> > > > > >> > > > >>>> > > Can you tell what causes this warnings? I would guess > they > > >> > > might > > >> > > > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't > think > > >> I even > > >> > > > >>>> have > > >> > > > >>>> > > MPICH on my system. > > >> > > > >>>> > > > > >> > > > >>>> > > Please let me know what you think about it? > > >> > > > >>>> > > > > >> > > > >>>> > > Cheers, > > >> > > > >>>> > > > > >> > > > >>>> > > Bojan > > >> > > > >>>> > > > > >> > > > >>>> > > > > >> > > > >>>> > > > > >> > > > >>>> > > > > >> > > > >>>> > > > >> > > > >>>> > > > >> > > > >>>> > > >> > > > >>> > > >> > > > >> > > >> > > > >> -- > > >> > > > >> What most experimenters take for granted before they begin > their > > >> > > > >> experiments is infinitely more interesting than any results > to > > >> which > > >> > > their > > >> > > > >> experiments lead. > > >> > > > >> -- Norbert Wiener > > >> > > > >> > > >> > > > >> https://www.cse.buffalo.edu/~knepley/ > > >> > > > >> > > >> > > > >> > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno.scientist at gmail.com Fri Feb 11 00:26:28 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Fri, 11 Feb 2022 07:26:28 +0100 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> <2bed48e9-b141-44f8-807e-14d3e6f4c3fe@mcs.anl.gov> <28149e2c-4389-1629-bdb4-90984aeb1ecb@mcs.anl.gov> Message-ID: What does seem to work is: use mpi_f08 and using: call Mpi_Init_08(error) call Mpi_Comm_Size_08(MPI_COMM_WORLD, n_proc, error)? call Mpi_Comm_Rank_08(MPI_COMM_WORLD, this_proc, error) That is, function calls with extension _08 On Fri, Feb 11, 2022 at 6:51 AM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Thanks for the example. > > I do use -i8 -r8 (I know it is a bad practice and am planning to get rid > of it in mid terms), and I did suspect that, but when I tried to compile > without those options, the warning remained. > > When I switched from "include mpif.h" to: > > use mpi, > > or even: > > #include > > use petscmpi > > use petscsys > > > I get the following type of messages: > > Error: There is no specific subroutine for the generic ?mpi_barrier? at (1) > Comm_Mod/Parallel/Start.f90:11:22: > > Error: There is no specific subroutine for the generic ?mpi_init? at (1) > Comm_Mod/Parallel/Start.f90:14:52: > > Error: There is no specific subroutine for the generic ?mpi_comm_size? at > (1) > Comm_Mod/Parallel/Start.f90:17:54: > > I was googling for a solution, but StackOverflow seems to be down for > maintenance at the moment :-( > > I did manage to find that "include mpif.h" is obsolete, which I didn't > know before :-) > > > > > > > > > > On Fri, Feb 11, 2022 at 5:29 AM Satish Balay wrote: > >> 1. you can call MPI_Init() before calling PetscInitialize() For >> example - check src/sys/tutorials/ex4f90.F90 >> >> 2. Are you using -i8 -r8 type flags when compiling your code? That >> might case issues when using mpif.h. Perhaps you can switch from >> "include 'mpif.h'" to "use mpi" - in your module file - and see if >> that helps. >> >> Satish >> >> On Fri, 11 Feb 2022, Bojan Niceno wrote: >> >> > Dear both, >> > >> > Allow me to update you on the issue. I tried to re-compile PETSc with >> > different configuration options as Satish suggested, and went further >> on by >> > specifying exact location of OpenMPI libraries and include files to the >> > ones installed by PETSc (for those configurations for which I used >> > "--download-openmpi=1") and the original problem, the warning Named >> COMMON >> > block ?mpi_fortran_bottom? at (1) shall be of the same size as >> elsewhere (4 >> > vs 8 bytes), prevailed. >> > >> > In desperation, I completely removed OpenMPI from my workstation to make >> > sure that only those which are downloaded with PETSc are used, yet the >> > warning was still there. (That resolved the Invalid MIT-MAGIC-COOKIE-1 >> > warning at least) >> > >> > Now I am wondering if the problem originates from the fact that I >> already >> > have all the necessary MPI routines developed in Fortran? All calls, >> > including the basic MPI_Init, MPI_Comm_Size and MPI_Comm_Rank, are done >> > from Fortran. I actually have a module called Comm_Mod which does all >> > MPI-related calls, and this module contains line include 'mpif.h'. That >> > include statement does take the file from PETSc installation as no other >> > MPI installation is left on my system, but still it somehow seems to be >> the >> > origin of the warning on common blocks I observe. Now I am wondering if >> > the include 'mpif.h' from Fortran somehow collides with the option >> include >> > ${PETSC_DIR}/lib/petsc/conf/variables I put in my makefile in order to >> > compile with PETSc. >> > >> > I am really not sure if it is possible to have main program and all MPI >> > initialization done from Fortran (as I have now) and then plug PETSc on >> top >> > of it? Should that be possible? >> > >> > Kind regards, >> > >> > Bojan >> > >> > P.S. The sequential version works fine, I can compile without warning >> and >> > can call PETSc solvers from Fortran without a glitch. >> > >> > On Thu, Feb 10, 2022 at 5:08 PM Bojan Niceno < >> > bojan.niceno.scientist at gmail.com> wrote: >> > >> > > Dear Satish, >> > > >> > > Thanks for the advice. I will try in a few hours because it is almost >> > > dinner time with me (I am in Europe) and I am supposed to go out with >> a >> > > friend this evening. >> > > >> > > Will let you know. Thanks for help, I highly appreciate it. >> > > >> > > >> > > Kind regards, >> > > >> > > Bojan >> > > >> > > >> > > On Thu, Feb 10, 2022 at 5:06 PM Satish Balay >> wrote: >> > > >> > >> Hm - this is strange. >> > >> >> > >> Do you have 'xauth' installed? >> > >> >> > >> I would make sure xauth is installed, delete ~/.Xauthority - and >> reboot >> > >> [or restart the X server] >> > >> >> > >> Yeah - it might not work - but perhaps worth a try.. >> > >> >> > >> Or perhaps its not X11 related.. >> > >> >> > >> I would also try 'strace' on an application that is producing this >> > >> message - to see if I can narrow down further.. >> > >> >> > >> Do you get this message with both (runs)?: >> > >> >> > >> cd src/ksp/ksp/tutorials >> > >> make ex2 >> > >> mpiexec -n 1 ./ex2 >> > >> ./ex2 >> > >> >> > >> Satish >> > >> >> > >> On Thu, 10 Feb 2022, Bojan Niceno wrote: >> > >> >> > >> > Dear both, >> > >> > >> > >> > I work on an ASUS ROG laptop and don't use any NFS. Everything is >> on >> > >> one >> > >> > computer, one disk. That is why I couldn't resolve the Invalid >> Magic >> > >> > Cookie, because all the advice I've found about it concerns the >> remote >> > >> > access/display. It is not an issue for me. My laptop has an >> Nvidia >> > >> > GeForce RTX graphical card, maybe Ubuntu drivers are simply not >> able to >> > >> > cope with it. I am out of ideas, really. >> > >> > >> > >> > >> > >> > Cheers, >> > >> > >> > >> > Bojan >> > >> > >> > >> > On Thu, Feb 10, 2022 at 4:53 PM Satish Balay >> wrote: >> > >> > >> > >> > > Do the compute nodes and frontend share the same NFS? >> > >> > > >> > >> > > I would try the following [to see if they work): >> > >> > > >> > >> > > - delete ~/.Xauthority [first check with 'xauth list') >> > >> > > - setup ssh to not use X - i.e add the following to ~/.ssh/config >> > >> > > >> > >> > > ForwardX11 no >> > >> > > ForwardX11Trusted no >> > >> > > >> > >> > > [this can be tailored to apply only to your specific compute >> nodes - >> > >> if >> > >> > > needed] >> > >> > > >> > >> > > Satish >> > >> > > >> > >> > > On Thu, 10 Feb 2022, Matthew Knepley wrote: >> > >> > > >> > >> > > > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < >> > >> > > > bojan.niceno.scientist at gmail.com> wrote: >> > >> > > > >> > >> > > > > Thanks a lot, now I feel much better. >> > >> > > > > >> > >> > > > > By the way, I can't get around the invalid magic cookie. It >> is >> > >> > > occurring >> > >> > > > > ever since I installed the OS (Ubuntu 20.04) so I eventually >> gave >> > >> up >> > >> > > and >> > >> > > > > decided to live with it :-D >> > >> > > > > >> > >> > > > >> > >> > > > >> > >> > > >> > >> >> https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely >> > >> > > > >> > >> > > > Thanks, >> > >> > > > >> > >> > > > Matt >> > >> > > > >> > >> > > > >> > >> > > > > Cheers, >> > >> > > > > >> > >> > > > > Bojan >> > >> > > > > >> > >> > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley < >> > >> knepley at gmail.com> >> > >> > > wrote: >> > >> > > > > >> > >> > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < >> > >> > > > >> bojan.niceno.scientist at gmail.com> wrote: >> > >> > > > >> >> > >> > > > >>> Dear Satish, >> > >> > > > >>> >> > >> > > > >>> Thanks for the answer. Your suggestion makes a lot of >> sense, >> > >> but >> > >> > > this >> > >> > > > >>> is what I get as a result of that: >> > >> > > > >>> >> > >> > > > >>> Running check examples to verify correct installation >> > >> > > > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and >> > >> > > > >>> PETSC_ARCH=arch-linux-c-debug >> > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with >> 1 MPI >> > >> > > process >> > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, >> prandtl # >> > >> = 1., >> > >> > > > >>> grashof # = 1. >> > >> > > > >>> Number of SNES iterations = 2 >> > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with >> 2 MPI >> > >> > > processes >> > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, >> prandtl # >> > >> = 1., >> > >> > > > >>> grashof # = 1. >> > >> > > > >>> Number of SNES iterations = 2 >> > >> > > > >>> Possible error running Fortran example >> src/snes/tutorials/ex5f >> > >> with 1 >> > >> > > > >>> MPI process >> > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >> > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations = >> 4 >> > >> > > > >>> Completed test examples >> > >> > > > >>> >> > >> > > > >>> I am getting the "Possible error running Fortran example" >> > >> warning >> > >> > > with >> > >> > > > >>> this. This somehow looks more severe to me. But I could >> be >> > >> wrong. >> > >> > > > >>> >> > >> > > > >> >> > >> > > > >> You are getting this message because your MPI >> implementation is >> > >> > > printing >> > >> > > > >> >> > >> > > > >> Invalid MIT-MAGIC-COOKIE-1 key >> > >> > > > >> >> > >> > > > >> It is still running fine, but this is an MPI configuration >> issue. >> > >> > > > >> >> > >> > > > >> Thanks, >> > >> > > > >> >> > >> > > > >> Matt >> > >> > > > >> >> > >> > > > >> Any suggestions what to do? >> > >> > > > >>> >> > >> > > > >>> >> > >> > > > >>> Kind regards, >> > >> > > > >>> >> > >> > > > >>> Bojan >> > >> > > > >>> >> > >> > > > >>> >> > >> > > > >>> >> > >> > > > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay < >> balay at mcs.anl.gov> >> > >> > > wrote: >> > >> > > > >>> >> > >> > > > >>>> To clarify: >> > >> > > > >>>> >> > >> > > > >>>> you are using --download-openmpi=yes with petsc. However >> you >> > >> say: >> > >> > > > >>>> >> > >> > > > >>>> > > The mpif90 command which >> > >> > > > >>>> > > I use to compile the code, wraps gfortran with OpenMPI >> > >> > > > >>>> >> > >> > > > >>>> This suggests a different install of OpenMPI is used to >> build >> > >> your >> > >> > > code. >> > >> > > > >>>> >> > >> > > > >>>> One way to resolve this is - delete current build of >> PETSc - >> > >> and >> > >> > > > >>>> rebuild it with this same MPI [that you are using with >> your >> > >> > > application] >> > >> > > > >>>> >> > >> > > > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx >> --with-fc=mpif90 >> > >> > > > >>>> --download-fblaslapack --download-metis >> --download-parmetis >> > >> > > --download-cmake >> > >> > > > >>>> >> > >> > > > >>>> Also PETSc provides makefile format that minimizes such >> > >> conflicts.. >> > >> > > > >>>> >> > >> > > > >>>> >> > >> > > > >>>> >> > >> > > >> > >> >> https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications >> > >> > > > >>>> >> > >> > > > >>>> Satish >> > >> > > > >>>> >> > >> > > > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: >> > >> > > > >>>> >> > >> > > > >>>> > Are you using the same MPI to build both PETSc and your >> > >> > > appliation? >> > >> > > > >>>> > >> > >> > > > >>>> > Satish >> > >> > > > >>>> > >> > >> > > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: >> > >> > > > >>>> > > To whom it may concern, >> > >> > > > >>>> > > >> > >> > > > >>>> > > >> > >> > > > >>>> > > I am working on a Fortran (2003) computational fluid >> > >> dynamics >> > >> > > > >>>> solver, >> > >> > > > >>>> > > which is actually quite mature, was parallelized with >> MPI >> > >> from >> > >> > > the >> > >> > > > >>>> > > very beginning and it comes with its own suite of >> Krylov >> > >> > > solvers. >> > >> > > > >>>> > > Although the code is self-sustained, I am inclined to >> > >> believe >> > >> > > that >> > >> > > > >>>> it >> > >> > > > >>>> > > would be better to use PETSc instead of my own >> home-grown >> > >> > > solvers. >> > >> > > > >>>> > > >> > >> > > > >>>> > > In the attempt to do so, I have installed PETSc >> 3.16.4 with >> > >> > > > >>>> following >> > >> > > > >>>> > > options: >> > >> > > > >>>> > > >> > >> > > > >>>> > > ./configure --with-debugging=yes >> --download-openmpi=yes >> > >> > > --download- >> > >> > > > >>>> > > fblaslapack=yes --download-metis=yes >> > >> --download-parmetis=yes -- >> > >> > > > >>>> > > download-cmake=yes >> > >> > > > >>>> > > >> > >> > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The mpif90 >> > >> command >> > >> > > which >> > >> > > > >>>> > > I use to compile the code, wraps gfortran with >> OpenMPI, >> > >> hence >> > >> > > the >> > >> > > > >>>> > > option "--download-openmpi=yes" when configuring >> PETSc. >> > >> > > > >>>> > > >> > >> > > > >>>> > > Anyhow, installation of PETSc went fine, I managed to >> link >> > >> and >> > >> > > run >> > >> > > > >>>> it >> > >> > > > >>>> > > with my code, but I am getting the following messages >> > >> during >> > >> > > > >>>> > > compilation: >> > >> > > > >>>> > > >> > >> > > > >>>> > > Petsc_Mod.f90:18:6: >> > >> > > > >>>> > > >> > >> > > > >>>> > > 18 | use PetscMat, only: tMat, MAT_FINAL_ASSEMBLY >> > >> > > > >>>> > > | 1 >> > >> > > > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at >> (1) >> > >> shall >> > >> > > be of >> > >> > > > >>>> > > the same size as elsewhere (4 vs 8 bytes) >> > >> > > > >>>> > > >> > >> > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing >> PETSc. >> > >> All >> > >> > > works, >> > >> > > > >>>> > > but these messages give me a reason to worry. >> > >> > > > >>>> > > >> > >> > > > >>>> > > Can you tell what causes this warnings? I would >> guess they >> > >> > > might >> > >> > > > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't >> think >> > >> I even >> > >> > > > >>>> have >> > >> > > > >>>> > > MPICH on my system. >> > >> > > > >>>> > > >> > >> > > > >>>> > > Please let me know what you think about it? >> > >> > > > >>>> > > >> > >> > > > >>>> > > Cheers, >> > >> > > > >>>> > > >> > >> > > > >>>> > > Bojan >> > >> > > > >>>> > > >> > >> > > > >>>> > > >> > >> > > > >>>> > > >> > >> > > > >>>> > > >> > >> > > > >>>> > >> > >> > > > >>>> > >> > >> > > > >>>> >> > >> > > > >>> >> > >> > > > >> >> > >> > > > >> -- >> > >> > > > >> What most experimenters take for granted before they begin >> their >> > >> > > > >> experiments is infinitely more interesting than any results >> to >> > >> which >> > >> > > their >> > >> > > > >> experiments lead. >> > >> > > > >> -- Norbert Wiener >> > >> > > > >> >> > >> > > > >> https://www.cse.buffalo.edu/~knepley/ >> > >> > > > >> >> > >> > > > >> >> > >> > > > > >> > >> > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > > >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sasyed at fnal.gov Fri Feb 11 10:17:13 2022 From: sasyed at fnal.gov (Sajid Ali Syed) Date: Fri, 11 Feb 2022 16:17:13 +0000 Subject: [petsc-users] GAMG crash during setup when using multiple GPUs In-Reply-To: References: Message-ID: Hi Mark, Thanks for the information. @Junchao: Given that there are known issues with GPU aware MPI, it might be best to wait until there is an updated version of cray-mpich (which hopefully contains the relevant fixes). Thank You, Sajid Ali (he/him) | Research Associate Scientific Computing Division Fermi National Accelerator Laboratory s-sajid-ali.github.io ________________________________ From: Mark Adams Sent: Thursday, February 10, 2022 8:47 PM To: Junchao Zhang Cc: Sajid Ali Syed ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] GAMG crash during setup when using multiple GPUs Perlmutter has problems with GPU aware MPI. This is being actively worked on at NERSc. Mark On Thu, Feb 10, 2022 at 9:22 PM Junchao Zhang > wrote: Hi, Sajid Ali, I have no clue. I have access to perlmutter. I am thinking how to debug that. If your app is open-sourced and easy to build, then I can build and debug it. Otherwise, suppose you build and install petsc (only with options needed by your app) to a shared directory, and I can access your executable (which uses RPATH for libraries), then maybe I can debug it (I only need to install my own petsc to the shared directory) --Junchao Zhang On Thu, Feb 10, 2022 at 6:04 PM Sajid Ali Syed > wrote: Hi Junchao, With "-use_gpu_aware_mpi 0" there is no error. I'm attaching the log for this case with this email. I also ran with gpu aware mpi to see if I could reproduce the error and got the error but from a different location. This logfile is also attached. This was using the newest cray-mpich on NERSC-perlmutter (8.1.12). Let me know if I can share further information to help with debugging this. Thank You, Sajid Ali (he/him) | Research Associate Scientific Computing Division Fermi National Accelerator Laboratory s-sajid-ali.github.io ________________________________ From: Junchao Zhang > Sent: Thursday, February 10, 2022 1:43 PM To: Sajid Ali Syed > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] GAMG crash during setup when using multiple GPUs Also, try "-use_gpu_aware_mpi 0" to see if there is a difference. --Junchao Zhang On Thu, Feb 10, 2022 at 1:40 PM Junchao Zhang > wrote: Did it fail without GPU at 64 MPI ranks? --Junchao Zhang On Thu, Feb 10, 2022 at 1:22 PM Sajid Ali Syed > wrote: Hi PETSc-developers, I?m seeing the following crash that occurs during the setup phase of the preconditioner when using multiple GPUs. The relevant error trace is shown below: (GTL DEBUG: 26) cuIpcOpenMemHandle: resource already mapped, CUDA_ERROR_ALREADY_MAPPED, line no 272 [24]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [24]PETSC ERROR: General MPI error [24]PETSC ERROR: MPI error 1 Invalid buffer pointer [24]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [24]PETSC ERROR: Petsc Development GIT revision: f351d5494b5462f62c419e00645ac2e477b88cae GIT Date: 2022-02-08 15:08:19 +0000 ... [24]PETSC ERROR: #1 PetscSFLinkWaitRequests_MPI() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfmpi.c:54 [24]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/../src/vec/is/sf/impls/basic/sfpack.h:274 [24]PETSC ERROR: #3 PetscSFBcastEnd_Basic() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/impls/basic/sfbasic.c:218 [24]PETSC ERROR: #4 PetscSFBcastEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/sf.c:1499 [24]PETSC ERROR: #5 VecScatterEnd_Internal() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:87 [24]PETSC ERROR: #6 VecScatterEnd() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/vec/is/sf/interface/vscat.c:1366 [24]PETSC ERROR: #7 MatMult_MPIAIJCUSPARSE() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/impls/aij/mpi/mpicusparse/mpiaijcusparse.cu:302 [24]PETSC ERROR: #8 MatMult() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/mat/interface/matrix.c:2438 [24]PETSC ERROR: #9 PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:730 [24]PETSC ERROR: #10 KSP_PCApplyBAorAB() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/include/petsc/private/kspimpl.h:421 [24]PETSC ERROR: #11 KSPGMRESCycle() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:162 [24]PETSC ERROR: #12 KSPSolve_GMRES() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/impls/gmres/gmres.c:247 [24]PETSC ERROR: #13 KSPSolve_Private() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:925 [24]PETSC ERROR: #14 KSPSolve() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:1103 [24]PETSC ERROR: #15 PCGAMGOptProlongator_AGG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/agg.c:1127 [24]PETSC ERROR: #16 PCSetUp_GAMG() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/impls/gamg/gamg.c:626 [24]PETSC ERROR: #17 PCSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/pc/interface/precon.c:1017 [24]PETSC ERROR: #18 KSPSetUp() at /tmp/sajid/spack-stage/spack-stage-petsc-main-mnj56kbexro3fipf6kheyttljzwss7fo/spack-src/src/ksp/ksp/interface/itfunc.c:417 [24]PETSC ERROR: #19 main() at poisson3d.c:69 [24]PETSC ERROR: PETSc Option Table entries: [24]PETSC ERROR: -dm_mat_type aijcusparse [24]PETSC ERROR: -dm_vec_type cuda [24]PETSC ERROR: -ksp_monitor [24]PETSC ERROR: -ksp_norm_type unpreconditioned [24]PETSC ERROR: -ksp_type cg [24]PETSC ERROR: -ksp_view [24]PETSC ERROR: -log_view [24]PETSC ERROR: -mg_levels_esteig_ksp_type cg [24]PETSC ERROR: -mg_levels_ksp_type chebyshev [24]PETSC ERROR: -mg_levels_pc_type jacobi [24]PETSC ERROR: -pc_gamg_agg_nsmooths 1 [24]PETSC ERROR: -pc_gamg_square_graph 1 [24]PETSC ERROR: -pc_gamg_threshold 0.0 [24]PETSC ERROR: -pc_gamg_threshold_scale 0.0 [24]PETSC ERROR: -pc_gamg_type agg [24]PETSC ERROR: -pc_type gamg [24]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Attached with this email is the full error log and the submit script for a 8-node/64-GPU/64 MPI rank job. I?ll also note that the same program did not crash when using either 2 or 4 nodes (with 8 & 16 GPUs/MPI ranks respectively) and attach those logs as well if that helps. Could someone let me know what this error means and what can be done to prevent it? Thank You, Sajid Ali (he/him) | Research Associate Scientific Computing Division Fermi National Accelerator Laboratory s-sajid-ali.github.io ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 11 13:27:06 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 11 Feb 2022 14:27:06 -0500 Subject: [petsc-users] Gmsh 8-noded quadrilateral In-Reply-To: <87bkzeuvwq.fsf@jedbrown.org> References: <87bkzeuvwq.fsf@jedbrown.org> Message-ID: Jed is right about the numerics. However, this does not look hard. Here is my try at it: https://gitlab.com/petsc/petsc/-/merge_requests/4838 Please tell me if this works and I will make a test and merge. Thanks, Matt On Thu, Feb 10, 2022 at 6:47 PM Jed Brown wrote: > Susanne, do you want PetscFE to make the serendipity (8-node) finite > element space or do you just want to read these meshes? I.e., would it be > okay with you if the coordinates were placed in a Q_2 (9-node, biquadratic) > finite element space? > > This won't matter if you're traversing the dofs per edge manually, but > there are some efficiency benefits of using the Q_2 space (especially if > your code can use the tensor product, perhaps via a library like libCEED). > Note that Q_2 spaces have better stability properties. For example, the Q_2 > space is inf-sup stable with P_1 discontinuous pressure (gives third order > L^2 and second order H^1 convergence), but serendipity (8-node) is only > stable with piecewise constant pressure (gives second order L^2 and first > order H^1 convergence). > > Susanne Claus writes: > > > Dear Matthew, > > > > Thank you so much. > > I have a attached a small 8-noded quadrilateral mesh file (Version 4 > > ASCII) generated with gmsh 4.8.4. > > > > Best wishes, > > Susanne > > > > On 10.02.2022 16:23, Matthew Knepley wrote: > > > >> On Thu, Feb 10, 2022 at 10:12 AM Susanne Claus > > >> wrote: > >> > >>> Hello, > >>> > >>> I am using DMPlex for the mesh structure of a solid mechanics finite > >>> element code. I mainly use gmsh as input file format. When I try to > >>> read in 8-noded Quadrilaterals (Element type 16 in gmsh) DMPlex tells > >>> me that this element type is unknown. However a 9-noded Quadrilateral > >>> can be read without problem. On inspecting the plexgmsh.c source code > >>> I can see that 8-noded quadrilaterals are deactivated: > >>> > >>> #if 0 > >>> 146: {20, GMSH_TRI, 2, 3, 3, 9, NULL}, > >>> 147: {16, GMSH_QUA, 2, 2, 4, 8, NULL}, > >>> > >>> For our application these 8-noded quadrilateral are very important. > >>> > >>> Is there any reason why they have not been implemented/deactivated in > >>> the dmplex gmsh reader? > >> > >> No, we can handle them in the same way I think. Let me look at it. > >> Hopefully it is easy. > >> > >> Thanks, > >> > >> Matt > >> > >>> Thank you for all the great work you are doing. PETSc is amazing. > >>> > >>> Best wishes, > >>> Susanne Claus > >> > >> -- > >> > >> What most experimenters take for granted before they begin their > >> experiments is infinitely more interesting than any results to which > >> their experiments lead. > >> -- Norbert Wiener > >> > >> https://www.cse.buffalo.edu/~knepley/ [1] > > > > -- > > > > Susanne Claus > > Ing?nieur Chercheur > > Applied Mathematics and Scientific Computing Group > > DTIS > > > > ONERA - The French Aerospace Lab > > 6 Chemin de la Vauve aux Granges, 91120 Palaiseau > > > > Links: > > ------ > > [1] http://www.cse.buffalo.edu/~knepley/ > > $MeshFormat > > 4.1 0 8 > > $EndMeshFormat > > $PhysicalNames > > 2 > > 1 2 "Neumann" > > 2 1 "Domain" > > $EndPhysicalNames > > $Entities > > 4 4 1 0 > > 1 0 0 0 0 > > 2 1 0 0 0 > > 3 1 1 0 0 > > 4 0 1 0 0 > > 1 -9.999999994736442e-08 -1e-07 -1e-07 1.0000001 1e-07 1e-07 0 2 1 -2 > > 2 0.9999999000000001 -9.999999994736442e-08 -1e-07 1.0000001 1.0000001 > 1e-07 1 2 2 2 -3 > > 3 -9.999999994736442e-08 0.9999999000000001 -1e-07 1.0000001 1.0000001 > 1e-07 0 2 3 -4 > > 4 -1e-07 -9.999999994736442e-08 -1e-07 1e-07 1.0000001 1e-07 0 2 4 -1 > > 1 -9.999999994736442e-08 -9.999999994736442e-08 -1e-07 1.0000001 > 1.0000001 1e-07 1 1 4 1 2 3 4 > > $EndEntities > > $Nodes > > 9 21 1 46 > > 0 1 0 1 > > 1 > > 0 0 0 > > 0 2 0 1 > > 2 > > 1 0 0 > > 0 3 0 1 > > 3 > > 1 1 0 > > 0 4 0 1 > > 4 > > 0 1 0 > > 1 1 0 3 > > 5 > > 35 > > 36 > > 0.5 0 0 > > 0.25 0 0 > > 0.75 0 0 > > 1 2 0 3 > > 6 > > 37 > > 38 > > 1 0.5 0 > > 1 0.25 0 > > 1 0.75 0 > > 1 3 0 3 > > 7 > > 39 > > 40 > > 0.5 1 0 > > 0.75 1 0 > > 0.25 1 0 > > 1 4 0 3 > > 8 > > 41 > > 42 > > 0 0.5 0 > > 0 0.75 0 > > 0 0.25 0 > > 2 1 0 5 > > 9 > > 43 > > 44 > > 45 > > 46 > > 0.5 0.5 0 > > 0.75 0.5 0 > > 0.5 0.25 0 > > 0.25 0.5 0 > > 0.5 0.75 0 > > $EndNodes > > $Elements > > 2 6 197 206 > > 1 2 8 2 > > 197 2 6 37 > > 198 6 3 38 > > 2 1 16 4 > > 203 2 6 9 5 37 43 44 36 > > 204 1 5 9 8 35 44 45 42 > > 205 4 8 9 7 41 45 46 40 > > 206 3 7 9 6 39 46 43 38 > > $EndElements > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From susanne.claus at onera.fr Fri Feb 11 15:01:52 2022 From: susanne.claus at onera.fr (Susanne Claus) Date: Fri, 11 Feb 2022 22:01:52 +0100 Subject: [petsc-users] Gmsh 8-noded quadrilateral In-Reply-To: References: <87bkzeuvwq.fsf@jedbrown.org> Message-ID: Dear Matthew and Jed, Brilliant. Thank you so much! Your changes work like a charm Matthew (I tested your branch on the gmsh file I sent) and thank you so much for your advice Jed. The loss of one order of convergence for an inf-sup stable pressure discretization seems indeed a very high price to pay for the moderate increase in efficiency by elimination of the interior modes. You have given me food for thought and I will probably personally not use 8-node quadrilaterals. Nevertheless, for our code it will be important to support 8-node quadrilaterals as it is still an element widely used in solid mechanics simulations. LibCEED looks very interesting. Thank you so much again. Best wishes from Paris, Susanne On 11.02.2022 20:27, Matthew Knepley wrote: > Jed is right about the numerics. However, this does not look hard. Here > is my try at it: > > https://gitlab.com/petsc/petsc/-/merge_requests/4838 > > Please tell me if this works and I will make a test and merge. > > Thanks, > > Matt > > On Thu, Feb 10, 2022 at 6:47 PM Jed Brown wrote: > >> Susanne, do you want PetscFE to make the serendipity (8-node) finite >> element space or do you just want to read these meshes? I.e., would it >> be okay with you if the coordinates were placed in a Q_2 (9-node, >> biquadratic) finite element space? >> >> This won't matter if you're traversing the dofs per edge manually, but >> there are some efficiency benefits of using the Q_2 space (especially >> if your code can use the tensor product, perhaps via a library like >> libCEED). Note that Q_2 spaces have better stability properties. For >> example, the Q_2 space is inf-sup stable with P_1 discontinuous >> pressure (gives third order L^2 and second order H^1 convergence), but >> serendipity (8-node) is only stable with piecewise constant pressure >> (gives second order L^2 and first order H^1 convergence). >> >> Susanne Claus writes: >> >>> Dear Matthew, >>> >>> Thank you so much. >>> I have a attached a small 8-noded quadrilateral mesh file (Version 4 >>> ASCII) generated with gmsh 4.8.4. >>> >>> Best wishes, >>> Susanne >>> >>> On 10.02.2022 16:23, Matthew Knepley wrote: >>> >>>> On Thu, Feb 10, 2022 at 10:12 AM Susanne Claus >>>> >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> I am using DMPlex for the mesh structure of a solid mechanics >>>>> finite >>>>> element code. I mainly use gmsh as input file format. When I try to >>>>> read in 8-noded Quadrilaterals (Element type 16 in gmsh) DMPlex >>>>> tells >>>>> me that this element type is unknown. However a 9-noded >>>>> Quadrilateral >>>>> can be read without problem. On inspecting the plexgmsh.c source >>>>> code >>>>> I can see that 8-noded quadrilaterals are deactivated: >>>>> >>>>> #if 0 >>>>> 146: {20, GMSH_TRI, 2, 3, 3, 9, NULL}, >>>>> 147: {16, GMSH_QUA, 2, 2, 4, 8, NULL}, >>>>> >>>>> For our application these 8-noded quadrilateral are very important. >>>>> >>>>> Is there any reason why they have not been implemented/deactivated >>>>> in >>>>> the dmplex gmsh reader? >>>> >>>> No, we can handle them in the same way I think. Let me look at it. >>>> Hopefully it is easy. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> Thank you for all the great work you are doing. PETSc is amazing. >>>>> >>>>> Best wishes, >>>>> Susanne Claus >>>> >>>> -- >>>> >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which >>>> their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ [1] >>> >>> -- >>> >>> Susanne Claus >>> Ing?nieur Chercheur >>> Applied Mathematics and Scientific Computing Group >>> DTIS >>> >>> ONERA - The French Aerospace Lab >>> 6 Chemin de la Vauve aux Granges, 91120 Palaiseau >>> >>> Links: >>> ------ >>> [1] http://www.cse.buffalo.edu/~knepley/ >>> $MeshFormat >>> 4.1 0 8 >>> $EndMeshFormat >>> $PhysicalNames >>> 2 >>> 1 2 "Neumann" >>> 2 1 "Domain" >>> $EndPhysicalNames >>> $Entities >>> 4 4 1 0 >>> 1 0 0 0 0 >>> 2 1 0 0 0 >>> 3 1 1 0 0 >>> 4 0 1 0 0 >>> 1 -9.999999994736442e-08 -1e-07 -1e-07 1.0000001 1e-07 1e-07 0 2 1 -2 >>> 2 0.9999999000000001 -9.999999994736442e-08 -1e-07 1.0000001 >>> 1.0000001 1e-07 1 2 2 2 -3 >>> 3 -9.999999994736442e-08 0.9999999000000001 -1e-07 1.0000001 >>> 1.0000001 1e-07 0 2 3 -4 >>> 4 -1e-07 -9.999999994736442e-08 -1e-07 1e-07 1.0000001 1e-07 0 2 4 -1 >>> 1 -9.999999994736442e-08 -9.999999994736442e-08 -1e-07 1.0000001 >>> 1.0000001 1e-07 1 1 4 1 2 3 4 >>> $EndEntities >>> $Nodes >>> 9 21 1 46 >>> 0 1 0 1 >>> 1 >>> 0 0 0 >>> 0 2 0 1 >>> 2 >>> 1 0 0 >>> 0 3 0 1 >>> 3 >>> 1 1 0 >>> 0 4 0 1 >>> 4 >>> 0 1 0 >>> 1 1 0 3 >>> 5 >>> 35 >>> 36 >>> 0.5 0 0 >>> 0.25 0 0 >>> 0.75 0 0 >>> 1 2 0 3 >>> 6 >>> 37 >>> 38 >>> 1 0.5 0 >>> 1 0.25 0 >>> 1 0.75 0 >>> 1 3 0 3 >>> 7 >>> 39 >>> 40 >>> 0.5 1 0 >>> 0.75 1 0 >>> 0.25 1 0 >>> 1 4 0 3 >>> 8 >>> 41 >>> 42 >>> 0 0.5 0 >>> 0 0.75 0 >>> 0 0.25 0 >>> 2 1 0 5 >>> 9 >>> 43 >>> 44 >>> 45 >>> 46 >>> 0.5 0.5 0 >>> 0.75 0.5 0 >>> 0.5 0.25 0 >>> 0.25 0.5 0 >>> 0.5 0.75 0 >>> $EndNodes >>> $Elements >>> 2 6 197 206 >>> 1 2 8 2 >>> 197 2 6 37 >>> 198 6 3 38 >>> 2 1 16 4 >>> 203 2 6 9 5 37 43 44 36 >>> 204 1 5 9 8 35 44 45 42 >>> 205 4 8 9 7 41 45 46 40 >>> 206 3 7 9 6 39 46 43 38 >>> $EndElements > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ [1] -- Susanne Claus Ing?nieur Chercheur Applied Mathematics and Scientific Computing Group DTIS ONERA - The French Aerospace Lab 6 Chemin de la Vauve aux Granges, 91120 Palaiseau Links: ------ [1] http://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 3bb899cc.png Type: image/png Size: 4266 bytes Desc: not available URL: From jed at jedbrown.org Fri Feb 11 15:59:48 2022 From: jed at jedbrown.org (Jed Brown) Date: Fri, 11 Feb 2022 14:59:48 -0700 Subject: [petsc-users] Gmsh 8-noded quadrilateral In-Reply-To: References: <87bkzeuvwq.fsf@jedbrown.org> Message-ID: <87a6ext67f.fsf@jedbrown.org> Sounds good. Note that if you use direct solvers, that extra node is basically free because the vertex separators are unchanged. It's a marginal cost in the storage of assembled matrices and the length of state vectors. And in 3D, even less significant. Susanne Claus writes: > Dear Matthew and Jed, > > Brilliant. Thank you so much! > > Your changes work like a charm Matthew (I tested your branch on the gmsh > file I sent) and thank you so much for your advice Jed. The loss of one > order of convergence for an inf-sup stable pressure discretization seems > indeed a very high price to pay for the moderate increase in efficiency > by elimination of the interior modes. You have given me food for thought > and I will probably personally not use 8-node quadrilaterals. > Nevertheless, for our code it will be important to support 8-node > quadrilaterals as it is still an element widely used in solid mechanics > simulations. LibCEED looks very interesting. > > Thank you so much again. > > Best wishes from Paris, > Susanne > > On 11.02.2022 20:27, Matthew Knepley wrote: > >> Jed is right about the numerics. However, this does not look hard. Here >> is my try at it: >> >> https://gitlab.com/petsc/petsc/-/merge_requests/4838 >> >> Please tell me if this works and I will make a test and merge. >> >> Thanks, >> >> Matt >> >> On Thu, Feb 10, 2022 at 6:47 PM Jed Brown wrote: >> >>> Susanne, do you want PetscFE to make the serendipity (8-node) finite >>> element space or do you just want to read these meshes? I.e., would it >>> be okay with you if the coordinates were placed in a Q_2 (9-node, >>> biquadratic) finite element space? >>> >>> This won't matter if you're traversing the dofs per edge manually, but >>> there are some efficiency benefits of using the Q_2 space (especially >>> if your code can use the tensor product, perhaps via a library like >>> libCEED). Note that Q_2 spaces have better stability properties. For >>> example, the Q_2 space is inf-sup stable with P_1 discontinuous >>> pressure (gives third order L^2 and second order H^1 convergence), but >>> serendipity (8-node) is only stable with piecewise constant pressure >>> (gives second order L^2 and first order H^1 convergence). >>> >>> Susanne Claus writes: >>> >>>> Dear Matthew, >>>> >>>> Thank you so much. >>>> I have a attached a small 8-noded quadrilateral mesh file (Version 4 >>>> ASCII) generated with gmsh 4.8.4. >>>> >>>> Best wishes, >>>> Susanne >>>> >>>> On 10.02.2022 16:23, Matthew Knepley wrote: >>>> >>>>> On Thu, Feb 10, 2022 at 10:12 AM Susanne Claus >>>>> >>>>> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> I am using DMPlex for the mesh structure of a solid mechanics >>>>>> finite >>>>>> element code. I mainly use gmsh as input file format. When I try to >>>>>> read in 8-noded Quadrilaterals (Element type 16 in gmsh) DMPlex >>>>>> tells >>>>>> me that this element type is unknown. However a 9-noded >>>>>> Quadrilateral >>>>>> can be read without problem. On inspecting the plexgmsh.c source >>>>>> code >>>>>> I can see that 8-noded quadrilaterals are deactivated: >>>>>> >>>>>> #if 0 >>>>>> 146: {20, GMSH_TRI, 2, 3, 3, 9, NULL}, >>>>>> 147: {16, GMSH_QUA, 2, 2, 4, 8, NULL}, >>>>>> >>>>>> For our application these 8-noded quadrilateral are very important. >>>>>> >>>>>> Is there any reason why they have not been implemented/deactivated >>>>>> in >>>>>> the dmplex gmsh reader? >>>>> >>>>> No, we can handle them in the same way I think. Let me look at it. >>>>> Hopefully it is easy. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> Thank you for all the great work you are doing. PETSc is amazing. >>>>>> >>>>>> Best wishes, >>>>>> Susanne Claus >>>>> >>>>> -- >>>>> >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which >>>>> their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ [1] >>>> >>>> -- >>>> >>>> Susanne Claus >>>> Ing?nieur Chercheur >>>> Applied Mathematics and Scientific Computing Group >>>> DTIS >>>> >>>> ONERA - The French Aerospace Lab >>>> 6 Chemin de la Vauve aux Granges, 91120 Palaiseau >>>> >>>> Links: >>>> ------ >>>> [1] http://www.cse.buffalo.edu/~knepley/ >>>> $MeshFormat >>>> 4.1 0 8 >>>> $EndMeshFormat >>>> $PhysicalNames >>>> 2 >>>> 1 2 "Neumann" >>>> 2 1 "Domain" >>>> $EndPhysicalNames >>>> $Entities >>>> 4 4 1 0 >>>> 1 0 0 0 0 >>>> 2 1 0 0 0 >>>> 3 1 1 0 0 >>>> 4 0 1 0 0 >>>> 1 -9.999999994736442e-08 -1e-07 -1e-07 1.0000001 1e-07 1e-07 0 2 1 -2 >>>> 2 0.9999999000000001 -9.999999994736442e-08 -1e-07 1.0000001 >>>> 1.0000001 1e-07 1 2 2 2 -3 >>>> 3 -9.999999994736442e-08 0.9999999000000001 -1e-07 1.0000001 >>>> 1.0000001 1e-07 0 2 3 -4 >>>> 4 -1e-07 -9.999999994736442e-08 -1e-07 1e-07 1.0000001 1e-07 0 2 4 -1 >>>> 1 -9.999999994736442e-08 -9.999999994736442e-08 -1e-07 1.0000001 >>>> 1.0000001 1e-07 1 1 4 1 2 3 4 >>>> $EndEntities >>>> $Nodes >>>> 9 21 1 46 >>>> 0 1 0 1 >>>> 1 >>>> 0 0 0 >>>> 0 2 0 1 >>>> 2 >>>> 1 0 0 >>>> 0 3 0 1 >>>> 3 >>>> 1 1 0 >>>> 0 4 0 1 >>>> 4 >>>> 0 1 0 >>>> 1 1 0 3 >>>> 5 >>>> 35 >>>> 36 >>>> 0.5 0 0 >>>> 0.25 0 0 >>>> 0.75 0 0 >>>> 1 2 0 3 >>>> 6 >>>> 37 >>>> 38 >>>> 1 0.5 0 >>>> 1 0.25 0 >>>> 1 0.75 0 >>>> 1 3 0 3 >>>> 7 >>>> 39 >>>> 40 >>>> 0.5 1 0 >>>> 0.75 1 0 >>>> 0.25 1 0 >>>> 1 4 0 3 >>>> 8 >>>> 41 >>>> 42 >>>> 0 0.5 0 >>>> 0 0.75 0 >>>> 0 0.25 0 >>>> 2 1 0 5 >>>> 9 >>>> 43 >>>> 44 >>>> 45 >>>> 46 >>>> 0.5 0.5 0 >>>> 0.75 0.5 0 >>>> 0.5 0.25 0 >>>> 0.25 0.5 0 >>>> 0.5 0.75 0 >>>> $EndNodes >>>> $Elements >>>> 2 6 197 206 >>>> 1 2 8 2 >>>> 197 2 6 37 >>>> 198 6 3 38 >>>> 2 1 16 4 >>>> 203 2 6 9 5 37 43 44 36 >>>> 204 1 5 9 8 35 44 45 42 >>>> 205 4 8 9 7 41 45 46 40 >>>> 206 3 7 9 6 39 46 43 38 >>>> $EndElements >> >> -- >> >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ [1] > > -- > > Susanne Claus > Ing?nieur Chercheur > Applied Mathematics and Scientific Computing Group > DTIS > > ONERA - The French Aerospace Lab > 6 Chemin de la Vauve aux Granges, 91120 Palaiseau > > Links: > ------ > [1] http://www.cse.buffalo.edu/~knepley/ From samar.khatiwala at earth.ox.ac.uk Sat Feb 12 07:21:31 2022 From: samar.khatiwala at earth.ox.ac.uk (Samar Khatiwala) Date: Sat, 12 Feb 2022 13:21:31 +0000 Subject: [petsc-users] Creating multiple Vecs with petsc4py Message-ID: <0006BEAC-42C5-45B2-BC94-4DD9731D80E7@earth.ox.ac.uk> Hello, I?d like to create an array of Vecs in petsc4py by calling VecDuplicateVecs but I can?t find the corresponding method (I?ve tried various iterations such as q = x.duplicateVecs(4), etc). Is this not implemented in petsc4py? One workaround I?ve come up with is something like: q={} for i in range(0, 3): q[i]=x.duplicate() Is there another/better way? And how do I then use PETSc functions that operate on Vecs (e.g., VecMAXPY)? Would I just call VecAXPY in a loop as above? Ultimately, what I really want to do is wrap my own C functions with Cython that take an array of Vecs as an argument and then operate on them. (The function needs the entire array of Vecs to do its thing so I can?t loop over the elements of the array.) For instance, I want to pass the q above to myCfunc(Vec *q, Vec *qout). How do I go about doing that? Thanks very much! Best, Samar From knepley at gmail.com Sat Feb 12 07:30:09 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 12 Feb 2022 08:30:09 -0500 Subject: [petsc-users] Creating multiple Vecs with petsc4py In-Reply-To: <0006BEAC-42C5-45B2-BC94-4DD9731D80E7@earth.ox.ac.uk> References: <0006BEAC-42C5-45B2-BC94-4DD9731D80E7@earth.ox.ac.uk> Message-ID: On Sat, Feb 12, 2022 at 8:21 AM Samar Khatiwala < samar.khatiwala at earth.ox.ac.uk> wrote: > Hello, > > I?d like to create an array of Vecs in petsc4py by calling > VecDuplicateVecs but I can?t find the corresponding method (I?ve tried > various iterations such as q = x.duplicateVecs(4), etc). > Is this not implemented in petsc4py? One workaround I?ve come up with is > something like: > > q={} > for i in range(0, 3): > q[i]=x.duplicate() > > Is there another/better way? And how do I then use PETSc functions that > operate on Vecs (e.g., VecMAXPY)? Would I just call VecAXPY in a loop as > above? > I don't think so, but maybe Lisandro has a suggestion. You can do this in one line q = [x.duplicate() for i in range(0, 3)] > Ultimately, what I really want to do is wrap my own C functions with > Cython that take an array of Vecs as an argument and then operate on them. > (The function needs the entire array > of Vecs to do its thing so I can?t loop over the elements of the array.) > For instance, I want to pass the q above to myCfunc(Vec *q, Vec *qout). How > do I go about doing that? > I think you can do the same thing as VecMAXPY, def maxpy(self, alphas, vecs): cdef PetscInt n = 0 cdef PetscScalar *a = NULL cdef PetscVec *v = NULL cdef object tmp1 = iarray_s(alphas, &n, &a) cdef object tmp2 = oarray_p(empty_p(n),NULL, &v) assert n == len(vecs) cdef Py_ssize_t i=0 for i from 0 <= i < n: v[i] = ((vecs[i])).vec CHKERR( VecMAXPY(self.vec, n, a, v) ) Thanks, Matt Thanks very much! > > Best, > > Samar > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sat Feb 12 07:33:06 2022 From: jed at jedbrown.org (Jed Brown) Date: Sat, 12 Feb 2022 06:33:06 -0700 Subject: [petsc-users] Creating multiple Vecs with petsc4py In-Reply-To: <0006BEAC-42C5-45B2-BC94-4DD9731D80E7@earth.ox.ac.uk> References: <0006BEAC-42C5-45B2-BC94-4DD9731D80E7@earth.ox.ac.uk> Message-ID: <87a6ew43cd.fsf@jedbrown.org> VecDuplicateVecs isn't implemented in petsc4py, but it internally just loops over VecDuplicate so you can use qs = [x.duplicate() for i in range(4)] y.maxpy(alphas, qs) where the Python binding here handles qs being a Python array. Samar Khatiwala writes: > Hello, > > I?d like to create an array of Vecs in petsc4py by calling VecDuplicateVecs but I can?t find the corresponding method (I?ve tried various iterations such as q = x.duplicateVecs(4), etc). > Is this not implemented in petsc4py? One workaround I?ve come up with is something like: > > q={} > for i in range(0, 3): > q[i]=x.duplicate() > > Is there another/better way? And how do I then use PETSc functions that operate on Vecs (e.g., VecMAXPY)? Would I just call VecAXPY in a loop as above? > > Ultimately, what I really want to do is wrap my own C functions with Cython that take an array of Vecs as an argument and then operate on them. (The function needs the entire array > of Vecs to do its thing so I can?t loop over the elements of the array.) For instance, I want to pass the q above to myCfunc(Vec *q, Vec *qout). How do I go about doing that? > > Thanks very much! > > Best, > > Samar From samar.khatiwala at earth.ox.ac.uk Sat Feb 12 07:35:45 2022 From: samar.khatiwala at earth.ox.ac.uk (Samar Khatiwala) Date: Sat, 12 Feb 2022 13:35:45 +0000 Subject: [petsc-users] Creating multiple Vecs with petsc4py In-Reply-To: <87a6ew43cd.fsf@jedbrown.org> References: <0006BEAC-42C5-45B2-BC94-4DD9731D80E7@earth.ox.ac.uk> <87a6ew43cd.fsf@jedbrown.org> Message-ID: Thanks Matt and Jed for the quick replies! That answers my immediate questions. Best, Samar > On Feb 12, 2022, at 1:33 PM, Jed Brown wrote: > > VecDuplicateVecs isn't implemented in petsc4py, but it internally just loops over VecDuplicate so you can use > > qs = [x.duplicate() for i in range(4)] > > y.maxpy(alphas, qs) > > > where the Python binding here handles qs being a Python array. > > Samar Khatiwala writes: > >> Hello, >> >> I?d like to create an array of Vecs in petsc4py by calling VecDuplicateVecs but I can?t find the corresponding method (I?ve tried various iterations such as q = x.duplicateVecs(4), etc). >> Is this not implemented in petsc4py? One workaround I?ve come up with is something like: >> >> q={} >> for i in range(0, 3): >> q[i]=x.duplicate() >> >> Is there another/better way? And how do I then use PETSc functions that operate on Vecs (e.g., VecMAXPY)? Would I just call VecAXPY in a loop as above? >> >> Ultimately, what I really want to do is wrap my own C functions with Cython that take an array of Vecs as an argument and then operate on them. (The function needs the entire array >> of Vecs to do its thing so I can?t loop over the elements of the array.) For instance, I want to pass the q above to myCfunc(Vec *q, Vec *qout). How do I go about doing that? >> >> Thanks very much! >> >> Best, >> >> Samar From bojan.niceno.scientist at gmail.com Mon Feb 14 01:19:46 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Mon, 14 Feb 2022 08:19:46 +0100 Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> <2bed48e9-b141-44f8-807e-14d3e6f4c3fe@mcs.anl.gov> <28149e2c-4389-1629-bdb4-90984aeb1ecb@mcs.anl.gov> Message-ID: Dear both, It was the compiler options for integer lengths after all, thanks, Now I corrected it all in my code, all integers have explicitly defined lengths, and I am using the MPI_F08 module instead of obsolete mpi.f. It was a daunting task (> 800 files, > 64000 lines of code), but I am happy with the outcome. Now I can continue with PETSc :-) Cheers Bojan On Fri, Feb 11, 2022 at 7:26 AM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > What does seem to work is: > > use mpi_f08 > > and using: > > call Mpi_Init_08(error) > > call Mpi_Comm_Size_08(MPI_COMM_WORLD, n_proc, error)? > > call Mpi_Comm_Rank_08(MPI_COMM_WORLD, this_proc, error) > > > That is, function calls with extension _08 > > > > > > On Fri, Feb 11, 2022 at 6:51 AM Bojan Niceno < > bojan.niceno.scientist at gmail.com> wrote: > >> Thanks for the example. >> >> I do use -i8 -r8 (I know it is a bad practice and am planning to get rid >> of it in mid terms), and I did suspect that, but when I tried to compile >> without those options, the warning remained. >> >> When I switched from "include mpif.h" to: >> >> use mpi, >> >> or even: >> >> #include >> >> use petscmpi >> >> use petscsys >> >> >> I get the following type of messages: >> >> Error: There is no specific subroutine for the generic ?mpi_barrier? at >> (1) >> Comm_Mod/Parallel/Start.f90:11:22: >> >> Error: There is no specific subroutine for the generic ?mpi_init? at (1) >> Comm_Mod/Parallel/Start.f90:14:52: >> >> Error: There is no specific subroutine for the generic ?mpi_comm_size? at >> (1) >> Comm_Mod/Parallel/Start.f90:17:54: >> >> I was googling for a solution, but StackOverflow seems to be down for >> maintenance at the moment :-( >> >> I did manage to find that "include mpif.h" is obsolete, which I didn't >> know before :-) >> >> >> >> >> >> >> >> >> >> On Fri, Feb 11, 2022 at 5:29 AM Satish Balay wrote: >> >>> 1. you can call MPI_Init() before calling PetscInitialize() For >>> example - check src/sys/tutorials/ex4f90.F90 >>> >>> 2. Are you using -i8 -r8 type flags when compiling your code? That >>> might case issues when using mpif.h. Perhaps you can switch from >>> "include 'mpif.h'" to "use mpi" - in your module file - and see if >>> that helps. >>> >>> Satish >>> >>> On Fri, 11 Feb 2022, Bojan Niceno wrote: >>> >>> > Dear both, >>> > >>> > Allow me to update you on the issue. I tried to re-compile PETSc with >>> > different configuration options as Satish suggested, and went further >>> on by >>> > specifying exact location of OpenMPI libraries and include files to the >>> > ones installed by PETSc (for those configurations for which I used >>> > "--download-openmpi=1") and the original problem, the warning Named >>> COMMON >>> > block ?mpi_fortran_bottom? at (1) shall be of the same size as >>> elsewhere (4 >>> > vs 8 bytes), prevailed. >>> > >>> > In desperation, I completely removed OpenMPI from my workstation to >>> make >>> > sure that only those which are downloaded with PETSc are used, yet the >>> > warning was still there. (That resolved the Invalid MIT-MAGIC-COOKIE-1 >>> > warning at least) >>> > >>> > Now I am wondering if the problem originates from the fact that I >>> already >>> > have all the necessary MPI routines developed in Fortran? All calls, >>> > including the basic MPI_Init, MPI_Comm_Size and MPI_Comm_Rank, are done >>> > from Fortran. I actually have a module called Comm_Mod which does all >>> > MPI-related calls, and this module contains line include 'mpif.h'. >>> That >>> > include statement does take the file from PETSc installation as no >>> other >>> > MPI installation is left on my system, but still it somehow seems to >>> be the >>> > origin of the warning on common blocks I observe. Now I am wondering >>> if >>> > the include 'mpif.h' from Fortran somehow collides with the option >>> include >>> > ${PETSC_DIR}/lib/petsc/conf/variables I put in my makefile in order to >>> > compile with PETSc. >>> > >>> > I am really not sure if it is possible to have main program and all MPI >>> > initialization done from Fortran (as I have now) and then plug PETSc >>> on top >>> > of it? Should that be possible? >>> > >>> > Kind regards, >>> > >>> > Bojan >>> > >>> > P.S. The sequential version works fine, I can compile without warning >>> and >>> > can call PETSc solvers from Fortran without a glitch. >>> > >>> > On Thu, Feb 10, 2022 at 5:08 PM Bojan Niceno < >>> > bojan.niceno.scientist at gmail.com> wrote: >>> > >>> > > Dear Satish, >>> > > >>> > > Thanks for the advice. I will try in a few hours because it is >>> almost >>> > > dinner time with me (I am in Europe) and I am supposed to go out >>> with a >>> > > friend this evening. >>> > > >>> > > Will let you know. Thanks for help, I highly appreciate it. >>> > > >>> > > >>> > > Kind regards, >>> > > >>> > > Bojan >>> > > >>> > > >>> > > On Thu, Feb 10, 2022 at 5:06 PM Satish Balay >>> wrote: >>> > > >>> > >> Hm - this is strange. >>> > >> >>> > >> Do you have 'xauth' installed? >>> > >> >>> > >> I would make sure xauth is installed, delete ~/.Xauthority - and >>> reboot >>> > >> [or restart the X server] >>> > >> >>> > >> Yeah - it might not work - but perhaps worth a try.. >>> > >> >>> > >> Or perhaps its not X11 related.. >>> > >> >>> > >> I would also try 'strace' on an application that is producing this >>> > >> message - to see if I can narrow down further.. >>> > >> >>> > >> Do you get this message with both (runs)?: >>> > >> >>> > >> cd src/ksp/ksp/tutorials >>> > >> make ex2 >>> > >> mpiexec -n 1 ./ex2 >>> > >> ./ex2 >>> > >> >>> > >> Satish >>> > >> >>> > >> On Thu, 10 Feb 2022, Bojan Niceno wrote: >>> > >> >>> > >> > Dear both, >>> > >> > >>> > >> > I work on an ASUS ROG laptop and don't use any NFS. Everything >>> is on >>> > >> one >>> > >> > computer, one disk. That is why I couldn't resolve the Invalid >>> Magic >>> > >> > Cookie, because all the advice I've found about it concerns the >>> remote >>> > >> > access/display. It is not an issue for me. My laptop has an >>> Nvidia >>> > >> > GeForce RTX graphical card, maybe Ubuntu drivers are simply not >>> able to >>> > >> > cope with it. I am out of ideas, really. >>> > >> > >>> > >> > >>> > >> > Cheers, >>> > >> > >>> > >> > Bojan >>> > >> > >>> > >> > On Thu, Feb 10, 2022 at 4:53 PM Satish Balay >>> wrote: >>> > >> > >>> > >> > > Do the compute nodes and frontend share the same NFS? >>> > >> > > >>> > >> > > I would try the following [to see if they work): >>> > >> > > >>> > >> > > - delete ~/.Xauthority [first check with 'xauth list') >>> > >> > > - setup ssh to not use X - i.e add the following to >>> ~/.ssh/config >>> > >> > > >>> > >> > > ForwardX11 no >>> > >> > > ForwardX11Trusted no >>> > >> > > >>> > >> > > [this can be tailored to apply only to your specific compute >>> nodes - >>> > >> if >>> > >> > > needed] >>> > >> > > >>> > >> > > Satish >>> > >> > > >>> > >> > > On Thu, 10 Feb 2022, Matthew Knepley wrote: >>> > >> > > >>> > >> > > > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < >>> > >> > > > bojan.niceno.scientist at gmail.com> wrote: >>> > >> > > > >>> > >> > > > > Thanks a lot, now I feel much better. >>> > >> > > > > >>> > >> > > > > By the way, I can't get around the invalid magic cookie. >>> It is >>> > >> > > occurring >>> > >> > > > > ever since I installed the OS (Ubuntu 20.04) so I >>> eventually gave >>> > >> up >>> > >> > > and >>> > >> > > > > decided to live with it :-D >>> > >> > > > > >>> > >> > > > >>> > >> > > > >>> > >> > > >>> > >> >>> https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely >>> > >> > > > >>> > >> > > > Thanks, >>> > >> > > > >>> > >> > > > Matt >>> > >> > > > >>> > >> > > > >>> > >> > > > > Cheers, >>> > >> > > > > >>> > >> > > > > Bojan >>> > >> > > > > >>> > >> > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley < >>> > >> knepley at gmail.com> >>> > >> > > wrote: >>> > >> > > > > >>> > >> > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < >>> > >> > > > >> bojan.niceno.scientist at gmail.com> wrote: >>> > >> > > > >> >>> > >> > > > >>> Dear Satish, >>> > >> > > > >>> >>> > >> > > > >>> Thanks for the answer. Your suggestion makes a lot of >>> sense, >>> > >> but >>> > >> > > this >>> > >> > > > >>> is what I get as a result of that: >>> > >> > > > >>> >>> > >> > > > >>> Running check examples to verify correct installation >>> > >> > > > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and >>> > >> > > > >>> PETSC_ARCH=arch-linux-c-debug >>> > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with >>> 1 MPI >>> > >> > > process >>> > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, >>> prandtl # >>> > >> = 1., >>> > >> > > > >>> grashof # = 1. >>> > >> > > > >>> Number of SNES iterations = 2 >>> > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with >>> 2 MPI >>> > >> > > processes >>> > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, >>> prandtl # >>> > >> = 1., >>> > >> > > > >>> grashof # = 1. >>> > >> > > > >>> Number of SNES iterations = 2 >>> > >> > > > >>> Possible error running Fortran example >>> src/snes/tutorials/ex5f >>> > >> with 1 >>> > >> > > > >>> MPI process >>> > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations >>> = 4 >>> > >> > > > >>> Completed test examples >>> > >> > > > >>> >>> > >> > > > >>> I am getting the "Possible error running Fortran example" >>> > >> warning >>> > >> > > with >>> > >> > > > >>> this. This somehow looks more severe to me. But I could >>> be >>> > >> wrong. >>> > >> > > > >>> >>> > >> > > > >> >>> > >> > > > >> You are getting this message because your MPI >>> implementation is >>> > >> > > printing >>> > >> > > > >> >>> > >> > > > >> Invalid MIT-MAGIC-COOKIE-1 key >>> > >> > > > >> >>> > >> > > > >> It is still running fine, but this is an MPI configuration >>> issue. >>> > >> > > > >> >>> > >> > > > >> Thanks, >>> > >> > > > >> >>> > >> > > > >> Matt >>> > >> > > > >> >>> > >> > > > >> Any suggestions what to do? >>> > >> > > > >>> >>> > >> > > > >>> >>> > >> > > > >>> Kind regards, >>> > >> > > > >>> >>> > >> > > > >>> Bojan >>> > >> > > > >>> >>> > >> > > > >>> >>> > >> > > > >>> >>> > >> > > > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay < >>> balay at mcs.anl.gov> >>> > >> > > wrote: >>> > >> > > > >>> >>> > >> > > > >>>> To clarify: >>> > >> > > > >>>> >>> > >> > > > >>>> you are using --download-openmpi=yes with petsc. However >>> you >>> > >> say: >>> > >> > > > >>>> >>> > >> > > > >>>> > > The mpif90 command which >>> > >> > > > >>>> > > I use to compile the code, wraps gfortran with >>> OpenMPI >>> > >> > > > >>>> >>> > >> > > > >>>> This suggests a different install of OpenMPI is used to >>> build >>> > >> your >>> > >> > > code. >>> > >> > > > >>>> >>> > >> > > > >>>> One way to resolve this is - delete current build of >>> PETSc - >>> > >> and >>> > >> > > > >>>> rebuild it with this same MPI [that you are using with >>> your >>> > >> > > application] >>> > >> > > > >>>> >>> > >> > > > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx >>> --with-fc=mpif90 >>> > >> > > > >>>> --download-fblaslapack --download-metis >>> --download-parmetis >>> > >> > > --download-cmake >>> > >> > > > >>>> >>> > >> > > > >>>> Also PETSc provides makefile format that minimizes such >>> > >> conflicts.. >>> > >> > > > >>>> >>> > >> > > > >>>> >>> > >> > > > >>>> >>> > >> > > >>> > >> >>> https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications >>> > >> > > > >>>> >>> > >> > > > >>>> Satish >>> > >> > > > >>>> >>> > >> > > > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: >>> > >> > > > >>>> >>> > >> > > > >>>> > Are you using the same MPI to build both PETSc and your >>> > >> > > appliation? >>> > >> > > > >>>> > >>> > >> > > > >>>> > Satish >>> > >> > > > >>>> > >>> > >> > > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: >>> > >> > > > >>>> > > To whom it may concern, >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > I am working on a Fortran (2003) computational fluid >>> > >> dynamics >>> > >> > > > >>>> solver, >>> > >> > > > >>>> > > which is actually quite mature, was parallelized >>> with MPI >>> > >> from >>> > >> > > the >>> > >> > > > >>>> > > very beginning and it comes with its own suite of >>> Krylov >>> > >> > > solvers. >>> > >> > > > >>>> > > Although the code is self-sustained, I am inclined to >>> > >> believe >>> > >> > > that >>> > >> > > > >>>> it >>> > >> > > > >>>> > > would be better to use PETSc instead of my own >>> home-grown >>> > >> > > solvers. >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > In the attempt to do so, I have installed PETSc >>> 3.16.4 with >>> > >> > > > >>>> following >>> > >> > > > >>>> > > options: >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > ./configure --with-debugging=yes >>> --download-openmpi=yes >>> > >> > > --download- >>> > >> > > > >>>> > > fblaslapack=yes --download-metis=yes >>> > >> --download-parmetis=yes -- >>> > >> > > > >>>> > > download-cmake=yes >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The >>> mpif90 >>> > >> command >>> > >> > > which >>> > >> > > > >>>> > > I use to compile the code, wraps gfortran with >>> OpenMPI, >>> > >> hence >>> > >> > > the >>> > >> > > > >>>> > > option "--download-openmpi=yes" when configuring >>> PETSc. >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > Anyhow, installation of PETSc went fine, I managed >>> to link >>> > >> and >>> > >> > > run >>> > >> > > > >>>> it >>> > >> > > > >>>> > > with my code, but I am getting the following messages >>> > >> during >>> > >> > > > >>>> > > compilation: >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > Petsc_Mod.f90:18:6: >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > 18 | use PetscMat, only: tMat, >>> MAT_FINAL_ASSEMBLY >>> > >> > > > >>>> > > | 1 >>> > >> > > > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at >>> (1) >>> > >> shall >>> > >> > > be of >>> > >> > > > >>>> > > the same size as elsewhere (4 vs 8 bytes) >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing >>> PETSc. >>> > >> All >>> > >> > > works, >>> > >> > > > >>>> > > but these messages give me a reason to worry. >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > Can you tell what causes this warnings? I would >>> guess they >>> > >> > > might >>> > >> > > > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't >>> think >>> > >> I even >>> > >> > > > >>>> have >>> > >> > > > >>>> > > MPICH on my system. >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > Please let me know what you think about it? >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > Cheers, >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > Bojan >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > >>> > >> > > > >>>> > >>> > >> > > > >>>> > >>> > >> > > > >>>> >>> > >> > > > >>> >>> > >> > > > >> >>> > >> > > > >> -- >>> > >> > > > >> What most experimenters take for granted before they begin >>> their >>> > >> > > > >> experiments is infinitely more interesting than any >>> results to >>> > >> which >>> > >> > > their >>> > >> > > > >> experiments lead. >>> > >> > > > >> -- Norbert Wiener >>> > >> > > > >> >>> > >> > > > >> https://www.cse.buffalo.edu/~knepley/ >>> > >> > > > >> >>> > >> > > > >> >>> > >> > > > > >>> > >> > > > >>> > >> > > > >>> > >> > > >>> > >> > >>> > >> >>> > > >>> > >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Feb 14 07:59:42 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 14 Feb 2022 07:59:42 -0600 (CST) Subject: [petsc-users] Warning while compiling Fortran with PETSc In-Reply-To: References: <0afa61d126791d97890f4b375c8ab34ac61c8b85.camel@mcs.anl.gov> <72449390-aac9-55ee-b02b-20e520f566de@mcs.anl.gov> <5c452e1-548a-9fa6-9d82-46ded5142fc4@mcs.anl.gov> <2bed48e9-b141-44f8-807e-14d3e6f4c3fe@mcs.anl.gov> <28149e2c-4389-1629-bdb4-90984aeb1ecb@mcs.anl.gov> Message-ID: <9b7e8759-261f-8a34-6b72-f953914b80e7@mcs.anl.gov> Thanks for the update. Glad you were able to find a fix for this issue. Satish On Mon, 14 Feb 2022, Bojan Niceno wrote: > Dear both, > > It was the compiler options for integer lengths after all, thanks, Now I > corrected it all in my code, all integers have explicitly defined lengths, > and I am using the MPI_F08 module instead of obsolete mpi.f. It was a > daunting task (> 800 files, > 64000 lines of code), but I am happy with the > outcome. Now I can continue with PETSc :-) > > Cheers > > Bojan > > > > > On Fri, Feb 11, 2022 at 7:26 AM Bojan Niceno < > bojan.niceno.scientist at gmail.com> wrote: > > > What does seem to work is: > > > > use mpi_f08 > > > > and using: > > > > call Mpi_Init_08(error) > > > > call Mpi_Comm_Size_08(MPI_COMM_WORLD, n_proc, error)? > > > > call Mpi_Comm_Rank_08(MPI_COMM_WORLD, this_proc, error) > > > > > > That is, function calls with extension _08 > > > > > > > > > > > > On Fri, Feb 11, 2022 at 6:51 AM Bojan Niceno < > > bojan.niceno.scientist at gmail.com> wrote: > > > >> Thanks for the example. > >> > >> I do use -i8 -r8 (I know it is a bad practice and am planning to get rid > >> of it in mid terms), and I did suspect that, but when I tried to compile > >> without those options, the warning remained. > >> > >> When I switched from "include mpif.h" to: > >> > >> use mpi, > >> > >> or even: > >> > >> #include > >> > >> use petscmpi > >> > >> use petscsys > >> > >> > >> I get the following type of messages: > >> > >> Error: There is no specific subroutine for the generic ?mpi_barrier? at > >> (1) > >> Comm_Mod/Parallel/Start.f90:11:22: > >> > >> Error: There is no specific subroutine for the generic ?mpi_init? at (1) > >> Comm_Mod/Parallel/Start.f90:14:52: > >> > >> Error: There is no specific subroutine for the generic ?mpi_comm_size? at > >> (1) > >> Comm_Mod/Parallel/Start.f90:17:54: > >> > >> I was googling for a solution, but StackOverflow seems to be down for > >> maintenance at the moment :-( > >> > >> I did manage to find that "include mpif.h" is obsolete, which I didn't > >> know before :-) > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> On Fri, Feb 11, 2022 at 5:29 AM Satish Balay wrote: > >> > >>> 1. you can call MPI_Init() before calling PetscInitialize() For > >>> example - check src/sys/tutorials/ex4f90.F90 > >>> > >>> 2. Are you using -i8 -r8 type flags when compiling your code? That > >>> might case issues when using mpif.h. Perhaps you can switch from > >>> "include 'mpif.h'" to "use mpi" - in your module file - and see if > >>> that helps. > >>> > >>> Satish > >>> > >>> On Fri, 11 Feb 2022, Bojan Niceno wrote: > >>> > >>> > Dear both, > >>> > > >>> > Allow me to update you on the issue. I tried to re-compile PETSc with > >>> > different configuration options as Satish suggested, and went further > >>> on by > >>> > specifying exact location of OpenMPI libraries and include files to the > >>> > ones installed by PETSc (for those configurations for which I used > >>> > "--download-openmpi=1") and the original problem, the warning Named > >>> COMMON > >>> > block ?mpi_fortran_bottom? at (1) shall be of the same size as > >>> elsewhere (4 > >>> > vs 8 bytes), prevailed. > >>> > > >>> > In desperation, I completely removed OpenMPI from my workstation to > >>> make > >>> > sure that only those which are downloaded with PETSc are used, yet the > >>> > warning was still there. (That resolved the Invalid MIT-MAGIC-COOKIE-1 > >>> > warning at least) > >>> > > >>> > Now I am wondering if the problem originates from the fact that I > >>> already > >>> > have all the necessary MPI routines developed in Fortran? All calls, > >>> > including the basic MPI_Init, MPI_Comm_Size and MPI_Comm_Rank, are done > >>> > from Fortran. I actually have a module called Comm_Mod which does all > >>> > MPI-related calls, and this module contains line include 'mpif.h'. > >>> That > >>> > include statement does take the file from PETSc installation as no > >>> other > >>> > MPI installation is left on my system, but still it somehow seems to > >>> be the > >>> > origin of the warning on common blocks I observe. Now I am wondering > >>> if > >>> > the include 'mpif.h' from Fortran somehow collides with the option > >>> include > >>> > ${PETSC_DIR}/lib/petsc/conf/variables I put in my makefile in order to > >>> > compile with PETSc. > >>> > > >>> > I am really not sure if it is possible to have main program and all MPI > >>> > initialization done from Fortran (as I have now) and then plug PETSc > >>> on top > >>> > of it? Should that be possible? > >>> > > >>> > Kind regards, > >>> > > >>> > Bojan > >>> > > >>> > P.S. The sequential version works fine, I can compile without warning > >>> and > >>> > can call PETSc solvers from Fortran without a glitch. > >>> > > >>> > On Thu, Feb 10, 2022 at 5:08 PM Bojan Niceno < > >>> > bojan.niceno.scientist at gmail.com> wrote: > >>> > > >>> > > Dear Satish, > >>> > > > >>> > > Thanks for the advice. I will try in a few hours because it is > >>> almost > >>> > > dinner time with me (I am in Europe) and I am supposed to go out > >>> with a > >>> > > friend this evening. > >>> > > > >>> > > Will let you know. Thanks for help, I highly appreciate it. > >>> > > > >>> > > > >>> > > Kind regards, > >>> > > > >>> > > Bojan > >>> > > > >>> > > > >>> > > On Thu, Feb 10, 2022 at 5:06 PM Satish Balay > >>> wrote: > >>> > > > >>> > >> Hm - this is strange. > >>> > >> > >>> > >> Do you have 'xauth' installed? > >>> > >> > >>> > >> I would make sure xauth is installed, delete ~/.Xauthority - and > >>> reboot > >>> > >> [or restart the X server] > >>> > >> > >>> > >> Yeah - it might not work - but perhaps worth a try.. > >>> > >> > >>> > >> Or perhaps its not X11 related.. > >>> > >> > >>> > >> I would also try 'strace' on an application that is producing this > >>> > >> message - to see if I can narrow down further.. > >>> > >> > >>> > >> Do you get this message with both (runs)?: > >>> > >> > >>> > >> cd src/ksp/ksp/tutorials > >>> > >> make ex2 > >>> > >> mpiexec -n 1 ./ex2 > >>> > >> ./ex2 > >>> > >> > >>> > >> Satish > >>> > >> > >>> > >> On Thu, 10 Feb 2022, Bojan Niceno wrote: > >>> > >> > >>> > >> > Dear both, > >>> > >> > > >>> > >> > I work on an ASUS ROG laptop and don't use any NFS. Everything > >>> is on > >>> > >> one > >>> > >> > computer, one disk. That is why I couldn't resolve the Invalid > >>> Magic > >>> > >> > Cookie, because all the advice I've found about it concerns the > >>> remote > >>> > >> > access/display. It is not an issue for me. My laptop has an > >>> Nvidia > >>> > >> > GeForce RTX graphical card, maybe Ubuntu drivers are simply not > >>> able to > >>> > >> > cope with it. I am out of ideas, really. > >>> > >> > > >>> > >> > > >>> > >> > Cheers, > >>> > >> > > >>> > >> > Bojan > >>> > >> > > >>> > >> > On Thu, Feb 10, 2022 at 4:53 PM Satish Balay > >>> wrote: > >>> > >> > > >>> > >> > > Do the compute nodes and frontend share the same NFS? > >>> > >> > > > >>> > >> > > I would try the following [to see if they work): > >>> > >> > > > >>> > >> > > - delete ~/.Xauthority [first check with 'xauth list') > >>> > >> > > - setup ssh to not use X - i.e add the following to > >>> ~/.ssh/config > >>> > >> > > > >>> > >> > > ForwardX11 no > >>> > >> > > ForwardX11Trusted no > >>> > >> > > > >>> > >> > > [this can be tailored to apply only to your specific compute > >>> nodes - > >>> > >> if > >>> > >> > > needed] > >>> > >> > > > >>> > >> > > Satish > >>> > >> > > > >>> > >> > > On Thu, 10 Feb 2022, Matthew Knepley wrote: > >>> > >> > > > >>> > >> > > > On Thu, Feb 10, 2022 at 10:40 AM Bojan Niceno < > >>> > >> > > > bojan.niceno.scientist at gmail.com> wrote: > >>> > >> > > > > >>> > >> > > > > Thanks a lot, now I feel much better. > >>> > >> > > > > > >>> > >> > > > > By the way, I can't get around the invalid magic cookie. > >>> It is > >>> > >> > > occurring > >>> > >> > > > > ever since I installed the OS (Ubuntu 20.04) so I > >>> eventually gave > >>> > >> up > >>> > >> > > and > >>> > >> > > > > decided to live with it :-D > >>> > >> > > > > > >>> > >> > > > > >>> > >> > > > > >>> > >> > > > >>> > >> > >>> https://unix.stackexchange.com/questions/199891/invalid-mit-magic-cookie-1-key-when-trying-to-run-program-remotely > >>> > >> > > > > >>> > >> > > > Thanks, > >>> > >> > > > > >>> > >> > > > Matt > >>> > >> > > > > >>> > >> > > > > >>> > >> > > > > Cheers, > >>> > >> > > > > > >>> > >> > > > > Bojan > >>> > >> > > > > > >>> > >> > > > > On Thu, Feb 10, 2022 at 4:37 PM Matthew Knepley < > >>> > >> knepley at gmail.com> > >>> > >> > > wrote: > >>> > >> > > > > > >>> > >> > > > >> On Thu, Feb 10, 2022 at 10:34 AM Bojan Niceno < > >>> > >> > > > >> bojan.niceno.scientist at gmail.com> wrote: > >>> > >> > > > >> > >>> > >> > > > >>> Dear Satish, > >>> > >> > > > >>> > >>> > >> > > > >>> Thanks for the answer. Your suggestion makes a lot of > >>> sense, > >>> > >> but > >>> > >> > > this > >>> > >> > > > >>> is what I get as a result of that: > >>> > >> > > > >>> > >>> > >> > > > >>> Running check examples to verify correct installation > >>> > >> > > > >>> Using PETSC_DIR=/home/niceno/Development/petsc-debug and > >>> > >> > > > >>> PETSC_ARCH=arch-linux-c-debug > >>> > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with > >>> 1 MPI > >>> > >> > > process > >>> > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>> > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, > >>> prandtl # > >>> > >> = 1., > >>> > >> > > > >>> grashof # = 1. > >>> > >> > > > >>> Number of SNES iterations = 2 > >>> > >> > > > >>> Possible error running C/C++ src/snes/tutorials/ex19 with > >>> 2 MPI > >>> > >> > > processes > >>> > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>> > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keylid velocity = 0.0016, > >>> prandtl # > >>> > >> = 1., > >>> > >> > > > >>> grashof # = 1. > >>> > >> > > > >>> Number of SNES iterations = 2 > >>> > >> > > > >>> Possible error running Fortran example > >>> src/snes/tutorials/ex5f > >>> > >> with 1 > >>> > >> > > > >>> MPI process > >>> > >> > > > >>> See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>> > >> > > > >>> Invalid MIT-MAGIC-COOKIE-1 keyNumber of SNES iterations > >>> = 4 > >>> > >> > > > >>> Completed test examples > >>> > >> > > > >>> > >>> > >> > > > >>> I am getting the "Possible error running Fortran example" > >>> > >> warning > >>> > >> > > with > >>> > >> > > > >>> this. This somehow looks more severe to me. But I could > >>> be > >>> > >> wrong. > >>> > >> > > > >>> > >>> > >> > > > >> > >>> > >> > > > >> You are getting this message because your MPI > >>> implementation is > >>> > >> > > printing > >>> > >> > > > >> > >>> > >> > > > >> Invalid MIT-MAGIC-COOKIE-1 key > >>> > >> > > > >> > >>> > >> > > > >> It is still running fine, but this is an MPI configuration > >>> issue. > >>> > >> > > > >> > >>> > >> > > > >> Thanks, > >>> > >> > > > >> > >>> > >> > > > >> Matt > >>> > >> > > > >> > >>> > >> > > > >> Any suggestions what to do? > >>> > >> > > > >>> > >>> > >> > > > >>> > >>> > >> > > > >>> Kind regards, > >>> > >> > > > >>> > >>> > >> > > > >>> Bojan > >>> > >> > > > >>> > >>> > >> > > > >>> > >>> > >> > > > >>> > >>> > >> > > > >>> On Wed, Feb 9, 2022 at 5:49 PM Satish Balay < > >>> balay at mcs.anl.gov> > >>> > >> > > wrote: > >>> > >> > > > >>> > >>> > >> > > > >>>> To clarify: > >>> > >> > > > >>>> > >>> > >> > > > >>>> you are using --download-openmpi=yes with petsc. However > >>> you > >>> > >> say: > >>> > >> > > > >>>> > >>> > >> > > > >>>> > > The mpif90 command which > >>> > >> > > > >>>> > > I use to compile the code, wraps gfortran with > >>> OpenMPI > >>> > >> > > > >>>> > >>> > >> > > > >>>> This suggests a different install of OpenMPI is used to > >>> build > >>> > >> your > >>> > >> > > code. > >>> > >> > > > >>>> > >>> > >> > > > >>>> One way to resolve this is - delete current build of > >>> PETSc - > >>> > >> and > >>> > >> > > > >>>> rebuild it with this same MPI [that you are using with > >>> your > >>> > >> > > application] > >>> > >> > > > >>>> > >>> > >> > > > >>>> ./configure --with-cc=mpicc --with-cxx=mpicxx > >>> --with-fc=mpif90 > >>> > >> > > > >>>> --download-fblaslapack --download-metis > >>> --download-parmetis > >>> > >> > > --download-cmake > >>> > >> > > > >>>> > >>> > >> > > > >>>> Also PETSc provides makefile format that minimizes such > >>> > >> conflicts.. > >>> > >> > > > >>>> > >>> > >> > > > >>>> > >>> > >> > > > >>>> > >>> > >> > > > >>> > >> > >>> https://petsc.org/release/docs/manual/getting_started/#writing-c-c-or-fortran-applications > >>> > >> > > > >>>> > >>> > >> > > > >>>> Satish > >>> > >> > > > >>>> > >>> > >> > > > >>>> On Wed, 9 Feb 2022, Balay, Satish via petsc-users wrote: > >>> > >> > > > >>>> > >>> > >> > > > >>>> > Are you using the same MPI to build both PETSc and your > >>> > >> > > appliation? > >>> > >> > > > >>>> > > >>> > >> > > > >>>> > Satish > >>> > >> > > > >>>> > > >>> > >> > > > >>>> > On Wed, 2022-02-09 at 05:21 +0100, Bojan Niceno wrote: > >>> > >> > > > >>>> > > To whom it may concern, > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > I am working on a Fortran (2003) computational fluid > >>> > >> dynamics > >>> > >> > > > >>>> solver, > >>> > >> > > > >>>> > > which is actually quite mature, was parallelized > >>> with MPI > >>> > >> from > >>> > >> > > the > >>> > >> > > > >>>> > > very beginning and it comes with its own suite of > >>> Krylov > >>> > >> > > solvers. > >>> > >> > > > >>>> > > Although the code is self-sustained, I am inclined to > >>> > >> believe > >>> > >> > > that > >>> > >> > > > >>>> it > >>> > >> > > > >>>> > > would be better to use PETSc instead of my own > >>> home-grown > >>> > >> > > solvers. > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > In the attempt to do so, I have installed PETSc > >>> 3.16.4 with > >>> > >> > > > >>>> following > >>> > >> > > > >>>> > > options: > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > ./configure --with-debugging=yes > >>> --download-openmpi=yes > >>> > >> > > --download- > >>> > >> > > > >>>> > > fblaslapack=yes --download-metis=yes > >>> > >> --download-parmetis=yes -- > >>> > >> > > > >>>> > > download-cmake=yes > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > on a workstation running Ubuntu 20.04 LTS. The > >>> mpif90 > >>> > >> command > >>> > >> > > which > >>> > >> > > > >>>> > > I use to compile the code, wraps gfortran with > >>> OpenMPI, > >>> > >> hence > >>> > >> > > the > >>> > >> > > > >>>> > > option "--download-openmpi=yes" when configuring > >>> PETSc. > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > Anyhow, installation of PETSc went fine, I managed > >>> to link > >>> > >> and > >>> > >> > > run > >>> > >> > > > >>>> it > >>> > >> > > > >>>> > > with my code, but I am getting the following messages > >>> > >> during > >>> > >> > > > >>>> > > compilation: > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > Petsc_Mod.f90:18:6: > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > 18 | use PetscMat, only: tMat, > >>> MAT_FINAL_ASSEMBLY > >>> > >> > > > >>>> > > | 1 > >>> > >> > > > >>>> > > Warning: Named COMMON block ?mpi_fortran_bottom? at > >>> (1) > >>> > >> shall > >>> > >> > > be of > >>> > >> > > > >>>> > > the same size as elsewhere (4 vs 8 bytes) > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > Petsc_Mod.f90 is a module I wrote for interfacing > >>> PETSc. > >>> > >> All > >>> > >> > > works, > >>> > >> > > > >>>> > > but these messages give me a reason to worry. > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > Can you tell what causes this warnings? I would > >>> guess they > >>> > >> > > might > >>> > >> > > > >>>> > > appear if one mixes OpenMPI with MPICH, but I don't > >>> think > >>> > >> I even > >>> > >> > > > >>>> have > >>> > >> > > > >>>> > > MPICH on my system. > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > Please let me know what you think about it? > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > Cheers, > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > Bojan > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > > >>> > >> > > > >>>> > > >>> > >> > > > >>>> > > >>> > >> > > > >>>> > >>> > >> > > > >>> > >>> > >> > > > >> > >>> > >> > > > >> -- > >>> > >> > > > >> What most experimenters take for granted before they begin > >>> their > >>> > >> > > > >> experiments is infinitely more interesting than any > >>> results to > >>> > >> which > >>> > >> > > their > >>> > >> > > > >> experiments lead. > >>> > >> > > > >> -- Norbert Wiener > >>> > >> > > > >> > >>> > >> > > > >> https://www.cse.buffalo.edu/~knepley/ > >>> > >> > > > >> > >>> > >> > > > >> > >>> > >> > > > > > >>> > >> > > > > >>> > >> > > > > >>> > >> > > > >>> > >> > > >>> > >> > >>> > > > >>> > > >>> > >> > From aduarteg at utexas.edu Mon Feb 14 12:29:31 2022 From: aduarteg at utexas.edu (Alfredo J Duarte Gomez) Date: Mon, 14 Feb 2022 12:29:31 -0600 Subject: [petsc-users] Restarting Multistep methods Message-ID: Good morning PETSC team, I have been working on a code that uses the TSBDF object, and I have been able to run successful restarts with the BDF order 1. However, due to the nonlinearity of the problem, restarting higher order methods from a single initial solution has led to trouble. Are there any built in functions or procedures to save the BDF solution vector with the necessary previous solutions so that it can be easily restarted? Maybe saving a set of binary files? Thank you, -Alfredo -- Alfredo Duarte Graduate Research Assistant The University of Texas at Austin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tangqi at msu.edu Tue Feb 15 09:43:30 2022 From: tangqi at msu.edu (Tang, Qi) Date: Tue, 15 Feb 2022 15:43:30 +0000 Subject: [petsc-users] Hessenberg Index-2 DAE and IMEX Message-ID: Hi, Does PETSc?s ARK directly apply to Hessenberg Index-2 DAE? Do we need to perform a time derivative of the constraint equation by ourselves first? https://petsc.org/main/docs/manual/ts/#hessenberg-index-2-dae If we do not have to, do we expect to get high order in time? Thanks, Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Tue Feb 15 09:51:03 2022 From: hongzhang at anl.gov (Zhang, Hong) Date: Tue, 15 Feb 2022 15:51:03 +0000 Subject: [petsc-users] Hessenberg Index-2 DAE and IMEX In-Reply-To: References: Message-ID: <64C394A8-F223-4C13-B67F-604E09AD094E@anl.gov> Hi Qi, The index-2 DAE cannot be solved directly with ARK or implicit methods such as backward Euler and Crank-Nicolson. You need to convert the system to an index-1 DAE as illustrated in the documentation. Hong (Mr.) On Feb 15, 2022, at 9:43 AM, Tang, Qi > wrote: Hi, Does PETSc?s ARK directly apply to Hessenberg Index-2 DAE? Do we need to perform a time derivative of the constraint equation by ourselves first? https://petsc.org/main/docs/manual/ts/#hessenberg-index-2-dae If we do not have to, do we expect to get high order in time? Thanks, Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongzhang at anl.gov Tue Feb 15 09:58:48 2022 From: hongzhang at anl.gov (Zhang, Hong) Date: Tue, 15 Feb 2022 15:58:48 +0000 Subject: [petsc-users] Restarting Multistep methods In-Reply-To: References: Message-ID: <54CA4F0E-269C-49AE-9321-FACC5BC6C59B@anl.gov> ALfredo, If you want PETSc to restart BDF, you can use TSRestartStep(). Hong (Mr.) On Feb 14, 2022, at 12:29 PM, Alfredo J Duarte Gomez > wrote: Good morning PETSC team, I have been working on a code that uses the TSBDF object, and I have been able to run successful restarts with the BDF order 1. However, due to the nonlinearity of the problem, restarting higher order methods from a single initial solution has led to trouble. Are there any built in functions or procedures to save the BDF solution vector with the necessary previous solutions so that it can be easily restarted? Maybe saving a set of binary files? Thank you, -Alfredo -- Alfredo Duarte Graduate Research Assistant The University of Texas at Austin -------------- next part -------------- An HTML attachment was scrubbed... URL: From facklerpw at ornl.gov Tue Feb 15 10:10:49 2022 From: facklerpw at ornl.gov (Fackler, Philip) Date: Tue, 15 Feb 2022 16:10:49 +0000 Subject: [petsc-users] Kokkos Interface for PETSc Message-ID: We're intending to transitioning the Xolotl interfaces with PETSc. I am hoping someone (can) point us to some documentation (and examples) for using PETSc's Kokkos-based interface. If this does not yet exist, then perhaps some slides (like the ones Richard Mills showed at the NE-SciDAC all-hands meeting) showing some examples could get us started. Thanks for any help that can be provided, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From tangqi at msu.edu Tue Feb 15 10:27:38 2022 From: tangqi at msu.edu (Tang, Qi) Date: Tue, 15 Feb 2022 16:27:38 +0000 Subject: [petsc-users] Hessenberg Index-2 DAE and IMEX In-Reply-To: <64C394A8-F223-4C13-B67F-604E09AD094E@anl.gov> References: <64C394A8-F223-4C13-B67F-604E09AD094E@anl.gov> Message-ID: <1905DC63-4C56-40FD-9DD2-1669A9C2311E@msu.edu> Thanks a lot, Hong. I would think if one use BDF or backward Euler for incompressible Naiver Stokes, it should work with the index 2 equation. Why do you think it will not work? Or maybe you were talking about a general usage. INS is not my usage but I am still curious. We have some unconventional constraint equation. Qi On Feb 15, 2022, at 8:51 AM, Zhang, Hong > wrote: Hi Qi, The index-2 DAE cannot be solved directly with ARK or implicit methods such as backward Euler and Crank-Nicolson. You need to convert the system to an index-1 DAE as illustrated in the documentation. Hong (Mr.) On Feb 15, 2022, at 9:43 AM, Tang, Qi > wrote: Hi, Does PETSc?s ARK directly apply to Hessenberg Index-2 DAE? Do we need to perform a time derivative of the constraint equation by ourselves first? https://petsc.org/main/docs/manual/ts/#hessenberg-index-2-dae If we do not have to, do we expect to get high order in time? Thanks, Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Feb 15 10:43:12 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 15 Feb 2022 09:43:12 -0700 Subject: [petsc-users] Kokkos Interface for PETSc In-Reply-To: References: Message-ID: <87pmno5bdr.fsf@jedbrown.org> We need to make these docs more explicit, but the short answer is configure with --download-kokkos --download-kokkos-kernels and run almost any example with -dm_mat_type aijkokkos -dm_vec_type kokkos. If you run with -log_view, you should see that all the flops take place on the device and there are few host->device transfers. Message packing is done on the device and it'll use GPU-aware MPI. There are a few examples of residual evaluation and matrix assembly on the device using Kokkos. You can also see libCEED examples for assembly on the device into Kokkos matrices and vectors without touching host memory. "Fackler, Philip via petsc-users" writes: > We're intending to transitioning the Xolotl interfaces with PETSc. > > I am hoping someone (can) point us to some documentation (and examples) for using PETSc's Kokkos-based interface. If this does not yet exist, then perhaps some slides (like the ones Richard Mills showed at the NE-SciDAC all-hands meeting) showing some examples could get us started. > > Thanks for any help that can be provided, > > Philip Fackler > Research Software Engineer, Application Engineering Group > Advanced Computing Systems Research Section > Computer Science and Mathematics Division > Oak Ridge National Laboratory From balay at mcs.anl.gov Tue Feb 15 10:59:07 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 15 Feb 2022 10:59:07 -0600 (CST) Subject: [petsc-users] Kokkos Interface for PETSc In-Reply-To: <87pmno5bdr.fsf@jedbrown.org> References: <87pmno5bdr.fsf@jedbrown.org> Message-ID: Also - best to use petsc repo - 'main' branch. And for install on crusher - check config/examples/arch-olcf-crusher.py Satish On Tue, 15 Feb 2022, Jed Brown wrote: > We need to make these docs more explicit, but the short answer is configure with --download-kokkos --download-kokkos-kernels and run almost any example with -dm_mat_type aijkokkos -dm_vec_type kokkos. If you run with -log_view, you should see that all the flops take place on the device and there are few host->device transfers. Message packing is done on the device and it'll use GPU-aware MPI. There are a few examples of residual evaluation and matrix assembly on the device using Kokkos. You can also see libCEED examples for assembly on the device into Kokkos matrices and vectors without touching host memory. > > "Fackler, Philip via petsc-users" writes: > > > We're intending to transitioning the Xolotl interfaces with PETSc. > > > > I am hoping someone (can) point us to some documentation (and examples) for using PETSc's Kokkos-based interface. If this does not yet exist, then perhaps some slides (like the ones Richard Mills showed at the NE-SciDAC all-hands meeting) showing some examples could get us started. > > > > Thanks for any help that can be provided, > > > > Philip Fackler > > Research Software Engineer, Application Engineering Group > > Advanced Computing Systems Research Section > > Computer Science and Mathematics Division > > Oak Ridge National Laboratory > From balay at mcs.anl.gov Tue Feb 15 11:07:10 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 15 Feb 2022 11:07:10 -0600 (CST) Subject: [petsc-users] Kokkos Interface for PETSc In-Reply-To: References: <87pmno5bdr.fsf@jedbrown.org> Message-ID: Also - perhaps the following info might be useful Satish ---- balay at sb /home/balay/petsc (main=) $ git grep -l download-kokkos-kernels config/examples config/examples/arch-ci-freebsd-cxx-cmplx-pkgs-dbg.py config/examples/arch-ci-linux-cuda-double.py config/examples/arch-ci-linux-gcc-ifc-cmplx.py config/examples/arch-ci-linux-hip-double.py config/examples/arch-ci-linux-pkgs-dbg-ftn-interfaces.py config/examples/arch-ci-linux-pkgs-valgrind.py config/examples/arch-ci-osx-cxx-pkgs-opt.py config/examples/arch-nvhpc.py config/examples/arch-olcf-crusher.py config/examples/arch-olcf-spock.py balay at sb /home/balay/petsc (main=) $ git grep -l "requires:.*kokkos_kernels" src/ksp/ksp/tests/ex3.c src/ksp/ksp/tests/ex43.c src/ksp/ksp/tests/ex60.c src/ksp/ksp/tutorials/ex7.c src/mat/tests/ex123.c src/mat/tests/ex132.c src/mat/tests/ex2.c src/mat/tests/ex250.c src/mat/tests/ex251.c src/mat/tests/ex252.c src/mat/tests/ex254.c src/mat/tests/ex5.c src/mat/tests/ex62.c src/mat/tutorials/ex5k.kokkos.cxx src/snes/tests/ex13.c src/snes/tutorials/ex13.c src/snes/tutorials/ex3k.kokkos.cxx src/snes/tutorials/ex56.c src/ts/utils/dmplexlandau/tutorials/ex1.c src/ts/utils/dmplexlandau/tutorials/ex1f90.F90 src/ts/utils/dmplexlandau/tutorials/ex2.c src/vec/vec/tests/ex21.c src/vec/vec/tests/ex22.c src/vec/vec/tests/ex23.c src/vec/vec/tests/ex28.c src/vec/vec/tests/ex34.c src/vec/vec/tests/ex37.c src/vec/vec/tests/ex38.c src/vec/vec/tests/ex4.c src/vec/vec/tests/ex43.c src/vec/vec/tests/ex60.c src/vec/vec/tutorials/ex1.c balay at sb /home/balay/petsc (main=) $ On Tue, 15 Feb 2022, Satish Balay via petsc-users wrote: > Also - best to use petsc repo - 'main' branch. > > And for install on crusher - check config/examples/arch-olcf-crusher.py > > Satish > > On Tue, 15 Feb 2022, Jed Brown wrote: > > > We need to make these docs more explicit, but the short answer is configure with --download-kokkos --download-kokkos-kernels and run almost any example with -dm_mat_type aijkokkos -dm_vec_type kokkos. If you run with -log_view, you should see that all the flops take place on the device and there are few host->device transfers. Message packing is done on the device and it'll use GPU-aware MPI. There are a few examples of residual evaluation and matrix assembly on the device using Kokkos. You can also see libCEED examples for assembly on the device into Kokkos matrices and vectors without touching host memory. > > > > "Fackler, Philip via petsc-users" writes: > > > > > We're intending to transitioning the Xolotl interfaces with PETSc. > > > > > > I am hoping someone (can) point us to some documentation (and examples) for using PETSc's Kokkos-based interface. If this does not yet exist, then perhaps some slides (like the ones Richard Mills showed at the NE-SciDAC all-hands meeting) showing some examples could get us started. > > > > > > Thanks for any help that can be provided, > > > > > > Philip Fackler > > > Research Software Engineer, Application Engineering Group > > > Advanced Computing Systems Research Section > > > Computer Science and Mathematics Division > > > Oak Ridge National Laboratory > > > From hongzhang at anl.gov Tue Feb 15 12:39:41 2022 From: hongzhang at anl.gov (Zhang, Hong) Date: Tue, 15 Feb 2022 18:39:41 +0000 Subject: [petsc-users] Hessenberg Index-2 DAE and IMEX In-Reply-To: <1905DC63-4C56-40FD-9DD2-1669A9C2311E@msu.edu> References: <64C394A8-F223-4C13-B67F-604E09AD094E@anl.gov> <1905DC63-4C56-40FD-9DD2-1669A9C2311E@msu.edu> Message-ID: Yes. I am talking about a general usage. To be accurate, direct application of these methods does not always work. There are examples of high-index DAEs for which backward Euler and all multi-step and RK methods fail. You can find one such example in https://www.cs.usask.ca/~spiteri/M314/notes/AP/chap10.pdf This note also mentions the difficulties in solving the nonlinear system and calculating error estimate for index-2 DAEs. Thus we do not recommend solving index-2 DAEs directly with these methods. Hong On Feb 15, 2022, at 10:27 AM, Tang, Qi > wrote: Thanks a lot, Hong. I would think if one use BDF or backward Euler for incompressible Naiver Stokes, it should work with the index 2 equation. Why do you think it will not work? Or maybe you were talking about a general usage. INS is not my usage but I am still curious. We have some unconventional constraint equation. Qi On Feb 15, 2022, at 8:51 AM, Zhang, Hong > wrote: Hi Qi, The index-2 DAE cannot be solved directly with ARK or implicit methods such as backward Euler and Crank-Nicolson. You need to convert the system to an index-1 DAE as illustrated in the documentation. Hong (Mr.) On Feb 15, 2022, at 9:43 AM, Tang, Qi > wrote: Hi, Does PETSc?s ARK directly apply to Hessenberg Index-2 DAE? Do we need to perform a time derivative of the constraint equation by ourselves first? https://petsc.org/main/docs/manual/ts/#hessenberg-index-2-dae If we do not have to, do we expect to get high order in time? Thanks, Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 15 14:32:55 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 Feb 2022 15:32:55 -0500 Subject: [petsc-users] Hessenberg Index-2 DAE and IMEX In-Reply-To: References: <64C394A8-F223-4C13-B67F-604E09AD094E@anl.gov> <1905DC63-4C56-40FD-9DD2-1669A9C2311E@msu.edu> Message-ID: On Tue, Feb 15, 2022 at 1:39 PM Zhang, Hong via petsc-users < petsc-users at mcs.anl.gov> wrote: > Yes. I am talking about a general usage. To be accurate, direct > application of these methods does not always work. There are examples of > high-index DAEs for which backward Euler and all multi-step and RK methods > fail. You can find one such example in > https://www.cs.usask.ca/~spiteri/M314/notes/AP/chap10.pdf > This note also mentions the difficulties in solving the nonlinear system > and calculating error estimate for index-2 DAEs. > > Thus we do not recommend solving index-2 DAEs directly with these methods. > I have an index-2 DAE (earthquake mechanics), and we had to explicitly differentiate the constraint and add terms back in the momentum equation. This is also what is done for Navier-Stokes (I think they call it a segregated timestepping method). Thanks, Matt > Hong > > On Feb 15, 2022, at 10:27 AM, Tang, Qi wrote: > > Thanks a lot, Hong. > > I would think if one use BDF or backward Euler for incompressible Naiver > Stokes, it should work with the index 2 equation. Why do you think it will > not work? Or maybe you were talking about a general usage. > > INS is not my usage but I am still curious. We have some unconventional > constraint equation. > > Qi > > > > On Feb 15, 2022, at 8:51 AM, Zhang, Hong wrote: > > Hi Qi, > > The index-2 DAE cannot be solved directly with ARK or implicit methods > such as backward Euler and Crank-Nicolson. You need to convert the system > to an index-1 DAE as illustrated in the documentation. > > Hong (Mr.) > > On Feb 15, 2022, at 9:43 AM, Tang, Qi wrote: > > Hi, > > Does PETSc?s ARK directly apply to Hessenberg Index-2 DAE? Do we need to > perform a time derivative of the constraint equation by ourselves first? > https://petsc.org/main/docs/manual/ts/#hessenberg-index-2-dae > > If we do not have to, do we expect to get high order in time? > > Thanks, > Qi > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno.scientist at gmail.com Tue Feb 15 15:13:14 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Tue, 15 Feb 2022 22:13:14 +0100 Subject: [petsc-users] Migrating to parallel PETSc Message-ID: Dear PETSc users, I have an in-house computational fluid dynamics (CFD) solver, written in Fortran 2008, parallelized with MPI with its own home-grown suite of linear solvers. The code is unstructured, performs domain decomposition with METIS and all communication buffers, I mean connectivity between processors, has been properly worked out. A couple of weeks back, I decided to try out the PETSc suite of solvers. After some initial setbacks, I managed to compile my code with PETSc and have the sequential version working fine :-) I have essentially using the following PETSc routines to get the code solving linear systems with PETSc: I set up the working space as follows: call PetscInitialize(PETSC_NULL_CHARACTER, Pet % petsc_err) call MatCreateSeqAij(PETSC_COMM_SELF, ... call MatSeqAijSetColumnIndices(.... call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector x call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector b call KSPCreate(PETSC_COMM_SELF, ... Then in order to solve a system, I do: call MatSetValue(Pet % petsc_A, ! Inside a loop through matrix entries i-PETSC_ONE, k-PETSC_ONE, ... call MatAssemblyBegin(Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % petsc_err) call MatAssemblyEnd (Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % petsc_err) call VecSetValue(Pet % petsc_x, ... ! Fill up x call VecSetValue(Pet % petsc_b, ... ! Fill up b call KSPSetType(Pet % petsc_ksp ... ! Set solver call KSPGetPC(Pet % petsc_ksp, ... ! get preconditioner context call PCSetType(Pet % petsc_pc, ... ! Set preconditioner call KSPSetFromOptions(Pet % petsc_ksp, Pet % petsc_err) call KSPSetUp (Pet % petsc_ksp, Pet % petsc_err) ! Finally solve call KSPSolve(Pet % petsc_ksp, ... Once this was up and running, I thought that in order to have the parallel version I will merely have to replace the "Seq" versions of the above functions, with their parallel counterparts. I was expecting to find the red function (MatSeqAijSetColumnIndices) for parallel runs, but it doesn't seem to exist. I have found non-seq versions of some other functions ( MatCreateAij, VecCreateSeq), but not something like MatAijSetColumnIndices, which surprised me a bit, because I have this information in my code. Is there a parallel counterpart of this function, and if there is none, what should it be replaced with? I understand that I will have to provide non-zeros in buffers (o_nnz), which is not a big issue, but how to provide information on columns for parallel version is not clear to me. In a nutshell, I would need a hint on which of the above functions could remain the same in parallel, and which should be replaced and with what? Cheers, Bojan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 15 15:43:55 2022 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 15 Feb 2022 16:43:55 -0500 Subject: [petsc-users] Migrating to parallel PETSc In-Reply-To: References: Message-ID: On Tue, Feb 15, 2022 at 4:13 PM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Dear PETSc users, > > > I have an in-house computational fluid dynamics (CFD) solver, written in > Fortran 2008, parallelized with MPI with its own home-grown suite of linear > solvers. The code is unstructured, performs domain decomposition with > METIS and all communication buffers, I mean connectivity between > processors, has been properly worked out. > > A couple of weeks back, I decided to try out the PETSc suite of solvers. > After some initial setbacks, I managed to compile my code with PETSc and > have the sequential version working fine :-) > > I have essentially using the following PETSc routines to get the code > solving linear systems with PETSc: > > I set up the working space as follows: > > call PetscInitialize(PETSC_NULL_CHARACTER, Pet % petsc_err) > call MatCreateSeqAij(PETSC_COMM_SELF, ... > call MatSeqAijSetColumnIndices(.... > call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector x > call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector b > call KSPCreate(PETSC_COMM_SELF, ... > > Then in order to solve a system, I do: > > call MatSetValue(Pet % petsc_A, ! Inside a loop through matrix entries > i-PETSC_ONE, > k-PETSC_ONE, ... > call MatAssemblyBegin(Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % petsc_err) > call MatAssemblyEnd (Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % petsc_err) > > call VecSetValue(Pet % petsc_x, ... ! Fill up x > call VecSetValue(Pet % petsc_b, ... ! Fill up b > > call KSPSetType(Pet % petsc_ksp ... ! Set solver > call KSPGetPC(Pet % petsc_ksp, ... ! get preconditioner context > call PCSetType(Pet % petsc_pc, ... ! Set preconditioner > > call KSPSetFromOptions(Pet % petsc_ksp, Pet % petsc_err) > > call KSPSetUp (Pet % petsc_ksp, Pet % petsc_err) > > > ! Finally solve > call KSPSolve(Pet % petsc_ksp, ... > > Once this was up and running, I thought that in order to have the parallel > version I will merely have to replace the "Seq" versions of the above > functions, with their parallel counterparts. I was expecting to find the > red function (MatSeqAijSetColumnIndices) for parallel runs, but it > doesn't seem to exist. I have found non-seq versions of some other > functions (MatCreateAij, VecCreateSeq), but not something like > MatAijSetColumnIndices, which surprised me a bit, because I have this > information in my code. > > Is there a parallel counterpart of this function, and if there is none, > what should it be replaced with? I understand that I will have to provide > non-zeros in buffers (o_nnz), which is not a big issue, but how to provide > information on columns for parallel version is not clear to me. In a > nutshell, I would need a hint on which of the above functions could remain > the same in parallel, and which should be replaced and with what? > The reason that we do not have the same function is that the parallel storage format separately stores the diagonal and off-diagonal blocks, so you cannot just copy in the indices. I think if you really need this, we could add something that would take the data you have in parallel and produce a nonzero structure, but it could not share your array. More generally, I would say things that intrusively touch the internal data structure, like MatSeqAijSetColumnIndices(), are not to be preferred, but we do have them because some users want them. If I were doing the kind of conversion that you are, I would instead use MatPreallocator. This allows you to run your assembly loop twice, just inserting values the first time, which allows the matrix to be preallocated, and then inserting again. The insertion time is usually small compared to computing the values. I have used this in a few places in my code FEM code and it works efficiently. Alternatively, you could just call https://petsc.org/main/docs/manualpages/Mat/MatXAIJSetPreallocation.html with your preallocation data and let the column indices be calculated. That is usually a very small overhead. Thanks, Matt > Cheers, > > Bojan > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tangqi at msu.edu Tue Feb 15 16:17:16 2022 From: tangqi at msu.edu (Tang, Qi) Date: Tue, 15 Feb 2022 22:17:16 +0000 Subject: [petsc-users] Hessenberg Index-2 DAE and IMEX In-Reply-To: References: <64C394A8-F223-4C13-B67F-604E09AD094E@anl.gov> <1905DC63-4C56-40FD-9DD2-1669A9C2311E@msu.edu> Message-ID: Hong and Matt, Thanks a lot. I have a read of the notes as well as segregated RK. I think I have a better understanding now. That answers a lot of my questions I had. Qi On Feb 15, 2022, at 1:32 PM, Matthew Knepley > wrote: segregated -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Tue Feb 15 16:29:35 2022 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 15 Feb 2022 17:29:35 -0500 Subject: [petsc-users] Migrating to parallel PETSc In-Reply-To: References: Message-ID: <7200EA13-95E6-46A7-BE9D-71FE5615D6BB@petsc.dev> MatCreateMPIAIJWithArrays() and MatUpdateMPIAIJWithArrays() may be suitable for your use case. This should also be much more efficient in moving the matrix from your code to PETSc's format. Barry > On Feb 15, 2022, at 4:13 PM, Bojan Niceno wrote: > > Dear PETSc users, > > > I have an in-house computational fluid dynamics (CFD) solver, written in Fortran 2008, parallelized with MPI with its own home-grown suite of linear solvers. The code is unstructured, performs domain decomposition with METIS and all communication buffers, I mean connectivity between processors, has been properly worked out. > > A couple of weeks back, I decided to try out the PETSc suite of solvers. After some initial setbacks, I managed to compile my code with PETSc and have the sequential version working fine :-) > > I have essentially using the following PETSc routines to get the code solving linear systems with PETSc: > > I set up the working space as follows: > > call PetscInitialize(PETSC_NULL_CHARACTER, Pet % petsc_err) > call MatCreateSeqAij(PETSC_COMM_SELF, ... > call MatSeqAijSetColumnIndices(.... > call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector x > call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector b > call KSPCreate(PETSC_COMM_SELF, ... > > Then in order to solve a system, I do: > > call MatSetValue(Pet % petsc_A, ! Inside a loop through matrix entries > i-PETSC_ONE, > k-PETSC_ONE, ... > call MatAssemblyBegin(Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % petsc_err) > call MatAssemblyEnd (Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % petsc_err) > > call VecSetValue(Pet % petsc_x, ... ! Fill up x > call VecSetValue(Pet % petsc_b, ... ! Fill up b > > call KSPSetType(Pet % petsc_ksp ... ! Set solver > call KSPGetPC(Pet % petsc_ksp, ... ! get preconditioner context > call PCSetType(Pet % petsc_pc, ... ! Set preconditioner > > call KSPSetFromOptions(Pet % petsc_ksp, Pet % petsc_err) > call KSPSetUp (Pet % petsc_ksp, Pet % petsc_err) > > ! Finally solve > call KSPSolve(Pet % petsc_ksp, ... > > Once this was up and running, I thought that in order to have the parallel version I will merely have to replace the "Seq" versions of the above functions, with their parallel counterparts. I was expecting to find the red function (MatSeqAijSetColumnIndices) for parallel runs, but it doesn't seem to exist. I have found non-seq versions of some other functions (MatCreateAij, VecCreateSeq), but not something like MatAijSetColumnIndices, which surprised me a bit, because I have this information in my code. > > Is there a parallel counterpart of this function, and if there is none, what should it be replaced with? I understand that I will have to provide non-zeros in buffers (o_nnz), which is not a big issue, but how to provide information on columns for parallel version is not clear to me. In a nutshell, I would need a hint on which of the above functions could remain the same in parallel, and which should be replaced and with what? > > Cheers, > > Bojan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno.scientist at gmail.com Wed Feb 16 00:30:00 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Wed, 16 Feb 2022 07:30:00 +0100 Subject: [petsc-users] Migrating to parallel PETSc In-Reply-To: <7200EA13-95E6-46A7-BE9D-71FE5615D6BB@petsc.dev> References: <7200EA13-95E6-46A7-BE9D-71FE5615D6BB@petsc.dev> Message-ID: Thanks you guys for your suggestions, I will have a look at functions you suggested. Have a great day, Bojan On Tue, Feb 15, 2022 at 11:29 PM Barry Smith wrote: > > MatCreateMPIAIJWithArrays() and MatUpdateMPIAIJWithArrays() may be > suitable for your use case. > > This should also be much more efficient in moving the matrix from your > code to PETSc's format. > > Barry > > > On Feb 15, 2022, at 4:13 PM, Bojan Niceno < > bojan.niceno.scientist at gmail.com> wrote: > > Dear PETSc users, > > > I have an in-house computational fluid dynamics (CFD) solver, written in > Fortran 2008, parallelized with MPI with its own home-grown suite of linear > solvers. The code is unstructured, performs domain decomposition with > METIS and all communication buffers, I mean connectivity between > processors, has been properly worked out. > > A couple of weeks back, I decided to try out the PETSc suite of solvers. > After some initial setbacks, I managed to compile my code with PETSc and > have the sequential version working fine :-) > > I have essentially using the following PETSc routines to get the code > solving linear systems with PETSc: > > I set up the working space as follows: > > call PetscInitialize(PETSC_NULL_CHARACTER, Pet % petsc_err) > call MatCreateSeqAij(PETSC_COMM_SELF, ... > call MatSeqAijSetColumnIndices(.... > call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector x > call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector b > call KSPCreate(PETSC_COMM_SELF, ... > > Then in order to solve a system, I do: > > call MatSetValue(Pet % petsc_A, ! Inside a loop through matrix entries > i-PETSC_ONE, > k-PETSC_ONE, ... > call MatAssemblyBegin(Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % petsc_err) > call MatAssemblyEnd (Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % petsc_err) > > call VecSetValue(Pet % petsc_x, ... ! Fill up x > call VecSetValue(Pet % petsc_b, ... ! Fill up b > > call KSPSetType(Pet % petsc_ksp ... ! Set solver > call KSPGetPC(Pet % petsc_ksp, ... ! get preconditioner context > call PCSetType(Pet % petsc_pc, ... ! Set preconditioner > > call KSPSetFromOptions(Pet % petsc_ksp, Pet % petsc_err) > > call KSPSetUp (Pet % petsc_ksp, Pet % petsc_err) > > > ! Finally solve > call KSPSolve(Pet % petsc_ksp, ... > > Once this was up and running, I thought that in order to have the parallel > version I will merely have to replace the "Seq" versions of the above > functions, with their parallel counterparts. I was expecting to find the > red function (MatSeqAijSetColumnIndices) for parallel runs, but it > doesn't seem to exist. I have found non-seq versions of some other > functions (MatCreateAij, VecCreateSeq), but not something like > MatAijSetColumnIndices, which surprised me a bit, because I have this > information in my code. > > Is there a parallel counterpart of this function, and if there is none, > what should it be replaced with? I understand that I will have to provide > non-zeros in buffers (o_nnz), which is not a big issue, but how to provide > information on columns for parallel version is not clear to me. In a > nutshell, I would need a hint on which of the above functions could remain > the same in parallel, and which should be replaced and with what? > > Cheers, > > Bojan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno.scientist at gmail.com Wed Feb 16 07:13:19 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Wed, 16 Feb 2022 14:13:19 +0100 Subject: [petsc-users] Migrating to parallel PETSc In-Reply-To: References: <7200EA13-95E6-46A7-BE9D-71FE5615D6BB@petsc.dev> Message-ID: Dear Matthew and Barry, I am going through tutorial examples, and have noticed that calls to MatMPIAIJSetPreallocation and MatSeqAIJSetPreallocation always come in succession of one another, usually in that order. Why would that be? Is it maybe to be able to run on systems without MPI, so that the first function MatMPIAIJSetPreallocation only makes a dummy call? I was checking the manual pages and found no information about this pairing. Kind regards, Bojan On Wed, Feb 16, 2022 at 7:30 AM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Thanks you guys for your suggestions, I will have a look at functions you > suggested. > > Have a great day, > > Bojan > > On Tue, Feb 15, 2022 at 11:29 PM Barry Smith wrote: > >> >> MatCreateMPIAIJWithArrays() and MatUpdateMPIAIJWithArrays() may be >> suitable for your use case. >> >> This should also be much more efficient in moving the matrix from your >> code to PETSc's format. >> >> Barry >> >> >> On Feb 15, 2022, at 4:13 PM, Bojan Niceno < >> bojan.niceno.scientist at gmail.com> wrote: >> >> Dear PETSc users, >> >> >> I have an in-house computational fluid dynamics (CFD) solver, written in >> Fortran 2008, parallelized with MPI with its own home-grown suite of linear >> solvers. The code is unstructured, performs domain decomposition with >> METIS and all communication buffers, I mean connectivity between >> processors, has been properly worked out. >> >> A couple of weeks back, I decided to try out the PETSc suite of solvers. >> After some initial setbacks, I managed to compile my code with PETSc and >> have the sequential version working fine :-) >> >> I have essentially using the following PETSc routines to get the code >> solving linear systems with PETSc: >> >> I set up the working space as follows: >> >> call PetscInitialize(PETSC_NULL_CHARACTER, Pet % petsc_err) >> call MatCreateSeqAij(PETSC_COMM_SELF, ... >> call MatSeqAijSetColumnIndices(.... >> call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector x >> call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector b >> call KSPCreate(PETSC_COMM_SELF, ... >> >> Then in order to solve a system, I do: >> >> call MatSetValue(Pet % petsc_A, ! Inside a loop through matrix entries >> i-PETSC_ONE, >> k-PETSC_ONE, ... >> call MatAssemblyBegin(Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % >> petsc_err) >> call MatAssemblyEnd (Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % >> petsc_err) >> >> call VecSetValue(Pet % petsc_x, ... ! Fill up x >> call VecSetValue(Pet % petsc_b, ... ! Fill up b >> >> call KSPSetType(Pet % petsc_ksp ... ! Set solver >> call KSPGetPC(Pet % petsc_ksp, ... ! get preconditioner context >> call PCSetType(Pet % petsc_pc, ... ! Set preconditioner >> >> call KSPSetFromOptions(Pet % petsc_ksp, Pet % petsc_err) >> >> call KSPSetUp (Pet % petsc_ksp, Pet % petsc_err) >> >> >> ! Finally solve >> call KSPSolve(Pet % petsc_ksp, ... >> >> Once this was up and running, I thought that in order to have the >> parallel version I will merely have to replace the "Seq" versions of the >> above functions, with their parallel counterparts. I was expecting to find >> the red function (MatSeqAijSetColumnIndices) for parallel runs, but it >> doesn't seem to exist. I have found non-seq versions of some other >> functions (MatCreateAij, VecCreateSeq), but not something like >> MatAijSetColumnIndices, which surprised me a bit, because I have this >> information in my code. >> >> Is there a parallel counterpart of this function, and if there is none, >> what should it be replaced with? I understand that I will have to provide >> non-zeros in buffers (o_nnz), which is not a big issue, but how to provide >> information on columns for parallel version is not clear to me. In a >> nutshell, I would need a hint on which of the above functions could remain >> the same in parallel, and which should be replaced and with what? >> >> Cheers, >> >> Bojan >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 16 07:31:20 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 16 Feb 2022 08:31:20 -0500 Subject: [petsc-users] Migrating to parallel PETSc In-Reply-To: References: <7200EA13-95E6-46A7-BE9D-71FE5615D6BB@petsc.dev> Message-ID: On Wed, Feb 16, 2022 at 8:18 AM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Dear Matthew and Barry, > > I am going through tutorial examples, and have noticed that calls to > MatMPIAIJSetPreallocation and MatSeqAIJSetPreallocation always come in > succession of one another, usually in that order. Why would that be? Is > it maybe to be able to run on systems without MPI, so that the first > function MatMPIAIJSetPreallocation only makes a dummy call? > > I was checking the manual pages and found no information about this > pairing. > This is because we are too lazy or time-constrained to convert all examples to the new style using https://petsc.org/main/docs/manualpages/Mat/MatXAIJSetPreallocation.html More broadly, PETSc is constructed like Objective-C in that method bindings can change at runtime and calling a method on an object of a different type just ignores it. So if I have a SeqAIJ matrix, only the MatSeqAIJSetPreallocation call works and the other is ignored. This way you can put in customization for many types without lots of checks to see what type the caller has. Thanks, Matt > Kind regards, > > Bojan > > > > On Wed, Feb 16, 2022 at 7:30 AM Bojan Niceno < > bojan.niceno.scientist at gmail.com> wrote: > >> Thanks you guys for your suggestions, I will have a look at functions you >> suggested. >> >> Have a great day, >> >> Bojan >> >> On Tue, Feb 15, 2022 at 11:29 PM Barry Smith wrote: >> >>> >>> MatCreateMPIAIJWithArrays() and MatUpdateMPIAIJWithArrays() may be >>> suitable for your use case. >>> >>> This should also be much more efficient in moving the matrix from your >>> code to PETSc's format. >>> >>> Barry >>> >>> >>> On Feb 15, 2022, at 4:13 PM, Bojan Niceno < >>> bojan.niceno.scientist at gmail.com> wrote: >>> >>> Dear PETSc users, >>> >>> >>> I have an in-house computational fluid dynamics (CFD) solver, written in >>> Fortran 2008, parallelized with MPI with its own home-grown suite of linear >>> solvers. The code is unstructured, performs domain decomposition with >>> METIS and all communication buffers, I mean connectivity between >>> processors, has been properly worked out. >>> >>> A couple of weeks back, I decided to try out the PETSc suite of >>> solvers. After some initial setbacks, I managed to compile my code with >>> PETSc and have the sequential version working fine :-) >>> >>> I have essentially using the following PETSc routines to get the code >>> solving linear systems with PETSc: >>> >>> I set up the working space as follows: >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER, Pet % petsc_err) >>> call MatCreateSeqAij(PETSC_COMM_SELF, ... >>> call MatSeqAijSetColumnIndices(.... >>> call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector x >>> call VecCreateSeq(PETSC_COMM_SELF, ... ! Create Vector b >>> call KSPCreate(PETSC_COMM_SELF, ... >>> >>> Then in order to solve a system, I do: >>> >>> call MatSetValue(Pet % petsc_A, ! Inside a loop through matrix >>> entries >>> i-PETSC_ONE, >>> k-PETSC_ONE, ... >>> call MatAssemblyBegin(Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % >>> petsc_err) >>> call MatAssemblyEnd (Pet % petsc_A, MAT_FINAL_ASSEMBLY, Pet % >>> petsc_err) >>> >>> call VecSetValue(Pet % petsc_x, ... ! Fill up x >>> call VecSetValue(Pet % petsc_b, ... ! Fill up b >>> >>> call KSPSetType(Pet % petsc_ksp ... ! Set solver >>> call KSPGetPC(Pet % petsc_ksp, ... ! get preconditioner context >>> call PCSetType(Pet % petsc_pc, ... ! Set preconditioner >>> >>> call KSPSetFromOptions(Pet % petsc_ksp, Pet % petsc_err) >>> >>> call KSPSetUp (Pet % petsc_ksp, Pet % petsc_err) >>> >>> >>> ! Finally solve >>> call KSPSolve(Pet % petsc_ksp, ... >>> >>> Once this was up and running, I thought that in order to have the >>> parallel version I will merely have to replace the "Seq" versions of the >>> above functions, with their parallel counterparts. I was expecting to find >>> the red function (MatSeqAijSetColumnIndices) for parallel runs, but it >>> doesn't seem to exist. I have found non-seq versions of some other >>> functions (MatCreateAij, VecCreateSeq), but not something like >>> MatAijSetColumnIndices, which surprised me a bit, because I have this >>> information in my code. >>> >>> Is there a parallel counterpart of this function, and if there is none, >>> what should it be replaced with? I understand that I will have to provide >>> non-zeros in buffers (o_nnz), which is not a big issue, but how to provide >>> information on columns for parallel version is not clear to me. In a >>> nutshell, I would need a hint on which of the above functions could remain >>> the same in parallel, and which should be replaced and with what? >>> >>> Cheers, >>> >>> Bojan >>> >>> >>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Wed Feb 16 07:54:53 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 16 Feb 2022 14:54:53 +0100 Subject: [petsc-users] Nodes Communication - Petsc Vec Message-ID: Hello, I want to pass a Petsc Vec from one node to another one. Is there any Petsc method for doing this or should I use MPI_Irecv and MPI_Isend for non-blocking communication ? When using MPI_Isend, can i send a Petsc Vec, or should I get the array from it, and then send it ? Thanks From knepley at gmail.com Wed Feb 16 08:03:28 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 16 Feb 2022 09:03:28 -0500 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: References: Message-ID: On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM < medane.tchakorom at univ-fcomte.fr> wrote: > Hello, > > I want to pass a Petsc Vec from one node to another one. Is there any > Petsc method for doing this > You likely want the VecScatter functionality. There is a discussion of this in the manual: https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct Thanks, Matt > or should I use MPI_Irecv and MPI_Isend for non-blocking communication ? > > When using MPI_Isend, can i send a Petsc Vec, or should I get the array > from it, and then send it ? > > Thanks > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Wed Feb 16 08:12:11 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 16 Feb 2022 15:12:11 +0100 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: References: Message-ID: Re: I forgot to mention that the nodes are in differents communicators. Does VecScatter functionnality still applies ? Thanks. On 16/02/2022 15:03, Matthew Knepley wrote: > On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM > wrote: > > Hello, > > I want to pass a Petsc Vec from one node to another one. Is there any > Petsc method for doing this > > > You likely want the VecScatter functionality. There is a discussion of > this in the manual: > https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct > > ? Thanks, > > ? ? Matt > > or should I use MPI_Irecv and MPI_Isend for non-blocking > communication ? > > When using MPI_Isend, can i send a Petsc Vec, or should I get the > array > from it, and then send it ? > > Thanks > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Feb 16 08:43:11 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 16 Feb 2022 08:43:11 -0600 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: References: Message-ID: See Notes at https://petsc.org/main/docs/manualpages/PetscSF/VecScatterCreate.html Does it apply to your case? --Junchao Zhang On Wed, Feb 16, 2022 at 8:12 AM Medane TCHAKOROM < medane.tchakorom at univ-fcomte.fr> wrote: > Re: > > I forgot to mention that the nodes are in differents communicators. Does > VecScatter functionnality still applies ? > > Thanks. > On 16/02/2022 15:03, Matthew Knepley wrote: > > On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM < > medane.tchakorom at univ-fcomte.fr> wrote: > >> Hello, >> >> I want to pass a Petsc Vec from one node to another one. Is there any >> Petsc method for doing this >> > > You likely want the VecScatter functionality. There is a discussion of > this in the manual: > https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct > > Thanks, > > Matt > > >> or should I use MPI_Irecv and MPI_Isend for non-blocking communication ? >> >> When using MPI_Isend, can i send a Petsc Vec, or should I get the array >> from it, and then send it ? >> >> Thanks >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Wed Feb 16 08:47:32 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 16 Feb 2022 15:47:32 +0100 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: References: Message-ID: Re: It does not apply, because "their communicator must be on the same set of processes" . In my case, I have two disjoint subcomm from PETSC_COMM_WORD with same number of processes. I want to send information (Petsc Vec) from one subcomm to another subcomm. Thanks On 16/02/2022 15:43, Junchao Zhang wrote: > See Notes at > https://petsc.org/main/docs/manualpages/PetscSF/VecScatterCreate.html > > Does it apply to your case? > > --Junchao Zhang > > > On Wed, Feb 16, 2022 at 8:12 AM Medane TCHAKOROM > wrote: > > Re: > > I forgot to mention that the nodes are in differents > communicators. Does VecScatter functionnality still applies ? > > Thanks. > > On 16/02/2022 15:03, Matthew Knepley wrote: >> On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM >> wrote: >> >> Hello, >> >> I want to pass a Petsc Vec from one node to another one. Is >> there any >> Petsc method for doing this >> >> >> You likely want the VecScatter functionality. There is a >> discussion of this in the manual: >> https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct >> >> ? Thanks, >> >> ? ? Matt >> >> or should I use MPI_Irecv and MPI_Isend for non-blocking >> communication ? >> >> When using MPI_Isend, can i send a Petsc Vec, or should I get >> the array >> from it, and then send it ? >> >> Thanks >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 16 08:48:51 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 16 Feb 2022 09:48:51 -0500 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: References: Message-ID: On Wed, Feb 16, 2022 at 9:47 AM Medane TCHAKOROM < medane.tchakorom at univ-fcomte.fr> wrote: > Re: > > > It does not apply, because "their communicator must be on the same set of > processes" . > > In my case, I have two disjoint subcomm from PETSC_COMM_WORD with same > number of processes. > > I want to send information (Petsc Vec) from one subcomm to another subcomm. > The way I do this is to go to the supercommunicator, incorporating these two groups and use VecScatter. The alternative is just to use custom MPI calls as you say. Thanks, Matt > Thanks > On 16/02/2022 15:43, Junchao Zhang wrote: > > See Notes at > https://petsc.org/main/docs/manualpages/PetscSF/VecScatterCreate.html > > Does it apply to your case? > > --Junchao Zhang > > > On Wed, Feb 16, 2022 at 8:12 AM Medane TCHAKOROM < > medane.tchakorom at univ-fcomte.fr> wrote: > >> Re: >> >> I forgot to mention that the nodes are in differents communicators. Does >> VecScatter functionnality still applies ? >> >> Thanks. >> On 16/02/2022 15:03, Matthew Knepley wrote: >> >> On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM < >> medane.tchakorom at univ-fcomte.fr> wrote: >> >>> Hello, >>> >>> I want to pass a Petsc Vec from one node to another one. Is there any >>> Petsc method for doing this >>> >> >> You likely want the VecScatter functionality. There is a discussion of >> this in the manual: >> https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct >> >> Thanks, >> >> Matt >> >> >>> or should I use MPI_Irecv and MPI_Isend for non-blocking communication ? >>> >>> When using MPI_Isend, can i send a Petsc Vec, or should I get the array >>> from it, and then send it ? >>> >>> Thanks >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Feb 16 09:00:23 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 16 Feb 2022 09:00:23 -0600 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: References: Message-ID: We could still do it. See the example at petsc/src/vec/is/sf/tests/ex9.c --Junchao Zhang On Wed, Feb 16, 2022 at 8:47 AM Medane TCHAKOROM < medane.tchakorom at univ-fcomte.fr> wrote: > Re: > > > It does not apply, because "their communicator must be on the same set of > processes" . > > In my case, I have two disjoint subcomm from PETSC_COMM_WORD with same > number of processes. > > I want to send information (Petsc Vec) from one subcomm to another subcomm. > > Thanks > On 16/02/2022 15:43, Junchao Zhang wrote: > > See Notes at > https://petsc.org/main/docs/manualpages/PetscSF/VecScatterCreate.html > > Does it apply to your case? > > --Junchao Zhang > > > On Wed, Feb 16, 2022 at 8:12 AM Medane TCHAKOROM < > medane.tchakorom at univ-fcomte.fr> wrote: > >> Re: >> >> I forgot to mention that the nodes are in differents communicators. Does >> VecScatter functionnality still applies ? >> >> Thanks. >> On 16/02/2022 15:03, Matthew Knepley wrote: >> >> On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM < >> medane.tchakorom at univ-fcomte.fr> wrote: >> >>> Hello, >>> >>> I want to pass a Petsc Vec from one node to another one. Is there any >>> Petsc method for doing this >>> >> >> You likely want the VecScatter functionality. There is a discussion of >> this in the manual: >> https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct >> >> Thanks, >> >> Matt >> >> >>> or should I use MPI_Irecv and MPI_Isend for non-blocking communication ? >>> >>> When using MPI_Isend, can i send a Petsc Vec, or should I get the array >>> from it, and then send it ? >>> >>> Thanks >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Wed Feb 16 09:31:27 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 16 Feb 2022 16:31:27 +0100 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: References: Message-ID: Re: Please, could you if possible, provide a basic example code. I do not understand how to "go to the supercommunicator, incorporating these two groups and use VecScatter" Another question is: If I was to use "MPI calls", can I use Petsc Vec as a buffer ? If so, which type should I indicate (MPI_Datatype) in the let's say MPI_Isend method ? Thanks On 16/02/2022 15:48, Matthew Knepley wrote: > On Wed, Feb 16, 2022 at 9:47 AM Medane TCHAKOROM > wrote: > > Re: > > > It does not apply, because "their communicator must be on the same > set of processes" . > > In my case, I have two disjoint subcomm from PETSC_COMM_WORD with > same number of processes. > > I want to send information (Petsc Vec) from one subcomm to another > subcomm. > > The way I do this is to go to the supercommunicator, incorporating > these two groups and use VecScatter. > The alternative is just to use custom MPI calls as you say. > > ? Thanks, > > ? ? ?Matt > > Thanks > > On 16/02/2022 15:43, Junchao Zhang wrote: >> See Notes at >> https://petsc.org/main/docs/manualpages/PetscSF/VecScatterCreate.html >> >> >> Does it apply to your case? >> >> --Junchao Zhang >> >> >> On Wed, Feb 16, 2022 at 8:12 AM Medane TCHAKOROM >> wrote: >> >> Re: >> >> I forgot to mention that the nodes are in differents >> communicators. Does VecScatter functionnality still applies ? >> >> Thanks. >> >> On 16/02/2022 15:03, Matthew Knepley wrote: >>> On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM >>> wrote: >>> >>> Hello, >>> >>> I want to pass a Petsc Vec from one node to another one. >>> Is there any >>> Petsc method for doing this >>> >>> >>> You likely want the VecScatter functionality. There is a >>> discussion of this in the manual: >>> https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct >>> >>> ? Thanks, >>> >>> ? ? Matt >>> >>> or should I use MPI_Irecv and MPI_Isend for non-blocking >>> communication ? >>> >>> When using MPI_Isend, can i send a Petsc Vec, or should >>> I get the array >>> from it, and then send it ? >>> >>> Thanks >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Wed Feb 16 09:40:58 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 16 Feb 2022 16:40:58 +0100 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: References: Message-ID: <74abd611-7905-8b61-8aef-5112be49c8c6@univ-fcomte.fr> Re: Sorry, but I can't find this file in my installation. I'am using Petsc 3.14.2 Thanks On 16/02/2022 16:00, Junchao Zhang wrote: > We could still do it. See the example at petsc/src/vec/is/sf/tests/ex9.c > > --Junchao Zhang > > > On Wed, Feb 16, 2022 at 8:47 AM Medane TCHAKOROM > wrote: > > Re: > > > It does not apply, because "their communicator must be on the same > set of processes" . > > In my case, I have two disjoint subcomm from PETSC_COMM_WORD with > same number of processes. > > I want to send information (Petsc Vec) from one subcomm to another > subcomm. > > Thanks > > On 16/02/2022 15:43, Junchao Zhang wrote: >> See Notes at >> https://petsc.org/main/docs/manualpages/PetscSF/VecScatterCreate.html >> >> >> Does it apply to your case? >> >> --Junchao Zhang >> >> >> On Wed, Feb 16, 2022 at 8:12 AM Medane TCHAKOROM >> wrote: >> >> Re: >> >> I forgot to mention that the nodes are in differents >> communicators. Does VecScatter functionnality still applies ? >> >> Thanks. >> >> On 16/02/2022 15:03, Matthew Knepley wrote: >>> On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM >>> wrote: >>> >>> Hello, >>> >>> I want to pass a Petsc Vec from one node to another one. >>> Is there any >>> Petsc method for doing this >>> >>> >>> You likely want the VecScatter functionality. There is a >>> discussion of this in the manual: >>> https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct >>> >>> ? Thanks, >>> >>> ? ? Matt >>> >>> or should I use MPI_Irecv and MPI_Isend for non-blocking >>> communication ? >>> >>> When using MPI_Isend, can i send a Petsc Vec, or should >>> I get the array >>> from it, and then send it ? >>> >>> Thanks >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Feb 16 09:47:16 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 16 Feb 2022 09:47:16 -0600 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: <74abd611-7905-8b61-8aef-5112be49c8c6@univ-fcomte.fr> References: <74abd611-7905-8b61-8aef-5112be49c8c6@univ-fcomte.fr> Message-ID: https://gitlab.com/petsc/petsc/-/blob/main/src/vec/is/sf/tests/ex9.c --Junchao Zhang On Wed, Feb 16, 2022 at 9:40 AM Medane TCHAKOROM < medane.tchakorom at univ-fcomte.fr> wrote: > Re: > > > Sorry, but I can't find this file in my installation. I'am using Petsc > 3.14.2 > > > Thanks > On 16/02/2022 16:00, Junchao Zhang wrote: > > We could still do it. See the example at petsc/src/vec/is/sf/tests/ex9.c > > --Junchao Zhang > > > On Wed, Feb 16, 2022 at 8:47 AM Medane TCHAKOROM < > medane.tchakorom at univ-fcomte.fr> wrote: > >> Re: >> >> >> It does not apply, because "their communicator must be on the same set >> of processes" . >> >> In my case, I have two disjoint subcomm from PETSC_COMM_WORD with same >> number of processes. >> >> I want to send information (Petsc Vec) from one subcomm to another >> subcomm. >> >> Thanks >> On 16/02/2022 15:43, Junchao Zhang wrote: >> >> See Notes at >> https://petsc.org/main/docs/manualpages/PetscSF/VecScatterCreate.html >> >> Does it apply to your case? >> >> --Junchao Zhang >> >> >> On Wed, Feb 16, 2022 at 8:12 AM Medane TCHAKOROM < >> medane.tchakorom at univ-fcomte.fr> wrote: >> >>> Re: >>> >>> I forgot to mention that the nodes are in differents communicators. Does >>> VecScatter functionnality still applies ? >>> >>> Thanks. >>> On 16/02/2022 15:03, Matthew Knepley wrote: >>> >>> On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM < >>> medane.tchakorom at univ-fcomte.fr> wrote: >>> >>>> Hello, >>>> >>>> I want to pass a Petsc Vec from one node to another one. Is there any >>>> Petsc method for doing this >>>> >>> >>> You likely want the VecScatter functionality. There is a discussion of >>> this in the manual: >>> https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> or should I use MPI_Irecv and MPI_Isend for non-blocking communication ? >>>> >>>> When using MPI_Isend, can i send a Petsc Vec, or should I get the array >>>> from it, and then send it ? >>>> >>>> Thanks >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Wed Feb 16 10:25:15 2022 From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM) Date: Wed, 16 Feb 2022 17:25:15 +0100 Subject: [petsc-users] Nodes Communication - Petsc Vec In-Reply-To: References: <74abd611-7905-8b61-8aef-5112be49c8c6@univ-fcomte.fr> Message-ID: <28cd6657-498b-cde0-ac04-93c47f076638@univ-fcomte.fr> Re: I'am currently going through the code. Thans On 16/02/2022 16:47, Junchao Zhang wrote: > https://gitlab.com/petsc/petsc/-/blob/main/src/vec/is/sf/tests/ex9.c > > --Junchao Zhang > > > On Wed, Feb 16, 2022 at 9:40 AM Medane TCHAKOROM > wrote: > > Re: > > > Sorry, but I can't find this file in my installation. I'am using > Petsc 3.14.2 > > > Thanks > > On 16/02/2022 16:00, Junchao Zhang wrote: >> We could still do it. See the example at >> petsc/src/vec/is/sf/tests/ex9.c >> >> --Junchao Zhang >> >> >> On Wed, Feb 16, 2022 at 8:47 AM Medane TCHAKOROM >> wrote: >> >> Re: >> >> >> It does not apply, because "their communicator must be on the >> same set of processes" . >> >> In my case, I have two disjoint subcomm from PETSC_COMM_WORD >> with same number of processes. >> >> I want to send information (Petsc Vec) from one subcomm to >> another subcomm. >> >> Thanks >> >> On 16/02/2022 15:43, Junchao Zhang wrote: >>> See Notes at >>> https://petsc.org/main/docs/manualpages/PetscSF/VecScatterCreate.html >>> >>> >>> Does it apply to your case? >>> >>> --Junchao Zhang >>> >>> >>> On Wed, Feb 16, 2022 at 8:12 AM Medane TCHAKOROM >>> wrote: >>> >>> Re: >>> >>> I forgot to mention that the nodes are in differents >>> communicators. Does VecScatter functionnality still >>> applies ? >>> >>> Thanks. >>> >>> On 16/02/2022 15:03, Matthew Knepley wrote: >>>> On Wed, Feb 16, 2022 at 8:55 AM Medane TCHAKOROM >>>> wrote: >>>> >>>> Hello, >>>> >>>> I want to pass a Petsc Vec from one node to another >>>> one. Is there any >>>> Petsc method for doing this >>>> >>>> >>>> You likely want the VecScatter functionality. There is >>>> a discussion of this in the manual: >>>> https://petsc.org/main/docs/manual/vec/?highlight=vecscatter#sec-unstruct >>>> >>>> ? Thanks, >>>> >>>> ? ? Matt >>>> >>>> or should I use MPI_Irecv and MPI_Isend for >>>> non-blocking communication ? >>>> >>>> When using MPI_Isend, can i send a Petsc Vec, or >>>> should I get the array >>>> from it, and then send it ? >>>> >>>> Thanks >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they >>>> begin their experiments is infinitely more interesting >>>> than any results to which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From heepark at sandia.gov Wed Feb 16 17:28:19 2022 From: heepark at sandia.gov (Park, Heeho) Date: Wed, 16 Feb 2022 23:28:19 +0000 Subject: [petsc-users] Changes in SNES convergence criteria Message-ID: <7D9AFB1C-88AD-4727-B1EB-E38DD7AEAD9C@sandia.gov> Hi Petsc developers, I?m running into two issues related to SNES convergence criteria. First is setting convergence criteria to a large value. This used to work to effectively turn off RTOL, STOL, ATOL, and use PFLOTRAN?s Newton iteration convergence criteria using INF norms, but this trick does not work anymore ( example 1 ). So, we set the condition to PETSc convergence criteria (default) AND PFLOTRAN convergence criteria to declare convergence, and get this error ( example 2). Is there a good way to bypass SNES convergence criteria and use our own? What?s happening in the second example? Thank you, Example 1: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Relative tolerance 1e+20 must be non-negative and less than 1.0 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.4-756-g5b3d650 GIT Date: 2022-02-02 17:48:52 +0000 [0]PETSC ERROR: /data/home/heepark/software/pflotran-bcnew-trd/src/pflotran/pflotran on a intel-c-opt named cb007 by tlaforc Tue Feb 15 19:45:39 2022 [0]PETSC ERROR: Configure options --with-petsc-arch=intel-c-opt --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-c language=c --with-shared-libraries=0 --with-debugging=0 --with-mpi=yes --download-sowing=yes --with-valgrind=yes --with-cmak e=yes --with-make=yes --with-fblaslaback=yes --with-blaslapack=yes --FFLAGS="-diag-disable 5462" --CFLAGS="-O3 -g" --CXXFLAG S="-O3 -g" --FFLAGS="-O3 -g" --with-hdf5-include=/data/home/heepark/software/hdf5/build/include --with-hdf5-lib="-L/data/hom e/heepark/software/hdf5/build/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5" --with-hypre-include=/data/home/heepark/ software/hypre/build/include --with-hypre-lib="-L/data/home/heepark/software/hypre/build/lib -lHYPRE" --with-metis-include=/ data/home/heepark/software/metis/include --with-metis-lib=/data/home/heepark/software/metis/build/libmetis/libmetis.a --with -parmetis-include=/data/home/heepark/software/parmetis/include --with-parmetis-lib=/data/home/heepark/software/parmetis/buil d/libparmetis/libparmetis.a --with-scalapack-lib=/data/home/heepark/software/scalapack/build/lib/libscalapack.a --with-hdf5- fortran-bindings=yes [0]PETSC ERROR: #1 SNESSetTolerances() at /data/home/heepark/software/petsc-main/src/snes/interface/snes.c:3833 [0]PETSC ERROR: #2 User provided function() at User file:0 Example 2: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: toranks[39] 40 not in comm size 40 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.4-756-g5b3d650 GIT Date: 2022-02-02 17:48:52 +0000 [0]PETSC ERROR: /data/home/heepark/software/pflotran-bcnew-trd/src/pflotran/pflotran on a intel-c-opt named cb007 by tlaforc Wed Feb 16 14:13:12 2022 [0]PETSC ERROR: Configure options --with-petsc-arch=intel-c-opt --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-clanguage=c --with-shared-libraries=0 --with-debugging=0 --with-mpi=yes --download-sowing=yes --with-valgrind=yes --with-cmake=yes --with-make=yes --with-fblaslaback=yes --with-blaslapack=yes --FFLAGS="-diag-disable 5462" --CFLAGS="-O3 -g" --CXXFLAGS="-O3 -g" --FFLAGS="-O3 -g" --with-hdf5-include=/data/home/heepark/software/hdf5/build/include --with-hdf5-lib="-L/data/home/heepark/software/hdf5/build/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5" --with-hypre-include=/data/home/heepark/software/hypre/build/include --with-hypre-lib="-L/data/home/heepark/software/hypre/build/lib -lHYPRE" --with-metis-include=/data/home/heepark/software/metis/include --with-metis-lib=/data/home/heepark/software/metis/build/libmetis/libmetis.a --with-parmetis-include=/data/home/heepark/software/parmetis/include --with-parmetis-lib=/data/home/heepark/software/parmetis/build/libparmetis/libparmetis.a --with-scalapack-lib=/data/home/heepark/software/scalapack/build/lib/libscalapack.a --with-hdf5-fortran-bindings=yes [0]PETSC ERROR: #1 PetscCommBuildTwoSidedFReq() at /data/home/heepark/software/petsc-main/src/sys/utils/mpits.c:555 [0]PETSC ERROR: #2 MatStashScatterBegin_BTS() at /data/home/heepark/software/petsc-main/src/mat/utils/matstash.c:940 [0]PETSC ERROR: #3 MatStashScatterBegin_Private() at /data/home/heepark/software/petsc-main/src/mat/utils/matstash.c:461 [0]PETSC ERROR: #4 MatAssemblyBegin_MPIAIJ() at /data/home/heepark/software/petsc-main/src/mat/impls/aij/mpi/mpiaij.c:673 [0]PETSC ERROR: #5 MatAssemblyBegin() at /data/home/heepark/software/petsc-main/src/mat/interface/matrix.c:5592 [0]PETSC ERROR: #6 User provided function() at User file:0 [0]PETSC ERROR: #7 oursnesjacobian() at /data/home/heepark/software/petsc-main/src/snes/interface/ftn-custom/zsnesf.c:173 [0]PETSC ERROR: #8 SNESComputeJacobian() at /data/home/heepark/software/petsc-main/src/snes/interface/snes.c:2864 [0]PETSC ERROR: #9 SNESSolve_NEWTONLS() at /data/home/heepark/software/petsc-main/src/snes/impls/ls/ls.c:222 [0]PETSC ERROR: #10 SNESSolve() at /data/home/heepark/software/petsc-main/src/snes/interface/snes.c:4810 [0]PETSC ERROR: #11 User provided function() at User file:0 Heeho Daniel Park ! ------------------------------------ ! Sandia National Laboratories Org: 08844, R&D Work: 505-844-1319 ! ------------------------------------ ! -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Feb 16 18:06:14 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 16 Feb 2022 19:06:14 -0500 Subject: [petsc-users] Changes in SNES convergence criteria In-Reply-To: <7D9AFB1C-88AD-4727-B1EB-E38DD7AEAD9C@sandia.gov> References: <7D9AFB1C-88AD-4727-B1EB-E38DD7AEAD9C@sandia.gov> Message-ID: <72552460-C800-4DD1-941D-821F1676445F@petsc.dev> > On Feb 16, 2022, at 6:28 PM, Park, Heeho via petsc-users wrote: > > Hi Petsc developers, > > I?m running into two issues related to SNES convergence criteria. First is setting convergence criteria to a large value. This used to work to effectively turn off RTOL, STOL, ATOL, and use PFLOTRAN?s Newton iteration convergence criteria using INF norms, but this trick does not work anymore ( example 1 ). > So, we set the condition to PETSc convergence criteria (default) AND PFLOTRAN convergence criteria to declare convergence, and get this error ( example 2). > Is there a good way to bypass SNES convergence criteria and use our own? > What?s happening in the second example? > > Thank you, > > Example 1: Do you use SNESSetConvergenceTest? If so, then unless your test uses the PETSc SNES object tolerance values (unlikely), you don't need to change the convergence tolerance values. So you can just not set it to that huge value or any particular value. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Relative tolerance 1e+20 must be non-negative and less than 1.0 > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.16.4-756-g5b3d650 GIT Date: 2022-02-02 17:48:52 +0000 > [0]PETSC ERROR: /data/home/heepark/software/pflotran-bcnew-trd/src/pflotran/pflotran on a intel-c-opt named cb007 by tlaforc > Tue Feb 15 19:45:39 2022 > [0]PETSC ERROR: Configure options --with-petsc-arch=intel-c-opt --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-c > language=c --with-shared-libraries=0 --with-debugging=0 --with-mpi=yes --download-sowing=yes --with-valgrind=yes --with-cmak > e=yes --with-make=yes --with-fblaslaback=yes --with-blaslapack=yes --FFLAGS="-diag-disable 5462" --CFLAGS="-O3 -g" --CXXFLAG > S="-O3 -g" --FFLAGS="-O3 -g" --with-hdf5-include=/data/home/heepark/software/hdf5/build/include --with-hdf5-lib="-L/data/hom > e/heepark/software/hdf5/build/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5" --with-hypre-include=/data/home/heepark/ > software/hypre/build/include --with-hypre-lib="-L/data/home/heepark/software/hypre/build/lib -lHYPRE" --with-metis-include=/ > data/home/heepark/software/metis/include --with-metis-lib=/data/home/heepark/software/metis/build/libmetis/libmetis.a --with > -parmetis-include=/data/home/heepark/software/parmetis/include --with-parmetis-lib=/data/home/heepark/software/parmetis/buil > d/libparmetis/libparmetis.a --with-scalapack-lib=/data/home/heepark/software/scalapack/build/lib/libscalapack.a --with-hdf5- > fortran-bindings=yes > [0]PETSC ERROR: #1 SNESSetTolerances() at /data/home/heepark/software/petsc-main/src/snes/interface/snes.c:3833 > [0]PETSC ERROR: #2 User provided function() at User file:0 > > Example 2: This should never be happening. Perhaps it is caused by memory corruption. Can you run with valgrind to verify no memory issues. If valgrind does not work on your system you can try with the PETSc option -malloc_debug > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: toranks[39] 40 not in comm size 40 > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.16.4-756-g5b3d650 GIT Date: 2022-02-02 17:48:52 +0000 > [0]PETSC ERROR: /data/home/heepark/software/pflotran-bcnew-trd/src/pflotran/pflotran on a intel-c-opt named cb007 by tlaforc Wed Feb 16 14:13:12 2022 > [0]PETSC ERROR: Configure options --with-petsc-arch=intel-c-opt --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-clanguage=c --with-shared-libraries=0 --with-debugging=0 --with-mpi=yes --download-sowing=yes --with-valgrind=yes --with-cmake=yes --with-make=yes --with-fblaslaback=yes --with-blaslapack=yes --FFLAGS="-diag-disable 5462" --CFLAGS="-O3 -g" --CXXFLAGS="-O3 -g" --FFLAGS="-O3 -g" --with-hdf5-include=/data/home/heepark/software/hdf5/build/include --with-hdf5-lib="-L/data/home/heepark/software/hdf5/build/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5" --with-hypre-include=/data/home/heepark/software/hypre/build/include --with-hypre-lib="-L/data/home/heepark/software/hypre/build/lib -lHYPRE" --with-metis-include=/data/home/heepark/software/metis/include --with-metis-lib=/data/home/heepark/software/metis/build/libmetis/libmetis.a --with-parmetis-include=/data/home/heepark/software/parmetis/include --with-parmetis-lib=/data/home/heepark/software/parmetis/build/libparmetis/libparmetis.a --with-scalapack-lib=/data/home/heepark/software/scalapack/build/lib/libscalapack.a --with-hdf5-fortran-bindings=yes > [0]PETSC ERROR: #1 PetscCommBuildTwoSidedFReq() at /data/home/heepark/software/petsc-main/src/sys/utils/mpits.c:555 > [0]PETSC ERROR: #2 MatStashScatterBegin_BTS() at /data/home/heepark/software/petsc-main/src/mat/utils/matstash.c:940 > [0]PETSC ERROR: #3 MatStashScatterBegin_Private() at /data/home/heepark/software/petsc-main/src/mat/utils/matstash.c:461 > [0]PETSC ERROR: #4 MatAssemblyBegin_MPIAIJ() at /data/home/heepark/software/petsc-main/src/mat/impls/aij/mpi/mpiaij.c:673 > [0]PETSC ERROR: #5 MatAssemblyBegin() at /data/home/heepark/software/petsc-main/src/mat/interface/matrix.c:5592 > [0]PETSC ERROR: #6 User provided function() at User file:0 > [0]PETSC ERROR: #7 oursnesjacobian() at /data/home/heepark/software/petsc-main/src/snes/interface/ftn-custom/zsnesf.c:173 > [0]PETSC ERROR: #8 SNESComputeJacobian() at /data/home/heepark/software/petsc-main/src/snes/interface/snes.c:2864 > [0]PETSC ERROR: #9 SNESSolve_NEWTONLS() at /data/home/heepark/software/petsc-main/src/snes/impls/ls/ls.c:222 > [0]PETSC ERROR: #10 SNESSolve() at /data/home/heepark/software/petsc-main/src/snes/interface/snes.c:4810 > [0]PETSC ERROR: #11 User provided function() at User file:0 > > > Heeho Daniel Park > > ! ------------------------------------ ! > Sandia National Laboratories > Org: 08844, R&D > Work: 505-844-1319 > ! ------------------------------------ ! -------------- next part -------------- An HTML attachment was scrubbed... URL: From heepark at sandia.gov Wed Feb 16 18:10:37 2022 From: heepark at sandia.gov (Park, Heeho) Date: Thu, 17 Feb 2022 00:10:37 +0000 Subject: [petsc-users] [EXTERNAL] Re: Changes in SNES convergence criteria In-Reply-To: <72552460-C800-4DD1-941D-821F1676445F@petsc.dev> References: <7D9AFB1C-88AD-4727-B1EB-E38DD7AEAD9C@sandia.gov> <72552460-C800-4DD1-941D-821F1676445F@petsc.dev> Message-ID: HI, Jenn: Perhaps, we need to have an option to use or not use SNESSetConvergenceTest. I think PFLOTRAN is set to use that function as default no matter what. Barry: Thank you for your answers. Let me look into the second case deeper. I will have to test this in different machines and debug to see what is causing the issue. - Heeho Daniel Park From: Barry Smith Date: Wednesday, February 16, 2022 at 4:06 PM To: "Park, Heeho" Cc: "petsc-users at mcs.anl.gov" , "LaForce, Tara" Subject: [EXTERNAL] Re: [petsc-users] Changes in SNES convergence criteria On Feb 16, 2022, at 6:28 PM, Park, Heeho via petsc-users > wrote: Hi Petsc developers, I?m running into two issues related to SNES convergence criteria. First is setting convergence criteria to a large value. This used to work to effectively turn off RTOL, STOL, ATOL, and use PFLOTRAN?s Newton iteration convergence criteria using INF norms, but this trick does not work anymore ( example 1 ). So, we set the condition to PETSc convergence criteria (default) AND PFLOTRAN convergence criteria to declare convergence, and get this error ( example 2). Is there a good way to bypass SNES convergence criteria and use our own? What?s happening in the second example? Thank you, Example 1: Do you use SNESSetConvergenceTest? If so, then unless your test uses the PETSc SNES object tolerance values (unlikely), you don't need to change the convergence tolerance values. So you can just not set it to that huge value or any particular value. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Relative tolerance 1e+20 must be non-negative and less than 1.0 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.4-756-g5b3d650 GIT Date: 2022-02-02 17:48:52 +0000 [0]PETSC ERROR: /data/home/heepark/software/pflotran-bcnew-trd/src/pflotran/pflotran on a intel-c-opt named cb007 by tlaforc Tue Feb 15 19:45:39 2022 [0]PETSC ERROR: Configure options --with-petsc-arch=intel-c-opt --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-c language=c --with-shared-libraries=0 --with-debugging=0 --with-mpi=yes --download-sowing=yes --with-valgrind=yes --with-cmak e=yes --with-make=yes --with-fblaslaback=yes --with-blaslapack=yes --FFLAGS="-diag-disable 5462" --CFLAGS="-O3 -g" --CXXFLAG S="-O3 -g" --FFLAGS="-O3 -g" --with-hdf5-include=/data/home/heepark/software/hdf5/build/include --with-hdf5-lib="-L/data/hom e/heepark/software/hdf5/build/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5" --with-hypre-include=/data/home/heepark/ software/hypre/build/include --with-hypre-lib="-L/data/home/heepark/software/hypre/build/lib -lHYPRE" --with-metis-include=/ data/home/heepark/software/metis/include --with-metis-lib=/data/home/heepark/software/metis/build/libmetis/libmetis.a --with -parmetis-include=/data/home/heepark/software/parmetis/include --with-parmetis-lib=/data/home/heepark/software/parmetis/buil d/libparmetis/libparmetis.a --with-scalapack-lib=/data/home/heepark/software/scalapack/build/lib/libscalapack.a --with-hdf5- fortran-bindings=yes [0]PETSC ERROR: #1 SNESSetTolerances() at /data/home/heepark/software/petsc-main/src/snes/interface/snes.c:3833 [0]PETSC ERROR: #2 User provided function() at User file:0 Example 2: This should never be happening. Perhaps it is caused by memory corruption. Can you run with valgrind to verify no memory issues. If valgrind does not work on your system you can try with the PETSc option -malloc_debug [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: toranks[39] 40 not in comm size 40 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.4-756-g5b3d650 GIT Date: 2022-02-02 17:48:52 +0000 [0]PETSC ERROR: /data/home/heepark/software/pflotran-bcnew-trd/src/pflotran/pflotran on a intel-c-opt named cb007 by tlaforc Wed Feb 16 14:13:12 2022 [0]PETSC ERROR: Configure options --with-petsc-arch=intel-c-opt --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-clanguage=c --with-shared-libraries=0 --with-debugging=0 --with-mpi=yes --download-sowing=yes --with-valgrind=yes --with-cmake=yes --with-make=yes --with-fblaslaback=yes --with-blaslapack=yes --FFLAGS="-diag-disable 5462" --CFLAGS="-O3 -g" --CXXFLAGS="-O3 -g" --FFLAGS="-O3 -g" --with-hdf5-include=/data/home/heepark/software/hdf5/build/include --with-hdf5-lib="-L/data/home/heepark/software/hdf5/build/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5" --with-hypre-include=/data/home/heepark/software/hypre/build/include --with-hypre-lib="-L/data/home/heepark/software/hypre/build/lib -lHYPRE" --with-metis-include=/data/home/heepark/software/metis/include --with-metis-lib=/data/home/heepark/software/metis/build/libmetis/libmetis.a --with-parmetis-include=/data/home/heepark/software/parmetis/include --with-parmetis-lib=/data/home/heepark/software/parmetis/build/libparmetis/libparmetis.a --with-scalapack-lib=/data/home/heepark/software/scalapack/build/lib/libscalapack.a --with-hdf5-fortran-bindings=yes [0]PETSC ERROR: #1 PetscCommBuildTwoSidedFReq() at /data/home/heepark/software/petsc-main/src/sys/utils/mpits.c:555 [0]PETSC ERROR: #2 MatStashScatterBegin_BTS() at /data/home/heepark/software/petsc-main/src/mat/utils/matstash.c:940 [0]PETSC ERROR: #3 MatStashScatterBegin_Private() at /data/home/heepark/software/petsc-main/src/mat/utils/matstash.c:461 [0]PETSC ERROR: #4 MatAssemblyBegin_MPIAIJ() at /data/home/heepark/software/petsc-main/src/mat/impls/aij/mpi/mpiaij.c:673 [0]PETSC ERROR: #5 MatAssemblyBegin() at /data/home/heepark/software/petsc-main/src/mat/interface/matrix.c:5592 [0]PETSC ERROR: #6 User provided function() at User file:0 [0]PETSC ERROR: #7 oursnesjacobian() at /data/home/heepark/software/petsc-main/src/snes/interface/ftn-custom/zsnesf.c:173 [0]PETSC ERROR: #8 SNESComputeJacobian() at /data/home/heepark/software/petsc-main/src/snes/interface/snes.c:2864 [0]PETSC ERROR: #9 SNESSolve_NEWTONLS() at /data/home/heepark/software/petsc-main/src/snes/impls/ls/ls.c:222 [0]PETSC ERROR: #10 SNESSolve() at /data/home/heepark/software/petsc-main/src/snes/interface/snes.c:4810 [0]PETSC ERROR: #11 User provided function() at User file:0 Heeho Daniel Park ! ------------------------------------ ! Sandia National Laboratories Org: 08844, R&D Work: 505-844-1319 ! ------------------------------------ ! -------------- next part -------------- An HTML attachment was scrubbed... URL: From 459543524 at qq.com Thu Feb 17 02:42:51 2022 From: 459543524 at qq.com (=?ISO-8859-1?B?NDU5NTQzNTI0?=) Date: Thu, 17 Feb 2022 16:42:51 +0800 Subject: [petsc-users] Reuse symbolic factorization with petsc - mumps Message-ID: Sir, I have a problem when using petsc. I want to solve a series of linear equations. A1*x1=b1, A2*x2=b2, A3*x3=b3 ... The A1,A2,A3 have the same sparstiy pattern. I want to use MUMPS to solve the system. In order to enhance performance, I want to reuse the symbolic factorization. Here my code for solve a single linear system is ----------------------------------------------------- Mat A, P, F; PC pc; Vec rhs_vec, result_vec; KSPSetOperators(ksp, A, A); KSPSetType(ksp, KSPPREONLY); KSPGetPC(ksp, &pc); PCSetType(pc, PCLU); PCFactorSetMatSolverType(pc, MATSOLVERMUMPS); PCFactorSetUpMatSolverType(pc); PCFactorGetMatrix(pc, &F); MatMumpsSetIcntl(F, 7, 5); // configure mumps. KSPSolve(ksp, rhs_vec, result_vec); ----------------------------------------------------- I have no idea how to reuse symbolic factorization when using MUMPS. I have see the information from interent. The petsc developper have suggested that using: KSPSetOperators(KSP_A, A, A, DIFFERENT_NONZERO_PATTERN) KSPSetOperators(KSP_A, A, A, SAME_NONZERO_PATTERN) However, this API seems depreacted. see https://lists.mcs.anl.gov/pipermail/petsc-users/2013-March/016646.html I have see there exist API: MatLUFactorSymbolic,  MatLUFactorNumeric(). but I have no idea how to call it. Could you please give me an example how to reuse the symbolic factorization when using MUMPS in petsc? Thanks for your time. Xu Hui   -------------- next part -------------- An HTML attachment was scrubbed... URL: From berend.vanwachem at ovgu.de Thu Feb 17 03:06:44 2022 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Thu, 17 Feb 2022 10:06:44 +0100 Subject: [petsc-users] DMView and DMLoad In-Reply-To: References: <56ce2135-9757-4292-e33b-c7eea8fb7b2e@ovgu.de> <056E066F-D596-4254-A44A-60BFFD30FE82@erdw.ethz.ch> <6c4e0656-db99-e9da-000f-ab9f7dd62c07@ovgu.de> Message-ID: <0845e501-e2cd-d7cc-58be-2803ee5ef6cd@ovgu.de> Dear Koki, Many thanks for your help and sorry for the slow reply. I haven't been able to get it to work successfully. I have attached a small example that replicates the main features of our code. In this example a Box with one random field is generated, saved and loaded. The case works for non-periodic domains and fails for periodic ones. I've also included the error output at the bottom of this email. To switch between periodic and non-periodic, please comment/uncomment lines 47 to 52 in src/main.c. To compile, the files "compile" and "CMakeLists.txt" are included in a separate tar file, if you want to use this. Your library paths should be updated in the latter file. The PETSc main distribution is used. Many thanks for your help! Thanks and best regards, Berend. The error message with --with-debugging=no --with-errorchecking=no: [0]PETSC ERROR: --------------------- Error Message ------------------------------------------------ [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Number of coordinates loaded 3168 does not match number of vertices 1000 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.2-499-g9039b796d1 GIT Date: 2021-12-24 23:23:09 +0000 [0]PETSC ERROR: ./bin/restart_periodic on a arch-linux-c-opt named james by serbenlo Thu Dec 30 20:53:22 2021 [0]PETSC ERROR: Configure options --with-debugging=no --with-errorchecking=no --with-clean --download-metis=yes --download-parmetis=yes --download-hdf5 --download-p4est --download-triangle --download-tetgen --with-zlib-lib=/usr/lib/x86_64-linux-gnu/libz.a --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr --with-mpiexec=/usr/bin/mpiexec [0]PETSC ERROR: #1 DMPlexCoordinatesLoad_HDF5_V0_Private() at /usr/local/petsc_main/src/dm/impls/plex/plexhdf5.c:1387 [0]PETSC ERROR: #2 DMPlexCoordinatesLoad_HDF5_Internal() at /usr/local/petsc_main/src/dm/impls/plex/plexhdf5.c:1419 [0]PETSC ERROR: #3 DMPlexCoordinatesLoad() at /usr/local/petsc_main/src/dm/impls/plex/plex.c:2070 [0]PETSC ERROR: #4 main() at /media/MANNHEIM/Arbeit/OvGU_PostDoc_2021/Projects/MF_Restart/Periodic_DM/restart-periodic/src/main.c:229 [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0 The error message with --with-debugging=yes --with-errorchecking=yes: [0]PETSC ERROR: --------------------- Error Message ------------------------------------------------- [0]PETSC ERROR: Null argument, when expecting valid pointer [1]PETSC ERROR: --------------------- Error Message ------------------------------------------------- [1]PETSC ERROR: Null argument, when expecting valid pointer [1]PETSC ERROR: Null Object: Parameter # 1 [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Development GIT revision: v3.16.2-499-g9039b796d1 GIT Date: 2021-12-24 23:23:09 +0000 [1]PETSC ERROR: ./bin/restart_periodic on a arch-linux-c-opt named james by serbenlo Thu Dec 30 20:17:22 2021 [1]PETSC ERROR: Configure options --with-debugging=yes --with-errorchecking=yes --with-clean --download-metis=yes --download-parmetis=yes --download-hdf5 --download-p4est --download-triangle --download-tetgen --with-zlib-lib=/usr/lib/x86_64-linux-gnu/libz.a --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr --with-mpiexec=/usr/bin/mpiexec [1]PETSC ERROR: #1 PetscSectionGetDof() at /usr/local/petsc_main/src/vec/is/section/interface/section.c:807 [1]PETSC ERROR: [0]PETSC ERROR: Null Object: Parameter # 1 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.2-499-g9039b796d1 GIT Date: 2021-12-24 23:23:09 +0000 [0]PETSC ERROR: ./bin/restart_periodic on a arch-linux-c-opt named james by serbenlo Thu Dec 30 20:17:22 2021 [0]PETSC ERROR: Configure options --with-debugging=yes --with-errorchecking=yes --with-clean --download-metis=yes --download-parmetis=yes --download-hdf5 --download-p4est --download-triangle --download-tetgen --with-zlib-lib=/usr/lib/x86_64-linux-gnu/libz.a --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr --with-mpiexec=/usr/bin/mpiexec #2 DMDefaultSectionCheckConsistency_Internal() at /usr/local/petsc_main/src/dm/interface/dm.c:4489 [1]PETSC ERROR: #3 DMSetGlobalSection() at /usr/local/petsc_main/src/dm/interface/dm.c:4583 [1]PETSC ERROR: [0]PETSC ERROR: #1 PetscSectionGetDof() at /usr/local/petsc_main/src/vec/is/section/interface/section.c:807 [0]PETSC ERROR: #4 main() at /media/MANNHEIM/Arbeit/OvGU_PostDoc_2021/Projects/MF_Restart/Periodic_DM/restart-periodic/src/main.c:164 [1]PETSC ERROR: No PETSc Option Table entries [1]PETSC ERROR: #2 DMDefaultSectionCheckConsistency_Internal() at /usr/local/petsc_main/src/dm/interface/dm.c:4489 [0]PETSC ERROR: #3 DMSetGlobalSection() at /usr/local/petsc_main/src/dm/interface/dm.c:4583 ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 85) - process 1 [0]PETSC ERROR: #4 main() at /media/MANNHEIM/Arbeit/OvGU_PostDoc_2021/Projects/MF_Restart/Periodic_DM/restart-periodic/src/main.c:164 [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 On 12/7/21 16:50, Sagiyama, Koki wrote: > Hi Berend, > > I made some small changes to your code to successfully compile it and > defined a periodic dm using DMPlexCreateBoxMesh(), but otherwise your > code worked fine. > I think we would like to see a complete minimal failing example. Can you > make the working example that I pasted in earlier email fail just by > modifying the dm(i.e., using the periodic mesh you are actually using)? > > Thanks, > Koki > ------------------------------------------------------------------------ > *From:* Berend van Wachem > *Sent:* Monday, December 6, 2021 3:39 PM > *To:* Sagiyama, Koki ; Hapla Vaclav > ; PETSc users list ; > Lawrence Mitchell > *Subject:* Re: [petsc-users] DMView and DMLoad > Dear Koki, > > Thanks for your email. In the example of your last email > DMPlexCoordinatesLoad() takes sF0 (PetscSF) as a third argument. In our > code this modification does not fix the error when loading a periodic > dm. Are we doing something wrong? I've included an example code at the > bottom of this email, including the error output. > > Thanks and best regards, > Berend > > > /**** Write DM + Vec restart ****/ > PetscViewerHDF5Open(PETSC_COMM_WORLD, "result", FILE_MODE_WRITE, &H5Viewer); > PetscObjectSetName((PetscObject)dm, "plexA"); > PetscViewerPushFormat(H5Viewer, PETSC_VIEWER_HDF5_PETSC); > DMPlexTopologyView(dm, H5Viewer); > DMPlexLabelsView(dm, H5Viewer); > DMPlexCoordinatesView(dm, H5Viewer); > PetscViewerPopFormat(H5Viewer); > > DM sdm; > PetscSection s; > > DMClone(dm, &sdm); > PetscObjectSetName((PetscObject)sdm, "dmA"); > DMGetGlobalSection(dm, &s); > DMSetGlobalSection(sdm, s); > DMPlexSectionView(dm, H5Viewer, sdm); > > Vec? vec, vecOld; > PetscScalar *array, *arrayOld, *xVecArray, *xVecArrayOld; > PetscInt numPoints; > > DMGetGlobalVector(sdm, &vec); > DMGetGlobalVector(sdm, &vecOld); > > /*** Fill the vectors vec and vecOld? ***/ > VecGetArray(vec, &array); > VecGetArray(vecOld, &arrayOld); > VecGetLocalSize(xGlobalVector, &numPoints); > VecGetArray(xGlobalVector, &xVecArray); > VecGetArray(xOldGlobalVector, &xVecArrayOld); > > for (i = 0; i < numPoints; i++) /* Loop over all internal mesh points */ > { > ???? array[i]??? = xVecArray[i]; > ???? arrayOld[i] = xVecArrayOld[i]; > } > > VecRestoreArray(vec, &array); > VecRestoreArray(vecOld, &arrayOld); > VecRestoreArray(xGlobalVector, &xVecArray); > VecRestoreArray(xOldGlobalVector, &xVecArrayOld); > > PetscObjectSetName((PetscObject)vec, "vecA"); > PetscObjectSetName((PetscObject)vecOld, "vecB"); > DMPlexGlobalVectorView(dm, H5Viewer, sdm, vec); > DMPlexGlobalVectorView(dm, H5Viewer, sdm, vecOld); > PetscViewerDestroy(&H5Viewer); > /*** end of writing ****/ > > /*** Load ***/ > PetscViewerHDF5Open(PETSC_COMM_WORLD, "result", FILE_MODE_READ, &H5Viewer); > DMCreate(PETSC_COMM_WORLD, &dm); > DMSetType(dm, DMPLEX); > PetscObjectSetName((PetscObject)dm, "plexA"); > PetscViewerPushFormat(H5Viewer, PETSC_VIEWER_HDF5_PETSC); > DMPlexTopologyLoad(dm, H5Viewer, &sfO); > DMPlexLabelsLoad(dm, H5Viewer); > DMPlexCoordinatesLoad(dm, H5Viewer, sfO); > PetscViewerPopFormat(H5Viewer); > > DMPlexDistribute(dm, Options->Mesh.overlap, &sfDist, &distributedDM); > if (distributedDM) { > ???? DMDestroy(&dm); > ???? dm = distributedDM; > ???? PetscObjectSetName((PetscObject)dm, "plexA"); > } > > PetscSFCompose(sfO, sfDist, &sf); > PetscSFDestroy(&sfO); > PetscSFDestroy(&sfDist); > > DMClone(dm, &sdm); > PetscObjectSetName((PetscObject)sdm, "dmA"); > DMPlexSectionLoad(dm, H5Viewer, sdm, sf, &globalDataSF, &localDataSF); > > /** Load the Vectors **/ > DMGetGlobalVector(sdm, &Restart_xGlobalVector); > VecSet(Restart_xGlobalVector,0.0); > > PetscObjectSetName((PetscObject)Restart_xGlobalVector, "vecA"); > DMPlexGlobalVectorLoad(dm, H5Viewer, sdm, > globalDataSF,Restart_xGlobalVector); > DMGetGlobalVector(sdm, &Restart_xOldGlobalVector); > VecSet(Restart_xOldGlobalVector,0.0); > > PetscObjectSetName((PetscObject)Restart_xOldGlobalVector, "vecB"); > DMPlexGlobalVectorLoad(dm, H5Viewer, sdm, globalDataSF, > Restart_xOldGlobalVector); > > PetscViewerDestroy(&H5Viewer); > > > /**** The error message when loading is the following ************/ > > Creating and distributing mesh > [0]PETSC ERROR: --------------------- Error Message > -------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Number of coordinates loaded 17128 does not match number > of vertices 8000 > [0]PETSC ERROR: See https://petsc.org/release/faq/ > for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.16.1-435-g007f11b901 > GIT Date: 2021-12-01 14:31:21 +0000 > [0]PETSC ERROR: ./MF3 on a linux-gcc-openmpi-opt named > ivt24.ads.uni-magdeburg.de by berend Mon Dec? 6 16:11:21 2021 > [0]PETSC ERROR: Configure options --with-p4est=yes --with-partemis > --with-metis --with-debugging=no --download-metis=yes > --download-parmetis=yes --with-errorchecking=no --download-hdf5 > --download-zlib --download-p4est > [0]PETSC ERROR: #1 DMPlexCoordinatesLoad_HDF5_V0_Private() at > /home/berend/src/petsc_main/src/dm/impls/plex/plexhdf5.c:1387 > [0]PETSC ERROR: #2 DMPlexCoordinatesLoad_HDF5_Internal() at > /home/berend/src/petsc_main/src/dm/impls/plex/plexhdf5.c:1419 > [0]PETSC ERROR: #3 DMPlexCoordinatesLoad() at > /home/berend/src/petsc_main/src/dm/impls/plex/plex.c:2070 > [0]PETSC ERROR: #4 RestartMeshDM() at > /home/berend/src/eclipseworkspace/multiflow/src/io/restartmesh.c:81 > [0]PETSC ERROR: #5 CreateMeshDM() at > /home/berend/src/eclipseworkspace/multiflow/src/mesh/createmesh.c:61 > [0]PETSC ERROR: #6 main() at > /home/berend/src/eclipseworkspace/multiflow/src/general/main.c:132 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: --download-hdf5 > [0]PETSC ERROR: --download-metis=yes > [0]PETSC ERROR: --download-p4est > [0]PETSC ERROR: --download-parmetis=yes > [0]PETSC ERROR: --download-zlib > [0]PETSC ERROR: --with-debugging=no > [0]PETSC ERROR: --with-errorchecking=no > [0]PETSC ERROR: --with-metis > [0]PETSC ERROR: --with-p4est=yes > [0]PETSC ERROR: --with-partemis > [0]PETSC ERROR: -d results > [0]PETSC ERROR: -o run.mf > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 62. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > > > > > > On 11/19/21 00:26, Sagiyama, Koki wrote: >> Hi Berend, >> >> I was not able to reproduce the issue you are having, but the following >> 1D example (and similar 2D examples) worked fine for me using the latest >> PETSc. Please note that DMPlexCoordinatesLoad() now takes a PetscSF >> object as the third argument, but the default behavior is unchanged. >> >> /* test_periodic_io.c */ >> >> #include >> #include >> #include >> >> int main(int argc, char **argv) >> { >>? ? DM ? ? ? ? ? ? ? ? dm; >>? ? Vec ? ? ? ? ? ? ? ?coordinates; >>? ? PetscViewer ? ? ? ?viewer; >>? ? PetscViewerFormat ?format = PETSC_VIEWER_HDF5_PETSC; >>? ? PetscSF ? ? ? ? ? ?sfO; >>? ? PetscErrorCode ? ? ierr; >> >>? ? ierr = PetscInitialize(&argc, &argv, NULL, NULL); if (ierr) return ierr; >>? ? /* Save */ >>? ? ierr = PetscViewerHDF5Open(PETSC_COMM_WORLD, "periodic_example.h5", >> FILE_MODE_WRITE, &viewer);CHKERRQ(ierr); >>? ? { >>? ? ? DM ? ? ? ? ? ? ?pdm; >>? ? ? PetscInt ? ? ? ?dim = 1; >>? ? ? const PetscInt ?faces[1] = {4}; >>? ? ? DMBoundaryType ?periodicity[] = {DM_BOUNDARY_PERIODIC}; >>? ? ? PetscInt ? ? ? ?overlap = 1; >> >>? ? ? ierr = DMPlexCreateBoxMesh(PETSC_COMM_WORLD, dim, PETSC_FALSE, >> faces, NULL, NULL, periodicity, PETSC_TRUE, &dm);CHKERRQ(ierr); >>? ? ? ierr = DMPlexDistribute(dm, overlap, NULL, &pdm);CHKERRQ(ierr); >>? ? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); >>? ? ? dm = pdm; >>? ? ? ierr = PetscObjectSetName((PetscObject)dm, "periodicDM");CHKERRQ(ierr); >>? ? } >>? ? ierr = DMGetCoordinates(dm, &coordinates);CHKERRQ(ierr); >>? ? ierr = PetscPrintf(PETSC_COMM_WORLD, "Coordinates before >> saving:\n");CHKERRQ(ierr); >>? ? ierr = VecView(coordinates, NULL);CHKERRQ(ierr); >>? ? ierr = PetscViewerPushFormat(viewer, format);CHKERRQ(ierr); >>? ? ierr = DMPlexTopologyView(dm, viewer);CHKERRQ(ierr); >>? ? ierr = DMPlexCoordinatesView(dm, viewer);CHKERRQ(ierr); >>? ? ierr = PetscViewerPopFormat(viewer);CHKERRQ(ierr); >>? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); >>? ? ierr = PetscViewerDestroy(&viewer);CHKERRQ(ierr); >>? ? /* Load */ >>? ? ierr = PetscViewerHDF5Open(PETSC_COMM_WORLD, "periodic_example.h5", >> FILE_MODE_READ, &viewer);CHKERRQ(ierr); >>? ? ierr = DMCreate(PETSC_COMM_WORLD, &dm);CHKERRQ(ierr); >>? ? ierr = DMSetType(dm, DMPLEX);CHKERRQ(ierr); >>? ? ierr = PetscObjectSetName((PetscObject)dm, "periodicDM");CHKERRQ(ierr); >>? ? ierr = PetscViewerPushFormat(viewer, format);CHKERRQ(ierr); >>? ? ierr = DMPlexTopologyLoad(dm, viewer, &sfO);CHKERRQ(ierr); >>? ? ierr = DMPlexCoordinatesLoad(dm, viewer, sfO);CHKERRQ(ierr); >>? ? ierr = PetscViewerPopFormat(viewer);CHKERRQ(ierr); >>? ? ierr = DMGetCoordinates(dm, &coordinates);CHKERRQ(ierr); >>? ? ierr = PetscPrintf(PETSC_COMM_WORLD, "Coordinates after >> loading:\n");CHKERRQ(ierr); >>? ? ierr = VecView(coordinates, NULL);CHKERRQ(ierr); >>? ? ierr = PetscSFDestroy(&sfO);CHKERRQ(ierr); >>? ? ierr = DMDestroy(&dm);CHKERRQ(ierr); >>? ? ierr = PetscViewerDestroy(&viewer);CHKERRQ(ierr); >>? ? ierr = PetscFinalize(); >>? ? return ierr; >> } >> >> mpiexec -n 2 ./test_periodic_io >> >> Coordinates before saving: >> Vec Object: coordinates 2 MPI processes >>? ? type: mpi >> Process [0] >> 0. >> Process [1] >> 0.25 >> 0.5 >> 0.75 >> Coordinates after loading: >> Vec Object: vertices 2 MPI processes >>? ? type: mpi >> Process [0] >> 0. >> 0.25 >> 0.5 >> 0.75 >> Process [1] >> >> I would also like to note that, with the latest update, we can >> optionally load coordinates directly on the distributed dm as (using >> your notation): >> >>? ? /* Distribute dm */ >>? ? ... >>? ? PetscSFCompose(sfO, sfDist, &sf); >>? ? DMPlexCoordinatesLoad(dm, viewer, sf); >> >> To use this feature, we need to pass "-dm_plex_view_hdf5_storage_version >> 2.0.0" option when saving topology/coordinates. >> >> >> Thanks, >> Koki >> ------------------------------------------------------------------------ >> *From:* Berend van Wachem >> *Sent:* Wednesday, November 17, 2021 3:16 PM >> *To:* Hapla Vaclav ; PETSc users list >> ; Lawrence Mitchell ; Sagiyama, >> Koki >> *Subject:* Re: [petsc-users] DMView and DMLoad >> >> ******************* >> This email originates from outside Imperial. Do not click on links and >> attachments unless you recognise the sender. >> If you trust the sender, add them to your safe senders list >> https://spam.ic.ac.uk/SpamConsole/Senders.aspx > >> > to disable email >> stamping for this address. >> ******************* >> Dear Vaclav, Lawrence, Koki, >> >> Thanks for your help! Following your advice and following your example >> (https://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5 > >> >) > >> >> we are able to save and load the DM with a wrapped Vector in h5 format >> (PETSC_VIEWER_HDF5_PETSC) successfully. >> >> For saving, we use something similar to: >> >>? ???? DMPlexTopologyView(dm, viewer); >>? ???? DMClone(dm, &sdm); >>? ???? ... >>? ???? DMPlexSectionView(dm, viewer, sdm); >>? ???? DMGetLocalVector(sdm, &vec); >>? ???? ... >>? ???? DMPlexLocalVectorView(dm, viewer, sdm, vec); >> >> and for loading: >> >>? ???? DMCreate(PETSC_COMM_WORLD, &dm); >>? ???? DMSetType(dm, DMPLEX); >>? ???????? ... >>? ?????? PetscViewerPushFormat(viewer, PETSC_VIEWER_HDF5_PETSC); >>? ???? DMPlexTopologyLoad(dm, viewer, &sfO); >>? ???? DMPlexLabelsLoad(dm, viewer); >>? ???? DMPlexCoordinatesLoad(dm, viewer); >>? ???? PetscViewerPopFormat(viewer); >>? ???? ... >>? ???? PetscSFCompose(sfO, sfDist, &sf); >>? ???? ... >>? ???? DMClone(dm, &sdm); >>? ???? DMPlexSectionLoad(dm, viewer, sdm, sf, &globalDataSF, &localDataSF); >>? ???? DMGetLocalVector(sdm, &vec); >>? ???? ... >>? ???? DMPlexLocalVectorLoad(dm, viewer, sdm, localDataSF, vec); >> >> >> This works fine for non-periodic DMs but for periodic cases the line: >> >>? ???? DMPlexCoordinatesLoad(dm, H5Viewer); >> >> delivers the error message: invalid argument and the number of loaded >> coordinates does not match the number of vertices. >> >> Is this a known shortcoming, or have we forgotten something to load >> periodic DMs? >> >> Best regards, >> >> Berend. >> >> >> >> On 9/22/21 20:59, Hapla Vaclav wrote: >>> To avoid confusions here, Berend seems to be specifically demanding XDMF >>> (PETSC_VIEWER_HDF5_XDMF). The stuff we are now working on is parallel >>> checkpointing in our own HDF5 format?(PETSC_VIEWER_HDF5_PETSC), I will >>> make a series of MRs on this topic in the following days. >>> >>> For XDMF, we are specifically missing the ability to write/load DMLabels >>> properly. XDMF uses specific cell-local numbering for faces for >>> specification of face sets, and face-local numbering for specification >>> of edge sets, which is not great wrt DMPlex design. And ParaView doesn't >>> show any of these properly so it's hard to debug. Matt, we should talk >>> about this soon. >>> >>> Berend, for now, could you just load the mesh initially from XDMF and >>> then use our PETSC_VIEWER_HDF5_PETSC format for subsequent saving/loading? >>> >>> Thanks, >>> >>> Vaclav >>> >>>> On 17 Sep 2021, at 15:46, Lawrence Mitchell >>> >>> wrote: >>>> >>>> Hi Berend, >>>> >>>>> On 14 Sep 2021, at 12:23, Matthew Knepley >>>> >>> wrote: >>>>> >>>>> On Tue, Sep 14, 2021 at 5:15 AM Berend van Wachem >>>>> >>> wrote: >>>>> Dear PETSc-team, >>>>> >>>>> We are trying to save and load distributed DMPlex and its associated >>>>> physical fields (created with DMCreateGlobalVector) ?(Uvelocity, >>>>> VVelocity, ?...) in HDF5_XDMF format. To achieve this, we do the >>>>> following: >>>>> >>>>> 1) save in the same xdmf.h5 file: >>>>> DMView( DM ????????, H5_XDMF_Viewer ); >>>>> VecView( UVelocity, H5_XDMF_Viewer ); >>>>> >>>>> 2) load the dm: >>>>> DMPlexCreateFromfile(PETSC_COMM_WORLD, Filename, PETSC_TRUE, DM); >>>>> >>>>> 3) load the physical field: >>>>> VecLoad( UVelocity, H5_XDMF_Viewer ); >>>>> >>>>> There are no errors in the execution, but the loaded DM is distributed >>>>> differently to the original one, which results in the incorrect >>>>> placement of the values of the physical fields (UVelocity etc.) in the >>>>> domain. >>>>> >>>>> This approach is used to restart the simulation with the last saved DM. >>>>> Is there something we are missing, or there exists alternative routes to >>>>> this goal? Can we somehow get the IS of the redistribution, so we can >>>>> re-distribute the vector data as well? >>>>> >>>>> Many thanks, best regards, >>>>> >>>>> Hi Berend, >>>>> >>>>> We are in the midst of rewriting this. We want to support saving >>>>> multiple meshes, with fields attached to each, >>>>> and preserving the discretization (section) information, and allowing >>>>> us to load up on a different number of >>>>> processes. We plan to be done by October. Vaclav and I are doing this >>>>> in collaboration with Koki Sagiyama, >>>>> David Ham, and Lawrence Mitchell from the Firedrake team. >>>> >>>> The core load/save cycle functionality is now in PETSc main. So if >>>> you're using main rather than a release, you can get access to it now. >>>> This section of the manual shows an example of how to do >>>> thingshttps://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5 >>>> >> >> >>>> >>>> Let us know if things aren't clear! >>>> >>>> Thanks, >>>> >>>> Lawrence >>> -------------- next part -------------- A non-text attachment was scrubbed... Name: restart-periodic.tar.gz Type: application/gzip Size: 5661 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: main.c Type: text/x-csrc Size: 7663 bytes Desc: not available URL: From jroman at dsic.upv.es Thu Feb 17 03:17:43 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 17 Feb 2022 10:17:43 +0100 Subject: [petsc-users] Reuse symbolic factorization with petsc - mumps In-Reply-To: References: Message-ID: Since version 3.5, KSPSetOperators() will check if the passed matrix has the same sparse pattern as the previously set one, so you don't have to do anything. The list of changes in version 3.5 has this note: "KSPSetOperators() no longer has the MatStructure argument. The Mat objects now track that information themselves. Use KSP/PCSetReusePreconditioner() to prevent the recomputation of the preconditioner if the operator changed in the way that SAME_PRECONDITIONER did with KSPSetOperators()." You don't call MatLUFactorSymbolic() yourself, it is called internally. You can check with -log_view if the number of calls to MatLUFactorSymbolic() is as expected. Jose > El 17 feb 2022, a las 9:42, 459543524 via petsc-users escribi?: > > Sir, I have a problem when using petsc. > > I want to solve a series of linear equations. > > A1*x1=b1, A2*x2=b2, A3*x3=b3 ... > > The A1,A2,A3 have the same sparstiy pattern. > > I want to use MUMPS to solve the system. > In order to enhance performance, I want to reuse the symbolic factorization. > > Here my code for solve a single linear system is > ----------------------------------------------------- > Mat A, P, F; > PC pc; > Vec rhs_vec, result_vec; > KSPSetOperators(ksp, A, A); > KSPSetType(ksp, KSPPREONLY); > KSPGetPC(ksp, &pc); > PCSetType(pc, PCLU); > PCFactorSetMatSolverType(pc, MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc, &F); > MatMumpsSetIcntl(F, 7, 5); // configure mumps. > KSPSolve(ksp, rhs_vec, result_vec); > ----------------------------------------------------- > > > I have no idea how to reuse symbolic factorization when using MUMPS. > > I have see the information from interent. The petsc developper have suggested that using: > KSPSetOperators(KSP_A, A, A, DIFFERENT_NONZERO_PATTERN) > KSPSetOperators(KSP_A, A, A, SAME_NONZERO_PATTERN) > However, this API seems depreacted. > see https://lists.mcs.anl.gov/pipermail/petsc-users/2013-March/016646.html > > I have see there exist API: MatLUFactorSymbolic, MatLUFactorNumeric(). but I have no idea how to call it. > > Could you please give me an example how to reuse the symbolic factorization when using MUMPS in petsc? > > Thanks for your time. > > Xu Hui > > > From jroman at dsic.upv.es Thu Feb 17 04:49:52 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 17 Feb 2022 11:49:52 +0100 Subject: [petsc-users] Reuse symbolic factorization with petsc - mumps In-Reply-To: References: Message-ID: Please always respond to the list. Yes, those lines are not needed every time, just the first one. Anyway, they do not imply a big overhead. Jose > El 17 feb 2022, a las 11:45, 459543524 <459543524 at qq.com> escribi?: > > Thanks for your reply sir. > > I now can reuse the sparsity pattern. > I solve two linear system and found call 'MatLUFactorSym' 1 time and 'MatLUFactorNum' 2 time. > I modify my code by following. > > > ----------------------------------------------------- > // stage 1: > > Vec x1, b2; > Vec x1, b2; > Mat A, P, F; > PC pc; > > // solve first system > MatCreateAIJ(A, ...) > MatSetVaules(A, ...) > MatAssembleBegin(A, ...) > MatAssembleBegin(A, ...) > > KSPSetOperators(ksp, A, A); > KSPSetType(ksp, KSPPREONLY); > KSPGetPC(ksp, &pc); > PCSetType(pc, PCLU); > PCFactorSetMatSolverType(pc, MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc, &F); > MatMumpsSetIcntl(F, 7, 5); // configure mumps. > KSPSolve(ksp, b1, x1); > > // solve second system > MatZeroEntries(A); > MatSetVaules(A, ...); > MatAssembleBegin(A, ...); > MatAssembleBegin(A, ...); > > KSPSetOperators(ksp, A, A); > KSPSetType(ksp, KSPPREONLY); > KSPGetPC(ksp, &pc); > PCSetType(pc, PCLU); > PCFactorSetMatSolverType(pc, MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc, &F); > MatMumpsSetIcntl(F, 7, 5); // configure mumps. > KSPSolve(ksp, b2, x2); > > ----------------------------------------------------- > > I question is, in the code, we call follow code block twice > ------------------------------- > KSPSetType(ksp, KSPPREONLY); > KSPGetPC(ksp, &pc); > PCSetType(pc, PCLU); > PCFactorSetMatSolverType(pc, MATSOLVERMUMPS); > PCFactorSetUpMatSolverType(pc); > PCFactorGetMatrix(pc, &F); > MatMumpsSetIcntl(F, 7, 5); // configure mumps. > ------------------------------- > Does this introduce unnecessary big computation overhead? > Can the code further simpilfy to enhance a better performance? > > Thanks for your time. > > > > ------------------ ???? ------------------ > ???: "Jose E. Roman" ; > ????: 2022?2?17?(???) ??5:17 > ???: "459543524"<459543524 at qq.com>; > ??: "petsc-users"; > ??: Re: [petsc-users] Reuse symbolic factorization with petsc - mumps > > Since version 3.5, KSPSetOperators() will check if the passed matrix has the same sparse pattern as the previously set one, so you don't have to do anything. > > The list of changes in version 3.5 has this note: > "KSPSetOperators() no longer has the MatStructure argument. The Mat objects now track that information themselves. Use KSP/PCSetReusePreconditioner() to prevent the recomputation of the preconditioner if the operator changed in the way that SAME_PRECONDITIONER did with KSPSetOperators()." > > You don't call MatLUFactorSymbolic() yourself, it is called internally. You can check with -log_view if the number of calls to MatLUFactorSymbolic() is as expected. > > Jose > > > > > El 17 feb 2022, a las 9:42, 459543524 via petsc-users escribi?: > > > > Sir, I have a problem when using petsc. > > > > I want to solve a series of linear equations. > > > > A1*x1=b1, A2*x2=b2, A3*x3=b3 ... > > > > The A1,A2,A3 have the same sparstiy pattern. > > > > I want to use MUMPS to solve the system. > > In order to enhance performance, I want to reuse the symbolic factorization. > > > > Here my code for solve a single linear system is > > ----------------------------------------------------- > > Mat A, P, F; > > PC pc; > > Vec rhs_vec, result_vec; > > KSPSetOperators(ksp, A, A); > > KSPSetType(ksp, KSPPREONLY); > > KSPGetPC(ksp, &pc); > > PCSetType(pc, PCLU); > > PCFactorSetMatSolverType(pc, MATSOLVERMUMPS); > > PCFactorSetUpMatSolverType(pc); > > PCFactorGetMatrix(pc, &F); > > MatMumpsSetIcntl(F, 7, 5); // configure mumps. > > KSPSolve(ksp, rhs_vec, result_vec); > > ----------------------------------------------------- > > > > > > I have no idea how to reuse symbolic factorization when using MUMPS. > > > > I have see the information from interent. The petsc developper have suggested that using: > > KSPSetOperators(KSP_A, A, A, DIFFERENT_NONZERO_PATTERN) > > KSPSetOperators(KSP_A, A, A, SAME_NONZERO_PATTERN) > > However, this API seems depreacted. > > see https://lists.mcs.anl.gov/pipermail/petsc-users/2013-March/016646.html > > > > I have see there exist API: MatLUFactorSymbolic, MatLUFactorNumeric(). but I have no idea how to call it. > > > > Could you please give me an example how to reuse the symbolic factorization when using MUMPS in petsc? > > > > Thanks for your time. > > > > Xu Hui > > > > > > From 459543524 at qq.com Thu Feb 17 05:02:33 2022 From: 459543524 at qq.com (=?gb18030?B?NDU5NTQzNTI0?=) Date: Thu, 17 Feb 2022 19:02:33 +0800 Subject: [petsc-users] Reuse symbolic factorization with petsc - mumps In-Reply-To: References: Message-ID: Thanks sir. I now modify my code into following. Everything works good. > ----------------------------------------------------- // stage 1: Vec x1, b2; Vec x1, b2; Mat A, P, F; PC pc; // solve first system MatCreateAIJ(A, ...) MatSetVaules(A, ...) MatAssembleBegin(A, ...) MatAssembleBegin(A, ...) KSPSetOperators(ksp, A, A); KSPSetType(ksp, KSPPREONLY); KSPGetPC(ksp, &pc); PCSetType(pc, PCLU); PCFactorSetMatSolverType(pc, MATSOLVERMUMPS); PCFactorSetUpMatSolverType(pc); PCFactorGetMatrix(pc, &F); MatMumpsSetIcntl(F, 7, 5); // configure mumps. KSPSolve(ksp, b1, x1); // solve second system MatZeroEntries(A); MatSetVaules(A, ...); MatAssembleBegin(A, ...); MatAssembleBegin(A, ...); KSPSetOperators(ksp, A, A); KSPSolve(ksp, b2, x2); > > ----------------------------------------------------- ------------------ ???? ------------------ ???: "Jose E. Roman" From bojan.niceno.scientist at gmail.com Thu Feb 17 06:01:02 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Thu, 17 Feb 2022 13:01:02 +0100 Subject: [petsc-users] Why is there MAPMPIAIJ Message-ID: Dear all, I am coupling my unstructured CFD solver with PETSc. At this moment, sequential version is working fine, but I obviously want to migrate to MPI parallel. My code is MPI parallel since ages. Anyhow, as a part of the migration to parallel, I changed the matrix type from MATSEQAIJ to MATMPIAIJ. The code compiled, but when I executed it one processor, I received an error message that combination of matrix format does not support BICG solver and PCILU preconditoner. I took a look at the compatibility matrix ( https://petsc.org/release/overview/linear_solve_table/#preconditioners) and noticed that MATMPIAIJ supports only MKL CParadiso preconditioner which seems to belong to Intel. I did some more reading and realised that I should probably continue with MATAIJ (which should work in sequential and parallel), but I am wondering why would there even be MATMPIAJ if it supports only one third-party preconditioner? Cheers, Bojan Niceno -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Feb 17 06:12:26 2022 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 17 Feb 2022 07:12:26 -0500 Subject: [petsc-users] Why is there MAPMPIAIJ In-Reply-To: References: Message-ID: On Thu, Feb 17, 2022 at 7:01 AM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Dear all, > > I am coupling my unstructured CFD solver with PETSc. At this moment, > sequential version is working fine, but I obviously want to migrate to MPI > parallel. My code is MPI parallel since ages. > > Anyhow, as a part of the migration to parallel, I changed the matrix type > from MATSEQAIJ to MATMPIAIJ. The code compiled, but when I executed it one > processor, I received an error message that combination of matrix format > does not support BICG solver and PCILU preconditoner. I took a look at the > compatibility matrix ( > https://petsc.org/release/overview/linear_solve_table/#preconditioners) > and noticed that MATMPIAIJ supports only MKL CParadiso preconditioner which > seems to belong to Intel. > > I did some more reading and realised that I should probably continue with > MATAIJ (which should work in sequential and parallel), but I am wondering > why would there even be MATMPIAJ if it supports only one third-party > preconditioner? > 1) MATAIJ is not a concrete type, it just creates MATSEQAIJ in serial and MATMPIAIJ in parallel 2) MATMPIAIJ supports many parallel direct solvers (see the end of https://petsc.org/main/docs/manual/ksp/), including MUMPS SuperLU_dist Hypre (Euclid) CPardiso There are also parallel AMG solvers, parallel DD solvers, and Krylov solvers. The complaint you got said that a serial LU was being used with a parallel matrix type, so using AIJ is the right solution. Thanks, Matt > Cheers, > > Bojan Niceno > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno.scientist at gmail.com Thu Feb 17 10:45:50 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Thu, 17 Feb 2022 17:45:50 +0100 Subject: [petsc-users] (no subject) Message-ID: Dear all, I am experiencing difficulties when using PETSc in parallel in an unstructured CFD code. It uses CRS format to store its matrices. I use the following sequence of PETSc call in the hope to get PETSc solving my linear systems in parallel. Before I continue, I would just like to say that the code is MPI parallel since long time ago, and performs its own domain decomposition through METIS, and it works out its communication patterns which work with its home-grown (non-PETSc) linear solvers. Anyhow, I issue the following calls: err = PetscInitialize(0, NULL, (char*)0, help); err = MatCreate(MPI_COMM_WORLD, A); In the above, I use MPI_COMM_WORLD instead of PETSC_COMM_SELF because the call to MPI_Init is invoked outside of PETSc, from the main program. err = MatSetSizes(A, m, m, M, M); Since my matrices are exclusively square, M is set to the total number of computational cells, while m is equal to the number of computational cells within each subdomain/processor. (Not all processors necessarily have the same m, it depends on domain decomposition.) I do not distinguish between m (M) and n (N) since matrices are all square. Am I wrong to assume that? err = MatSetType(A, MATAIJ); I set the matrix to be of type MATAIJ, to cover runs on one and on more processors. By the way, on one processors everything works fine err = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); err = MatSeqAIJSetPreallocation(A, 0, d_nnz); The two lines above specify matrix preallocation. Since d_nz and o_nz vary from cell to cell (row to row), I set them to zero and provide arrays with number of diagonal and off diagonal zeroes instead. To my understanding, that is legit since d_nz and o_nz are neglected if d_nnz and o_nnz are provided. Am I wrong? Finally, inside a loop through rows and columns I call: err = MatSetValue(A, row, col, value, INSERT_VALUES); Here I make sure that row and col point to global cell (unknown) numbers. Yet, when I run the code on more than one processor, I get the error: [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [3]PETSC ERROR: Argument out of range [3]PETSC ERROR: New nonzero at (21,356) caused a malloc Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check [3]PETSC ERROR: #1 MatSetValues_MPIAIJ() at /home/niceno/Development/petsc-debug/src/mat/impls/aij/mpi/mpiaij.c:517 [3]PETSC ERROR: #2 MatSetValues() at /home/niceno/Development/petsc-debug/src/mat/interface/matrix.c:1398 [3]PETSC ERROR: #3 MatSetValues_MPIAIJ() at /home/niceno/Development/petsc-debug/src/mat/impls/aij/mpi/mpiaij.c:517 [3]PETSC ERROR: #4 MatSetValues() at /home/niceno/Development/petsc-debug/src/mat/interface/matrix.c:1398 and so forth, for roughly 10% of all matrix entries. I checked if these errors occur only for off-diagonal parts of the matrix entries, but that is not the case. Error code is 63; PETSC_ERR_ARG_OUTOFRANGE Does anyone have an idea what am I doing wrong? Is any of my assumptions above (like thinking n(N) is always m(M) for square matrices, that I can send zeros as d_nz and o_nz if I provide arrays d_nnz[] and o_nnz[] wrong? Any idea how to debug it, where to look for an error? I am carefully checking all the data I send to PETSc functions and looks correct to me, but maybe I lack some fundamental understanding of what should be provided to PETSc, as I write above? Best regards, Bojan Niceno -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Feb 17 11:05:35 2022 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 17 Feb 2022 12:05:35 -0500 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: On Thu, Feb 17, 2022 at 11:46 AM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Dear all, > > > I am experiencing difficulties when using PETSc in parallel in an > unstructured CFD code. It uses CRS format to store its matrices. I use > the following sequence of PETSc call in the hope to get PETSc solving my > linear systems in parallel. Before I continue, I would just like to say > that the code is MPI parallel since long time ago, and performs its own > domain decomposition through METIS, and it works out its communication > patterns which work with its home-grown (non-PETSc) linear solvers. > Anyhow, I issue the following calls: > > err = PetscInitialize(0, NULL, (char*)0, help); > > err = MatCreate(MPI_COMM_WORLD, A); > In the above, I use MPI_COMM_WORLD instead of PETSC_COMM_SELF because the > call to MPI_Init is invoked outside of PETSc, from the main program. > > err = MatSetSizes(A, m, m, M, M); > Since my matrices are exclusively square, M is set to the total number of > computational cells, while m is equal to the number of computational cells > within each subdomain/processor. (Not all processors necessarily have the > same m, it depends on domain decomposition.) I do not distinguish between > m (M) and n (N) since matrices are all square. Am I wrong to assume that? > > err = MatSetType(A, MATAIJ); > I set the matrix to be of type MATAIJ, to cover runs on one and on more > processors. By the way, on one processors everything works fine > > err = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); > err = MatSeqAIJSetPreallocation(A, 0, d_nnz); > The two lines above specify matrix preallocation. Since d_nz and o_nz > vary from cell to cell (row to row), I set them to zero and provide arrays > with number of diagonal and off diagonal zeroes instead. To my > understanding, that is legit since d_nz and o_nz are neglected if d_nnz and > o_nnz are provided. Am I wrong? > > Finally, inside a loop through rows and columns I call: > > err = MatSetValue(A, row, col, value, INSERT_VALUES); > Here I make sure that row and col point to global cell (unknown) numbers. > > Yet, when I run the code on more than one processor, I get the error: > > [3]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [3]PETSC ERROR: Argument out of range > [3]PETSC ERROR: New nonzero at (21,356) caused a malloc > Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn > off this check > > [3]PETSC ERROR: #1 MatSetValues_MPIAIJ() at > /home/niceno/Development/petsc-debug/src/mat/impls/aij/mpi/mpiaij.c:517 > [3]PETSC ERROR: #2 MatSetValues() at > /home/niceno/Development/petsc-debug/src/mat/interface/matrix.c:1398 > [3]PETSC ERROR: #3 MatSetValues_MPIAIJ() at > /home/niceno/Development/petsc-debug/src/mat/impls/aij/mpi/mpiaij.c:517 > [3]PETSC ERROR: #4 MatSetValues() at > /home/niceno/Development/petsc-debug/src/mat/interface/matrix.c:1398 > > and so forth, for roughly 10% of all matrix entries. I checked if these > errors occur only for off-diagonal parts of the matrix entries, but that is > not the case. > > Error code is 63; PETSC_ERR_ARG_OUTOFRANGE > > Does anyone have an idea what am I doing wrong? Is any of my assumptions > above (like thinking n(N) is always m(M) for square matrices, that I can > send zeros as d_nz and o_nz if I provide arrays d_nnz[] and o_nnz[] wrong? > That is correct. > Any idea how to debug it, where to look for an error? > > I would guess that you are counting your o_nnz incorrectly. It looks like a small number of equations per process because the 4th process has row 21, apparently. Does that sound right? And column 356 is going to be in the off-diagonal block (ie, "o"). I would start with a serial matrix and run with -info. This will be noisy but you will see things like"number of unneeded..." that you can verify that you have set d_nnz perfectly (there should be 0 unneeded). Then try two processors. If it fails you could a print statement in everytime the row (eg, 21) is added to and check what your code for computing o_nnz is doing. I am carefully checking all the data I send to PETSc functions and looks > correct to me, but maybe I lack some fundamental understanding of what > should be provided to PETSc, as I write above? > It is a bit confusing at first. The man page gives a concrete example https://petsc.org/main/docs/manualpages/Mat/MatCreateAIJ.html#MatCreateAIJ > > > Best regards, > > > Bojan Niceno > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Feb 17 11:59:22 2022 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 17 Feb 2022 12:59:22 -0500 Subject: [petsc-users] (no subject) In-Reply-To: References: Message-ID: Please keep on list, On Thu, Feb 17, 2022 at 12:36 PM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Dear Mark, > > Sorry for mistakenly calling you Adam before. > > I was thinking about the o_nnz as you suggested, but then something else > occurred to me. So, I determine the d_nnz and o_nnz based on METIS domain > decomposition which I perform outside of PETSc, before I even call PETSc > initialize. Hence, if PETSc works out its own domain decomposition > PETSc does not work out its own decomposition. You specify the decomposition completely with https://petsc.org/main/docs/manualpages/Mat/MatCreateAIJ.html#MatCreateAIJ > and communication patterns, they might be different from mine, and > therefore MatSetValue misses some entries. Will PETSc follow the same > domain decomposition which it works from calls to MatSetValue from > different processors, or will it re-shuffle the matrix entries? > > Cheers, > > Bojan > > On Thu, Feb 17, 2022 at 6:14 PM Bojan Niceno < > bojan.niceno.scientist at gmail.com> wrote: > >> Thanks a lot for the hints Adam :-) >> >> Cheers, >> >> Bojan >> >> On Thu, Feb 17, 2022 at 6:05 PM Mark Adams wrote: >> >>> >>> >>> On Thu, Feb 17, 2022 at 11:46 AM Bojan Niceno < >>> bojan.niceno.scientist at gmail.com> wrote: >>> >>>> Dear all, >>>> >>>> >>>> I am experiencing difficulties when using PETSc in parallel in an >>>> unstructured CFD code. It uses CRS format to store its matrices. I use >>>> the following sequence of PETSc call in the hope to get PETSc solving my >>>> linear systems in parallel. Before I continue, I would just like to say >>>> that the code is MPI parallel since long time ago, and performs its own >>>> domain decomposition through METIS, and it works out its communication >>>> patterns which work with its home-grown (non-PETSc) linear solvers. >>>> Anyhow, I issue the following calls: >>>> >>>> err = PetscInitialize(0, NULL, (char*)0, help); >>>> >>>> err = MatCreate(MPI_COMM_WORLD, A); >>>> In the above, I use MPI_COMM_WORLD instead of PETSC_COMM_SELF because >>>> the call to MPI_Init is invoked outside of PETSc, from the main program. >>>> >>>> err = MatSetSizes(A, m, m, M, M); >>>> Since my matrices are exclusively square, M is set to the total number >>>> of computational cells, while m is equal to the number of computational >>>> cells within each subdomain/processor. (Not all processors necessarily >>>> have the same m, it depends on domain decomposition.) I do not distinguish >>>> between m (M) and n (N) since matrices are all square. Am I wrong to >>>> assume that? >>>> >>>> err = MatSetType(A, MATAIJ); >>>> I set the matrix to be of type MATAIJ, to cover runs on one and on more >>>> processors. By the way, on one processors everything works fine >>>> >>>> err = MatMPIAIJSetPreallocation(A, 0, d_nnz, 0, o_nnz); >>>> err = MatSeqAIJSetPreallocation(A, 0, d_nnz); >>>> The two lines above specify matrix preallocation. Since d_nz and o_nz >>>> vary from cell to cell (row to row), I set them to zero and provide arrays >>>> with number of diagonal and off diagonal zeroes instead. To my >>>> understanding, that is legit since d_nz and o_nz are neglected if d_nnz and >>>> o_nnz are provided. Am I wrong? >>>> >>>> Finally, inside a loop through rows and columns I call: >>>> >>>> err = MatSetValue(A, row, col, value, INSERT_VALUES); >>>> Here I make sure that row and col point to global cell (unknown) >>>> numbers. >>>> >>>> Yet, when I run the code on more than one processor, I get the error: >>>> >>>> [3]PETSC ERROR: --------------------- Error Message >>>> -------------------------------------------------------------- >>>> [3]PETSC ERROR: Argument out of range >>>> [3]PETSC ERROR: New nonzero at (21,356) caused a malloc >>>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to >>>> turn off this check >>>> >>>> [3]PETSC ERROR: #1 MatSetValues_MPIAIJ() at >>>> /home/niceno/Development/petsc-debug/src/mat/impls/aij/mpi/mpiaij.c:517 >>>> [3]PETSC ERROR: #2 MatSetValues() at >>>> /home/niceno/Development/petsc-debug/src/mat/interface/matrix.c:1398 >>>> [3]PETSC ERROR: #3 MatSetValues_MPIAIJ() at >>>> /home/niceno/Development/petsc-debug/src/mat/impls/aij/mpi/mpiaij.c:517 >>>> [3]PETSC ERROR: #4 MatSetValues() at >>>> /home/niceno/Development/petsc-debug/src/mat/interface/matrix.c:1398 >>>> >>>> and so forth, for roughly 10% of all matrix entries. I checked if >>>> these errors occur only for off-diagonal parts of the matrix entries, but >>>> that is not the case. >>>> >>>> Error code is 63; PETSC_ERR_ARG_OUTOFRANGE >>>> >>>> Does anyone have an idea what am I doing wrong? Is any of my >>>> assumptions above (like thinking n(N) is always m(M) for square matrices, >>>> that I can send zeros as d_nz and o_nz if I provide arrays d_nnz[] and >>>> o_nnz[] wrong? >>>> >>> >>> That is correct. >>> >>> >>>> Any idea how to debug it, where to look for an error? >>>> >>>> >>> I would guess that you are counting your o_nnz incorrectly. It looks >>> like a small number of equations per process because the 4th process has >>> row 21, apparently. Does that sound right? >>> >>> And column 356 is going to be in the off-diagonal block (ie, "o"). I >>> would start with a serial matrix and run with -info. This will be noisy but >>> you will see things like"number of unneeded..." that you can verify that >>> you have set d_nnz perfectly (there should be 0 unneeded). >>> Then try two processors. If it fails you could a print statement in >>> everytime the row (eg, 21) is added to and check what your code for >>> computing o_nnz is doing. >>> >>> I am carefully checking all the data I send to PETSc functions and looks >>>> correct to me, but maybe I lack some fundamental understanding of what >>>> should be provided to PETSc, as I write above? >>>> >>> >>> It is a bit confusing at first. The man page gives a concrete example >>> https://petsc.org/main/docs/manualpages/Mat/MatCreateAIJ.html#MatCreateAIJ >>> >>> >>>> >>>> >>>> Best regards, >>>> >>>> >>>> Bojan Niceno >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From rtmills at anl.gov Thu Feb 17 17:33:17 2022 From: rtmills at anl.gov (Richard Tran Mills) Date: Thu, 17 Feb 2022 15:33:17 -0800 Subject: [petsc-users] Kokkos Interface for PETSc In-Reply-To: References: <87pmno5bdr.fsf@jedbrown.org> Message-ID: Hi Philip, Sorry to be a bit late in my reply. Jed has explained the gist of what's involved with using the Kokkos/Kokkos-kernels back-end for the PETSc solves, though, depending on exactly how Xolotl creates its vectors, there may be a bit of work required to ensure that the command-line options specifying the matrix and GPU types get applied to the right objects, and that non-GPU types are not being hardcoded somewhere (by a call like "DMSetMatType(dm,MATAIJ)"). In addition to looking at the -log_view output, since Xolotl uses TS you can specify "-ts_view" and look at the output that describes the solver hierarchy that Xolotl sets up. If matrix types are being set correctly, you'll see things like ????? Mat Object: 1 MPI processes ??????? type: seqaijkokkos (I note that I've also sent a related message about getting Xolotl working with Kokkos back-ends on Summit to you, Sophie, and Phil in reply to old thread about this.) Were you also asking about how to use Kokkos for PETSc matrix assembly, or is that a question for later? Cheers, Richard On 2/15/22 09:07, Satish Balay via petsc-users wrote: > Also - perhaps the following info might be useful > > Satish > > ---- > > balay at sb /home/balay/petsc (main=) > $ git grep -l download-kokkos-kernels config/examples > config/examples/arch-ci-freebsd-cxx-cmplx-pkgs-dbg.py > config/examples/arch-ci-linux-cuda-double.py > config/examples/arch-ci-linux-gcc-ifc-cmplx.py > config/examples/arch-ci-linux-hip-double.py > config/examples/arch-ci-linux-pkgs-dbg-ftn-interfaces.py > config/examples/arch-ci-linux-pkgs-valgrind.py > config/examples/arch-ci-osx-cxx-pkgs-opt.py > config/examples/arch-nvhpc.py > config/examples/arch-olcf-crusher.py > config/examples/arch-olcf-spock.py > balay at sb /home/balay/petsc (main=) > $ git grep -l "requires:.*kokkos_kernels" > src/ksp/ksp/tests/ex3.c > src/ksp/ksp/tests/ex43.c > src/ksp/ksp/tests/ex60.c > src/ksp/ksp/tutorials/ex7.c > src/mat/tests/ex123.c > src/mat/tests/ex132.c > src/mat/tests/ex2.c > src/mat/tests/ex250.c > src/mat/tests/ex251.c > src/mat/tests/ex252.c > src/mat/tests/ex254.c > src/mat/tests/ex5.c > src/mat/tests/ex62.c > src/mat/tutorials/ex5k.kokkos.cxx > src/snes/tests/ex13.c > src/snes/tutorials/ex13.c > src/snes/tutorials/ex3k.kokkos.cxx > src/snes/tutorials/ex56.c > src/ts/utils/dmplexlandau/tutorials/ex1.c > src/ts/utils/dmplexlandau/tutorials/ex1f90.F90 > src/ts/utils/dmplexlandau/tutorials/ex2.c > src/vec/vec/tests/ex21.c > src/vec/vec/tests/ex22.c > src/vec/vec/tests/ex23.c > src/vec/vec/tests/ex28.c > src/vec/vec/tests/ex34.c > src/vec/vec/tests/ex37.c > src/vec/vec/tests/ex38.c > src/vec/vec/tests/ex4.c > src/vec/vec/tests/ex43.c > src/vec/vec/tests/ex60.c > src/vec/vec/tutorials/ex1.c > balay at sb /home/balay/petsc (main=) > $ > > On Tue, 15 Feb 2022, Satish Balay via petsc-users wrote: > >> Also - best to use petsc repo - 'main' branch. >> >> And for install on crusher - check config/examples/arch-olcf-crusher.py >> >> Satish >> >> On Tue, 15 Feb 2022, Jed Brown wrote: >> >>> We need to make these docs more explicit, but the short answer is configure with --download-kokkos --download-kokkos-kernels and run almost any example with -dm_mat_type aijkokkos -dm_vec_type kokkos. If you run with -log_view, you should see that all the flops take place on the device and there are few host->device transfers. Message packing is done on the device and it'll use GPU-aware MPI. There are a few examples of residual evaluation and matrix assembly on the device using Kokkos. You can also see libCEED examples for assembly on the device into Kokkos matrices and vectors without touching host memory. >>> >>> "Fackler, Philip via petsc-users" writes: >>> >>>> We're intending to transitioning the Xolotl interfaces with PETSc. >>>> >>>> I am hoping someone (can) point us to some documentation (and examples) for using PETSc's Kokkos-based interface. If this does not yet exist, then perhaps some slides (like the ones Richard Mills showed at the NE-SciDAC all-hands meeting) showing some examples could get us started. >>>> >>>> Thanks for any help that can be provided, >>>> >>>> Philip Fackler >>>> Research Software Engineer, Application Engineering Group >>>> Advanced Computing Systems Research Section >>>> Computer Science and Mathematics Division >>>> Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From shouronghao at sjtu.edu.cn Mon Feb 21 01:16:52 2022 From: shouronghao at sjtu.edu.cn (Shourong Hao) Date: Mon, 21 Feb 2022 15:16:52 +0800 Subject: [petsc-users] Problem about using TAU for profiling in PETSc Message-ID: <68C7C39F-F63D-490D-9F3C-80838783C88E@sjtu.edu.cn> An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Feb 21 08:26:55 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 21 Feb 2022 08:26:55 -0600 (CST) Subject: [petsc-users] Problem about using TAU for profiling in PETSc In-Reply-To: <68C7C39F-F63D-490D-9F3C-80838783C88E@sjtu.edu.cn> References: <68C7C39F-F63D-490D-9F3C-80838783C88E@sjtu.edu.cn> Message-ID: We haven't tried using tau in many years - so don't really know what issues exist. Previously we attempted a compile wrapper - so you might have to check the code in lib/petsc/bin/taucc.py BTW: One relevant part: cmd2 += ' -c -rn PetscFunctionReturn -rv PetscFunctionReturnVoid\\(\\)' Hope this helps. Satish On Mon, 21 Feb 2022, Shourong Hao wrote: > Hello, > > I'm using PETSc for a finite element computation.? > > Now I want to use TAU (https://www.cs.uoregon.edu/research/tau/home.php) for a performance profiling after I used?-log_view?option. > > However, I encountered many errors in the compiling stage, and there is little information about how to use TAU in PETSc. > > Is there some reference materials I can learn from? Or what should I do to use TAU in PETSc correctly? > > Thank you very much for your help! > > Best wishes, > > Shourong > [defaultAvatar.png] > Shourong Hao > [027208ca514b662477fc5a0020160906.png] > Shanghai Jiao Tong University > shouronghao at sjtu.edu.cn > > From a.croucher at auckland.ac.nz Mon Feb 21 16:38:08 2022 From: a.croucher at auckland.ac.nz (Adrian Croucher) Date: Tue, 22 Feb 2022 11:38:08 +1300 Subject: [petsc-users] Cray perftools Message-ID: hi, We have our PETSc-based code compiled on a Cray XC-50 machine, and it has just recently started running about 2.5 times slower on there. Neither the code nor PETSc has been recompiled lately. Turning the PETSc logging on, it appears to be spending more time on I/O than it used to. The cluster admins have suggested we rebuild with the Cray "perftools" module loaded to get profiling info. It's a slight hassle to rebuild everything, so I wondered, would this actually tell us anything that we don't already know from the PETSc logs? - Adrian -- Dr Adrian Croucher Senior Research Fellow Department of Engineering Science University of Auckland, New Zealand email: a.croucher at auckland.ac.nz tel: +64 (0)9 923 4611 From jed at jedbrown.org Mon Feb 21 16:45:49 2022 From: jed at jedbrown.org (Jed Brown) Date: Mon, 21 Feb 2022 15:45:49 -0700 Subject: [petsc-users] Cray perftools In-Reply-To: References: Message-ID: <8735kbvnxe.fsf@jedbrown.org> If you can share before/after output from -log_view, it would likely help localize. Another unintrusive thing (if you're allowed to run Linux perf) is to $ perf record --call-graph dwarf -F99 ./app [... runs ...] $ perf script | stackcollapse-perf | flamegraph > flame.svg and open flame.svg in a browser (it's interactive). This uses the flamegraph tools (https://github.com/brendangregg/FlameGraph). You can direct `perf script` to a file and share that if you can't/won't install flamegraph. This doesn't require compiling any special way and yet helps understand where time is spent. Adrian Croucher writes: > hi, > > We have our PETSc-based code compiled on a Cray XC-50 machine, and it > has just recently started running about 2.5 times slower on there. > Neither the code nor PETSc has been recompiled lately. > > Turning the PETSc logging on, it appears to be spending more time on I/O > than it used to. > > The cluster admins have suggested we rebuild with the Cray "perftools" > module loaded to get profiling info. It's a slight hassle to rebuild > everything, so I wondered, would this actually tell us anything that we > don't already know from the PETSc logs? > > - Adrian > > -- > Dr Adrian Croucher > Senior Research Fellow > Department of Engineering Science > University of Auckland, New Zealand > email: a.croucher at auckland.ac.nz > tel: +64 (0)9 923 4611 From bsmith at petsc.dev Mon Feb 21 17:10:39 2022 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 21 Feb 2022 18:10:39 -0500 Subject: [petsc-users] Cray perftools In-Reply-To: References: Message-ID: <6609AAFF-3F6E-4DC0-9BA3-5275B82A5766@petsc.dev> Have any of the modules loaded changed? Or the shared libraries that are in any of the modules? > On Feb 21, 2022, at 5:38 PM, Adrian Croucher wrote: > > hi, > > We have our PETSc-based code compiled on a Cray XC-50 machine, and it has just recently started running about 2.5 times slower on there. Neither the code nor PETSc has been recompiled lately. > > Turning the PETSc logging on, it appears to be spending more time on I/O than it used to. > > The cluster admins have suggested we rebuild with the Cray "perftools" module loaded to get profiling info. It's a slight hassle to rebuild everything, so I wondered, would this actually tell us anything that we don't already know from the PETSc logs? > > - Adrian > > -- > Dr Adrian Croucher > Senior Research Fellow > Department of Engineering Science > University of Auckland, New Zealand > email: a.croucher at auckland.ac.nz > tel: +64 (0)9 923 4611 > From nabw91 at gmail.com Mon Feb 21 17:20:54 2022 From: nabw91 at gmail.com (=?UTF-8?Q?Nicol=C3=A1s_Barnafi?=) Date: Tue, 22 Feb 2022 00:20:54 +0100 Subject: [petsc-users] PCSetCoordinates does not set coordinates of sub PC (fieldsplit) objects In-Reply-To: References: Message-ID: Dear all, I carried out the required impementation of the PCSetCoordinates for the fieldsplit and wanted to put it upstream. I am not sure about what could be missing (if anything), so perhaps I should open the pull request from a fork in order for you to take a look. Would you advise this or think it is better if I iterate with someone by email? Best regards, Nicolas On Thu, Jan 13, 2022 at 7:21 PM Matthew Knepley wrote: > On Thu, Jan 13, 2022 at 1:15 PM Nicol?s Barnafi wrote: > >> Dear all, >> >> I have created a first implementation. For now it must be called after >> setting the fields, eventually I would like to move it to the setup phase. >> The implementation seems clean, but it is giving me some memory errors >> (free() corrupted unsorted chunks). >> >> You may find the code below. After some work with gdb, I found out that >> the errors appears when calling the ISDestroy(&is_coords) line, which to me >> is not very clear, as I am indeed within the while scope creating and then >> destroying the is_coords object. I would greatly appreciate it if you could >> give me a hint on what the problem is. >> >> After debugging this, and working on your suggestion, I will open a PR. >> >> Best regards, >> NB >> >> >> ----- CODE ------ >> >> static PetscErrorCode PCSetCoordinates_FieldSplit(PC pc, PetscInt dim, >> PetscInt nloc, PetscReal coords[]) >> { >> PetscErrorCode ierr; >> PC_FieldSplit *jac = (PC_FieldSplit*)pc->data; >> PC_FieldSplitLink ilink_current = jac->head; >> PC pc_current; >> PetscInt nmin, nmax, ii, ndofs; >> PetscInt *owned_dofs; // Indexes owned by this processor >> PetscReal *coords_block; // Coordinates to be given to the current PC >> IS is_owned; >> >> PetscFunctionBegin; >> // Extract matrix ownership range to then compute subindexes for >> coordinates. This results in an IS object (is_owned). >> // TODO: This would be simpler with a general MatGetOwnershipIS >> (currently supported only by Elemental and BLAS matrices). >> ierr = MatGetOwnershipRange(pc->mat,&nmin,&nmax);CHKERRQ(ierr); >> ndofs = nmax - nmin; >> ierr = PetscMalloc1(ndofs, &owned_dofs); CHKERRQ(ierr); >> for(PetscInt i=nmin;i> owned_dofs[i] = nmin + i; >> ierr = ISCreateGeneral(MPI_COMM_WORLD, ndofs, owned_dofs, >> PETSC_OWN_POINTER, &is_owned); CHKERRQ(ierr); >> > > Here you tell PETSc to take control of the memory for owned_dofs, but > below you PetscFree(owned_dofs). This is not compatible. > You should just destroy the IS when you are done. > > Thanks, > > Matt > > >> // For each IS, embed it to get local coords indces and then set >> coordinates in the subPC. >> ii=0; >> while(ilink_current) >> { >> IS is_coords; >> PetscInt ndofs_block; >> const PetscInt *block_dofs_enumeration; // Numbering of the dofs >> relevant to the current block >> >> ierr = ISEmbed(ilink_current->is, is_owned, PETSC_TRUE, &is_coords); >> CHKERRQ(ierr); // Setting drop to TRUE, although it should make no >> difference. >> ierr = PetscMalloc1(ndofs_block, &coords_block); CHKERRQ(ierr); >> ierr = ISGetLocalSize(is_coords, &ndofs_block); CHKERRQ(ierr); >> ierr = ISGetIndices(is_coords, &block_dofs_enumeration); >> CHKERRQ(ierr); >> >> // Having the indices computed and the memory allocated, we can copy >> the relevant coords and set them to the subPC. >> for(PetscInt dof=0;dof> for(PetscInt d=0;d> { >> coords_block[dim*dof + d] = coords[dim * >> block_dofs_enumeration[dof] + d]; >> // printf("Dof: %d, Global: %f\n", block_dofs_enumeration[dof], >> coords[dim * block_dofs_enumeration[dof] + d]); >> } >> ierr = ISRestoreIndices(is_coords, &block_dofs_enumeration); >> CHKERRQ(ierr); >> ierr = ISDestroy(&is_coords); CHKERRQ(ierr); >> ierr = KSPGetPC(ilink_current->ksp, &pc_current); CHKERRQ(ierr); >> ierr = PCSetCoordinates(pc_current, dim, ndofs_block, coords_block); >> CHKERRQ(ierr); >> ierr = PetscFree(coords_block); CHKERRQ(ierr); >> if(!pc_current) >> SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ORDER,"Setting >> coordinates to PCFIELDSPLIT but a subPC is null."); >> >> ilink_current = ilink_current->next; >> ++ii; >> } >> ierr = PetscFree(owned_dofs); CHKERRQ(ierr); >> PetscFunctionReturn(0); >> } >> >> On Wed, Jan 12, 2022 at 6:22 AM Barry Smith wrote: >> >>> >>> >>> On Jan 11, 2022, at 9:51 PM, Matthew Knepley wrote: >>> >>> On Tue, Jan 11, 2022 at 3:31 PM Barry Smith wrote: >>> >>>> >>>> Nicolas, >>>> >>>> For "simple" PCFIELDSPLIT it is possible to pass down the attached >>>> coordinate information. By simple I mean where the splitting is done by >>>> fields and not by general lists of IS (where one does not have enough >>>> information to know what the coordinates would mean to the subPCS). >>>> >>>> Look in fieldsplit.c PCFieldSplitSetFields_FieldSplit() where it >>>> does the KSPCreate(). I think you can do a KSPGetPC() on that ksp and >>>> PCSetCoordinates on that PC to supply the coordinates to the subPC. In the >>>> function PCFieldSplitSetIS_FieldSplit() you can also attach the coordinates >>>> to the subPCs IF defaultsplit is true. >>>> >>>> Sadly this is not the full story. The outer PC will not have any >>>> coordinates because calling PCSetCoordinates on a PCFIELDSPLIT does nothing >>>> since fieldsplit doesn't handle coordinates. So you need to do more, you >>>> need to provide a PCSetCoordinates_FieldSplit() that saves the coordinates >>>> in new entries in the PC_FieldSplit struct and then in >>>> PCFieldSplitSetFields_FieldSplit() you need to access those saved values >>>> and pass them into the PCSetCoordinates() that you call on the subPCs. Once >>>> you write >>>> PCSetCoordinates_FieldSplit() you need to call >>>> >>>> ierr = >>>> PetscObjectComposeFunction((PetscObject)pc,"PCSetCoordinates_C",PCSetCoordinates_FieldSplit);CHKERRQ(ierr); >>>> >>>> >>>> inside PCCreate_FieldSplit(). >>>> >>>> Any questions just let us know. >>>> >>> >>> I will add "Why is this so cumbersome?". This is a workaround in order >>> to get geometric information into GAMG. It should really be >>> PCGAMGSetCoordinates(), which >>> are used to calculate the rigid body modes, and assume a bunch of stuff >>> about the coordinate space. This would not help you, because it would still >>> force you to pull >>> out the correct subPC. The "right" way now to give geometric information >>> to a TS/SNES/KSP/PC is through a DM, which are passed down through >>> PCFIELDSPLIT, >>> PCPATCH, etc. However they are heavier weight than just some coordinates. >>> >>> >>> This is not cumbersome at all. It is a simple natural way to pass >>> around coordinates to PC's and, when possible, their children. >>> >>> Barry >>> >>> Note that we could also have a DMGetCoordinates() that pulled >>> coordinates from a DM (that happended to have them) in this form associated >>> with the PC and the PC could call it to get the coordinates and use them as >>> needed. But this simple PCSetCoordinates() is a nice complement to that >>> approach. >>> >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Barry >>>> >>>> >>>> > On Jan 11, 2022, at 11:58 AM, Nicol?s Barnafi >>>> wrote: >>>> > >>>> > Dear community, >>>> > >>>> > I am working on a block preconditioner, where one of the blocks uses >>>> HYPRE's AMS. As it requires the coordinates of the dofs, I have done so to >>>> the PC object. I expected the coordinates to be inherited down to the >>>> subblocks, is this not the case? (it seems so as I couldn't find a >>>> specialized FIELDSPLIT SetCoordinates function). >>>> > >>>> > If this feature is missing, please give me some hints on where to add >>>> the missing function, I would gladly do it. If not, please let me know why >>>> it was dismissed, in order to do things the hard way [as in hard-coded ;)]. >>>> > >>>> > Kind regards, >>>> > Nicolas >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >>> >>> >> >> -- >> Nicol?s Alejandro Barnafi Wittwer >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- Nicol?s Alejandro Barnafi Wittwer -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Feb 21 17:26:06 2022 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 21 Feb 2022 18:26:06 -0500 Subject: [petsc-users] PCSetCoordinates does not set coordinates of sub PC (fieldsplit) objects In-Reply-To: References: Message-ID: <73BF9DD2-E18A-47EC-9C17-E14B07E1EA0D@petsc.dev> Please open a MR from your fork and we'll take a look at it. Thanks Barry > On Feb 21, 2022, at 6:20 PM, Nicol?s Barnafi wrote: > > Dear all, > > I carried out the required impementation of the PCSetCoordinates for the fieldsplit and wanted to put it upstream. I am not sure about what could be missing (if anything), so perhaps I should open the pull request from a fork in order for you to take a look. Would you advise this or think it is better if I iterate with someone by email? > > Best regards, > Nicolas > > On Thu, Jan 13, 2022 at 7:21 PM Matthew Knepley > wrote: > On Thu, Jan 13, 2022 at 1:15 PM Nicol?s Barnafi > wrote: > Dear all, > > I have created a first implementation. For now it must be called after setting the fields, eventually I would like to move it to the setup phase. The implementation seems clean, but it is giving me some memory errors (free() corrupted unsorted chunks). > > You may find the code below. After some work with gdb, I found out that the errors appears when calling the ISDestroy(&is_coords) line, which to me is not very clear, as I am indeed within the while scope creating and then destroying the is_coords object. I would greatly appreciate it if you could give me a hint on what the problem is. > > After debugging this, and working on your suggestion, I will open a PR. > > Best regards, > NB > > > ----- CODE ------ > > static PetscErrorCode PCSetCoordinates_FieldSplit(PC pc, PetscInt dim, PetscInt nloc, PetscReal coords[]) > { > PetscErrorCode ierr; > PC_FieldSplit *jac = (PC_FieldSplit*)pc->data; > PC_FieldSplitLink ilink_current = jac->head; > PC pc_current; > PetscInt nmin, nmax, ii, ndofs; > PetscInt *owned_dofs; // Indexes owned by this processor > PetscReal *coords_block; // Coordinates to be given to the current PC > IS is_owned; > > PetscFunctionBegin; > // Extract matrix ownership range to then compute subindexes for coordinates. This results in an IS object (is_owned). > // TODO: This would be simpler with a general MatGetOwnershipIS (currently supported only by Elemental and BLAS matrices). > ierr = MatGetOwnershipRange(pc->mat,&nmin,&nmax);CHKERRQ(ierr); > ndofs = nmax - nmin; > ierr = PetscMalloc1(ndofs, &owned_dofs); CHKERRQ(ierr); > for(PetscInt i=nmin;i owned_dofs[i] = nmin + i; > ierr = ISCreateGeneral(MPI_COMM_WORLD, ndofs, owned_dofs, PETSC_OWN_POINTER, &is_owned); CHKERRQ(ierr); > > Here you tell PETSc to take control of the memory for owned_dofs, but below you PetscFree(owned_dofs). This is not compatible. > You should just destroy the IS when you are done. > > Thanks, > > Matt > > // For each IS, embed it to get local coords indces and then set coordinates in the subPC. > ii=0; > while(ilink_current) > { > IS is_coords; > PetscInt ndofs_block; > const PetscInt *block_dofs_enumeration; // Numbering of the dofs relevant to the current block > > ierr = ISEmbed(ilink_current->is, is_owned, PETSC_TRUE, &is_coords); CHKERRQ(ierr); // Setting drop to TRUE, although it should make no difference. > ierr = PetscMalloc1(ndofs_block, &coords_block); CHKERRQ(ierr); > ierr = ISGetLocalSize(is_coords, &ndofs_block); CHKERRQ(ierr); > ierr = ISGetIndices(is_coords, &block_dofs_enumeration); CHKERRQ(ierr); > > // Having the indices computed and the memory allocated, we can copy the relevant coords and set them to the subPC. > for(PetscInt dof=0;dof for(PetscInt d=0;d { > coords_block[dim*dof + d] = coords[dim * block_dofs_enumeration[dof] + d]; > // printf("Dof: %d, Global: %f\n", block_dofs_enumeration[dof], coords[dim * block_dofs_enumeration[dof] + d]); > } > ierr = ISRestoreIndices(is_coords, &block_dofs_enumeration); CHKERRQ(ierr); > ierr = ISDestroy(&is_coords); CHKERRQ(ierr); > ierr = KSPGetPC(ilink_current->ksp, &pc_current); CHKERRQ(ierr); > ierr = PCSetCoordinates(pc_current, dim, ndofs_block, coords_block); CHKERRQ(ierr); > ierr = PetscFree(coords_block); CHKERRQ(ierr); > if(!pc_current) > SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ORDER,"Setting coordinates to PCFIELDSPLIT but a subPC is null."); > > ilink_current = ilink_current->next; > ++ii; > } > ierr = PetscFree(owned_dofs); CHKERRQ(ierr); > PetscFunctionReturn(0); > } > > On Wed, Jan 12, 2022 at 6:22 AM Barry Smith > wrote: > > >> On Jan 11, 2022, at 9:51 PM, Matthew Knepley > wrote: >> >> On Tue, Jan 11, 2022 at 3:31 PM Barry Smith > wrote: >> >> Nicolas, >> >> For "simple" PCFIELDSPLIT it is possible to pass down the attached coordinate information. By simple I mean where the splitting is done by fields and not by general lists of IS (where one does not have enough information to know what the coordinates would mean to the subPCS). >> >> Look in fieldsplit.c PCFieldSplitSetFields_FieldSplit() where it does the KSPCreate(). I think you can do a KSPGetPC() on that ksp and PCSetCoordinates on that PC to supply the coordinates to the subPC. In the function PCFieldSplitSetIS_FieldSplit() you can also attach the coordinates to the subPCs IF defaultsplit is true. >> >> Sadly this is not the full story. The outer PC will not have any coordinates because calling PCSetCoordinates on a PCFIELDSPLIT does nothing since fieldsplit doesn't handle coordinates. So you need to do more, you need to provide a PCSetCoordinates_FieldSplit() that saves the coordinates in new entries in the PC_FieldSplit struct and then in PCFieldSplitSetFields_FieldSplit() you need to access those saved values and pass them into the PCSetCoordinates() that you call on the subPCs. Once you write >> PCSetCoordinates_FieldSplit() you need to call >> >> ierr = PetscObjectComposeFunction((PetscObject)pc,"PCSetCoordinates_C",PCSetCoordinates_FieldSplit);CHKERRQ(ierr); >> >> inside PCCreate_FieldSplit(). >> >> Any questions just let us know. >> >> I will add "Why is this so cumbersome?". This is a workaround in order to get geometric information into GAMG. It should really be PCGAMGSetCoordinates(), which >> are used to calculate the rigid body modes, and assume a bunch of stuff about the coordinate space. This would not help you, because it would still force you to pull >> out the correct subPC. The "right" way now to give geometric information to a TS/SNES/KSP/PC is through a DM, which are passed down through PCFIELDSPLIT, >> PCPATCH, etc. However they are heavier weight than just some coordinates. > > This is not cumbersome at all. It is a simple natural way to pass around coordinates to PC's and, when possible, their children. > > Barry > > Note that we could also have a DMGetCoordinates() that pulled coordinates from a DM (that happended to have them) in this form associated with the PC and the PC could call it to get the coordinates and use them as needed. But this simple PCSetCoordinates() is a nice complement to that approach. > >> >> Thanks, >> >> Matt >> >> Barry >> >> >> > On Jan 11, 2022, at 11:58 AM, Nicol?s Barnafi > wrote: >> > >> > Dear community, >> > >> > I am working on a block preconditioner, where one of the blocks uses HYPRE's AMS. As it requires the coordinates of the dofs, I have done so to the PC object. I expected the coordinates to be inherited down to the subblocks, is this not the case? (it seems so as I couldn't find a specialized FIELDSPLIT SetCoordinates function). >> > >> > If this feature is missing, please give me some hints on where to add the missing function, I would gladly do it. If not, please let me know why it was dismissed, in order to do things the hard way [as in hard-coded ;)]. >> > >> > Kind regards, >> > Nicolas >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ > > > > -- > Nicol?s Alejandro Barnafi Wittwer > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > -- > Nicol?s Alejandro Barnafi Wittwer -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.croucher at auckland.ac.nz Mon Feb 21 20:02:42 2022 From: a.croucher at auckland.ac.nz (Adrian Croucher) Date: Tue, 22 Feb 2022 15:02:42 +1300 Subject: [petsc-users] Cray perftools In-Reply-To: <8735kbvnxe.fsf@jedbrown.org> References: <8735kbvnxe.fsf@jedbrown.org> Message-ID: <19d2faad-9461-7323-80a3-5f028965fa20@auckland.ac.nz> hi Jed, On 2/22/22 11:45, Jed Brown wrote: > If you can share before/after output from -log_view, it would likely > help localize. Unfortunately there aren't any "before" logs, because -log_view wasn't used while everything was going well. It looks like it might be related to I/O partly based on comparing logs with those from runs on other machines (not very comparable ones, admittedly) and also experimenting with things like turning file (HDF5) output from the code off. > > Another unintrusive thing (if you're allowed to run Linux perf) is to Looks like that is not available on the cluster, but will check and see if some module can be loaded to get it. On 2/22/22 12:10, Barry Smith wrote: > Have any of the modules loaded changed? Or the shared libraries that are in any of the modules? Will check. We build most of the important dependencies (e.g. blaslapack, netcdf, exodusii, hdf5, scotch, chaco) ourselves using --download-xxx in the PETSc build. - Adrian -- Dr Adrian Croucher Senior Research Fellow Department of Engineering Science University of Auckland, New Zealand email: a.croucher at auckland.ac.nz tel: +64 (0)9 923 4611 From Bruce.Palmer at pnnl.gov Tue Feb 22 10:03:01 2022 From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J) Date: Tue, 22 Feb 2022 16:03:01 +0000 Subject: [petsc-users] Configuring with CMake Message-ID: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> Hi, We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_timer.a libgridpack_parallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? Bruce Palmer -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Feb 22 10:15:13 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 22 Feb 2022 10:15:13 -0600 (CST) Subject: [petsc-users] Configuring with CMake In-Reply-To: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> References: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> Message-ID: Perhaps the easy fix is to build PETSc as shared libraries. [this is the default anyway] However wrt pkgconfig: >>>> $ cat arch-linux-c-debug/lib/pkgconfig/petsc.pc prefix=/home/balay/petsc/arch-linux-c-debug exec_prefix=${prefix} includedir=${prefix}/include libdir=${prefix}/lib ccompiler=mpicc cflags_extra=-fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wno-stringop-overflow -fstack-protector -fvisibility=hidden -g3 -O0 cflags_dep=-MMD -MP ldflag_rpath=-Wl,-rpath, cxxcompiler=mpicxx cxxflags_extra=-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O0 -std=gnu++17 -fPIC fcompiler=mpif90 fflags_extra=-fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O0 Name: PETSc Description: Library to solve ODEs and algebraic equations Version: 3.16.99 Cflags: -I${includedir} -I/home/balay/petsc/include Libs: -L${libdir} -lpetsc Libs.private: -L/home/balay/soft/mpich-3.4.2/lib -L/usr/lib/gcc/x86_64-redhat-linux/11 -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl <<<< i.e 'Libs' provides the petsc library [usable when built as shared] - and 'Libs.private' provides all the dependencies [that are required in a static build] Satish On Tue, 22 Feb 2022, Palmer, Bruce J via petsc-users wrote: > Hi, > > We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors > > > /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_timer.a libgri dpack_pa rallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': > > /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr > > I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? > > Bruce Palmer > From bsmith at petsc.dev Tue Feb 22 10:23:32 2022 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 22 Feb 2022 11:23:32 -0500 Subject: [petsc-users] Configuring with CMake In-Reply-To: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> References: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> Message-ID: <66DE95E7-67B2-4DEF-9C76-8DFF24647480@petsc.dev> Bruce, Can you please send the PkgConfig calls that you make to get the PETSc values? And then exactly what PETSc PkgConfig returns. Thanks Barry > On Feb 22, 2022, at 11:03 AM, Palmer, Bruce J via petsc-users wrote: > > Hi, > > We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors > > /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_timer.a libgridpack_parallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': > /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr > > I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? > > Bruce Palmer -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Feb 22 10:37:40 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 22 Feb 2022 10:37:40 -0600 (CST) Subject: [petsc-users] Configuring with CMake In-Reply-To: <66DE95E7-67B2-4DEF-9C76-8DFF24647480@petsc.dev> References: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> <66DE95E7-67B2-4DEF-9C76-8DFF24647480@petsc.dev> Message-ID: <3649bdda-ac31-10d8-7951-4bbacfa6c05e@mcs.anl.gov> The relevant pkg-config commands are: balay at sb /home/balay/petsc (release=) $ pkg-config --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc balay at sb /home/balay/petsc (release=) $ pkg-config --shared --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc balay at sb /home/balay/petsc (release=) $ pkg-config --static --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc -L/home/balay/soft/mpich-3.4.2/lib -L/usr/lib/gcc/x86_64-redhat-linux/11 -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl And more example usages in share/petsc/Makefile.user Satish On Tue, 22 Feb 2022, Barry Smith wrote: > Bruce, > > Can you please send the PkgConfig calls that you make to get the PETSc values? And then exactly what PETSc PkgConfig returns. > > Thanks > > Barry > > > > On Feb 22, 2022, at 11:03 AM, Palmer, Bruce J via petsc-users wrote: > > > > Hi, > > > > We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors > > > > /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_timer.a libg ridpack_ parallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': > > /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr > > > > I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? > > > > Bruce Palmer > > From Bruce.Palmer at pnnl.gov Tue Feb 22 12:13:47 2022 From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J) Date: Tue, 22 Feb 2022 18:13:47 +0000 Subject: [petsc-users] Configuring with CMake In-Reply-To: <3649bdda-ac31-10d8-7951-4bbacfa6c05e@mcs.anl.gov> References: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> <66DE95E7-67B2-4DEF-9C76-8DFF24647480@petsc.dev> <3649bdda-ac31-10d8-7951-4bbacfa6c05e@mcs.anl.gov> Message-ID: The contents of the petsc.pc file are listed below. It looks good to me. The Libs.private variable seems to include the -lf2clapack and -lf2cblas libraries. I don't know how this info gets propagated up the build chain. Bruce prefix=/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt exec_prefix=${prefix} includedir=${prefix}/include libdir=${prefix}/lib ccompiler=mpicc cflags_extra=-fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O cflags_dep=-MMD -MP ldflag_rpath=-Wl,-rpath, cxxcompiler=mpicxx cxxflags_extra=-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O -std=gnu++11 fcompiler=mpif90 fflags_extra=-Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O Name: PETSc Description: Library to solve ODEs and algebraic equations Version: 3.16.3 Cflags: -I${includedir} -I/pic/projects/gridpack/software/petsc-3.16.3/include Libs: -L${libdir} -lpetsc Libs.private: -L/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib -L/share/apps/openmpi/3.0.1/gcc/6.1.0/lib -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc -L/qfs/projects/ops/rh6/gcc/6.1.0/lib64 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib -lspqr -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lrt -lsuperlu -lsuperlu_dist -lf2clapack -lf2cblas -lparmetis -lmetis -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl ?On 2/22/22, 8:39 AM, "Satish Balay" wrote: The relevant pkg-config commands are: balay at sb /home/balay/petsc (release=) $ pkg-config --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc balay at sb /home/balay/petsc (release=) $ pkg-config --shared --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc balay at sb /home/balay/petsc (release=) $ pkg-config --static --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc -L/home/balay/soft/mpich-3.4.2/lib -L/usr/lib/gcc/x86_64-redhat-linux/11 -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl And more example usages in share/petsc/Makefile.user Satish On Tue, 22 Feb 2022, Barry Smith wrote: > Bruce, > > Can you please send the PkgConfig calls that you make to get the PETSc values? And then exactly what PETSc PkgConfig returns. > > Thanks > > Barry > > > > On Feb 22, 2022, at 11:03 AM, Palmer, Bruce J via petsc-users wrote: > > > > Hi, > > > > We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors > > > > /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_timer.a libgridpack_ parallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': > > /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr > > > > I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? > > > > Bruce Palmer > > From balay at mcs.anl.gov Tue Feb 22 12:21:32 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 22 Feb 2022 12:21:32 -0600 (CST) Subject: [petsc-users] Configuring with CMake In-Reply-To: References: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> <66DE95E7-67B2-4DEF-9C76-8DFF24647480@petsc.dev> <3649bdda-ac31-10d8-7951-4bbacfa6c05e@mcs.anl.gov> Message-ID: https://cmake.org/cmake/help/latest/module/FindPkgConfig.html >>> Two sets of values exist: One for the common case ( = ) and another for the information pkg-config provides when called with the --static option ( = _STATIC). <<< So perhaps CMAKE is already setting the _STATIC variant of (for PETSC_LIB or equivalent) variable that's currently used)? Satish On Tue, 22 Feb 2022, Palmer, Bruce J via petsc-users wrote: > The contents of the petsc.pc file are listed below. It looks good to me. The Libs.private variable seems to include the -lf2clapack and -lf2cblas libraries. I don't know how this info gets propagated up the build chain. > > Bruce > > prefix=/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt > exec_prefix=${prefix} > includedir=${prefix}/include > libdir=${prefix}/lib > ccompiler=mpicc > cflags_extra=-fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O > cflags_dep=-MMD -MP > ldflag_rpath=-Wl,-rpath, > cxxcompiler=mpicxx > cxxflags_extra=-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O -std=gnu++11 > fcompiler=mpif90 > fflags_extra=-Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O > > Name: PETSc > Description: Library to solve ODEs and algebraic equations > Version: 3.16.3 > Cflags: -I${includedir} -I/pic/projects/gridpack/software/petsc-3.16.3/include > Libs: -L${libdir} -lpetsc > Libs.private: -L/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib -L/share/apps/openmpi/3.0.1/gcc/6.1.0/lib -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc -L/qfs/projects/ops/rh6/gcc/6.1.0/lib64 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib -lspqr -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lrt -lsuperlu -lsuperlu_dist -lf2clapack -lf2cblas -lparmetis -lmetis -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl > > ?On 2/22/22, 8:39 AM, "Satish Balay" wrote: > > The relevant pkg-config commands are: > > balay at sb /home/balay/petsc (release=) > $ pkg-config --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > balay at sb /home/balay/petsc (release=) > $ pkg-config --shared --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > balay at sb /home/balay/petsc (release=) > $ pkg-config --static --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc -L/home/balay/soft/mpich-3.4.2/lib -L/usr/lib/gcc/x86_64-redhat-linux/11 -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl > > > And more example usages in share/petsc/Makefile.user > > Satish > > > On Tue, 22 Feb 2022, Barry Smith wrote: > > > Bruce, > > > > Can you please send the PkgConfig calls that you make to get the PETSc values? And then exactly what PETSc PkgConfig returns. > > > > Thanks > > > > Barry > > > > > > > On Feb 22, 2022, at 11:03 AM, Palmer, Bruce J via petsc-users wrote: > > > > > > Hi, > > > > > > We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors > > > > > > /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_timer. a libgri dpack_ > parallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': > > > /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr > > > > > > I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? > > > > > > Bruce Palmer > > > > > > From Bruce.Palmer at pnnl.gov Tue Feb 22 12:31:42 2022 From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J) Date: Tue, 22 Feb 2022 18:31:42 +0000 Subject: [petsc-users] Configuring with CMake In-Reply-To: References: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> <66DE95E7-67B2-4DEF-9C76-8DFF24647480@petsc.dev> <3649bdda-ac31-10d8-7951-4bbacfa6c05e@mcs.anl.gov> Message-ID: The static versions of the variables exist (PETSC_STATIC), but they appear to have the same values as the non-static variables. As I mentioned, I'm a complete novice at pkgconfig, but it looks like if you could add the contents of Libs.private to the link line, you'd be in business. Any idea how to access this information from CMake? Bruce ?On 2/22/22, 10:22 AM, "Satish Balay" wrote: https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcmake.org%2Fcmake%2Fhelp%2Flatest%2Fmodule%2FFindPkgConfig.html&data=04%7C01%7CBruce.Palmer%40pnnl.gov%7Ca582313d16214bb7771708d9f63037c4%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C637811509227495697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=xE81D5%2FNF5PR3KSc48FD9CQSkQ%2F%2BdLqe8wKNjvW7xOM%3D&reserved=0 >>> Two sets of values exist: One for the common case ( = ) and another for the information pkg-config provides when called with the --static option ( = _STATIC). <<< So perhaps CMAKE is already setting the _STATIC variant of (for PETSC_LIB or equivalent) variable that's currently used)? Satish On Tue, 22 Feb 2022, Palmer, Bruce J via petsc-users wrote: > The contents of the petsc.pc file are listed below. It looks good to me. The Libs.private variable seems to include the -lf2clapack and -lf2cblas libraries. I don't know how this info gets propagated up the build chain. > > Bruce > > prefix=/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt > exec_prefix=${prefix} > includedir=${prefix}/include > libdir=${prefix}/lib > ccompiler=mpicc > cflags_extra=-fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O > cflags_dep=-MMD -MP > ldflag_rpath=-Wl,-rpath, > cxxcompiler=mpicxx > cxxflags_extra=-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O -std=gnu++11 > fcompiler=mpif90 > fflags_extra=-Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O > > Name: PETSc > Description: Library to solve ODEs and algebraic equations > Version: 3.16.3 > Cflags: -I${includedir} -I/pic/projects/gridpack/software/petsc-3.16.3/include > Libs: -L${libdir} -lpetsc > Libs.private: -L/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib -L/share/apps/openmpi/3.0.1/gcc/6.1.0/lib -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc -L/qfs/projects/ops/rh6/gcc/6.1.0/lib64 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib -lspqr -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lrt -lsuperlu -lsuperlu_dist -lf2clapack -lf2cblas -lparmetis -lmetis -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl > > On 2/22/22, 8:39 AM, "Satish Balay" wrote: > > The relevant pkg-config commands are: > > balay at sb /home/balay/petsc (release=) > $ pkg-config --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > balay at sb /home/balay/petsc (release=) > $ pkg-config --shared --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > balay at sb /home/balay/petsc (release=) > $ pkg-config --static --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc -L/home/balay/soft/mpich-3.4.2/lib -L/usr/lib/gcc/x86_64-redhat-linux/11 -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl > > > And more example usages in share/petsc/Makefile.user > > Satish > > > On Tue, 22 Feb 2022, Barry Smith wrote: > > > Bruce, > > > > Can you please send the PkgConfig calls that you make to get the PETSc values? And then exactly what PETSc PkgConfig returns. > > > > Thanks > > > > Barry > > > > > > > On Feb 22, 2022, at 11:03 AM, Palmer, Bruce J via petsc-users wrote: > > > > > > Hi, > > > > > > We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors > > > > > > /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_timer.a libgri dpack_ > parallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': > > > /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr > > > > > > I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? > > > > > > Bruce Palmer > > > > > > From balay at mcs.anl.gov Tue Feb 22 12:39:39 2022 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 22 Feb 2022 12:39:39 -0600 (CST) Subject: [petsc-users] Configuring with CMake In-Reply-To: References: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> <66DE95E7-67B2-4DEF-9C76-8DFF24647480@petsc.dev> <3649bdda-ac31-10d8-7951-4bbacfa6c05e@mcs.anl.gov> Message-ID: <249144c-4765-c4d9-529-e3891c5caabf@mcs.anl.gov> You can run 'pkg-config --static --libs PETSC_DIR/PETSC_ARCH/lib/pkgconfig/petsc.pc' to verify if pkg-config is able to obtain 'Libs.private' values. And then you would need help from someone who can debug cmake - on why PETSC_STATIC set by cmake does not reflect this value [as it should - per the FindPkgConfig doc] [sorry - I don't understand cmake - or how one would debug cmake issues] Satish On Tue, 22 Feb 2022, Palmer, Bruce J via petsc-users wrote: > The static versions of the variables exist (PETSC_STATIC), but they appear to have the same values as the non-static variables. > > As I mentioned, I'm a complete novice at pkgconfig, but it looks like if you could add the contents of Libs.private to the link line, you'd be in business. Any idea how to access this information from CMake? > > Bruce > > ?On 2/22/22, 10:22 AM, "Satish Balay" wrote: > > https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcmake.org%2Fcmake%2Fhelp%2Flatest%2Fmodule%2FFindPkgConfig.html&data=04%7C01%7CBruce.Palmer%40pnnl.gov%7Ca582313d16214bb7771708d9f63037c4%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C637811509227495697%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=xE81D5%2FNF5PR3KSc48FD9CQSkQ%2F%2BdLqe8wKNjvW7xOM%3D&reserved=0 > > >>> > Two sets of values exist: One for the common case ( = ) and another for the information pkg-config provides when called with the --static option ( = _STATIC). > <<< > > So perhaps CMAKE is already setting the _STATIC variant of (for PETSC_LIB or equivalent) variable that's currently used)? > > Satish > > On Tue, 22 Feb 2022, Palmer, Bruce J via petsc-users wrote: > > > The contents of the petsc.pc file are listed below. It looks good to me. The Libs.private variable seems to include the -lf2clapack and -lf2cblas libraries. I don't know how this info gets propagated up the build chain. > > > > Bruce > > > > prefix=/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt > > exec_prefix=${prefix} > > includedir=${prefix}/include > > libdir=${prefix}/lib > > ccompiler=mpicc > > cflags_extra=-fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O > > cflags_dep=-MMD -MP > > ldflag_rpath=-Wl,-rpath, > > cxxcompiler=mpicxx > > cxxflags_extra=-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O -std=gnu++11 > > fcompiler=mpif90 > > fflags_extra=-Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O > > > > Name: PETSc > > Description: Library to solve ODEs and algebraic equations > > Version: 3.16.3 > > Cflags: -I${includedir} -I/pic/projects/gridpack/software/petsc-3.16.3/include > > Libs: -L${libdir} -lpetsc > > Libs.private: -L/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib -L/share/apps/openmpi/3.0.1/gcc/6.1.0/lib -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc -L/qfs/projects/ops/rh6/gcc/6.1.0/lib64 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib -lspqr -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lrt -lsuperlu -lsuperlu_dist -lf2clapack -lf2cblas -lparmetis -lmetis -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl > > > > On 2/22/22, 8:39 AM, "Satish Balay" wrote: > > > > The relevant pkg-config commands are: > > > > balay at sb /home/balay/petsc (release=) > > $ pkg-config --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > > balay at sb /home/balay/petsc (release=) > > $ pkg-config --shared --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > > balay at sb /home/balay/petsc (release=) > > $ pkg-config --static --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc -L/home/balay/soft/mpich-3.4.2/lib -L/usr/lib/gcc/x86_64-redhat-linux/11 -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl > > > > > > And more example usages in share/petsc/Makefile.user > > > > Satish > > > > > > On Tue, 22 Feb 2022, Barry Smith wrote: > > > > > Bruce, > > > > > > Can you please send the PkgConfig calls that you make to get the PETSc values? And then exactly what PETSc PkgConfig returns. > > > > > > Thanks > > > > > > Barry > > > > > > > > > > On Feb 22, 2022, at 11:03 AM, Palmer, Bruce J via petsc-users wrote: > > > > > > > > Hi, > > > > > > > > We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors > > > > > > > > /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_ timer.a libgri > dpack_ > > parallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr > > > > > > > > I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? > > > > > > > > Bruce Palmer > > > > > > > > > > > > From Bruce.Palmer at pnnl.gov Tue Feb 22 14:35:40 2022 From: Bruce.Palmer at pnnl.gov (Palmer, Bruce J) Date: Tue, 22 Feb 2022 20:35:40 +0000 Subject: [petsc-users] Configuring with CMake In-Reply-To: <249144c-4765-c4d9-529-e3891c5caabf@mcs.anl.gov> References: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> <66DE95E7-67B2-4DEF-9C76-8DFF24647480@petsc.dev> <3649bdda-ac31-10d8-7951-4bbacfa6c05e@mcs.anl.gov> <249144c-4765-c4d9-529-e3891c5caabf@mcs.anl.gov> Message-ID: <241B5C7C-14E4-479E-A843-AA392AC1F17D@pnnl.gov> Argh, I'm an idiot. I can't write a proper print statement in CMake. The PETSC_STATIC_LDFLAGS variable is showing all the libraries so probably all I need to do is substitute that for PETSC_LDFLAGS in the GridPACK CMake build (once I find it) when the build is static. ?On 2/22/22, 10:39 AM, "Satish Balay" wrote: You can run 'pkg-config --static --libs PETSC_DIR/PETSC_ARCH/lib/pkgconfig/petsc.pc' to verify if pkg-config is able to obtain 'Libs.private' values. And then you would need help from someone who can debug cmake - on why PETSC_STATIC set by cmake does not reflect this value [as it should - per the FindPkgConfig doc] [sorry - I don't understand cmake - or how one would debug cmake issues] Satish On Tue, 22 Feb 2022, Palmer, Bruce J via petsc-users wrote: > The static versions of the variables exist (PETSC_STATIC), but they appear to have the same values as the non-static variables. > > As I mentioned, I'm a complete novice at pkgconfig, but it looks like if you could add the contents of Libs.private to the link line, you'd be in business. Any idea how to access this information from CMake? > > Bruce > > On 2/22/22, 10:22 AM, "Satish Balay" wrote: > > https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcmake.org%2Fcmake%2Fhelp%2Flatest%2Fmodule%2FFindPkgConfig.html&data=04%7C01%7CBruce.Palmer%40pnnl.gov%7C5533bb688b344af7fb1a08d9f632b7dd%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C637811519962671755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=O1tG6y4zgckU2lmyBM9F4hxiOhKNYz8O5WmFLH4k%2Fa0%3D&reserved=0 > > >>> > Two sets of values exist: One for the common case ( = ) and another for the information pkg-config provides when called with the --static option ( = _STATIC). > <<< > > So perhaps CMAKE is already setting the _STATIC variant of (for PETSC_LIB or equivalent) variable that's currently used)? > > Satish > > On Tue, 22 Feb 2022, Palmer, Bruce J via petsc-users wrote: > > > The contents of the petsc.pc file are listed below. It looks good to me. The Libs.private variable seems to include the -lf2clapack and -lf2cblas libraries. I don't know how this info gets propagated up the build chain. > > > > Bruce > > > > prefix=/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt > > exec_prefix=${prefix} > > includedir=${prefix}/include > > libdir=${prefix}/lib > > ccompiler=mpicc > > cflags_extra=-fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O > > cflags_dep=-MMD -MP > > ldflag_rpath=-Wl,-rpath, > > cxxcompiler=mpicxx > > cxxflags_extra=-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O -std=gnu++11 > > fcompiler=mpif90 > > fflags_extra=-Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O > > > > Name: PETSc > > Description: Library to solve ODEs and algebraic equations > > Version: 3.16.3 > > Cflags: -I${includedir} -I/pic/projects/gridpack/software/petsc-3.16.3/include > > Libs: -L${libdir} -lpetsc > > Libs.private: -L/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib -L/share/apps/openmpi/3.0.1/gcc/6.1.0/lib -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc -L/qfs/projects/ops/rh6/gcc/6.1.0/lib64 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib -lspqr -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lrt -lsuperlu -lsuperlu_dist -lf2clapack -lf2cblas -lparmetis -lmetis -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl > > > > On 2/22/22, 8:39 AM, "Satish Balay" wrote: > > > > The relevant pkg-config commands are: > > > > balay at sb /home/balay/petsc (release=) > > $ pkg-config --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > > balay at sb /home/balay/petsc (release=) > > $ pkg-config --shared --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > > balay at sb /home/balay/petsc (release=) > > $ pkg-config --static --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc -L/home/balay/soft/mpich-3.4.2/lib -L/usr/lib/gcc/x86_64-redhat-linux/11 -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl > > > > > > And more example usages in share/petsc/Makefile.user > > > > Satish > > > > > > On Tue, 22 Feb 2022, Barry Smith wrote: > > > > > Bruce, > > > > > > Can you please send the PkgConfig calls that you make to get the PETSc values? And then exactly what PETSc PkgConfig returns. > > > > > > Thanks > > > > > > Barry > > > > > > > > > > On Feb 22, 2022, at 11:03 AM, Palmer, Bruce J via petsc-users wrote: > > > > > > > > Hi, > > > > > > > > We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors > > > > > > > > /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_timer.a libgri > dpack_ > > parallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr > > > > > > > > I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? > > > > > > > > Bruce Palmer > > > > > > > > > > > > From jed at jedbrown.org Tue Feb 22 15:09:45 2022 From: jed at jedbrown.org (Jed Brown) Date: Tue, 22 Feb 2022 14:09:45 -0700 Subject: [petsc-users] Configuring with CMake In-Reply-To: <241B5C7C-14E4-479E-A843-AA392AC1F17D@pnnl.gov> References: <1D44E11E-7219-4260-8B58-0274EE1BA3A6@pnnl.gov> <66DE95E7-67B2-4DEF-9C76-8DFF24647480@petsc.dev> <3649bdda-ac31-10d8-7951-4bbacfa6c05e@mcs.anl.gov> <249144c-4765-c4d9-529-e3891c5caabf@mcs.anl.gov> <241B5C7C-14E4-479E-A843-AA392AC1F17D@pnnl.gov> Message-ID: <87ee3utxpi.fsf@jedbrown.org> It would be good to report a reduced test case upstream. They may not fix it, but a lot of things related to static libraries don't work without coaxing and they'll never get fixed if people who use CMake with static libraries don't make their voices heard. "Palmer, Bruce J via petsc-users" writes: > Argh, I'm an idiot. I can't write a proper print statement in CMake. > > The PETSC_STATIC_LDFLAGS variable is showing all the libraries so probably all I need to do is substitute that for PETSC_LDFLAGS in the GridPACK CMake build (once I find it) when the build is static. > > ?On 2/22/22, 10:39 AM, "Satish Balay" wrote: > > You can run 'pkg-config --static --libs PETSC_DIR/PETSC_ARCH/lib/pkgconfig/petsc.pc' to verify if pkg-config is able to obtain 'Libs.private' values. > > And then you would need help from someone who can debug cmake - on why PETSC_STATIC set by cmake does not reflect this value [as it should - per the FindPkgConfig doc] > > [sorry - I don't understand cmake - or how one would debug cmake issues] > > Satish > > On Tue, 22 Feb 2022, Palmer, Bruce J via petsc-users wrote: > > > The static versions of the variables exist (PETSC_STATIC), but they appear to have the same values as the non-static variables. > > > > As I mentioned, I'm a complete novice at pkgconfig, but it looks like if you could add the contents of Libs.private to the link line, you'd be in business. Any idea how to access this information from CMake? > > > > Bruce > > > > On 2/22/22, 10:22 AM, "Satish Balay" wrote: > > > > https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcmake.org%2Fcmake%2Fhelp%2Flatest%2Fmodule%2FFindPkgConfig.html&data=04%7C01%7CBruce.Palmer%40pnnl.gov%7C5533bb688b344af7fb1a08d9f632b7dd%7Cd6faa5f90ae240338c0130048a38deeb%7C0%7C0%7C637811519962671755%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=O1tG6y4zgckU2lmyBM9F4hxiOhKNYz8O5WmFLH4k%2Fa0%3D&reserved=0 > > > > >>> > > Two sets of values exist: One for the common case ( = ) and another for the information pkg-config provides when called with the --static option ( = _STATIC). > > <<< > > > > So perhaps CMAKE is already setting the _STATIC variant of (for PETSC_LIB or equivalent) variable that's currently used)? > > > > Satish > > > > On Tue, 22 Feb 2022, Palmer, Bruce J via petsc-users wrote: > > > > > The contents of the petsc.pc file are listed below. It looks good to me. The Libs.private variable seems to include the -lf2clapack and -lf2cblas libraries. I don't know how this info gets propagated up the build chain. > > > > > > Bruce > > > > > > prefix=/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt > > > exec_prefix=${prefix} > > > includedir=${prefix}/include > > > libdir=${prefix}/lib > > > ccompiler=mpicc > > > cflags_extra=-fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O > > > cflags_dep=-MMD -MP > > > ldflag_rpath=-Wl,-rpath, > > > cxxcompiler=mpicxx > > > cxxflags_extra=-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -fvisibility=hidden -g -O -std=gnu++11 > > > fcompiler=mpif90 > > > fflags_extra=-Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O > > > > > > Name: PETSc > > > Description: Library to solve ODEs and algebraic equations > > > Version: 3.16.3 > > > Cflags: -I${includedir} -I/pic/projects/gridpack/software/petsc-3.16.3/include > > > Libs: -L${libdir} -lpetsc > > > Libs.private: -L/pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib -L/share/apps/openmpi/3.0.1/gcc/6.1.0/lib -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib/gcc -L/qfs/projects/ops/rh6/gcc/6.1.0/lib64 -L/qfs/projects/ops/rh6/gcc/6.1.0/lib -lspqr -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lrt -lsuperlu -lsuperlu_dist -lf2clapack -lf2cblas -lparmetis -lmetis -lm -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lquadmath -lstdc++ -ldl > > > > > > On 2/22/22, 8:39 AM, "Satish Balay" wrote: > > > > > > The relevant pkg-config commands are: > > > > > > balay at sb /home/balay/petsc (release=) > > > $ pkg-config --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > > > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > > > balay at sb /home/balay/petsc (release=) > > > $ pkg-config --shared --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > > > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc > > > balay at sb /home/balay/petsc (release=) > > > $ pkg-config --static --libs arch-linux-c-debug/lib/pkgconfig/petsc.pc > > > -L/home/balay/petsc/arch-linux-c-debug/lib -lpetsc -L/home/balay/soft/mpich-3.4.2/lib -L/usr/lib/gcc/x86_64-redhat-linux/11 -llapack -lblas -lm -lX11 -lstdc++ -ldl -lmpifort -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -ldl > > > > > > > > > And more example usages in share/petsc/Makefile.user > > > > > > Satish > > > > > > > > > On Tue, 22 Feb 2022, Barry Smith wrote: > > > > > > > Bruce, > > > > > > > > Can you please send the PkgConfig calls that you make to get the PETSc values? And then exactly what PETSc PkgConfig returns. > > > > > > > > Thanks > > > > > > > > Barry > > > > > > > > > > > > > On Feb 22, 2022, at 11:03 AM, Palmer, Bruce J via petsc-users wrote: > > > > > > > > > > Hi, > > > > > > > > > > We recently switched the CMake configuration on our GridPACK application to use the PkgConfig utility instead of Jeb Brown?s FindPETSc.cmake module. This seems to work on a number of platforms but it is failing to link on others. It appears that the build cannot find the LAPACK and BLAS libraries. The PETSc library I?m linking to (v3.16.3) was configured with -download-f2cblaslapack so it should have these libraries, but when I try and link one of the test applications in GridPACK I get the errors > > > > > > > > > > /share/apps/gcc/6.1.0/bin/g++ -pthread -g -rdynamic CMakeFiles/greetings.dir/test/greetings.cpp.o -o greetings -Wl,-rpath,/qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib ../math/libgridpack_math.a libgridpack_parallel.a ../timer/libgridpack_timer.a ../environment/libgridpack_environment.a ../math/libgridpack_math.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_mpi.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_serialization.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_random.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_filesystem.a /pic/projects/gridpack/software/boost_1_65_0/lib/libboost_system.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga++.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libga.a /pic/projects/gridpack/software/ga-5.7/build_pr/lib/libarmci.a -lrt /usr/lib64/librt.so /usr/lib64/libdl.so /qfs/projects/ops/rh6/openmpi/3.0.1/gcc/6.1.0/lib/libmpi.so ../timer/libgridpack_timer.a > libgri > > dpack_ > > > parallel.a ../configuration/libgridpack_configuration.a /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a > > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(zstart.o): In function `petscinitializef_': > > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/sys/objects/ftn-custom/zstart.c:280: undefined reference to `mpi_init_' > > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N': > > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1462: undefined reference to `zgemv_' > > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1475: undefined reference to `zgemv_' > > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1478: undefined reference to `zgemv_' > > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o): In function `MatSolve_SeqBAIJ_N_NaturalOrdering': > > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1407: undefined reference to `zgemv_' > > > > > /pic/projects/gridpack/software/petsc-3.16.3/src/mat/impls/baij/seq/baijfact.c:1420: undefined reference to `zgemv_' > > > > > /pic/projects/gridpack/software/petsc-3.16.3/linux-openmpi-gnu-cxx-complex-opt/lib/libpetsc.a(baijfact.o):/pic/projects/gridpack/software/petsc-3.16.3/sr > > > > > > > > > > I suspect that the reason it worked for others and not for me is that they had viable blas and lapack libraries in their path and I don?t. Is there anything special you need to do to make sure that the build is pointed at the libraries that get created with the -download-f2cblaslapack option? > > > > > > > > > > Bruce Palmer > > > > > > > > > > > > > > > > > > From berend.vanwachem at ovgu.de Wed Feb 23 03:21:29 2022 From: berend.vanwachem at ovgu.de (Berend van Wachem) Date: Wed, 23 Feb 2022 10:21:29 +0100 Subject: [petsc-users] error in DMSetGlobalSection after update to v3.16.4 Message-ID: <1409ddbd-edd8-d841-5602-34acb371f289@ovgu.de> Dear PETSc Team, We are currently having a problem with the function DMSetGlobalSection - this worked with v3.16.2 and 3.16.1, but doesn't seem to work anymore in PETSc versions after 3.16.2. I've attached an example which replicates the issue. It creates a box with a random field is generated, saved and loaded. The code worked for PETSc v3.16.1 and v3.16.2 (main branch) but fails for the last main (and release) version v3.16.4. The issue arises in DMSetGlobalSection(sdm,s) (line 160) as it detects the cloned DM sdm as Null argument. I've also included the error message at the bottom. Are we doing something wrong? Or has the 'Cloning' of the dm changed recently? Thanks and best regards, Berend ********************** Error message *********************************** [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Null argument, when expecting valid pointer [0]PETSC ERROR: Null Object: Parameter # 1 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.4-1031-g8387aa1ef1 GIT Date: 2022-02-21 23:19:21 -0600 [0]PETSC ERROR: ./bin/restart_periodic on a arch-linux-c-opt named zeldovich by serbenlo Tue Feb 22 14:32:56 2022 [0]PETSC ERROR: Configure options --with-debugging=yes --with-errorchecking=yes --with-clean --download-metis=yes --download-parmetis=yes --download-hdf5 --download-p4est --download-triangle --download-tetgen --with-zlib-lib=/usr/lib/x86_64-linux-gnu/libz.a --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr --with-mpiexec=/usr/bin/mpiexec [0]PETSC ERROR: #1 PetscSectionGetDof() at /usr/local/petsc_main/src/vec/is/section/interface/section.c:807 [0]PETSC ERROR: #2 DMDefaultSectionCheckConsistency_Internal() at /usr/local/petsc_main/src/dm/interface/dm.c:4520 [0]PETSC ERROR: #3 DMSetGlobalSection() at /usr/local/petsc_main/src/dm/interface/dm.c:4614 [0]PETSC ERROR: #4 main() at /home/serbenlo/DATA/Arbeit/OvGU_PostDoc_2021/Projects/MF_Restart/Periodic_DM/restart-periodic/src/main.c:161 [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc-example.c Type: text/x-csrc Size: 6551 bytes Desc: not available URL: From facklerpw at ornl.gov Wed Feb 23 08:47:03 2022 From: facklerpw at ornl.gov (Fackler, Philip) Date: Wed, 23 Feb 2022 14:47:03 +0000 Subject: [petsc-users] [EXTERNAL] Re: Kokkos Interface for PETSc In-Reply-To: References: <87pmno5bdr.fsf@jedbrown.org> Message-ID: Thanks Jed, Satish, and Richard for the quick and thorough responses. Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory ________________________________ From: petsc-users on behalf of Richard Tran Mills via petsc-users Sent: Thursday, February 17, 2022 18:33 To: petsc-users Cc: Blondel, Sophie ; Roth, Philip ; xolotl-psi-development at lists.sourceforge.net Subject: [EXTERNAL] Re: [petsc-users] Kokkos Interface for PETSc Hi Philip, Sorry to be a bit late in my reply. Jed has explained the gist of what's involved with using the Kokkos/Kokkos-kernels back-end for the PETSc solves, though, depending on exactly how Xolotl creates its vectors, there may be a bit of work required to ensure that the command-line options specifying the matrix and GPU types get applied to the right objects, and that non-GPU types are not being hardcoded somewhere (by a call like "DMSetMatType(dm,MATAIJ)"). In addition to looking at the -log_view output, since Xolotl uses TS you can specify "-ts_view" and look at the output that describes the solver hierarchy that Xolotl sets up. If matrix types are being set correctly, you'll see things like Mat Object: 1 MPI processes type: seqaijkokkos (I note that I've also sent a related message about getting Xolotl working with Kokkos back-ends on Summit to you, Sophie, and Phil in reply to old thread about this.) Were you also asking about how to use Kokkos for PETSc matrix assembly, or is that a question for later? Cheers, Richard On 2/15/22 09:07, Satish Balay via petsc-users wrote: Also - perhaps the following info might be useful Satish ---- balay at sb /home/balay/petsc (main=) $ git grep -l download-kokkos-kernels config/examples config/examples/arch-ci-freebsd-cxx-cmplx-pkgs-dbg.py config/examples/arch-ci-linux-cuda-double.py config/examples/arch-ci-linux-gcc-ifc-cmplx.py config/examples/arch-ci-linux-hip-double.py config/examples/arch-ci-linux-pkgs-dbg-ftn-interfaces.py config/examples/arch-ci-linux-pkgs-valgrind.py config/examples/arch-ci-osx-cxx-pkgs-opt.py config/examples/arch-nvhpc.py config/examples/arch-olcf-crusher.py config/examples/arch-olcf-spock.py balay at sb /home/balay/petsc (main=) $ git grep -l "requires:.*kokkos_kernels" src/ksp/ksp/tests/ex3.c src/ksp/ksp/tests/ex43.c src/ksp/ksp/tests/ex60.c src/ksp/ksp/tutorials/ex7.c src/mat/tests/ex123.c src/mat/tests/ex132.c src/mat/tests/ex2.c src/mat/tests/ex250.c src/mat/tests/ex251.c src/mat/tests/ex252.c src/mat/tests/ex254.c src/mat/tests/ex5.c src/mat/tests/ex62.c src/mat/tutorials/ex5k.kokkos.cxx src/snes/tests/ex13.c src/snes/tutorials/ex13.c src/snes/tutorials/ex3k.kokkos.cxx src/snes/tutorials/ex56.c src/ts/utils/dmplexlandau/tutorials/ex1.c src/ts/utils/dmplexlandau/tutorials/ex1f90.F90 src/ts/utils/dmplexlandau/tutorials/ex2.c src/vec/vec/tests/ex21.c src/vec/vec/tests/ex22.c src/vec/vec/tests/ex23.c src/vec/vec/tests/ex28.c src/vec/vec/tests/ex34.c src/vec/vec/tests/ex37.c src/vec/vec/tests/ex38.c src/vec/vec/tests/ex4.c src/vec/vec/tests/ex43.c src/vec/vec/tests/ex60.c src/vec/vec/tutorials/ex1.c balay at sb /home/balay/petsc (main=) $ On Tue, 15 Feb 2022, Satish Balay via petsc-users wrote: Also - best to use petsc repo - 'main' branch. And for install on crusher - check config/examples/arch-olcf-crusher.py Satish On Tue, 15 Feb 2022, Jed Brown wrote: We need to make these docs more explicit, but the short answer is configure with --download-kokkos --download-kokkos-kernels and run almost any example with -dm_mat_type aijkokkos -dm_vec_type kokkos. If you run with -log_view, you should see that all the flops take place on the device and there are few host->device transfers. Message packing is done on the device and it'll use GPU-aware MPI. There are a few examples of residual evaluation and matrix assembly on the device using Kokkos. You can also see libCEED examples for assembly on the device into Kokkos matrices and vectors without touching host memory. "Fackler, Philip via petsc-users" writes: We're intending to transitioning the Xolotl interfaces with PETSc. I am hoping someone (can) point us to some documentation (and examples) for using PETSc's Kokkos-based interface. If this does not yet exist, then perhaps some slides (like the ones Richard Mills showed at the NE-SciDAC all-hands meeting) showing some examples could get us started. Thanks for any help that can be provided, Philip Fackler Research Software Engineer, Application Engineering Group Advanced Computing Systems Research Section Computer Science and Mathematics Division Oak Ridge National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From qiyang at oakland.edu Wed Feb 23 06:00:33 2022 From: qiyang at oakland.edu (Qi Yang) Date: Wed, 23 Feb 2022 20:00:33 +0800 Subject: [petsc-users] Petsc support for BoomerAMG from PCHYPRE to run on Nvidia and AMD GPUs Message-ID: To whom it may concern, Hope this email finds you well. I am a student from Oakland University who is trying to use Petsc in my project. I found that the Petsc 3.16 now is supporting AMG preconditioner of hypre with gpu on Nvidia and AMD card, however, do you use openCL to realize it ? or do you realize it by using cuda on Nvidia and hip on AMD, if it is not openCL, how do you enable with flag? like -pc_hypre_gamg_mat_type aijcusparse or just -mat_type aijcusparse(which is already used in former edition)? Sorry for the disturbance, I did not find the answer in your document, looking forward to hearing from you soon. Appreciate your time. Best regards, Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Feb 23 11:21:06 2022 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 23 Feb 2022 12:21:06 -0500 Subject: [petsc-users] Petsc support for BoomerAMG from PCHYPRE to run on Nvidia and AMD GPUs In-Reply-To: References: Message-ID: Answered in petsc-maint at mcs.anl.gov Please do not post the same question in multiple venues, it is confusing and can result in no response. > On Feb 23, 2022, at 7:00 AM, Qi Yang wrote: > > To whom it may concern, > > Hope this email finds you well. > > I am a student from Oakland University who is trying to use Petsc in my project. > > I found that the Petsc 3.16 now is supporting AMG preconditioner of hypre with gpu on Nvidia and AMD card, however, do you use openCL to realize it ? or do you realize it by using cuda on Nvidia and hip on AMD, if it is not openCL, how do you enable with flag? like -pc_hypre_gamg_mat_type aijcusparse or just -mat_type aijcusparse(which is already used in former edition)? > > Sorry for the disturbance, I did not find the answer in your document, looking forward to hearing from you soon. > > Appreciate your time. > > Best regards, > Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Feb 23 11:23:07 2022 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 23 Feb 2022 12:23:07 -0500 Subject: [petsc-users] Petsc support for BoomerAMG from PCHYPRE to run on Nvidia and AMD GPUs In-Reply-To: References: Message-ID: On Wed, Feb 23, 2022 at 11:30 AM Qi Yang wrote: > To whom it may concern, > > Hope this email finds you well. > > I am a student from Oakland University who is trying to use Petsc in my > project. > > I found that the Petsc 3.16 now is supporting AMG preconditioner of hypre > with gpu on Nvidia and AMD card, however, do you use openCL to realize it ? > No > or do you realize it by using cuda on Nvidia and hip on AMD, if it is not > openCL, how do you enable with flag? like -pc_hypre_gamg_mat_type > aijcusparse or just -mat_type aijcusparse(which is already used in former > edition)? > See the docs to install PETSc with CUDA support, etc, https://petsc.org/release/overview/ You can look at examples that use hypre, but first get your code working with built-in solver on the CPU, then turn CUDA on (--with-cuda=1) and use built-in solvers with CUDA (eg, -mat_type aijcusparse), Then configure with hypre (--download-hypre) and try it with -pc_type hypre. Good luck, mark > > Sorry for the disturbance, I did not find the answer in your document, > looking forward to hearing from you soon. > > Appreciate your time. > > Best regards, > Qi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aduarteg at utexas.edu Wed Feb 23 17:00:52 2022 From: aduarteg at utexas.edu (Alfredo J Duarte Gomez) Date: Wed, 23 Feb 2022 17:00:52 -0600 Subject: [petsc-users] Reusing Jacobian across time steps Message-ID: Good morning PETSC team, I am currently using a TS object to advance a set of PDEs in time. However, the computation of the Jacobian is quite expensive and I wish to reuse it across time steps if possible. I am well aware of the options -snes_lag_jacobian and -snes_lag_jacobian_persists, but I do not quite understand how to combine them for what I want. In summary I want to compute the Jacobian only at the beginning of a given time step, reuse that Jacobian for N time steps, and then recompute again at the beginning of the next time step. With my current understanding I have been able to reuse it across time steps by recomputing every N snes iterations, however, this leads to recomputations in the middle of time steps which is not what I desire. Thank you, -Alfredo -- Alfredo Duarte Graduate Research Assistant The University of Texas at Austin -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 23 17:44:50 2022 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 23 Feb 2022 18:44:50 -0500 Subject: [petsc-users] Reusing Jacobian across time steps In-Reply-To: References: Message-ID: On Wed, Feb 23, 2022 at 6:01 PM Alfredo J Duarte Gomez wrote: > Good morning PETSC team, > > I am currently using a TS object to advance a set of PDEs in time. > > However, the computation of the Jacobian is quite expensive and I wish to > reuse it across time steps if possible. > > I am well aware of the options -snes_lag_jacobian and > -snes_lag_jacobian_persists, but I do not quite understand how to combine > them for what I want. > > In summary I want to compute the Jacobian only at the beginning of a given > time step, reuse that Jacobian for N time steps, and then recompute again > at the beginning of the next time step. > > With my current understanding I have been able to reuse it across time > steps by recomputing every N snes iterations, however, this leads to > recomputations in the middle of time steps which is not what I desire. > I think the right way to handle this is for the TS to set the lag to (-2, persist) at each timestep where it wants a recompute. I think putting this in a TSMonitor should work. Is that okay for you? Thanks, Matt > Thank you, > > -Alfredo > > > > -- > Alfredo Duarte > Graduate Research Assistant > The University of Texas at Austin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From k.sagiyama at imperial.ac.uk Thu Feb 24 05:07:29 2022 From: k.sagiyama at imperial.ac.uk (Sagiyama, Koki) Date: Thu, 24 Feb 2022 11:07:29 +0000 Subject: [petsc-users] DMView and DMLoad In-Reply-To: <0845e501-e2cd-d7cc-58be-2803ee5ef6cd@ovgu.de> References: <56ce2135-9757-4292-e33b-c7eea8fb7b2e@ovgu.de> <056E066F-D596-4254-A44A-60BFFD30FE82@erdw.ethz.ch> <6c4e0656-db99-e9da-000f-ab9f7dd62c07@ovgu.de> <0845e501-e2cd-d7cc-58be-2803ee5ef6cd@ovgu.de> Message-ID: Dear Berend, DMClone() on a DMPlex object does not clone the PetscSection that that DMPlex object carries (https://petsc.org/main/docs/manualpages/DM/DMClone.html). I think you intended to do something like the following: ``` DMClone(dm, &sdm); PetscObjectSetName((PetscObject)sdm, "dmA"); DMSetLocalSection(sdm, section); ... DMCreateGlobalVector(sdm, &xGlobalVector); ... ``` Regarding save/load, current default I/O seems not working for some reason for periodic meshes as you reported. The latest implementation, however, seems working, so you can try using `-dm_plex_view_hdf5_storage_version 2.0.0` option when saving and see if it works. Thanks, Koki ________________________________ From: Berend van Wachem Sent: Thursday, February 17, 2022 9:06 AM To: Sagiyama, Koki ; Hapla Vaclav ; PETSc users list ; Lawrence Mitchell Subject: Re: [petsc-users] DMView and DMLoad Dear Koki, Many thanks for your help and sorry for the slow reply. I haven't been able to get it to work successfully. I have attached a small example that replicates the main features of our code. In this example a Box with one random field is generated, saved and loaded. The case works for non-periodic domains and fails for periodic ones. I've also included the error output at the bottom of this email. To switch between periodic and non-periodic, please comment/uncomment lines 47 to 52 in src/main.c. To compile, the files "compile" and "CMakeLists.txt" are included in a separate tar file, if you want to use this. Your library paths should be updated in the latter file. The PETSc main distribution is used. Many thanks for your help! Thanks and best regards, Berend. The error message with --with-debugging=no --with-errorchecking=no: [0]PETSC ERROR: --------------------- Error Message ------------------------------------------------ [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Number of coordinates loaded 3168 does not match number of vertices 1000 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.2-499-g9039b796d1 GIT Date: 2021-12-24 23:23:09 +0000 [0]PETSC ERROR: ./bin/restart_periodic on a arch-linux-c-opt named james by serbenlo Thu Dec 30 20:53:22 2021 [0]PETSC ERROR: Configure options --with-debugging=no --with-errorchecking=no --with-clean --download-metis=yes --download-parmetis=yes --download-hdf5 --download-p4est --download-triangle --download-tetgen --with-zlib-lib=/usr/lib/x86_64-linux-gnu/libz.a --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr --with-mpiexec=/usr/bin/mpiexec [0]PETSC ERROR: #1 DMPlexCoordinatesLoad_HDF5_V0_Private() at /usr/local/petsc_main/src/dm/impls/plex/plexhdf5.c:1387 [0]PETSC ERROR: #2 DMPlexCoordinatesLoad_HDF5_Internal() at /usr/local/petsc_main/src/dm/impls/plex/plexhdf5.c:1419 [0]PETSC ERROR: #3 DMPlexCoordinatesLoad() at /usr/local/petsc_main/src/dm/impls/plex/plex.c:2070 [0]PETSC ERROR: #4 main() at /media/MANNHEIM/Arbeit/OvGU_PostDoc_2021/Projects/MF_Restart/Periodic_DM/restart-periodic/src/main.c:229 [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 62) - process 0 The error message with --with-debugging=yes --with-errorchecking=yes: [0]PETSC ERROR: --------------------- Error Message ------------------------------------------------- [0]PETSC ERROR: Null argument, when expecting valid pointer [1]PETSC ERROR: --------------------- Error Message ------------------------------------------------- [1]PETSC ERROR: Null argument, when expecting valid pointer [1]PETSC ERROR: Null Object: Parameter # 1 [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [1]PETSC ERROR: Petsc Development GIT revision: v3.16.2-499-g9039b796d1 GIT Date: 2021-12-24 23:23:09 +0000 [1]PETSC ERROR: ./bin/restart_periodic on a arch-linux-c-opt named james by serbenlo Thu Dec 30 20:17:22 2021 [1]PETSC ERROR: Configure options --with-debugging=yes --with-errorchecking=yes --with-clean --download-metis=yes --download-parmetis=yes --download-hdf5 --download-p4est --download-triangle --download-tetgen --with-zlib-lib=/usr/lib/x86_64-linux-gnu/libz.a --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr --with-mpiexec=/usr/bin/mpiexec [1]PETSC ERROR: #1 PetscSectionGetDof() at /usr/local/petsc_main/src/vec/is/section/interface/section.c:807 [1]PETSC ERROR: [0]PETSC ERROR: Null Object: Parameter # 1 [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.16.2-499-g9039b796d1 GIT Date: 2021-12-24 23:23:09 +0000 [0]PETSC ERROR: ./bin/restart_periodic on a arch-linux-c-opt named james by serbenlo Thu Dec 30 20:17:22 2021 [0]PETSC ERROR: Configure options --with-debugging=yes --with-errorchecking=yes --with-clean --download-metis=yes --download-parmetis=yes --download-hdf5 --download-p4est --download-triangle --download-tetgen --with-zlib-lib=/usr/lib/x86_64-linux-gnu/libz.a --with-zlib-include=/usr/include --with-mpi=yes --with-mpi-dir=/usr --with-mpiexec=/usr/bin/mpiexec #2 DMDefaultSectionCheckConsistency_Internal() at /usr/local/petsc_main/src/dm/interface/dm.c:4489 [1]PETSC ERROR: #3 DMSetGlobalSection() at /usr/local/petsc_main/src/dm/interface/dm.c:4583 [1]PETSC ERROR: [0]PETSC ERROR: #1 PetscSectionGetDof() at /usr/local/petsc_main/src/vec/is/section/interface/section.c:807 [0]PETSC ERROR: #4 main() at /media/MANNHEIM/Arbeit/OvGU_PostDoc_2021/Projects/MF_Restart/Periodic_DM/restart-periodic/src/main.c:164 [1]PETSC ERROR: No PETSc Option Table entries [1]PETSC ERROR: #2 DMDefaultSectionCheckConsistency_Internal() at /usr/local/petsc_main/src/dm/interface/dm.c:4489 [0]PETSC ERROR: #3 DMSetGlobalSection() at /usr/local/petsc_main/src/dm/interface/dm.c:4583 ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 85) - process 1 [0]PETSC ERROR: #4 main() at /media/MANNHEIM/Arbeit/OvGU_PostDoc_2021/Projects/MF_Restart/Periodic_DM/restart-periodic/src/main.c:164 [0]PETSC ERROR: No PETSc Option Table entries [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 85) - process 0 On 12/7/21 16:50, Sagiyama, Koki wrote: > Hi Berend, > > I made some small changes to your code to successfully compile it and > defined a periodic dm using DMPlexCreateBoxMesh(), but otherwise your > code worked fine. > I think we would like to see a complete minimal failing example. Can you > make the working example that I pasted in earlier email fail just by > modifying the dm(i.e., using the periodic mesh you are actually using)? > > Thanks, > Koki > ------------------------------------------------------------------------ > *From:* Berend van Wachem > *Sent:* Monday, December 6, 2021 3:39 PM > *To:* Sagiyama, Koki ; Hapla Vaclav > ; PETSc users list ; > Lawrence Mitchell > *Subject:* Re: [petsc-users] DMView and DMLoad > Dear Koki, > > Thanks for your email. In the example of your last email > DMPlexCoordinatesLoad() takes sF0 (PetscSF) as a third argument. In our > code this modification does not fix the error when loading a periodic > dm. Are we doing something wrong? I've included an example code at the > bottom of this email, including the error output. > > Thanks and best regards, > Berend > > > /**** Write DM + Vec restart ****/ > PetscViewerHDF5Open(PETSC_COMM_WORLD, "result", FILE_MODE_WRITE, &H5Viewer); > PetscObjectSetName((PetscObject)dm, "plexA"); > PetscViewerPushFormat(H5Viewer, PETSC_VIEWER_HDF5_PETSC); > DMPlexTopologyView(dm, H5Viewer); > DMPlexLabelsView(dm, H5Viewer); > DMPlexCoordinatesView(dm, H5Viewer); > PetscViewerPopFormat(H5Viewer); > > DM sdm; > PetscSection s; > > DMClone(dm, &sdm); > PetscObjectSetName((PetscObject)sdm, "dmA"); > DMGetGlobalSection(dm, &s); > DMSetGlobalSection(sdm, s); > DMPlexSectionView(dm, H5Viewer, sdm); > > Vec vec, vecOld; > PetscScalar *array, *arrayOld, *xVecArray, *xVecArrayOld; > PetscInt numPoints; > > DMGetGlobalVector(sdm, &vec); > DMGetGlobalVector(sdm, &vecOld); > > /*** Fill the vectors vec and vecOld ***/ > VecGetArray(vec, &array); > VecGetArray(vecOld, &arrayOld); > VecGetLocalSize(xGlobalVector, &numPoints); > VecGetArray(xGlobalVector, &xVecArray); > VecGetArray(xOldGlobalVector, &xVecArrayOld); > > for (i = 0; i < numPoints; i++) /* Loop over all internal mesh points */ > { > array[i] = xVecArray[i]; > arrayOld[i] = xVecArrayOld[i]; > } > > VecRestoreArray(vec, &array); > VecRestoreArray(vecOld, &arrayOld); > VecRestoreArray(xGlobalVector, &xVecArray); > VecRestoreArray(xOldGlobalVector, &xVecArrayOld); > > PetscObjectSetName((PetscObject)vec, "vecA"); > PetscObjectSetName((PetscObject)vecOld, "vecB"); > DMPlexGlobalVectorView(dm, H5Viewer, sdm, vec); > DMPlexGlobalVectorView(dm, H5Viewer, sdm, vecOld); > PetscViewerDestroy(&H5Viewer); > /*** end of writing ****/ > > /*** Load ***/ > PetscViewerHDF5Open(PETSC_COMM_WORLD, "result", FILE_MODE_READ, &H5Viewer); > DMCreate(PETSC_COMM_WORLD, &dm); > DMSetType(dm, DMPLEX); > PetscObjectSetName((PetscObject)dm, "plexA"); > PetscViewerPushFormat(H5Viewer, PETSC_VIEWER_HDF5_PETSC); > DMPlexTopologyLoad(dm, H5Viewer, &sfO); > DMPlexLabelsLoad(dm, H5Viewer); > DMPlexCoordinatesLoad(dm, H5Viewer, sfO); > PetscViewerPopFormat(H5Viewer); > > DMPlexDistribute(dm, Options->Mesh.overlap, &sfDist, &distributedDM); > if (distributedDM) { > DMDestroy(&dm); > dm = distributedDM; > PetscObjectSetName((PetscObject)dm, "plexA"); > } > > PetscSFCompose(sfO, sfDist, &sf); > PetscSFDestroy(&sfO); > PetscSFDestroy(&sfDist); > > DMClone(dm, &sdm); > PetscObjectSetName((PetscObject)sdm, "dmA"); > DMPlexSectionLoad(dm, H5Viewer, sdm, sf, &globalDataSF, &localDataSF); > > /** Load the Vectors **/ > DMGetGlobalVector(sdm, &Restart_xGlobalVector); > VecSet(Restart_xGlobalVector,0.0); > > PetscObjectSetName((PetscObject)Restart_xGlobalVector, "vecA"); > DMPlexGlobalVectorLoad(dm, H5Viewer, sdm, > globalDataSF,Restart_xGlobalVector); > DMGetGlobalVector(sdm, &Restart_xOldGlobalVector); > VecSet(Restart_xOldGlobalVector,0.0); > > PetscObjectSetName((PetscObject)Restart_xOldGlobalVector, "vecB"); > DMPlexGlobalVectorLoad(dm, H5Viewer, sdm, globalDataSF, > Restart_xOldGlobalVector); > > PetscViewerDestroy(&H5Viewer); > > > /**** The error message when loading is the following ************/ > > Creating and distributing mesh > [0]PETSC ERROR: --------------------- Error Message > -------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Number of coordinates loaded 17128 does not match number > of vertices 8000 > [0]PETSC ERROR: See https://petsc.org/release/faq/ > for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.16.1-435-g007f11b901 > GIT Date: 2021-12-01 14:31:21 +0000 > [0]PETSC ERROR: ./MF3 on a linux-gcc-openmpi-opt named > ivt24.ads.uni-magdeburg.de by berend Mon Dec 6 16:11:21 2021 > [0]PETSC ERROR: Configure options --with-p4est=yes --with-partemis > --with-metis --with-debugging=no --download-metis=yes > --download-parmetis=yes --with-errorchecking=no --download-hdf5 > --download-zlib --download-p4est > [0]PETSC ERROR: #1 DMPlexCoordinatesLoad_HDF5_V0_Private() at > /home/berend/src/petsc_main/src/dm/impls/plex/plexhdf5.c:1387 > [0]PETSC ERROR: #2 DMPlexCoordinatesLoad_HDF5_Internal() at > /home/berend/src/petsc_main/src/dm/impls/plex/plexhdf5.c:1419 > [0]PETSC ERROR: #3 DMPlexCoordinatesLoad() at > /home/berend/src/petsc_main/src/dm/impls/plex/plex.c:2070 > [0]PETSC ERROR: #4 RestartMeshDM() at > /home/berend/src/eclipseworkspace/multiflow/src/io/restartmesh.c:81 > [0]PETSC ERROR: #5 CreateMeshDM() at > /home/berend/src/eclipseworkspace/multiflow/src/mesh/createmesh.c:61 > [0]PETSC ERROR: #6 main() at > /home/berend/src/eclipseworkspace/multiflow/src/general/main.c:132 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: --download-hdf5 > [0]PETSC ERROR: --download-metis=yes > [0]PETSC ERROR: --download-p4est > [0]PETSC ERROR: --download-parmetis=yes > [0]PETSC ERROR: --download-zlib > [0]PETSC ERROR: --with-debugging=no > [0]PETSC ERROR: --with-errorchecking=no > [0]PETSC ERROR: --with-metis > [0]PETSC ERROR: --with-p4est=yes > [0]PETSC ERROR: --with-partemis > [0]PETSC ERROR: -d results > [0]PETSC ERROR: -o run.mf > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 62. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > > > > > > On 11/19/21 00:26, Sagiyama, Koki wrote: >> Hi Berend, >> >> I was not able to reproduce the issue you are having, but the following >> 1D example (and similar 2D examples) worked fine for me using the latest >> PETSc. Please note that DMPlexCoordinatesLoad() now takes a PetscSF >> object as the third argument, but the default behavior is unchanged. >> >> /* test_periodic_io.c */ >> >> #include >> #include >> #include >> >> int main(int argc, char **argv) >> { >> DM dm; >> Vec coordinates; >> PetscViewer viewer; >> PetscViewerFormat format = PETSC_VIEWER_HDF5_PETSC; >> PetscSF sfO; >> PetscErrorCode ierr; >> >> ierr = PetscInitialize(&argc, &argv, NULL, NULL); if (ierr) return ierr; >> /* Save */ >> ierr = PetscViewerHDF5Open(PETSC_COMM_WORLD, "periodic_example.h5", >> FILE_MODE_WRITE, &viewer);CHKERRQ(ierr); >> { >> DM pdm; >> PetscInt dim = 1; >> const PetscInt faces[1] = {4}; >> DMBoundaryType periodicity[] = {DM_BOUNDARY_PERIODIC}; >> PetscInt overlap = 1; >> >> ierr = DMPlexCreateBoxMesh(PETSC_COMM_WORLD, dim, PETSC_FALSE, >> faces, NULL, NULL, periodicity, PETSC_TRUE, &dm);CHKERRQ(ierr); >> ierr = DMPlexDistribute(dm, overlap, NULL, &pdm);CHKERRQ(ierr); >> ierr = DMDestroy(&dm);CHKERRQ(ierr); >> dm = pdm; >> ierr = PetscObjectSetName((PetscObject)dm, "periodicDM");CHKERRQ(ierr); >> } >> ierr = DMGetCoordinates(dm, &coordinates);CHKERRQ(ierr); >> ierr = PetscPrintf(PETSC_COMM_WORLD, "Coordinates before >> saving:\n");CHKERRQ(ierr); >> ierr = VecView(coordinates, NULL);CHKERRQ(ierr); >> ierr = PetscViewerPushFormat(viewer, format);CHKERRQ(ierr); >> ierr = DMPlexTopologyView(dm, viewer);CHKERRQ(ierr); >> ierr = DMPlexCoordinatesView(dm, viewer);CHKERRQ(ierr); >> ierr = PetscViewerPopFormat(viewer);CHKERRQ(ierr); >> ierr = DMDestroy(&dm);CHKERRQ(ierr); >> ierr = PetscViewerDestroy(&viewer);CHKERRQ(ierr); >> /* Load */ >> ierr = PetscViewerHDF5Open(PETSC_COMM_WORLD, "periodic_example.h5", >> FILE_MODE_READ, &viewer);CHKERRQ(ierr); >> ierr = DMCreate(PETSC_COMM_WORLD, &dm);CHKERRQ(ierr); >> ierr = DMSetType(dm, DMPLEX);CHKERRQ(ierr); >> ierr = PetscObjectSetName((PetscObject)dm, "periodicDM");CHKERRQ(ierr); >> ierr = PetscViewerPushFormat(viewer, format);CHKERRQ(ierr); >> ierr = DMPlexTopologyLoad(dm, viewer, &sfO);CHKERRQ(ierr); >> ierr = DMPlexCoordinatesLoad(dm, viewer, sfO);CHKERRQ(ierr); >> ierr = PetscViewerPopFormat(viewer);CHKERRQ(ierr); >> ierr = DMGetCoordinates(dm, &coordinates);CHKERRQ(ierr); >> ierr = PetscPrintf(PETSC_COMM_WORLD, "Coordinates after >> loading:\n");CHKERRQ(ierr); >> ierr = VecView(coordinates, NULL);CHKERRQ(ierr); >> ierr = PetscSFDestroy(&sfO);CHKERRQ(ierr); >> ierr = DMDestroy(&dm);CHKERRQ(ierr); >> ierr = PetscViewerDestroy(&viewer);CHKERRQ(ierr); >> ierr = PetscFinalize(); >> return ierr; >> } >> >> mpiexec -n 2 ./test_periodic_io >> >> Coordinates before saving: >> Vec Object: coordinates 2 MPI processes >> type: mpi >> Process [0] >> 0. >> Process [1] >> 0.25 >> 0.5 >> 0.75 >> Coordinates after loading: >> Vec Object: vertices 2 MPI processes >> type: mpi >> Process [0] >> 0. >> 0.25 >> 0.5 >> 0.75 >> Process [1] >> >> I would also like to note that, with the latest update, we can >> optionally load coordinates directly on the distributed dm as (using >> your notation): >> >> /* Distribute dm */ >> ... >> PetscSFCompose(sfO, sfDist, &sf); >> DMPlexCoordinatesLoad(dm, viewer, sf); >> >> To use this feature, we need to pass "-dm_plex_view_hdf5_storage_version >> 2.0.0" option when saving topology/coordinates. >> >> >> Thanks, >> Koki >> ------------------------------------------------------------------------ >> *From:* Berend van Wachem >> *Sent:* Wednesday, November 17, 2021 3:16 PM >> *To:* Hapla Vaclav ; PETSc users list >> ; Lawrence Mitchell ; Sagiyama, >> Koki >> *Subject:* Re: [petsc-users] DMView and DMLoad >> >> ******************* >> This email originates from outside Imperial. Do not click on links and >> attachments unless you recognise the sender. >> If you trust the sender, add them to your safe senders list >> https://spam.ic.ac.uk/SpamConsole/Senders.aspx > >> > to disable email >> stamping for this address. >> ******************* >> Dear Vaclav, Lawrence, Koki, >> >> Thanks for your help! Following your advice and following your example >> (https://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5 > >> >) > >> >> we are able to save and load the DM with a wrapped Vector in h5 format >> (PETSC_VIEWER_HDF5_PETSC) successfully. >> >> For saving, we use something similar to: >> >> DMPlexTopologyView(dm, viewer); >> DMClone(dm, &sdm); >> ... >> DMPlexSectionView(dm, viewer, sdm); >> DMGetLocalVector(sdm, &vec); >> ... >> DMPlexLocalVectorView(dm, viewer, sdm, vec); >> >> and for loading: >> >> DMCreate(PETSC_COMM_WORLD, &dm); >> DMSetType(dm, DMPLEX); >> ... >> PetscViewerPushFormat(viewer, PETSC_VIEWER_HDF5_PETSC); >> DMPlexTopologyLoad(dm, viewer, &sfO); >> DMPlexLabelsLoad(dm, viewer); >> DMPlexCoordinatesLoad(dm, viewer); >> PetscViewerPopFormat(viewer); >> ... >> PetscSFCompose(sfO, sfDist, &sf); >> ... >> DMClone(dm, &sdm); >> DMPlexSectionLoad(dm, viewer, sdm, sf, &globalDataSF, &localDataSF); >> DMGetLocalVector(sdm, &vec); >> ... >> DMPlexLocalVectorLoad(dm, viewer, sdm, localDataSF, vec); >> >> >> This works fine for non-periodic DMs but for periodic cases the line: >> >> DMPlexCoordinatesLoad(dm, H5Viewer); >> >> delivers the error message: invalid argument and the number of loaded >> coordinates does not match the number of vertices. >> >> Is this a known shortcoming, or have we forgotten something to load >> periodic DMs? >> >> Best regards, >> >> Berend. >> >> >> >> On 9/22/21 20:59, Hapla Vaclav wrote: >>> To avoid confusions here, Berend seems to be specifically demanding XDMF >>> (PETSC_VIEWER_HDF5_XDMF). The stuff we are now working on is parallel >>> checkpointing in our own HDF5 format (PETSC_VIEWER_HDF5_PETSC), I will >>> make a series of MRs on this topic in the following days. >>> >>> For XDMF, we are specifically missing the ability to write/load DMLabels >>> properly. XDMF uses specific cell-local numbering for faces for >>> specification of face sets, and face-local numbering for specification >>> of edge sets, which is not great wrt DMPlex design. And ParaView doesn't >>> show any of these properly so it's hard to debug. Matt, we should talk >>> about this soon. >>> >>> Berend, for now, could you just load the mesh initially from XDMF and >>> then use our PETSC_VIEWER_HDF5_PETSC format for subsequent saving/loading? >>> >>> Thanks, >>> >>> Vaclav >>> >>>> On 17 Sep 2021, at 15:46, Lawrence Mitchell >>> >>> wrote: >>>> >>>> Hi Berend, >>>> >>>>> On 14 Sep 2021, at 12:23, Matthew Knepley >>>> >>> wrote: >>>>> >>>>> On Tue, Sep 14, 2021 at 5:15 AM Berend van Wachem >>>>> >>> wrote: >>>>> Dear PETSc-team, >>>>> >>>>> We are trying to save and load distributed DMPlex and its associated >>>>> physical fields (created with DMCreateGlobalVector) (Uvelocity, >>>>> VVelocity, ...) in HDF5_XDMF format. To achieve this, we do the >>>>> following: >>>>> >>>>> 1) save in the same xdmf.h5 file: >>>>> DMView( DM , H5_XDMF_Viewer ); >>>>> VecView( UVelocity, H5_XDMF_Viewer ); >>>>> >>>>> 2) load the dm: >>>>> DMPlexCreateFromfile(PETSC_COMM_WORLD, Filename, PETSC_TRUE, DM); >>>>> >>>>> 3) load the physical field: >>>>> VecLoad( UVelocity, H5_XDMF_Viewer ); >>>>> >>>>> There are no errors in the execution, but the loaded DM is distributed >>>>> differently to the original one, which results in the incorrect >>>>> placement of the values of the physical fields (UVelocity etc.) in the >>>>> domain. >>>>> >>>>> This approach is used to restart the simulation with the last saved DM. >>>>> Is there something we are missing, or there exists alternative routes to >>>>> this goal? Can we somehow get the IS of the redistribution, so we can >>>>> re-distribute the vector data as well? >>>>> >>>>> Many thanks, best regards, >>>>> >>>>> Hi Berend, >>>>> >>>>> We are in the midst of rewriting this. We want to support saving >>>>> multiple meshes, with fields attached to each, >>>>> and preserving the discretization (section) information, and allowing >>>>> us to load up on a different number of >>>>> processes. We plan to be done by October. Vaclav and I are doing this >>>>> in collaboration with Koki Sagiyama, >>>>> David Ham, and Lawrence Mitchell from the Firedrake team. >>>> >>>> The core load/save cycle functionality is now in PETSc main. So if >>>> you're using main rather than a release, you can get access to it now. >>>> This section of the manual shows an example of how to do >>>> thingshttps://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5 >>>> >> >> >>>> >>>> Let us know if things aren't clear! >>>> >>>> Thanks, >>>> >>>> Lawrence >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhaog6 at lsec.cc.ac.cn Thu Feb 24 05:23:51 2022 From: zhaog6 at lsec.cc.ac.cn (=?UTF-8?B?6LW15Yia?=) Date: Thu, 24 Feb 2022 19:23:51 +0800 (GMT+08:00) Subject: [petsc-users] An issue about two parameter options in PETSc/GAMG Message-ID: <1ba3a140.1684e.17f2b791973.Coremail.zhaog6@lsec.cc.ac.cn> Dear Mark and PETSc team, I have a question when using PETSc/GAMG. For the parameter options `-pc_gamg_process_eq_limit ` and `-pc_gamg_coarse_eq_limit `, when or for what kind of problems can it bring significant performance improvement? In addition, generally, what relationship do the `a`, `b` and number of processes need to meet? If there are some papers or numerical examples that show the performance advantages and importance of using the both parameters, could you provide it to me? Thanks a lot. Best Regards, Gang Zhao From mfadams at lbl.gov Thu Feb 24 09:06:15 2022 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 24 Feb 2022 10:06:15 -0500 Subject: [petsc-users] An issue about two parameter options in PETSc/GAMG In-Reply-To: <1ba3a140.1684e.17f2b791973.Coremail.zhaog6@lsec.cc.ac.cn> References: <1ba3a140.1684e.17f2b791973.Coremail.zhaog6@lsec.cc.ac.cn> Message-ID: On Thu, Feb 24, 2022 at 6:24 AM ?? wrote: > Dear Mark and PETSc team, > > I have a question when using PETSc/GAMG. For the parameter options > `-pc_gamg_process_eq_limit ` and `-pc_gamg_coarse_eq_limit `, when or > for what kind of problems can it bring significant performance improvement? > In addition, generally, what relationship do the `a`, `b` and number of > processes need to meet? `-pc_gamg_process_eq_limit ` guides the process reduction on coarse grids. 'a' is the number of equations that you want to aim for on each active process. So if you have 1000 processors and a = 100, then GAMG should make 10 processes active and leave the rest empty on that level. `-pc_gamg_coarse_eq_limit ` tells GAMG when to stop coarsening. So 'b' should be the size problem that you can solve (eg, not factor) fast on your machine and problem. You can run with '-info' and grep on GAMG to see what GAMG is doing with respect to the active number of processes, size of each level, average number of non-zeros per level. -log_view will print the time in the coarse grid solver. > > If there are some papers or numerical examples that show the performance > advantages and importance of using the both parameters, could you provide > it to me? Thanks a lot. > These parameters are not too important until you get to a very large scale, but you can play with them (on as large a problem as you can deal with so that you see something). -log_view prints info about time on each level that can help to see the effects of this, but it is a bit hard to interpret. Some of my old papers had some data on this but it is so machine specific, and problem specific, that you really just need to test it. Setup a problem, again as large as possible, start with the defaults and search for the minimum. It will be a convex function so it is not hard. The two parameters are almost orthogonal. Mark > > > Best Regards, > Gang Zhao > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Thu Feb 24 15:00:59 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Thu, 24 Feb 2022 15:00:59 -0600 Subject: [petsc-users] NVIDIA HPC SDK and complex data type In-Reply-To: References: Message-ID: FYI: I am notified that the nvc compiler bug was fixed in nvhpc 22.2-0 --Junchao Zhang On Mon, Dec 20, 2021 at 8:19 AM Jonathan D. Halverson < halverson at princeton.edu> wrote: > Hi Junchao, > > Thank you very much for your quick work. The simple build procedure now > works. > > Jon > ------------------------------ > *From:* Junchao Zhang > *Sent:* Sunday, December 19, 2021 6:38 PM > *To:* petsc-users at mcs.anl.gov > *Cc:* Jonathan D. Halverson > *Subject:* Re: [petsc-users] NVIDIA HPC SDK and complex data type > > Since it will take a while for NVIDIA to fix the bug in their NVCHPC 21.11 > (December 2021), I added a workaround to the MR in petsc, > https://gitlab.com/petsc/petsc/-/merge_requests/4663 > I tested it and it works with debugging (-O0) or no debugging (-O, or > -O2). > > --Junchao Zhang > > > On Sat, Dec 18, 2021 at 7:51 PM Barry Smith wrote: > > > Yes, Junchao deserves a bounty from NVIDIA for this find. > > On Dec 18, 2021, at 8:22 PM, Matthew Knepley wrote: > > On Sat, Dec 18, 2021 at 7:03 PM Junchao Zhang > wrote: > > I found it is a NVIDIA C/C++ compiler bug. I can reproduce it with > > > Great find! > > Matt > > > #include > #include > #include > > typedef double _Complex PetscScalar; > typedef struct { > int row; > PetscScalar *valaddr; > } MatEntry2; > > int main(int arc, char** argv) > { > int i=2; > MatEntry2 *Jentry2 = (MatEntry2*)malloc(64*sizeof(MatEntry2)); > PetscScalar a=1, b=1; > > printf("sizeof(MatEntry2)=%lu\n",sizeof(MatEntry2)); > Jentry2[2].valaddr = (PetscScalar*)malloc(16*sizeof(PetscScalar)); > *(Jentry2[i].valaddr) = a*b; // Segfault > > free(Jentry2[2].valaddr); > free(Jentry2); > return 0; > } > > $ nvc -O0 -o test test.c > $ ./test > sizeof(MatEntry2)=16 > Segmentation fault (core dumped) > > If I change *(Jentry2[i].valaddr) = a*b; to > > PetscScalar *p = Jentry2[2].valaddr; > *p = a*b; > > Then the code works fine. Using -O0 to -O2 will also avoid this error for > this simple test, but not for PETSc. In PETSc, I could apply the above > silly trick, but I am not sure it is worth it. We should instead report it > to NVIDIA. > > Looking at the assembly code for the segfault line, we can find the > problem > movslq 52(%rsp), %rcx > movq 40(%rsp), %rax > movq 8(%rax,%rcx,8), %rax // Here %rax = &Jentry2, %rcx = i; The > instruction wrongly calculates Jentry2[2].valaddr as (%rax + %rcx*8)+8, > which should instead be (%rax + %rcx*16)+8 > vmovsd %xmm1, 8(%rax) > vmovsd %xmm0, (%rax) > > --Junchao Zhang > > > On Fri, Dec 17, 2021 at 7:58 PM Junchao Zhang > wrote: > > Hi, Jon, > I could reproduce the error exactly. I will have a look. > Thanks for reporting. > --Junchao Zhang > > > On Fri, Dec 17, 2021 at 2:56 PM Jonathan D. Halverson < > halverson at princeton.edu> wrote: > > Hello, > > We are unable to build PETSc using the NVIDIA HPC SDK and > --with-scalar-type=complex. Below is our procedure: > > $ module load nvhpc/21.11 > $ module load openmpi/nvhpc-21.11/4.1.2/64 > $ git clone -b release https://gitlab.com/petsc/petsc.git petsc; cd petsc > $ ./configure --with-debugging=1 --with-scalar-type=complex > PETSC_ARCH=openmpi-power > $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power all > $ make PETSC_DIR=/home/$USER/software/petsc PETSC_ARCH=openmpi-power check > > "make check" fails with a segmentation fault when running ex19. The > fortran test ex5f passes. > > The procedure above fails on x86_64 and POWER both running RHEL8. It also > fails using nvhpc 20.7. > > The procedure above works for "real" instead of "complex". > > A "hello world" MPI code using a complex data type works with our nvhpc > modules. > > The procedure above works successfully when GCC and an Open MPI library > built using GCC is used. > > The only trouble is the combination of PETSc with nvhpc and complex. Any > known issues? > > The build log for the procedure above is here: > https://tigress-web.princeton.edu/~jdh4/petsc_nvhpc_complex_17dec2021.log > > Jon > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gongding at cn.cogenda.com Thu Feb 24 20:56:56 2022 From: gongding at cn.cogenda.com (Gong Ding) Date: Fri, 25 Feb 2022 10:56:56 +0800 Subject: [petsc-users] mkl_cpardiso halt on petsc-3.16.x Message-ID: <4cd90f97-99e9-11e5-4c66-5d6559be74b6@cn.cogenda.com> Hi all, On petsc3.12, I successfully run the linear solver mkl cpardiso on parallel environment.? However, with the exactly same code, petsc-3.16 halt [0] PCSetUp(): Setting up PC for first time ---------------------------------------- MKL_CPARDISO Options: ? -mat_mkl_cpardiso_65 : Suggested number of threads to use within MKL_CPARDISO (None) ? -mat_mkl_cpardiso_66 : Maximum number of factors with identical sparsity structure that must be kept in memory at the same time (None) ? -mat_mkl_cpardiso_67 : Indicates the actual matrix for the solution phase (None) ? -mat_mkl_cpardiso_68 : Message level information (None) ? -mat_mkl_cpardiso_69 : Defines the matrix type (None) ? -mat_mkl_cpardiso_1 : Use default values (None) <--------- halt here Regards Gong Ding From bourdin at mcmaster.ca Fri Feb 25 09:20:18 2022 From: bourdin at mcmaster.ca (Blaise Bourdin) Date: Fri, 25 Feb 2022 15:20:18 +0000 Subject: [petsc-users] Message length unit in log_view Message-ID: <1FE574BD-10B7-4DF5-BB8F-049D6D7E258F@mcmaster.ca> Hi, What is the unit for messages length in PetscLogView? MPI Messages: 3.500e+00 1.000 3.500e+00 7.000e+00 MPI Message Lengths: 9.200e+01 1.000 2.629e+01 1.840e+02 MPI Reductions: 4.000e+01 1.000 Are these bytes, words, KB? Regards, Blaise -- Professor, Department of Mathematics & Statistics Hamilton Hall room 409A, McMaster University 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada Tel. +1 (905) 525 9140 ext. 27243 From junchao.zhang at gmail.com Fri Feb 25 09:30:56 2022 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 25 Feb 2022 09:30:56 -0600 Subject: [petsc-users] Message length unit in log_view In-Reply-To: <1FE574BD-10B7-4DF5-BB8F-049D6D7E258F@mcmaster.ca> References: <1FE574BD-10B7-4DF5-BB8F-049D6D7E258F@mcmaster.ca> Message-ID: On Fri, Feb 25, 2022 at 9:20 AM Blaise Bourdin wrote: > Hi, > > What is the unit for messages length in PetscLogView? > MPI Messages: 3.500e+00 1.000 3.500e+00 7.000e+00 > MPI Message Lengths: 9.200e+01 1.000 2.629e+01 1.840e+02 > MPI Reductions: 4.000e+01 1.000 > > Are these bytes, words, KB? > bytes. Yes, we need to make it clear. > > Regards, > Blaise > > -- > Professor, Department of Mathematics & Statistics > Hamilton Hall room 409A, McMaster University > 1280 Main Street West, Hamilton, Ontario L8S 4K1, Canada > Tel. +1 (905) 525 9140 ext. 27243 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From giavancini at usp.br Fri Feb 25 10:06:01 2022 From: giavancini at usp.br (Giovane Avancini) Date: Fri, 25 Feb 2022 13:06:01 -0300 Subject: [petsc-users] [KSP] PETSc not reporting a KSP fail when true residual is NaN Message-ID: Dear PETSc users, I'm working on an inhouse code that solves the Navier-Stokes equation in a Lagrangian fashion for free surface flows. Because of the large distortions and pressure gradients, it is quite common to encounter some issues with iterative solvers for some time steps, and because of that, I implemented a function that changes the solver type based on the flag KSPConvergedReason. If this flag is negative after a call to KSPSolve, I solve the same linear system again using a direct method. The problem is that, sometimes, KSP keeps converging even though the residual is NaN, and because of that, I'm not able to identify the problem and change the solver, which leads to a solution vector equals to INF and obviously the code ends up crashing. Is it normal to observe this kind of behaviour? Please find attached the log produced with the options -ksp_monitor_lg_residualnorm -ksp_log -ksp_view -ksp_monitor_true_residual -ksp_converged_reason and the function that changes the solver. I'm currently using FGMRES and BJACOBI preconditioner with LU for each block. The problem still happens with ILU for example. We can see in the log file that for the time step 921, the true residual is NaN and within just one iteration, the solver fails and it gives the reason DIVERGED_PC_FAILED. I simply changed the solver to MUMPS and it converged for that time step. However, when solving time step 922 we can see that FGMRES converges while the true residual is NaN. Why is that possible? I would appreciate it if someone could clarify this issue to me. Kind regards, Giovane -- Giovane Avancini Doutorando em Engenharia de Estruturas - Escola de Engenharia de S?o Carlos, USP PhD researcher in Structural Engineering - School of Engineering of S?o Carlos. USP -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- void FluidDomain::solveLinearSystem(KSP& ksp, Mat& mat, Vec& rhs, Vec& solution) { auto start_timer = std::chrono::high_resolution_clock::now(); KSPReset(ksp); KSPSetOperators(ksp, mat, mat); PC pc; KSPGetPC(ksp, &pc); PetscBool isbjacobi; PetscObjectTypeCompare((PetscObject)pc, PCBJACOBI, &isbjacobi); if (isbjacobi) { PetscInt nlocal; KSP *subksp; PC subpc; KSPSetUp(ksp); PCBJacobiGetSubKSP(pc, &nlocal, NULL, &subksp); for (int i = 0; i < nlocal; i++) { KSPGetPC(subksp[i], &subpc); PCSetType(subpc, PCLU); // PCFactorReorderForNonzeroDiagonal(subpc, 1.0e-10); // PCFactorSetShiftType(subpc, MAT_SHIFT_NONZERO); //PCFactorSetShiftAmount(subpc, 1.0e-10); } } else { //PCFactorReorderForNonzeroDiagonal(pc, 1.0e-10); //PCFactorSetShiftType(pc, MAT_SHIFT_NONZERO); // PCFactorSetShiftAmount(pc, 1.0e-10); } PetscReal matnorm, vecnorm; VecNorm(rhs, NORM_INFINITY, &vecnorm); MatNorm(mat, NORM_INFINITY, &matnorm); PetscPrintf(PETSC_COMM_WORLD, "MatNorm: %g VecNorm: %g\n", (double)matnorm, (double)vecnorm); KSPSolve(ksp, rhs, solution); KSPView(ksp, PETSC_VIEWER_STDOUT_WORLD); KSPConvergedReason reason; KSPGetConvergedReason(ksp,&reason); PetscInt nit; KSPGetIterationNumber(ksp, &nit); PetscReal norm; KSPGetResidualNorm(ksp, &norm); auto end_timer = std::chrono::high_resolution_clock::now(); std::chrono::duration elapsed = end_timer - start_timer; if (reason > 0) { PetscPrintf(PETSC_COMM_WORLD, "Solver converged within %d iterations. Elapsed time: %f\n", nit, elapsed.count()); } else { if (reason == -3) PetscPrintf(PETSC_COMM_WORLD, "Solver convergence is very slow. Modifying the solver in order to improve the convergence...\n"); else PetscPrintf(PETSC_COMM_WORLD, "Solver diverged, reason %d. Modifying the solver in order to improve the convergence...\n", reason); KSP ksp2; KSPCreate(PETSC_COMM_WORLD, &ksp2); KSPSetType(ksp2, KSPPREONLY); KSPSetTolerances(ksp2, 1.0e-8, PETSC_DEFAULT, PETSC_DEFAULT, 5000); KSPGMRESSetRestart(ksp2, 30); PC pc2; KSPGetPC(ksp2, &pc2); PCSetType(pc2, PCLU); KSPSetOperators(ksp2, mat, mat); PetscObjectTypeCompare((PetscObject)pc2, PCBJACOBI, &isbjacobi); if (isbjacobi) { PetscInt nlocal; KSP *subksp; PC subpc; KSPSetUp(ksp2); PCBJacobiGetSubKSP(pc2, &nlocal, NULL, &subksp); for (int i = 0; i < nlocal; i++) { KSPGetPC(subksp[i], &subpc); //PCFactorSetShiftType(subpc, MAT_SHIFT_NONZERO); } } else { //PCFactorSetShiftType(pc2, MAT_SHIFT_NONZERO); } VecNorm(rhs, NORM_INFINITY, &vecnorm); MatNorm(mat, NORM_INFINITY, &matnorm); PetscPrintf(PETSC_COMM_WORLD, "MatNorm: %g VecNorm: %g\n", (double)matnorm, (double)vecnorm); KSPSolve(ksp2, rhs, solution); KSPGetConvergedReason(ksp2, &reason); KSPGetIterationNumber(ksp2, &nit); if (reason > 0) { PetscPrintf(PETSC_COMM_WORLD, "Solver converged within %d iterations.\n", nit); } else { PetscPrintf(PETSC_COMM_WORLD, "Changing the solver did not improve the convergence.\n"); } KSPDestroy(&ksp2); } } -------------- next part -------------- ----------------------- TIME STEP = 921, time = 0.184200 ----------------------- Mesh Regenerated. Elapsed time: 0.011534 Isolated nodes: 0 Assemble Linear System. Elapsed time: 0.023297 MatNorm: 3.04644e+06 VecNorm: 1305. 0 KSP unpreconditioned resid norm 1.466259843490e+04 true resid norm -nan ||r(i)||/||b|| -nan Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to SUBPC_ERROR KSP Object: 4 MPI processes type: fgmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=500, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 4 MPI processes type: bjacobi number of blocks = 4 Local solver information for first block is in the following KSP and PC objects on rank 0: Use -ksp_view ::ascii_info_detail to display information for all blocks KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 2.90995 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=1091, cols=1091 package used to perform factorization: petsc total: nonzeros=58039, allocated nonzeros=58039 using I-node routines: found 364 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (sub_) 1 MPI processes type: seqaij rows=1091, cols=1091 total: nonzeros=19945, allocated nonzeros=19945 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 364 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpiaij rows=4362, cols=4362 total: nonzeros=88470, allocated nonzeros=88470 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 364 nodes, limit used is 5 Solver diverged, reason -11. Modifying the solver in order to improve the convergence... MatNorm: 3.04644e+06 VecNorm: 1305. Linear solve converged due to CONVERGED_ITS iterations 1 Solver converged within 1 iterations. Newton iteration: 0 - L2 Position Norm: 1.203626E-03 - L2 Pressure Norm: 2.537266E-01 Memory used by each processor: 36.636719 Mb Assemble Linear System. Elapsed time: 0.016010 MatNorm: 3.04644e+06 VecNorm: 0.0239994 0 KSP unpreconditioned resid norm 6.218477255232e-02 true resid norm -nan ||r(i)||/||b|| -nan Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to SUBPC_ERROR KSP Object: 4 MPI processes type: fgmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=500, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 4 MPI processes type: bjacobi number of blocks = 4 Local solver information for first block is in the following KSP and PC objects on rank 0: Use -ksp_view ::ascii_info_detail to display information for all blocks KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 2.90995 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=1091, cols=1091 package used to perform factorization: petsc total: nonzeros=58039, allocated nonzeros=58039 using I-node routines: found 364 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (sub_) 1 MPI processes type: seqaij rows=1091, cols=1091 total: nonzeros=19945, allocated nonzeros=19945 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 364 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpiaij rows=4362, cols=4362 total: nonzeros=88470, allocated nonzeros=88470 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 364 nodes, limit used is 5 Solver diverged, reason -11. Modifying the solver in order to improve the convergence... MatNorm: 3.04644e+06 VecNorm: 0.0239994 Linear solve converged due to CONVERGED_ITS iterations 1 Solver converged within 1 iterations. Newton iteration: 1 - L2 Position Norm: 1.796085E-07 - L2 Pressure Norm: 9.187252E-02 Memory used by each processor: 36.695312 Mb Assemble Linear System. Elapsed time: 0.020556 MatNorm: 3.04644e+06 VecNorm: 2.81116e-06 0 KSP unpreconditioned resid norm 1.136884066004e-05 true resid norm -nan ||r(i)||/||b|| -nan Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to SUBPC_ERROR KSP Object: 4 MPI processes type: fgmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=500, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 4 MPI processes type: bjacobi number of blocks = 4 Local solver information for first block is in the following KSP and PC objects on rank 0: Use -ksp_view ::ascii_info_detail to display information for all blocks KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 2.90995 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=1091, cols=1091 package used to perform factorization: petsc total: nonzeros=58039, allocated nonzeros=58039 using I-node routines: found 364 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (sub_) 1 MPI processes type: seqaij rows=1091, cols=1091 total: nonzeros=19945, allocated nonzeros=19945 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 364 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpiaij rows=4362, cols=4362 total: nonzeros=88470, allocated nonzeros=88470 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 364 nodes, limit used is 5 Solver diverged, reason -11. Modifying the solver in order to improve the convergence... MatNorm: 3.04644e+06 VecNorm: 2.81116e-06 Linear solve converged due to CONVERGED_ITS iterations 1 Solver converged within 1 iterations. Newton iteration: 2 - L2 Position Norm: 1.868159E-12 - L2 Pressure Norm: 2.037029E-07 Memory used by each processor: 36.808594 Mb ----------------------- TIME STEP = 922, time = 0.184400 ----------------------- Mesh Regenerated. Elapsed time: 0.019474 Isolated nodes: 1 Assemble Linear System. Elapsed time: 0.030308 MatNorm: 3.04642e+06 VecNorm: 1305.09 0 KSP unpreconditioned resid norm 1.466597558465e+04 true resid norm -nan ||r(i)||/||b|| -nan 1 KSP unpreconditioned resid norm 3.992657613692e+02 true resid norm -nan ||r(i)||/||b|| -nan 2 KSP unpreconditioned resid norm 6.865492930467e+01 true resid norm -nan ||r(i)||/||b|| -nan 3 KSP unpreconditioned resid norm 1.488490448891e+01 true resid norm -nan ||r(i)||/||b|| -nan 4 KSP unpreconditioned resid norm 6.459160528254e+00 true resid norm -nan ||r(i)||/||b|| -nan 5 KSP unpreconditioned resid norm 2.684190657780e+00 true resid norm -nan ||r(i)||/||b|| -nan 6 KSP unpreconditioned resid norm 1.583730558735e+00 true resid norm -nan ||r(i)||/||b|| -nan 7 KSP unpreconditioned resid norm 7.857636392042e-01 true resid norm -nan ||r(i)||/||b|| -nan 8 KSP unpreconditioned resid norm 5.609287021479e-01 true resid norm -nan ||r(i)||/||b|| -nan 9 KSP unpreconditioned resid norm 4.240869629805e-01 true resid norm -nan ||r(i)||/||b|| -nan 10 KSP unpreconditioned resid norm 3.545861070917e-01 true resid norm -nan ||r(i)||/||b|| -nan 11 KSP unpreconditioned resid norm 2.796829041968e-01 true resid norm -nan ||r(i)||/||b|| -nan 12 KSP unpreconditioned resid norm 2.415853017221e-01 true resid norm -nan ||r(i)||/||b|| -nan 13 KSP unpreconditioned resid norm 1.933876557197e-01 true resid norm -nan ||r(i)||/||b|| -nan 14 KSP unpreconditioned resid norm 1.820288353613e-01 true resid norm -nan ||r(i)||/||b|| -nan 15 KSP unpreconditioned resid norm 1.657259644747e-01 true resid norm -nan ||r(i)||/||b|| -nan 16 KSP unpreconditioned resid norm 1.563463788745e-01 true resid norm -nan ||r(i)||/||b|| -nan 17 KSP unpreconditioned resid norm 1.272726963049e-01 true resid norm -nan ||r(i)||/||b|| -nan 18 KSP unpreconditioned resid norm 1.137797079759e-01 true resid norm -nan ||r(i)||/||b|| -nan 19 KSP unpreconditioned resid norm 8.582335118209e-02 true resid norm -nan ||r(i)||/||b|| -nan 20 KSP unpreconditioned resid norm 7.628931493998e-02 true resid norm -nan ||r(i)||/||b|| -nan 21 KSP unpreconditioned resid norm 5.901409359786e-02 true resid norm -nan ||r(i)||/||b|| -nan 22 KSP unpreconditioned resid norm 5.496262106550e-02 true resid norm -nan ||r(i)||/||b|| -nan 23 KSP unpreconditioned resid norm 4.367683601600e-02 true resid norm -nan ||r(i)||/||b|| -nan 24 KSP unpreconditioned resid norm 3.767769610963e-02 true resid norm -nan ||r(i)||/||b|| -nan 25 KSP unpreconditioned resid norm 2.758466841864e-02 true resid norm -nan ||r(i)||/||b|| -nan 26 KSP unpreconditioned resid norm 2.401068925144e-02 true resid norm -nan ||r(i)||/||b|| -nan 27 KSP unpreconditioned resid norm 1.918366114227e-02 true resid norm -nan ||r(i)||/||b|| -nan 28 KSP unpreconditioned resid norm 1.796891532704e-02 true resid norm -nan ||r(i)||/||b|| -nan 29 KSP unpreconditioned resid norm 1.646774691070e-02 true resid norm -nan ||r(i)||/||b|| -nan 30 KSP unpreconditioned resid norm 1.581043087339e-02 true resid norm -nan ||r(i)||/||b|| -nan 31 KSP unpreconditioned resid norm 1.451402784393e-02 true resid norm -nan ||r(i)||/||b|| -nan 32 KSP unpreconditioned resid norm 1.365719226793e-02 true resid norm -nan ||r(i)||/||b|| -nan 33 KSP unpreconditioned resid norm 1.221815466293e-02 true resid norm -nan ||r(i)||/||b|| -nan 34 KSP unpreconditioned resid norm 1.170507483612e-02 true resid norm -nan ||r(i)||/||b|| -nan 35 KSP unpreconditioned resid norm 1.112121419983e-02 true resid norm -nan ||r(i)||/||b|| -nan 36 KSP unpreconditioned resid norm 1.041368299534e-02 true resid norm -nan ||r(i)||/||b|| -nan 37 KSP unpreconditioned resid norm 8.898468360233e-03 true resid norm -nan ||r(i)||/||b|| -nan 38 KSP unpreconditioned resid norm 7.828540090048e-03 true resid norm -nan ||r(i)||/||b|| -nan 39 KSP unpreconditioned resid norm 6.804894322652e-03 true resid norm -nan ||r(i)||/||b|| -nan 40 KSP unpreconditioned resid norm 5.932441731922e-03 true resid norm -nan ||r(i)||/||b|| -nan 41 KSP unpreconditioned resid norm 5.038590720204e-03 true resid norm -nan ||r(i)||/||b|| -nan 42 KSP unpreconditioned resid norm 4.352003569050e-03 true resid norm -nan ||r(i)||/||b|| -nan 43 KSP unpreconditioned resid norm 3.340851172402e-03 true resid norm -nan ||r(i)||/||b|| -nan 44 KSP unpreconditioned resid norm 2.489084471832e-03 true resid norm -nan ||r(i)||/||b|| -nan 45 KSP unpreconditioned resid norm 1.982062096221e-03 true resid norm -nan ||r(i)||/||b|| -nan 46 KSP unpreconditioned resid norm 1.543532665899e-03 true resid norm -nan ||r(i)||/||b|| -nan 47 KSP unpreconditioned resid norm 1.041250067680e-03 true resid norm -nan ||r(i)||/||b|| -nan 48 KSP unpreconditioned resid norm 7.072998665082e-04 true resid norm -nan ||r(i)||/||b|| -nan 49 KSP unpreconditioned resid norm 4.326826499956e-04 true resid norm -nan ||r(i)||/||b|| -nan 50 KSP unpreconditioned resid norm 3.114665876716e-04 true resid norm -nan ||r(i)||/||b|| -nan 51 KSP unpreconditioned resid norm 1.971230239174e-04 true resid norm -nan ||r(i)||/||b|| -nan 52 KSP unpreconditioned resid norm 1.513573312329e-04 true resid norm -nan ||r(i)||/||b|| -nan 53 KSP unpreconditioned resid norm 8.825285013709e-05 true resid norm -nan ||r(i)||/||b|| -nan Linear solve converged due to CONVERGED_RTOL iterations 53 KSP Object: 4 MPI processes type: fgmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=500, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 4 MPI processes type: bjacobi number of blocks = 4 Local solver information for first block is in the following KSP and PC objects on rank 0: Use -ksp_view ::ascii_info_detail to display information for all blocks KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 3.8053 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=1089, cols=1089 package used to perform factorization: petsc total: nonzeros=77571, allocated nonzeros=77571 using I-node routines: found 363 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (sub_) 1 MPI processes type: seqaij rows=1089, cols=1089 total: nonzeros=20385, allocated nonzeros=20385 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 363 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpiaij rows=4353, cols=4353 total: nonzeros=88389, allocated nonzeros=88389 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 363 nodes, limit used is 5 Solver converged within 53 iterations. Elapsed time: 0.112512 Newton iteration: 0 - L2 Position Norm: INF - L2 Pressure Norm: INF From mfadams at lbl.gov Fri Feb 25 10:36:33 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 25 Feb 2022 11:36:33 -0500 Subject: [petsc-users] [KSP] PETSc not reporting a KSP fail when true residual is NaN In-Reply-To: References: Message-ID: That is a bug. Others might have a better idea, but you could run with '-info :ksp' and see if you see any messages like "Linear solver has created a not a number (NaN) as the residual norm, declaring divergence \n" You could also run with -log_trace and see if it is using KSPConvergedDefault. I'm not sure if this is the method used given your parameters, but I think it is. Mark On Fri, Feb 25, 2022 at 11:06 AM Giovane Avancini via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc users, > > I'm working on an inhouse code that solves the Navier-Stokes equation in a > Lagrangian fashion for free surface flows. Because of the large distortions > and pressure gradients, it is quite common to encounter some issues with > iterative solvers for some time steps, and because of that, I implemented a > function that changes the solver type based on the flag KSPConvergedReason. > If this flag is negative after a call to KSPSolve, I solve the same linear > system again using a direct method. > > The problem is that, sometimes, KSP keeps converging even though the > residual is NaN, and because of that, I'm not able to identify the problem > and change the solver, which leads to a solution vector equals to INF and > obviously the code ends up crashing. Is it normal to observe this kind of > behaviour? > > Please find attached the log produced with the options > -ksp_monitor_lg_residualnorm -ksp_log -ksp_view -ksp_monitor_true_residual > -ksp_converged_reason and the function that changes the solver. I'm > currently using FGMRES and BJACOBI preconditioner with LU for each block. > The problem still happens with ILU for example. We can see in the log file > that for the time step 921, the true residual is NaN and within just one > iteration, the solver fails and it gives the reason DIVERGED_PC_FAILED. I > simply changed the solver to MUMPS and it converged for that time step. > However, when solving time step 922 we can see that FGMRES converges while > the true residual is NaN. Why is that possible? I would appreciate it if > someone could clarify this issue to me. > > Kind regards, > Giovane > > > > -- > Giovane Avancini > Doutorando em Engenharia de Estruturas - Escola de Engenharia de S?o > Carlos, USP > > PhD researcher in Structural Engineering - School of Engineering of S?o > Carlos. USP > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 25 10:37:14 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 25 Feb 2022 11:37:14 -0500 Subject: [petsc-users] [KSP] PETSc not reporting a KSP fail when true residual is NaN In-Reply-To: References: Message-ID: On Fri, Feb 25, 2022 at 11:06 AM Giovane Avancini via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc users, > > I'm working on an inhouse code that solves the Navier-Stokes equation in a > Lagrangian fashion for free surface flows. Because of the large distortions > and pressure gradients, it is quite common to encounter some issues with > iterative solvers for some time steps, and because of that, I implemented a > function that changes the solver type based on the flag KSPConvergedReason. > If this flag is negative after a call to KSPSolve, I solve the same linear > system again using a direct method. > > The problem is that, sometimes, KSP keeps converging even though the > residual is NaN, and because of that, I'm not able to identify the problem > and change the solver, which leads to a solution vector equals to INF and > obviously the code ends up crashing. Is it normal to observe this kind of > behaviour? > > Please find attached the log produced with the options > -ksp_monitor_lg_residualnorm -ksp_log -ksp_view -ksp_monitor_true_residual > -ksp_converged_reason and the function that changes the solver. I'm > currently using FGMRES and BJACOBI preconditioner with LU for each block. > The problem still happens with ILU for example. We can see in the log file > that for the time step 921, the true residual is NaN and within just one > iteration, the solver fails and it gives the reason DIVERGED_PC_FAILED. I > simply changed the solver to MUMPS and it converged for that time step. > However, when solving time step 922 we can see that FGMRES converges while > the true residual is NaN. Why is that possible? I would appreciate it if > someone could clarify this issue to me. > We check for NaN or Inf, for example, in KSPCheckDot(). if you have the KSP set to error ( https://petsc.org/main/docs/manualpages/KSP/KSPSetErrorIfNotConverged.html) then we throw an error, but the return codes do not seem to be checked in your implementation. If not, then we set the flag for divergence. Thanks, Matt > Kind regards, > Giovane > > > > -- > Giovane Avancini > Doutorando em Engenharia de Estruturas - Escola de Engenharia de S?o > Carlos, USP > > PhD researcher in Structural Engineering - School of Engineering of S?o > Carlos. USP > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Feb 25 10:48:22 2022 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 25 Feb 2022 11:48:22 -0500 Subject: [petsc-users] [KSP] PETSc not reporting a KSP fail when true residual is NaN In-Reply-To: References: Message-ID: Giovane, Thanks for the complete report. It looks like we may be missing a check in our FGMRES implementation that allows the iteration to continue after a NaN/Inf. I will explain how we handle the checking and then attach a patch that you can apply to see if it resolves the problem. Whenever our KSP solvers compute a norm we check after that calculation to verify that the norm is not an Inf or Nan. This is an inexpensive global check across all MPI ranks because immediately after the norm computation all ranks that share the KSP have the same value. If the norm is a Inf or Nan we "short-circuit" the KSP solve and return immediately with an appropriate not converged code. A quick eye-ball inspection of the FGMRES code found a missing check. You can apply the attached patch file in the PETSC_DIR with patch -p1 < fgmres.patch make libs then rerun your code and see if it now handles the Inf/NaN correctly. If so we'll patch our release branch with the fix. Barry > Giovane > On Feb 25, 2022, at 11:06 AM, Giovane Avancini via petsc-users wrote: > > Dear PETSc users, > > I'm working on an inhouse code that solves the Navier-Stokes equation in a Lagrangian fashion for free surface flows. Because of the large distortions and pressure gradients, it is quite common to encounter some issues with iterative solvers for some time steps, and because of that, I implemented a function that changes the solver type based on the flag KSPConvergedReason. If this flag is negative after a call to KSPSolve, I solve the same linear system again using a direct method. > > The problem is that, sometimes, KSP keeps converging even though the residual is NaN, and because of that, I'm not able to identify the problem and change the solver, which leads to a solution vector equals to INF and obviously the code ends up crashing. Is it normal to observe this kind of behaviour? > > Please find attached the log produced with the options -ksp_monitor_lg_residualnorm -ksp_log -ksp_view -ksp_monitor_true_residual -ksp_converged_reason and the function that changes the solver. I'm currently using FGMRES and BJACOBI preconditioner with LU for each block. The problem still happens with ILU for example. We can see in the log file that for the time step 921, the true residual is NaN and within just one iteration, the solver fails and it gives the reason DIVERGED_PC_FAILED. I simply changed the solver to MUMPS and it converged for that time step. However, when solving time step 922 we can see that FGMRES converges while the true residual is NaN. Why is that possible? I would appreciate it if someone could clarify this issue to me. > > Kind regards, > Giovane > > > > -- > Giovane Avancini > Doutorando em Engenharia de Estruturas - Escola de Engenharia de S?o Carlos, USP > > PhD researcher in Structural Engineering - School of Engineering of S?o Carlos. USP > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: fgmres.patch Type: application/octet-stream Size: 538 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at mcmaster.ca Fri Feb 25 11:43:14 2022 From: bourdin at mcmaster.ca (Blaise Bourdin) Date: Fri, 25 Feb 2022 17:43:14 +0000 Subject: [petsc-users] Message length unit in log_view In-Reply-To: References: <1FE574BD-10B7-4DF5-BB8F-049D6D7E258F@mcmaster.ca> Message-ID: <24314987-2D21-4BD8-975B-3249EF6600A7@mcmaster.ca> An HTML attachment was scrubbed... URL: From solomon.sundar.n at gmail.com Fri Feb 25 13:38:38 2022 From: solomon.sundar.n at gmail.com (Sundar Namala) Date: Fri, 25 Feb 2022 13:38:38 -0600 Subject: [petsc-users] Questions regarding nested field split Message-ID: Hi, I am currently using fieldsplit and I am creating the fields using ISCreateGeneral.programming is being carried out in FORTRAN. I have a couple of questions regarding fieldsplit in parallel. Do we need to create the index list of all the fields separately for each processor? For example, say I have 3 fields and the indices for field_0 is 0-99, field_1 is 100-299 and field_2 is 300-349. In case of 2 processors do I have to specify the indices for the first processor as field_0 is 0-99, field_1 is 100-174 and field_2 is null. On the second processor field_0 is null, field_1 is 175-299 and field_2 is 300-349. my second question is if the indices need to be listed separately how do you assign the null index list using ISCreateGeneral. Thanks, Sundar. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Feb 25 13:53:26 2022 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 25 Feb 2022 14:53:26 -0500 Subject: [petsc-users] Questions regarding nested field split In-Reply-To: References: Message-ID: On Fri, Feb 25, 2022 at 2:40 PM Sundar Namala wrote: > Hi, I am currently using fieldsplit and I am creating the fields using > ISCreateGeneral.programming is being carried out in FORTRAN. I have a > couple of questions regarding fieldsplit in parallel. > > Do we need to create the index list of all the fields separately for each > processor? > > For example, say I have 3 fields and the indices for field_0 is 0-99, > field_1 is 100-299 and field_2 is 300-349. In case of 2 processors do I > have to specify the indices for the first processor as field_0 is 0-99, > field_1 is 100-174 and field_2 is null. On the second processor field_0 is > null, field_1 is 175-299 and field_2 is 300-349. > Yes, each process specifies its indices independently (and for each field). > my second question is if the indices need to be listed separately how do > you assign the null index list using ISCreateGeneral. > You can use an IS with size 0. Thanks, Matt > Thanks, > Sundar. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From giavancini at usp.br Fri Feb 25 14:59:38 2022 From: giavancini at usp.br (Giovane Avancini) Date: Fri, 25 Feb 2022 17:59:38 -0300 Subject: [petsc-users] [KSP] PETSc not reporting a KSP fail when true residual is NaN In-Reply-To: References: Message-ID: Mark, Matthew and Barry, Thank you all for the quick responses. Others might have a better idea, but you could run with '-info :ksp' and see if you see any messages like "Linear solver has created a not a number (NaN) as the residual norm, declaring divergence \n" You could also run with -log_trace and see if it is using KSPConvergedDefault. I'm not sure if this is the method used given your parameters, but I think it is. Mark, I ran with both options. I didn't get any messages like "linear solver has created a not a number..." when using -info: ksp. When turning on -log_trace, I could verify that it is using KSPConvergedDefault but what does it mean exactly? When FGMRES converges with the true residual being NaN, I get the following message: [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 8.897908325511e-05 is less than relative tolerance 1.000000000000e-08 times initial right hand side norm 1.466597558465e+04 at iteration 53. No information about NaN whatsoever. We check for NaN or Inf, for example, in KSPCheckDot(). if you have the KSP set to error ( https://petsc.org/main/docs/manualpages/KSP/KSPSetErrorIfNotConverged.html) then we throw an error, but the return codes do not seem to be checked in your implementation. If not, then we set the flag for divergence. Matthew, I do not check the return code in this case because I don't want PETSc to stop if an error occurs during the solving step. I just want to know that it didn't converge and treat this error inside my code. The problem is that the flag for divergence is not always being set when FGMRES is not converging. I was just wondering why it was set during time step 921 and why not for time step 922 as well. Thanks for the complete report. It looks like we may be missing a check in our FGMRES implementation that allows the iteration to continue after a NaN/Inf. I will explain how we handle the checking and then attach a patch that you can apply to see if it resolves the problem. Whenever our KSP solvers compute a norm we check after that calculation to verify that the norm is not an Inf or Nan. This is an inexpensive global check across all MPI ranks because immediately after the norm computation all ranks that share the KSP have the same value. If the norm is a Inf or Nan we "short-circuit" the KSP solve and return immediately with an appropriate not converged code. A quick eye-ball inspection of the FGMRES code found a missing check. You can apply the attached patch file in the PETSC_DIR with patch -p1 < fgmres.patch make libs then rerun your code and see if it now handles the Inf/NaN correctly. If so we'll patch our release branch with the fix. Thank you for checking this, Barry. I applied the patch exactly the way you instructed, however, the problem is still happening. Is there a way to check if the patch was in fact applied? You can see in the attached screenshot the terminal information. Kind regards, Giovane Em sex., 25 de fev. de 2022 ?s 13:48, Barry Smith escreveu: > > Giovane, > > Thanks for the complete report. It looks like we may be missing a > check in our FGMRES implementation that allows the iteration to continue > after a NaN/Inf. > > I will explain how we handle the checking and then attach a patch that > you can apply to see if it resolves the problem. Whenever our KSP solvers > compute a norm we > check after that calculation to verify that the norm is not an Inf or Nan. > This is an inexpensive global check across all MPI ranks because > immediately after the norm computation all ranks that share the KSP have > the same value. If the norm is a Inf or Nan we "short-circuit" the KSP > solve and return immediately with an appropriate not converged code. A > quick eye-ball inspection of the FGMRES code found a missing check. > > You can apply the attached patch file in the PETSC_DIR with > > patch -p1 < fgmres.patch > make libs > > then rerun your code and see if it now handles the Inf/NaN correctly. If > so we'll patch our release branch with the fix. > > Barry > > > > Giovane > > > > On Feb 25, 2022, at 11:06 AM, Giovane Avancini via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Dear PETSc users, > > I'm working on an inhouse code that solves the Navier-Stokes equation in a > Lagrangian fashion for free surface flows. Because of the large distortions > and pressure gradients, it is quite common to encounter some issues with > iterative solvers for some time steps, and because of that, I implemented a > function that changes the solver type based on the flag KSPConvergedReason. > If this flag is negative after a call to KSPSolve, I solve the same linear > system again using a direct method. > > The problem is that, sometimes, KSP keeps converging even though the > residual is NaN, and because of that, I'm not able to identify the problem > and change the solver, which leads to a solution vector equals to INF and > obviously the code ends up crashing. Is it normal to observe this kind of > behaviour? > > Please find attached the log produced with the options > -ksp_monitor_lg_residualnorm -ksp_log -ksp_view -ksp_monitor_true_residual > -ksp_converged_reason and the function that changes the solver. I'm > currently using FGMRES and BJACOBI preconditioner with LU for each block. > The problem still happens with ILU for example. We can see in the log file > that for the time step 921, the true residual is NaN and within just one > iteration, the solver fails and it gives the reason DIVERGED_PC_FAILED. I > simply changed the solver to MUMPS and it converged for that time step. > However, when solving time step 922 we can see that FGMRES converges while > the true residual is NaN. Why is that possible? I would appreciate it if > someone could clarify this issue to me. > > Kind regards, > Giovane > > > > -- > Giovane Avancini > Doutorando em Engenharia de Estruturas - Escola de Engenharia de S?o > Carlos, USP > > PhD researcher in Structural Engineering - School of Engineering of S?o > Carlos. USP > > > > -- Giovane Avancini Doutorando em Engenharia de Estruturas - Escola de Engenharia de S?o Carlos, USP PhD researcher in Structural Engineering - School of Engineering of S?o Carlos. USP -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ----------------------- TIME STEP = 922, time = 0.184400 ----------------------- Mesh Regenerated. Elapsed time: 0.010348 Isolated nodes: 1 [0] 8412.37 Event begin: MatAssemblyBegin [0] 8412.37 Event begin: BuildTwoSidedF [0] 8412.37 Event begin: BuildTwoSided [0] 8412.37 Event end: BuildTwoSided [0] 8412.37 Event end: BuildTwoSidedF [0] 8412.37 Event end: MatAssemblyBegin [0] 8412.37 Event begin: MatAssemblyEnd [0] 8412.38 Event begin: VecSet [0] 8412.38 Event end: VecSet [0] 8412.38 Event begin: SFSetGraph [0] 8412.38 Event end: SFSetGraph [0] 8412.38 Event end: MatAssemblyEnd [0] 8412.38 Event begin: VecAssemblyBegin [0] 8412.38 Event begin: BuildTwoSidedF [0] 8412.38 Event begin: BuildTwoSided [0] 8412.38 Event end: BuildTwoSided [0] 8412.38 Event end: BuildTwoSidedF [0] 8412.38 Event end: VecAssemblyBegin [0] 8412.38 Event begin: VecAssemblyEnd [0] 8412.38 Event end: VecAssemblyEnd Assemble Linear System. Elapsed time: 0.015676 [0] 8412.38 Event begin: MatView [0] 8412.38 Event end: MatView [0] 8412.38 Event begin: SFSetGraph [0] 8412.38 Event end: SFSetGraph [0] 8412.38 Event begin: SFSetUp [0] 8412.38 Event begin: BuildTwoSided [0] 8412.38 Event end: BuildTwoSided [0] 8412.38 Event end: SFSetUp [0] 8412.38 Event begin: SFReduceBegin [0] 8412.38 Event begin: SFPack [0] 8412.38 Event end: SFPack [0] 8412.38 Event end: SFReduceBegin [0] 8412.38 Event begin: SFReduceEnd [0] 8412.38 Event begin: SFUnpack [0] 8412.38 Event end: SFUnpack [0] 8412.38 Event end: SFReduceEnd [0] 8412.38 Event begin: VecSet [0] 8412.38 Event end: VecSet [0] 8412.38 Event begin: VecScatterBegin [0] 8412.38 Event begin: SFSetUp [0] 8412.38 Event begin: BuildTwoSided [0] 8412.38 Event end: BuildTwoSided [0] 8412.38 Event end: SFSetUp [0] 8412.38 Event begin: SFPack [0] 8412.38 Event end: SFPack [0] 8412.38 Event end: VecScatterBegin [0] 8412.38 Event begin: VecScatterEnd [0] 8412.38 Event begin: SFUnpack [0] 8412.38 Event end: SFUnpack [0] 8412.38 Event end: VecScatterEnd [0] 8412.38 Event begin: VecScatterBegin [0] 8412.38 Event begin: SFPack [0] 8412.38 Event end: SFPack [0] 8412.38 Event end: VecScatterBegin [0] 8412.38 Event begin: VecScatterEnd [0] 8412.38 Event begin: SFUnpack [0] 8412.38 Event end: SFUnpack [0] 8412.38 Event end: VecScatterEnd [0] 8412.38 Event begin: KSPSetUp [0] 8412.38 Event end: KSPSetUp [0] 8412.38 Event begin: PCSetUp [0] 8412.38 Event end: PCSetUp [0] 8412.38 Event begin: VecNorm [0] 8412.38 Event end: VecNorm MatNorm: 3.04642e+06 VecNorm: 1305.09 [0] 8412.38 Event begin: PCSetUpOnBlocks [0] 8412.38 Event begin: KSPSetUp [0] 8412.38 Event end: KSPSetUp [0] 8412.38 Event begin: PCSetUp [0] 8412.38 Event begin: MatGetOrdering [0] 8412.38 Event begin: MatGetRowIJ [0] 8412.38 Event end: MatGetRowIJ [0] 8412.38 Event end: MatGetOrdering [0] 8412.38 Event begin: MatLUFactorSym [0] 8412.38 Event end: MatLUFactorSym [0] 8412.38 Event begin: MatLUFactorNum [0] 8412.39 Event end: MatLUFactorNum [0] 8412.39 Event end: PCSetUp [0] 8412.39 Event end: PCSetUpOnBlocks [0] 8412.39 Event begin: KSPSolve [0] 8412.39 Event begin: VecSet [0] 8412.39 Event end: VecSet [0] 8412.39 Event begin: VecCopy [0] 8412.39 Event end: VecCopy [0] 8412.39 Event begin: VecNorm [0] 8412.39 Event end: VecNorm [0] 8412.39 Event begin: VecCopy [0] 8412.39 Event end: VecCopy [0] 8412.39 Event begin: MatMult [0] 8412.39 Event begin: VecScatterBegin [0] 8412.39 Event begin: SFPack [0] 8412.39 Event end: SFPack [0] 8412.39 Event end: VecScatterBegin [0] 8412.39 Event begin: VecScatterEnd [0] 8412.39 Event begin: SFUnpack [0] 8412.39 Event end: SFUnpack [0] 8412.39 Event end: VecScatterEnd [0] 8412.39 Event end: MatMult [0] 8412.39 Event begin: VecAYPX [0] 8412.39 Event end: VecAYPX [0] 8412.39 Event begin: VecNorm [0] 8412.39 Event end: VecNorm [0] 8412.39 Event begin: VecNorm [0] 8412.39 Event end: VecNorm 0 KSP unpreconditioned resid norm 1.466597558465e+04 true resid norm -nan ||r(i)||/||b|| -nan [0] 8412.39 Event begin: VecScale [0] 8412.39 Event end: VecScale [0] 8412.39 Event begin: PCApply [0] 8412.39 Event begin: VecSet [0] 8412.39 Event end: VecSet [0] 8412.39 Event begin: MatSolve [0] 8412.39 Event end: MatSolve [0] 8412.39 Event end: PCApply [0] 8412.39 Event begin: MatMult [0] 8412.39 Event begin: VecScatterBegin [0] 8412.39 Event begin: SFPack [0] 8412.39 Event end: SFPack [0] 8412.39 Event end: VecScatterBegin [0] 8412.39 Event begin: VecScatterEnd [0] 8412.39 Event begin: SFUnpack [0] 8412.39 Event end: SFUnpack [0] 8412.39 Event end: VecScatterEnd [0] 8412.39 Event end: MatMult [0] 8412.39 Event begin: VecMDot [0] 8412.39 Event end: VecMDot [0] 8412.39 Event begin: VecMAXPY [0] 8412.39 Event end: VecMAXPY [0] 8412.39 Event begin: VecNorm [0] 8412.39 Event end: VecNorm [0] 8412.39 Event begin: VecScale [0] 8412.39 Event end: VecScale [0] 8412.39 Event begin: VecSet [0] 8412.39 Event end: VecSet [0] 8412.39 Event begin: VecMAXPY [0] 8412.39 Event end: VecMAXPY [0] 8412.39 Event begin: VecCopy [0] 8412.39 Event end: VecCopy [0] 8412.39 Event begin: VecAXPY [0] 8412.39 Event end: VecAXPY [0] 8412.39 Event begin: MatMult [0] 8412.39 Event begin: VecScatterBegin [0] 8412.39 Event begin: SFPack [0] 8412.39 Event end: SFPack [0] 8412.39 Event end: VecScatterBegin [0] 8412.39 Event begin: VecScatterEnd [0] 8412.39 Event begin: SFUnpack [0] 8412.39 Event end: SFUnpack [0] 8412.39 Event end: VecScatterEnd [0] 8412.39 Event end: MatMult [0] 8412.39 Event begin: VecAYPX [0] 8412.39 Event end: VecAYPX [0] 8412.39 Event begin: VecNorm [0] 8412.39 Event end: VecNorm 1 KSP unpreconditioned resid norm 3.992657613675e+02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8412.39 Event begin: PCApply [0] 8412.39 Event begin: VecSet [0] 8412.39 Event end: VecSet [0] 8412.39 Event begin: MatSolve [0] 8412.39 Event end: MatSolve [0] 8412.39 Event end: PCApply [0] 8412.39 Event begin: MatMult [0] 8412.39 Event begin: VecScatterBegin [0] 8412.39 Event begin: SFPack [0] 8412.39 Event end: SFPack [0] 8412.39 Event end: VecScatterBegin [0] 8412.39 Event begin: VecScatterEnd [0] 8413.76 Event begin: SFUnpack [0] 8413.76 Event end: SFUnpack [0] 8413.76 Event end: VecScatterEnd [0] 8413.76 Event end: MatMult [0] 8413.76 Event begin: VecMDot [0] 8413.76 Event end: VecMDot [0] 8413.76 Event begin: VecMAXPY [0] 8413.76 Event end: VecMAXPY [0] 8413.76 Event begin: VecNorm [0] 8413.76 Event end: VecNorm [0] 8413.76 Event begin: VecScale [0] 8413.76 Event end: VecScale [0] 8413.76 Event begin: VecSet [0] 8413.76 Event end: VecSet [0] 8413.76 Event begin: VecMAXPY [0] 8413.76 Event end: VecMAXPY [0] 8413.76 Event begin: VecCopy [0] 8413.76 Event end: VecCopy [0] 8413.76 Event begin: VecAXPY [0] 8413.76 Event end: VecAXPY [0] 8413.76 Event begin: MatMult [0] 8413.76 Event begin: VecScatterBegin [0] 8413.76 Event begin: SFPack [0] 8413.76 Event end: SFPack [0] 8413.76 Event end: VecScatterBegin [0] 8413.76 Event begin: VecScatterEnd [0] 8413.76 Event begin: SFUnpack [0] 8413.76 Event end: SFUnpack [0] 8413.76 Event end: VecScatterEnd [0] 8413.76 Event end: MatMult [0] 8413.76 Event begin: VecAYPX [0] 8413.76 Event end: VecAYPX [0] 8413.76 Event begin: VecNorm [0] 8413.76 Event end: VecNorm 2 KSP unpreconditioned resid norm 6.865492930488e+01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.76 Event begin: PCApply [0] 8413.76 Event begin: VecSet [0] 8413.76 Event end: VecSet [0] 8413.76 Event begin: MatSolve [0] 8413.76 Event end: MatSolve [0] 8413.76 Event end: PCApply [0] 8413.76 Event begin: MatMult [0] 8413.76 Event begin: VecScatterBegin [0] 8413.76 Event begin: SFPack [0] 8413.76 Event end: SFPack [0] 8413.76 Event end: VecScatterBegin [0] 8413.76 Event begin: VecScatterEnd [0] 8413.76 Event begin: SFUnpack [0] 8413.76 Event end: SFUnpack [0] 8413.76 Event end: VecScatterEnd [0] 8413.76 Event end: MatMult [0] 8413.76 Event begin: VecMDot [0] 8413.76 Event end: VecMDot [0] 8413.76 Event begin: VecMAXPY [0] 8413.76 Event end: VecMAXPY [0] 8413.76 Event begin: VecNorm [0] 8413.76 Event end: VecNorm [0] 8413.76 Event begin: VecScale [0] 8413.76 Event end: VecScale [0] 8413.76 Event begin: VecSet [0] 8413.76 Event end: VecSet [0] 8413.76 Event begin: VecMAXPY [0] 8413.76 Event end: VecMAXPY [0] 8413.76 Event begin: VecCopy [0] 8413.76 Event end: VecCopy [0] 8413.76 Event begin: VecAXPY [0] 8413.76 Event end: VecAXPY [0] 8413.76 Event begin: MatMult [0] 8413.76 Event begin: VecScatterBegin [0] 8413.76 Event begin: SFPack [0] 8413.76 Event end: SFPack [0] 8413.76 Event end: VecScatterBegin [0] 8413.76 Event begin: VecScatterEnd [0] 8413.76 Event begin: SFUnpack [0] 8413.76 Event end: SFUnpack [0] 8413.76 Event end: VecScatterEnd [0] 8413.76 Event end: MatMult [0] 8413.76 Event begin: VecAYPX [0] 8413.76 Event end: VecAYPX [0] 8413.76 Event begin: VecNorm [0] 8413.76 Event end: VecNorm 3 KSP unpreconditioned resid norm 1.488490448911e+01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.76 Event begin: PCApply [0] 8413.76 Event begin: VecSet [0] 8413.76 Event end: VecSet [0] 8413.76 Event begin: MatSolve [0] 8413.76 Event end: MatSolve [0] 8413.76 Event end: PCApply [0] 8413.76 Event begin: MatMult [0] 8413.76 Event begin: VecScatterBegin [0] 8413.76 Event begin: SFPack [0] 8413.76 Event end: SFPack [0] 8413.76 Event end: VecScatterBegin [0] 8413.76 Event begin: VecScatterEnd [0] 8413.76 Event begin: SFUnpack [0] 8413.76 Event end: SFUnpack [0] 8413.76 Event end: VecScatterEnd [0] 8413.76 Event end: MatMult [0] 8413.76 Event begin: VecMDot [0] 8413.77 Event end: VecMDot [0] 8413.77 Event begin: VecMAXPY [0] 8413.77 Event end: VecMAXPY [0] 8413.77 Event begin: VecNorm [0] 8413.77 Event end: VecNorm [0] 8413.77 Event begin: VecScale [0] 8413.77 Event end: VecScale [0] 8413.77 Event begin: VecSet [0] 8413.77 Event end: VecSet [0] 8413.77 Event begin: VecMAXPY [0] 8413.77 Event end: VecMAXPY [0] 8413.77 Event begin: VecCopy [0] 8413.77 Event end: VecCopy [0] 8413.77 Event begin: VecAXPY [0] 8413.77 Event end: VecAXPY [0] 8413.77 Event begin: MatMult [0] 8413.77 Event begin: VecScatterBegin [0] 8413.77 Event begin: SFPack [0] 8413.77 Event end: SFPack [0] 8413.77 Event end: VecScatterBegin [0] 8413.77 Event begin: VecScatterEnd [0] 8413.77 Event begin: SFUnpack [0] 8413.77 Event end: SFUnpack [0] 8413.77 Event end: VecScatterEnd [0] 8413.77 Event end: MatMult [0] 8413.77 Event begin: VecAYPX [0] 8413.77 Event end: VecAYPX [0] 8413.77 Event begin: VecNorm [0] 8413.77 Event end: VecNorm 4 KSP unpreconditioned resid norm 6.459160528255e+00 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.77 Event begin: PCApply [0] 8413.77 Event begin: VecSet [0] 8413.77 Event end: VecSet [0] 8413.77 Event begin: MatSolve [0] 8413.77 Event end: MatSolve [0] 8413.77 Event end: PCApply [0] 8413.77 Event begin: MatMult [0] 8413.77 Event begin: VecScatterBegin [0] 8413.77 Event begin: SFPack [0] 8413.77 Event end: SFPack [0] 8413.77 Event end: VecScatterBegin [0] 8413.77 Event begin: VecScatterEnd [0] 8413.77 Event begin: SFUnpack [0] 8413.77 Event end: SFUnpack [0] 8413.77 Event end: VecScatterEnd [0] 8413.77 Event end: MatMult [0] 8413.77 Event begin: VecMDot [0] 8413.77 Event end: VecMDot [0] 8413.77 Event begin: VecMAXPY [0] 8413.77 Event end: VecMAXPY [0] 8413.77 Event begin: VecNorm [0] 8413.77 Event end: VecNorm [0] 8413.77 Event begin: VecScale [0] 8413.77 Event end: VecScale [0] 8413.77 Event begin: VecSet [0] 8413.77 Event end: VecSet [0] 8413.77 Event begin: VecMAXPY [0] 8413.77 Event end: VecMAXPY [0] 8413.77 Event begin: VecCopy [0] 8413.77 Event end: VecCopy [0] 8413.77 Event begin: VecAXPY [0] 8413.77 Event end: VecAXPY [0] 8413.77 Event begin: MatMult [0] 8413.77 Event begin: VecScatterBegin [0] 8413.77 Event begin: SFPack [0] 8413.77 Event end: SFPack [0] 8413.77 Event end: VecScatterBegin [0] 8413.77 Event begin: VecScatterEnd [0] 8413.77 Event begin: SFUnpack [0] 8413.77 Event end: SFUnpack [0] 8413.77 Event end: VecScatterEnd [0] 8413.77 Event end: MatMult [0] 8413.77 Event begin: VecAYPX [0] 8413.77 Event end: VecAYPX [0] 8413.77 Event begin: VecNorm [0] 8413.77 Event end: VecNorm 5 KSP unpreconditioned resid norm 2.684190657710e+00 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.77 Event begin: PCApply [0] 8413.77 Event begin: VecSet [0] 8413.77 Event end: VecSet [0] 8413.77 Event begin: MatSolve [0] 8413.77 Event end: MatSolve [0] 8413.77 Event end: PCApply [0] 8413.77 Event begin: MatMult [0] 8413.77 Event begin: VecScatterBegin [0] 8413.77 Event begin: SFPack [0] 8413.77 Event end: SFPack [0] 8413.77 Event end: VecScatterBegin [0] 8413.77 Event begin: VecScatterEnd [0] 8413.77 Event begin: SFUnpack [0] 8413.77 Event end: SFUnpack [0] 8413.77 Event end: VecScatterEnd [0] 8413.77 Event end: MatMult [0] 8413.77 Event begin: VecMDot [0] 8413.77 Event end: VecMDot [0] 8413.77 Event begin: VecMAXPY [0] 8413.77 Event end: VecMAXPY [0] 8413.77 Event begin: VecNorm [0] 8413.77 Event end: VecNorm [0] 8413.77 Event begin: VecScale [0] 8413.77 Event end: VecScale [0] 8413.77 Event begin: VecSet [0] 8413.77 Event end: VecSet [0] 8413.77 Event begin: VecMAXPY [0] 8413.77 Event end: VecMAXPY [0] 8413.77 Event begin: VecCopy [0] 8413.77 Event end: VecCopy [0] 8413.77 Event begin: VecAXPY [0] 8413.77 Event end: VecAXPY [0] 8413.77 Event begin: MatMult [0] 8413.77 Event begin: VecScatterBegin [0] 8413.77 Event begin: SFPack [0] 8413.77 Event end: SFPack [0] 8413.77 Event end: VecScatterBegin [0] 8413.77 Event begin: VecScatterEnd [0] 8413.77 Event begin: SFUnpack [0] 8413.77 Event end: SFUnpack [0] 8413.77 Event end: VecScatterEnd [0] 8413.77 Event end: MatMult [0] 8413.77 Event begin: VecAYPX [0] 8413.77 Event end: VecAYPX [0] 8413.77 Event begin: VecNorm [0] 8413.77 Event end: VecNorm 6 KSP unpreconditioned resid norm 1.583730558621e+00 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.77 Event begin: PCApply [0] 8413.77 Event begin: VecSet [0] 8413.77 Event end: VecSet [0] 8413.77 Event begin: MatSolve [0] 8413.77 Event end: MatSolve [0] 8413.77 Event end: PCApply [0] 8413.77 Event begin: MatMult [0] 8413.77 Event begin: VecScatterBegin [0] 8413.77 Event begin: SFPack [0] 8413.77 Event end: SFPack [0] 8413.77 Event end: VecScatterBegin [0] 8413.77 Event begin: VecScatterEnd [0] 8413.77 Event begin: SFUnpack [0] 8413.77 Event end: SFUnpack [0] 8413.77 Event end: VecScatterEnd [0] 8413.77 Event end: MatMult [0] 8413.77 Event begin: VecMDot [0] 8413.77 Event end: VecMDot [0] 8413.77 Event begin: VecMAXPY [0] 8413.77 Event end: VecMAXPY [0] 8413.77 Event begin: VecNorm [0] 8413.77 Event end: VecNorm [0] 8413.77 Event begin: VecScale [0] 8413.77 Event end: VecScale [0] 8413.77 Event begin: VecSet [0] 8413.77 Event end: VecSet [0] 8413.77 Event begin: VecMAXPY [0] 8413.77 Event end: VecMAXPY [0] 8413.77 Event begin: VecCopy [0] 8413.77 Event end: VecCopy [0] 8413.77 Event begin: VecAXPY [0] 8413.77 Event end: VecAXPY [0] 8413.77 Event begin: MatMult [0] 8413.77 Event begin: VecScatterBegin [0] 8413.77 Event begin: SFPack [0] 8413.77 Event end: SFPack [0] 8413.77 Event end: VecScatterBegin [0] 8413.77 Event begin: VecScatterEnd [0] 8413.77 Event begin: SFUnpack [0] 8413.77 Event end: SFUnpack [0] 8413.77 Event end: VecScatterEnd [0] 8413.77 Event end: MatMult [0] 8413.77 Event begin: VecAYPX [0] 8413.77 Event end: VecAYPX [0] 8413.77 Event begin: VecNorm [0] 8413.78 Event end: VecNorm 7 KSP unpreconditioned resid norm 7.857636390655e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.78 Event begin: PCApply [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: MatSolve [0] 8413.78 Event end: MatSolve [0] 8413.78 Event end: PCApply [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecMDot [0] 8413.78 Event end: VecMDot [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm [0] 8413.78 Event begin: VecScale [0] 8413.78 Event end: VecScale [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecCopy [0] 8413.78 Event end: VecCopy [0] 8413.78 Event begin: VecAXPY [0] 8413.78 Event end: VecAXPY [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecAYPX [0] 8413.78 Event end: VecAYPX [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm 8 KSP unpreconditioned resid norm 5.609287018950e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.78 Event begin: PCApply [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: MatSolve [1] 8412.25 Event begin: VecSet [1] 8412.25 Event end: VecSet [1] 8412.25 Event begin: VecScatterBegin [1] 8412.25 Event begin: SFSetUp [1] 8412.25 Event end: SFSetUp [1] 8412.25 Event begin: SFPack [1] 8412.25 Event end: SFPack [1] 8412.25 Event end: VecScatterBegin [1] 8412.25 Event begin: VecScatterEnd [1] 8412.25 Event begin: SFUnpack [1] 8412.25 Event end: SFUnpack [1] 8412.25 Event end: VecScatterEnd [1] 8412.37 Event begin: MatAssemblyBegin [1] 8412.38 Event begin: BuildTwoSidedF [1] 8412.38 Event begin: BuildTwoSided [1] 8412.38 Event end: BuildTwoSided [1] 8412.38 Event end: BuildTwoSidedF [1] 8412.38 Event end: MatAssemblyBegin [1] 8412.38 Event begin: MatAssemblyEnd [1] 8412.38 Event begin: VecSet [1] 8412.38 Event end: VecSet [1] 8412.38 Event begin: SFSetGraph [1] 8412.38 Event end: SFSetGraph [1] 8412.38 Event end: MatAssemblyEnd [1] 8412.38 Event begin: VecAssemblyBegin [1] 8412.38 Event begin: BuildTwoSidedF [1] 8412.38 Event begin: BuildTwoSided [1] 8412.38 Event end: BuildTwoSided [1] 8412.38 Event end: BuildTwoSidedF [1] 8412.38 Event end: VecAssemblyBegin [1] 8412.38 Event begin: VecAssemblyEnd [1] 8412.38 Event end: VecAssemblyEnd [1] 8412.38 Event begin: MatView [1] 8412.38 Event end: MatView [1] 8412.38 Event begin: SFSetGraph [1] 8412.38 Event end: SFSetGraph [1] 8412.38 Event begin: SFSetUp [1] 8412.38 Event begin: BuildTwoSided [1] 8412.38 Event end: BuildTwoSided [1] 8412.38 Event end: SFSetUp [1] 8412.38 Event begin: SFReduceBegin [1] 8412.38 Event begin: SFPack [1] 8412.38 Event end: SFPack [1] 8412.38 Event end: SFReduceBegin [1] 8412.38 Event begin: SFReduceEnd [1] 8412.38 Event begin: SFUnpack [1] 8412.38 Event end: SFUnpack [1] 8412.38 Event end: SFReduceEnd [1] 8412.38 Event begin: VecSet [1] 8412.38 Event end: VecSet [1] 8412.38 Event begin: VecScatterBegin [1] 8412.38 Event begin: SFSetUp [1] 8412.38 Event begin: BuildTwoSided [1] 8412.38 Event end: BuildTwoSided [1] 8412.38 Event end: SFSetUp [1] 8412.38 Event begin: SFPack [1] 8412.38 Event end: SFPack [1] 8412.38 Event end: VecScatterBegin [1] 8412.38 Event begin: VecScatterEnd [1] 8412.38 Event begin: SFUnpack [1] 8412.38 Event end: SFUnpack [1] 8412.38 Event end: VecScatterEnd [1] 8412.38 Event begin: VecScatterBegin [1] 8412.38 Event begin: SFPack [1] 8412.38 Event end: SFPack [1] 8412.38 Event end: VecScatterBegin [1] 8412.38 Event begin: VecScatterEnd [1] 8412.38 Event begin: SFUnpack [1] 8412.38 Event end: SFUnpack [1] 8412.38 Event end: VecScatterEnd [1] 8412.38 Event begin: KSPSetUp [1] 8412.38 Event end: KSPSetUp [1] 8412.38 Event begin: PCSetUp [1] 8412.38 Event end: PCSetUp [1] 8412.38 Event begin: VecNorm [1] 8412.38 Event end: VecNorm [1] 8412.38 Event begin: PCSetUpOnBlocks [1] 8412.38 Event begin: KSPSetUp [1] 8412.38 Event end: KSPSetUp [1] 8412.38 Event begin: PCSetUp [1] 8412.38 Event begin: MatGetOrdering [1] 8412.38 Event begin: MatGetRowIJ [1] 8412.38 Event end: MatGetRowIJ [1] 8412.38 Event end: MatGetOrdering [1] 8412.38 Event begin: MatLUFactorSym [1] 8412.38 Event end: MatLUFactorSym [1] 8412.38 Event begin: MatLUFactorNum [1] 8412.38 Event end: MatLUFactorNum [1] 8412.38 Event end: PCSetUp [1] 8412.38 Event end: PCSetUpOnBlocks [1] 8412.38 Event begin: KSPSolve [1] 8412.38 Event begin: VecSet [1] 8412.38 Event end: VecSet [1] 8412.38 Event begin: VecCopy [1] 8412.38 Event end: VecCopy [1] 8412.38 Event begin: VecNorm [1] 8412.39 Event end: VecNorm [1] 8412.39 Event begin: VecCopy [1] 8412.39 Event end: VecCopy [1] 8412.39 Event begin: MatMult [1] 8412.39 Event begin: VecScatterBegin [1] 8412.39 Event begin: SFPack [1] 8412.39 Event end: SFPack [1] 8412.39 Event end: VecScatterBegin [1] 8412.39 Event begin: VecScatterEnd [1] 8412.39 Event begin: SFUnpack [1] 8412.39 Event end: SFUnpack [1] 8412.39 Event end: VecScatterEnd [1] 8412.39 Event end: MatMult [1] 8412.39 Event begin: VecAYPX [1] 8412.39 Event end: VecAYPX [1] 8412.39 Event begin: VecNorm [1] 8412.39 Event end: VecNorm [1] 8412.39 Event begin: VecNorm [1] 8412.39 Event end: VecNorm [1] 8412.39 Event begin: VecScale [1] 8412.39 Event end: VecScale [1] 8412.39 Event begin: PCApply [1] 8412.39 Event begin: VecSet [1] 8412.39 Event end: VecSet [1] 8412.39 Event begin: MatSolve [1] 8412.39 Event end: MatSolve [1] 8412.39 Event end: PCApply [1] 8412.39 Event begin: MatMult [1] 8412.39 Event begin: VecScatterBegin [1] 8412.39 Event begin: SFPack [1] 8412.39 Event end: SFPack [1] 8412.39 Event end: VecScatterBegin [1] 8412.39 Event begin: VecScatterEnd [1] 8412.39 Event begin: SFUnpack [1] 8412.39 Event end: SFUnpack [1] 8412.39 Event end: VecScatterEnd [1] 8412.39 Event end: MatMult [1] 8412.39 Event begin: VecMDot [1] 8412.39 Event end: VecMDot [1] 8412.39 Event begin: VecMAXPY [1] 8412.39 Event end: VecMAXPY [1] 8412.39 Event begin: VecNorm [1] 8412.39 Event end: VecNorm [1] 8412.39 Event begin: VecScale [1] 8412.39 Event end: VecScale [1] 8412.39 Event begin: VecSet [1] 8412.39 Event end: VecSet [1] 8412.39 Event begin: VecMAXPY [1] 8412.39 Event end: VecMAXPY [1] 8412.39 Event begin: VecCopy [1] 8412.39 Event end: VecCopy [1] 8412.39 Event begin: VecAXPY [1] 8412.39 Event end: VecAXPY [1] 8412.39 Event begin: MatMult [1] 8412.39 Event begin: VecScatterBegin [1] 8412.39 Event begin: SFPack [1] 8412.39 Event end: SFPack [1] 8412.39 Event end: VecScatterBegin [1] 8412.39 Event begin: VecScatterEnd [1] 8412.39 Event begin: SFUnpack [1] 8412.39 Event end: SFUnpack [1] 8412.39 Event end: VecScatterEnd [1] 8412.39 Event end: MatMult [1] 8412.39 Event begin: VecAYPX [1] 8412.39 Event end: VecAYPX [1] 8412.39 Event begin: VecNorm [1] 8412.39 Event end: VecNorm [1] 8412.39 Event begin: PCApply [1] 8412.39 Event begin: VecSet [1] 8412.39 Event end: VecSet [1] 8412.39 Event begin: MatSolve [1] 8412.39 Event end: MatSolve [1] 8412.39 Event end: PCApply [1] 8412.39 Event begin: MatMult [1] 8412.39 Event begin: VecScatterBegin [1] 8412.39 Event begin: SFPack [1] 8412.39 Event end: SFPack [1] 8412.39 Event end: VecScatterBegin [1] 8412.39 Event begin: VecScatterEnd [1] 8413.76 Event begin: SFUnpack [1] 8413.76 Event end: SFUnpack [1] 8413.76 Event end: VecScatterEnd [1] 8413.76 Event end: MatMult [1] 8413.76 Event begin: VecMDot [1] 8413.76 Event end: VecMDot [1] 8413.76 Event begin: VecMAXPY [1] 8413.76 Event end: VecMAXPY [1] 8413.76 Event begin: VecNorm [1] 8413.76 Event end: VecNorm [1] 8413.76 Event begin: VecScale [1] 8413.76 Event end: VecScale [1] 8413.76 Event begin: VecSet [1] 8413.76 Event end: VecSet [1] 8413.76 Event begin: VecMAXPY [1] 8413.76 Event end: VecMAXPY [1] 8413.76 Event begin: VecCopy [1] 8413.76 Event end: VecCopy [1] 8413.76 Event begin: VecAXPY [1] 8413.76 Event end: VecAXPY [1] 8413.76 Event begin: MatMult [1] 8413.76 Event begin: VecScatterBegin [1] 8413.76 Event begin: SFPack [1] 8413.76 Event end: SFPack [1] 8413.76 Event end: VecScatterBegin [1] 8413.76 Event begin: VecScatterEnd [1] 8413.76 Event begin: SFUnpack [1] 8413.76 Event end: SFUnpack [1] 8413.76 Event end: VecScatterEnd [1] 8413.76 Event end: MatMult [1] 8413.76 Event begin: VecAYPX [1] 8413.76 Event end: VecAYPX [1] 8413.76 Event begin: VecNorm [1] 8413.76 Event end: VecNorm [1] 8413.76 Event begin: PCApply [1] 8413.76 Event begin: VecSet [1] 8413.76 Event end: VecSet [1] 8413.76 Event begin: MatSolve [1] 8413.76 Event end: MatSolve [1] 8413.76 Event end: PCApply [1] 8413.76 Event begin: MatMult [1] 8413.76 Event begin: VecScatterBegin [1] 8413.76 Event begin: SFPack [1] 8413.76 Event end: SFPack [1] 8413.76 Event end: VecScatterBegin [1] 8413.76 Event begin: VecScatterEnd [1] 8413.76 Event begin: SFUnpack [1] 8413.76 Event end: SFUnpack [1] 8413.76 Event end: VecScatterEnd [1] 8413.76 Event end: MatMult [1] 8413.76 Event begin: VecMDot [1] 8413.76 Event end: VecMDot [1] 8413.76 Event begin: VecMAXPY [1] 8413.76 Event end: VecMAXPY [1] 8413.76 Event begin: VecNorm [1] 8413.76 Event end: VecNorm [1] 8413.76 Event begin: VecScale [1] 8413.76 Event end: VecScale [1] 8413.76 Event begin: VecSet [1] 8413.76 Event end: VecSet [1] 8413.76 Event begin: VecMAXPY [1] 8413.76 Event end: VecMAXPY [1] 8413.76 Event begin: VecCopy [1] 8413.76 Event end: VecCopy [1] 8413.76 Event begin: VecAXPY [1] 8413.76 Event end: VecAXPY [1] 8413.76 Event begin: MatMult [1] 8413.76 Event begin: VecScatterBegin [1] 8413.76 Event begin: SFPack [1] 8413.76 Event end: SFPack [1] 8413.76 Event end: VecScatterBegin [1] 8413.76 Event begin: VecScatterEnd [1] 8413.76 Event begin: SFUnpack [1] 8413.76 Event end: SFUnpack [1] 8413.76 Event end: VecScatterEnd [1] 8413.76 Event end: MatMult [1] 8413.76 Event begin: VecAYPX [1] 8413.76 Event end: VecAYPX [1] 8413.76 Event begin: VecNorm [1] 8413.76 Event end: VecNorm [1] 8413.76 Event begin: PCApply [1] 8413.76 Event begin: VecSet [1] 8413.76 Event end: VecSet [1] 8413.76 Event begin: MatSolve [1] 8413.76 Event end: MatSolve [1] 8413.76 Event end: PCApply [1] 8413.76 Event begin: MatMult [1] 8413.76 Event begin: VecScatterBegin [1] 8413.76 Event begin: SFPack [1] 8413.76 Event end: SFPack [1] 8413.76 Event end: VecScatterBegin [1] 8413.76 Event begin: VecScatterEnd [1] 8413.77 Event begin: SFUnpack [1] 8413.77 Event end: SFUnpack [1] 8413.77 Event end: VecScatterEnd [1] 8413.77 Event end: MatMult [1] 8413.77 Event begin: VecMDot [1] 8413.77 Event end: VecMDot [1] 8413.77 Event begin: VecMAXPY [1] 8413.77 Event end: VecMAXPY [1] 8413.77 Event begin: VecNorm [1] 8413.77 Event end: VecNorm [1] 8413.77 Event begin: VecScale [1] 8413.77 Event end: VecScale [1] 8413.77 Event begin: VecSet [1] 8413.77 Event end: VecSet [1] 8413.77 Event begin: VecMAXPY [1] 8413.77 Event end: VecMAXPY [1] 8413.77 Event begin: VecCopy [1] 8413.77 Event end: VecCopy [1] 8413.77 Event begin: VecAXPY [1] 8413.77 Event end: VecAXPY [1] 8413.77 Event begin: MatMult [1] 8413.77 Event begin: VecScatterBegin [1] 8413.77 Event begin: SFPack [1] 8413.77 Event end: SFPack [1] 8413.77 Event end: VecScatterBegin [1] 8413.77 Event begin: VecScatterEnd [1] 8413.77 Event begin: SFUnpack [1] 8413.77 Event end: SFUnpack [1] 8413.77 Event end: VecScatterEnd [1] 8413.77 Event end: MatMult [1] 8413.77 Event begin: VecAYPX [1] 8413.77 Event end: VecAYPX [1] 8413.77 Event begin: VecNorm [1] 8413.77 Event end: VecNorm [1] 8413.77 Event begin: PCApply [1] 8413.77 Event begin: VecSet [1] 8413.77 Event end: VecSet [1] 8413.77 Event begin: MatSolve [1] 8413.77 Event end: MatSolve [1] 8413.77 Event end: PCApply [1] 8413.77 Event begin: MatMult [1] 8413.77 Event begin: VecScatterBegin [1] 8413.77 Event begin: SFPack [1] 8413.77 Event end: SFPack [1] 8413.77 Event end: VecScatterBegin [1] 8413.77 Event begin: VecScatterEnd [1] 8413.77 Event begin: SFUnpack [1] 8413.77 Event end: SFUnpack [1] 8413.77 Event end: VecScatterEnd [1] 8413.77 Event end: MatMult [1] 8413.77 Event begin: VecMDot [1] 8413.77 Event end: VecMDot [1] 8413.77 Event begin: VecMAXPY [1] 8413.77 Event end: VecMAXPY [1] 8413.77 Event begin: VecNorm [1] 8413.77 Event end: VecNorm [1] 8413.77 Event begin: VecScale [1] 8413.77 Event end: VecScale [1] 8413.77 Event begin: VecSet [1] 8413.77 Event end: VecSet [1] 8413.77 Event begin: VecMAXPY [1] 8413.77 Event end: VecMAXPY [1] 8413.77 Event begin: VecCopy [1] 8413.77 Event end: VecCopy [1] 8413.77 Event begin: VecAXPY [1] 8413.77 Event end: VecAXPY [1] 8413.77 Event begin: MatMult [1] 8413.77 Event begin: VecScatterBegin [1] 8413.77 Event begin: SFPack [1] 8413.77 Event end: SFPack [1] 8413.77 Event end: VecScatterBegin [1] 8413.77 Event begin: VecScatterEnd [1] 8413.77 Event begin: SFUnpack [1] 8413.77 Event end: SFUnpack [1] 8413.77 Event end: VecScatterEnd [1] 8413.77 Event end: MatMult [1] 8413.77 Event begin: VecAYPX [1] 8413.77 Event end: VecAYPX [1] 8413.77 Event begin: VecNorm [1] 8413.77 Event end: VecNorm [1] 8413.77 Event begin: PCApply [1] 8413.77 Event begin: VecSet [1] 8413.77 Event end: VecSet [1] 8413.77 Event begin: MatSolve [1] 8413.77 Event end: MatSolve [1] 8413.77 Event end: PCApply [1] 8413.77 Event begin: MatMult [1] 8413.77 Event begin: VecScatterBegin [1] 8413.77 Event begin: SFPack [1] 8413.77 Event end: SFPack [1] 8413.77 Event end: VecScatterBegin [1] 8413.77 Event begin: VecScatterEnd [1] 8413.77 Event begin: SFUnpack [1] 8413.77 Event end: SFUnpack [1] 8413.77 Event end: VecScatterEnd [1] 8413.77 Event end: MatMult [1] 8413.77 Event begin: VecMDot [1] 8413.77 Event end: VecMDot [1] 8413.77 Event begin: VecMAXPY [1] 8413.77 Event end: VecMAXPY [1] 8413.77 Event begin: VecNorm [1] 8413.77 Event end: VecNorm [1] 8413.77 Event begin: VecScale [1] 8413.77 Event end: VecScale [1] 8413.77 Event begin: VecSet [1] 8413.77 Event end: VecSet [1] 8413.77 Event begin: VecMAXPY [1] 8413.77 Event end: VecMAXPY [1] 8413.77 Event begin: VecCopy [1] 8413.77 Event end: VecCopy [1] 8413.77 Event begin: VecAXPY [1] 8413.77 Event end: VecAXPY [1] 8413.77 Event begin: MatMult [1] 8413.77 Event begin: VecScatterBegin [1] 8413.77 Event begin: SFPack [1] 8413.77 Event end: SFPack [1] 8413.77 Event end: VecScatterBegin [1] 8413.77 Event begin: VecScatterEnd [1] 8413.77 Event begin: SFUnpack [1] 8413.77 Event end: SFUnpack [1] 8413.77 Event end: VecScatterEnd [1] 8413.77 Event end: MatMult [1] 8413.77 Event begin: VecAYPX [1] 8413.77 Event end: VecAYPX [1] 8413.77 Event begin: VecNorm [1] 8413.77 Event end: VecNorm [1] 8413.77 Event begin: PCApply [1] 8413.77 Event begin: VecSet [1] 8413.77 Event end: VecSet [1] 8413.77 Event begin: MatSolve [1] 8413.77 Event end: MatSolve [1] 8413.77 Event end: PCApply [1] 8413.77 Event begin: MatMult [1] 8413.77 Event begin: VecScatterBegin [1] 8413.77 Event begin: SFPack [1] 8413.77 Event end: SFPack [1] 8413.77 Event end: VecScatterBegin [1] 8413.77 Event begin: VecScatterEnd [1] 8413.77 Event begin: SFUnpack [1] 8413.77 Event end: SFUnpack [1] 8413.77 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecMDot [1] 8413.78 Event end: VecMDot [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: VecScale [1] 8413.78 Event end: VecScale [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecCopy [1] 8413.78 Event end: VecCopy [1] 8413.78 Event begin: VecAXPY [1] 8413.78 Event end: VecAXPY [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecAYPX [1] 8413.78 Event end: VecAYPX [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: PCApply [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: MatSolve [1] 8413.78 Event end: MatSolve [1] 8413.78 Event end: PCApply [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecMDot [1] 8413.78 Event end: VecMDot [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: VecScale [1] 8413.78 Event end: VecScale [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecCopy [1] 8413.78 Event end: VecCopy [1] 8413.78 Event begin: VecAXPY [1] 8413.78 Event end: VecAXPY [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecAYPX [1] 8413.78 Event end: VecAYPX [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: PCApply [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: MatSolve [1] 8413.78 Event end: MatSolve [1] 8413.78 Event end: PCApply [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecMDot [1] 8413.78 Event end: VecMDot [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: VecScale [1] 8413.78 Event end: VecScale [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecCopy [1] 8413.78 Event end: VecCopy [1] 8413.78 Event begin: VecAXPY [1] 8413.78 Event end: VecAXPY [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecAYPX [1] 8413.78 Event end: VecAYPX [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: PCApply [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: MatSolve [1] 8413.78 Event end: MatSolve [1] 8413.78 Event end: PCApply [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecMDot [1] 8413.78 Event end: VecMDot [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: VecScale [1] 8413.78 Event end: VecScale [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecCopy [1] 8413.78 Event end: VecCopy [1] 8413.78 Event begin: VecAXPY [1] 8413.78 Event end: VecAXPY [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecAYPX [1] 8413.78 Event end: VecAYPX [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: PCApply [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: MatSolve [1] 8413.78 Event end: MatSolve [1] 8413.78 Event end: PCApply [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecMDot [1] 8413.78 Event end: VecMDot [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: VecScale [1] 8413.78 Event end: VecScale [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecCopy [1] 8413.78 Event end: VecCopy [1] 8413.78 Event begin: VecAXPY [1] 8413.78 Event end: VecAXPY [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecAYPX [1] 8413.78 Event end: VecAYPX [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: PCApply [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: MatSolve [1] 8413.78 Event end: MatSolve [1] 8413.78 Event end: PCApply [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecMDot [1] 8413.78 Event end: VecMDot [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: VecScale [1] 8413.78 Event end: VecScale [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecCopy [1] 8413.78 Event end: VecCopy [1] 8413.78 Event begin: VecAXPY [1] 8413.78 Event end: VecAXPY [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecAYPX [1] 8413.78 Event end: VecAYPX [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: PCApply [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: MatSolve [1] 8413.78 Event end: MatSolve [1] 8413.78 Event end: PCApply [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecMDot [1] 8413.78 Event end: VecMDot [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: VecScale [1] 8413.78 Event end: VecScale [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: VecMAXPY [1] 8413.78 Event end: VecMAXPY [1] 8413.78 Event begin: VecCopy [1] 8413.78 Event end: VecCopy [1] 8413.78 Event begin: VecAXPY [1] 8413.78 Event end: VecAXPY [1] 8413.78 Event begin: MatMult [1] 8413.78 Event begin: VecScatterBegin [1] 8413.78 Event begin: SFPack [1] 8413.78 Event end: SFPack [1] 8413.78 Event end: VecScatterBegin [1] 8413.78 Event begin: VecScatterEnd [1] 8413.78 Event begin: SFUnpack [1] 8413.78 Event end: SFUnpack [1] 8413.78 Event end: VecScatterEnd [1] 8413.78 Event end: MatMult [1] 8413.78 Event begin: VecAYPX [1] 8413.78 Event end: VecAYPX [1] 8413.78 Event begin: VecNorm [1] 8413.78 Event end: VecNorm [1] 8413.78 Event begin: PCApply [1] 8413.78 Event begin: VecSet [1] 8413.78 Event end: VecSet [1] 8413.78 Event begin: MatSolve [1] 8413.78 Event end: MatSolve [1] 8413.78 Event end: PCApply [1] 8413.79 Event begin: MatMult [1] 8413.79 Event begin: VecScatterBegin [1] 8413.79 Event begin: SFPack [1] 8413.79 Event end: SFPack [1] 8413.79 Event end: VecScatterBegin [1] 8413.79 Event begin: VecScatterEnd [2] 8412.39 Event begin: MatSolve [2] 8413.75 Event end: MatSolve [2] 8413.75 Event end: PCApply [2] 8413.75 Event begin: MatMult [2] 8413.75 Event begin: VecScatterBegin [2] 8413.75 Event begin: SFPack [2] 8413.75 Event end: SFPack [2] 8413.75 Event end: VecScatterBegin [2] 8413.75 Event begin: VecScatterEnd [2] 8413.76 Event begin: SFUnpack [2] 8413.76 Event end: SFUnpack [2] 8413.76 Event end: VecScatterEnd [2] 8413.76 Event end: MatMult [2] 8413.76 Event begin: VecMDot [2] 8413.76 Event end: VecMDot [2] 8413.76 Event begin: VecMAXPY [2] 8413.76 Event end: VecMAXPY [2] 8413.76 Event begin: VecNorm [2] 8413.76 Event end: VecNorm [2] 8413.76 Event begin: VecScale [2] 8413.76 Event end: VecScale [2] 8413.76 Event begin: VecSet [2] 8413.76 Event end: VecSet [2] 8413.76 Event begin: VecMAXPY [2] 8413.76 Event end: VecMAXPY [2] 8413.76 Event begin: VecCopy [2] 8413.76 Event end: VecCopy [2] 8413.76 Event begin: VecAXPY [2] 8413.76 Event end: VecAXPY [2] 8413.76 Event begin: MatMult [2] 8413.76 Event begin: VecScatterBegin [2] 8413.76 Event begin: SFPack [2] 8413.76 Event end: SFPack [2] 8413.76 Event end: VecScatterBegin [2] 8413.76 Event begin: VecScatterEnd [2] 8413.76 Event begin: SFUnpack [2] 8413.76 Event end: SFUnpack [2] 8413.76 Event end: VecScatterEnd [2] 8413.76 Event end: MatMult [2] 8413.76 Event begin: VecAYPX [2] 8413.76 Event end: VecAYPX [2] 8413.76 Event begin: VecNorm [2] 8413.76 Event end: VecNorm [2] 8413.76 Event begin: PCApply [2] 8413.76 Event begin: VecSet [2] 8413.76 Event end: VecSet [2] 8413.76 Event begin: MatSolve [2] 8413.76 Event end: MatSolve [2] 8413.76 Event end: PCApply [2] 8413.76 Event begin: MatMult [2] 8413.76 Event begin: VecScatterBegin [2] 8413.76 Event begin: SFPack [2] 8413.76 Event end: SFPack [2] 8413.76 Event end: VecScatterBegin [2] 8413.76 Event begin: VecScatterEnd [2] 8413.76 Event begin: SFUnpack [2] 8413.76 Event end: SFUnpack [2] 8413.76 Event end: VecScatterEnd [2] 8413.76 Event end: MatMult [2] 8413.76 Event begin: VecMDot [2] 8413.76 Event end: VecMDot [2] 8413.76 Event begin: VecMAXPY [2] 8413.76 Event end: VecMAXPY [2] 8413.76 Event begin: VecNorm [2] 8413.76 Event end: VecNorm [2] 8413.76 Event begin: VecScale [2] 8413.76 Event end: VecScale [2] 8413.76 Event begin: VecSet [2] 8413.76 Event end: VecSet [2] 8413.76 Event begin: VecMAXPY [2] 8413.76 Event end: VecMAXPY [2] 8413.76 Event begin: VecCopy [2] 8413.76 Event end: VecCopy [2] 8413.76 Event begin: VecAXPY [2] 8413.76 Event end: VecAXPY [2] 8413.76 Event begin: MatMult [2] 8413.76 Event begin: VecScatterBegin [2] 8413.76 Event begin: SFPack [2] 8413.76 Event end: SFPack [2] 8413.76 Event end: VecScatterBegin [2] 8413.76 Event begin: VecScatterEnd [2] 8413.76 Event begin: SFUnpack [2] 8413.76 Event end: SFUnpack [2] 8413.76 Event end: VecScatterEnd [2] 8413.76 Event end: MatMult [2] 8413.76 Event begin: VecAYPX [2] 8413.76 Event end: VecAYPX [2] 8413.76 Event begin: VecNorm [2] 8413.76 Event end: VecNorm [2] 8413.76 Event begin: PCApply [2] 8413.76 Event begin: VecSet [2] 8413.76 Event end: VecSet [2] 8413.76 Event begin: MatSolve [2] 8413.77 Event end: MatSolve [2] 8413.77 Event end: PCApply [2] 8413.77 Event begin: MatMult [2] 8413.77 Event begin: VecScatterBegin [2] 8413.77 Event begin: SFPack [2] 8413.77 Event end: SFPack [2] 8413.77 Event end: VecScatterBegin [2] 8413.77 Event begin: VecScatterEnd [2] 8413.77 Event begin: SFUnpack [2] 8413.77 Event end: SFUnpack [2] 8413.77 Event end: VecScatterEnd [2] 8413.77 Event end: MatMult [2] 8413.77 Event begin: VecMDot [2] 8413.77 Event end: VecMDot [2] 8413.77 Event begin: VecMAXPY [2] 8413.77 Event end: VecMAXPY [2] 8413.77 Event begin: VecNorm [2] 8413.77 Event end: VecNorm [2] 8413.77 Event begin: VecScale [2] 8413.77 Event end: VecScale [2] 8413.77 Event begin: VecSet [2] 8413.77 Event end: VecSet [2] 8413.77 Event begin: VecMAXPY [2] 8413.77 Event end: VecMAXPY [2] 8413.77 Event begin: VecCopy [2] 8413.77 Event end: VecCopy [2] 8413.77 Event begin: VecAXPY [2] 8413.77 Event end: VecAXPY [2] 8413.77 Event begin: MatMult [2] 8413.77 Event begin: VecScatterBegin [2] 8413.77 Event begin: SFPack [2] 8413.77 Event end: SFPack [2] 8413.77 Event end: VecScatterBegin [2] 8413.77 Event begin: VecScatterEnd [2] 8413.77 Event begin: SFUnpack [2] 8413.77 Event end: SFUnpack [2] 8413.77 Event end: VecScatterEnd [2] 8413.77 Event end: MatMult [2] 8413.77 Event begin: VecAYPX [2] 8413.77 Event end: VecAYPX [2] 8413.77 Event begin: VecNorm [2] 8413.77 Event end: VecNorm [2] 8413.77 Event begin: PCApply [2] 8413.77 Event begin: VecSet [2] 8413.77 Event end: VecSet [2] 8413.77 Event begin: MatSolve [2] 8413.77 Event end: MatSolve [2] 8413.77 Event end: PCApply [2] 8413.77 Event begin: MatMult [2] 8413.77 Event begin: VecScatterBegin [2] 8413.77 Event begin: SFPack [2] 8413.77 Event end: SFPack [2] 8413.77 Event end: VecScatterBegin [2] 8413.77 Event begin: VecScatterEnd [2] 8413.77 Event begin: SFUnpack [2] 8413.77 Event end: SFUnpack [2] 8413.77 Event end: VecScatterEnd [2] 8413.77 Event end: MatMult [2] 8413.77 Event begin: VecMDot [2] 8413.77 Event end: VecMDot [2] 8413.77 Event begin: VecMAXPY [2] 8413.77 Event end: VecMAXPY [2] 8413.77 Event begin: VecNorm [2] 8413.77 Event end: VecNorm [2] 8413.77 Event begin: VecScale [2] 8413.77 Event end: VecScale [2] 8413.77 Event begin: VecSet [2] 8413.77 Event end: VecSet [2] 8413.77 Event begin: VecMAXPY [2] 8413.77 Event end: VecMAXPY [2] 8413.77 Event begin: VecCopy [2] 8413.77 Event end: VecCopy [2] 8413.77 Event begin: VecAXPY [2] 8413.77 Event end: VecAXPY [2] 8413.77 Event begin: MatMult [2] 8413.77 Event begin: VecScatterBegin [2] 8413.77 Event begin: SFPack [2] 8413.77 Event end: SFPack [2] 8413.77 Event end: VecScatterBegin [2] 8413.77 Event begin: VecScatterEnd [2] 8413.77 Event begin: SFUnpack [2] 8413.77 Event end: SFUnpack [2] 8413.77 Event end: VecScatterEnd [2] 8413.77 Event end: MatMult [2] 8413.77 Event begin: VecAYPX [2] 8413.77 Event end: VecAYPX [2] 8413.77 Event begin: VecNorm [2] 8413.77 Event end: VecNorm [2] 8413.77 Event begin: PCApply [2] 8413.77 Event begin: VecSet [2] 8413.77 Event end: VecSet [2] 8413.77 Event begin: MatSolve [2] 8413.77 Event end: MatSolve [2] 8413.77 Event end: PCApply [2] 8413.77 Event begin: MatMult [2] 8413.77 Event begin: VecScatterBegin [2] 8413.77 Event begin: SFPack [2] 8413.77 Event end: SFPack [2] 8413.77 Event end: VecScatterBegin [2] 8413.77 Event begin: VecScatterEnd [2] 8413.77 Event begin: SFUnpack [2] 8413.77 Event end: SFUnpack [2] 8413.77 Event end: VecScatterEnd [2] 8413.77 Event end: MatMult [2] 8413.77 Event begin: VecMDot [2] 8413.77 Event end: VecMDot [2] 8413.77 Event begin: VecMAXPY [2] 8413.77 Event end: VecMAXPY [2] 8413.77 Event begin: VecNorm [2] 8413.77 Event end: VecNorm [2] 8413.77 Event begin: VecScale [2] 8413.77 Event end: VecScale [2] 8413.77 Event begin: VecSet [2] 8413.77 Event end: VecSet [2] 8413.77 Event begin: VecMAXPY [2] 8413.77 Event end: VecMAXPY [2] 8413.77 Event begin: VecCopy [2] 8413.77 Event end: VecCopy [2] 8413.77 Event begin: VecAXPY [2] 8413.77 Event end: VecAXPY [2] 8413.77 Event begin: MatMult [2] 8413.77 Event begin: VecScatterBegin [2] 8413.77 Event begin: SFPack [2] 8413.77 Event end: SFPack [2] 8413.77 Event end: VecScatterBegin [2] 8413.77 Event begin: VecScatterEnd [2] 8413.77 Event begin: SFUnpack [2] 8413.77 Event end: SFUnpack [2] 8413.77 Event end: VecScatterEnd [2] 8413.77 Event end: MatMult [2] 8413.77 Event begin: VecAYPX [2] 8413.77 Event end: VecAYPX [2] 8413.77 Event begin: VecNorm [2] 8413.77 Event end: VecNorm [2] 8413.77 Event begin: PCApply [2] 8413.77 Event begin: VecSet [2] 8413.77 Event end: VecSet [2] 8413.77 Event begin: MatSolve [2] 8413.77 Event end: MatSolve [2] 8413.77 Event end: PCApply [2] 8413.77 Event begin: MatMult [2] 8413.77 Event begin: VecScatterBegin [2] 8413.77 Event begin: SFPack [2] 8413.77 Event end: SFPack [2] 8413.77 Event end: VecScatterBegin [2] 8413.77 Event begin: VecScatterEnd [2] 8413.77 Event begin: SFUnpack [2] 8413.77 Event end: SFUnpack [2] 8413.77 Event end: VecScatterEnd [2] 8413.77 Event end: MatMult [2] 8413.77 Event begin: VecMDot [2] 8413.78 Event end: VecMDot [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: VecScale [2] 8413.78 Event end: VecScale [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecCopy [2] 8413.78 Event end: VecCopy [2] 8413.78 Event begin: VecAXPY [2] 8413.78 Event end: VecAXPY [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecAYPX [2] 8413.78 Event end: VecAYPX [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: PCApply [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: MatSolve [2] 8413.78 Event end: MatSolve [2] 8413.78 Event end: PCApply [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecMDot [2] 8413.78 Event end: VecMDot [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: VecScale [2] 8413.78 Event end: VecScale [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecCopy [2] 8413.78 Event end: VecCopy [2] 8413.78 Event begin: VecAXPY [2] 8413.78 Event end: VecAXPY [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecAYPX [2] 8413.78 Event end: VecAYPX [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: PCApply [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: MatSolve [2] 8413.78 Event end: MatSolve [2] 8413.78 Event end: PCApply [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecMDot [2] 8413.78 Event end: VecMDot [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: VecScale [2] 8413.78 Event end: VecScale [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecCopy [2] 8413.78 Event end: VecCopy [2] 8413.78 Event begin: VecAXPY [2] 8413.78 Event end: VecAXPY [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecAYPX [2] 8413.78 Event end: VecAYPX [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: PCApply [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: MatSolve [2] 8413.78 Event end: MatSolve [2] 8413.78 Event end: PCApply [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecMDot [2] 8413.78 Event end: VecMDot [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: VecScale [2] 8413.78 Event end: VecScale [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecCopy [2] 8413.78 Event end: VecCopy [2] 8413.78 Event begin: VecAXPY [2] 8413.78 Event end: VecAXPY [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecAYPX [2] 8413.78 Event end: VecAYPX [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: PCApply [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: MatSolve [2] 8413.78 Event end: MatSolve [2] 8413.78 Event end: PCApply [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecMDot [2] 8413.78 Event end: VecMDot [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: VecScale [2] 8413.78 Event end: VecScale [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecCopy [2] 8413.78 Event end: VecCopy [2] 8413.78 Event begin: VecAXPY [2] 8413.78 Event end: VecAXPY [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecAYPX [2] 8413.78 Event end: VecAYPX [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: PCApply [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: MatSolve [2] 8413.78 Event end: MatSolve [2] 8413.78 Event end: PCApply [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecMDot [2] 8413.78 Event end: VecMDot [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: VecScale [2] 8413.78 Event end: VecScale [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: VecMAXPY [2] 8413.78 Event end: VecMAXPY [2] 8413.78 Event begin: VecCopy [2] 8413.78 Event end: VecCopy [2] 8413.78 Event begin: VecAXPY [2] 8413.78 Event end: VecAXPY [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.78 Event begin: SFUnpack [2] 8413.78 Event end: SFUnpack [2] 8413.78 Event end: VecScatterEnd [2] 8413.78 Event end: MatMult [2] 8413.78 Event begin: VecAYPX [2] 8413.78 Event end: VecAYPX [2] 8413.78 Event begin: VecNorm [2] 8413.78 Event end: VecNorm [2] 8413.78 Event begin: PCApply [2] 8413.78 Event begin: VecSet [2] 8413.78 Event end: VecSet [2] 8413.78 Event begin: MatSolve [2] 8413.78 Event end: MatSolve [2] 8413.78 Event end: PCApply [2] 8413.78 Event begin: MatMult [2] 8413.78 Event begin: VecScatterBegin [2] 8413.78 Event begin: SFPack [2] 8413.78 Event end: SFPack [2] 8413.78 Event end: VecScatterBegin [2] 8413.78 Event begin: VecScatterEnd [2] 8413.79 Event begin: SFUnpack [2] 8413.79 Event end: SFUnpack [2] 8413.79 Event end: VecScatterEnd [2] 8413.79 Event end: MatMult [2] 8413.79 Event begin: VecMDot [2] 8413.79 Event end: VecMDot [2] 8413.79 Event begin: VecMAXPY [2] 8413.79 Event end: VecMAXPY [2] 8413.79 Event begin: VecNorm [2] 8413.79 Event end: VecNorm [2] 8413.79 Event begin: VecScale [2] 8413.79 Event end: VecScale [2] 8413.79 Event begin: VecSet [2] 8413.79 Event end: VecSet [2] 8413.79 Event begin: VecMAXPY [2] 8413.79 Event end: VecMAXPY [2] 8413.79 Event begin: VecCopy [2] 8413.79 Event end: VecCopy [2] 8413.79 Event begin: VecAXPY [2] 8413.79 Event end: VecAXPY [2] 8413.79 Event begin: MatMult [2] 8413.79 Event begin: VecScatterBegin [2] 8413.79 Event begin: SFPack [2] 8413.79 Event end: SFPack [2] 8413.79 Event end: VecScatterBegin [2] 8413.79 Event begin: VecScatterEnd [2] 8413.79 Event begin: SFUnpack [2] 8413.79 Event end: SFUnpack [2] 8413.79 Event end: VecScatterEnd [2] 8413.79 Event end: MatMult [2] 8413.79 Event begin: VecAYPX [2] 8413.79 Event end: VecAYPX [2] 8413.79 Event begin: VecNorm [2] 8413.79 Event end: VecNorm [2] 8413.79 Event begin: PCApply [2] 8413.79 Event begin: VecSet [2] 8413.79 Event end: VecSet [2] 8413.79 Event begin: MatSolve [2] 8413.79 Event end: MatSolve [2] 8413.79 Event end: PCApply [2] 8413.79 Event begin: MatMult [2] 8413.79 Event begin: VecScatterBegin [2] 8413.79 Event begin: SFPack [2] 8413.79 Event end: SFPack [2] 8413.79 Event end: VecScatterBegin [2] 8413.79 Event begin: VecScatterEnd [2] 8413.79 Event begin: SFUnpack [2] 8413.79 Event end: SFUnpack [2] 8413.79 Event end: VecScatterEnd [2] 8413.79 Event end: MatMult [2] 8413.79 Event begin: VecMDot [2] 8413.81 Event end: VecMDot [2] 8413.81 Event begin: VecMAXPY [2] 8413.81 Event end: VecMAXPY [2] 8413.81 Event begin: VecNorm [2] 8413.81 Event end: VecNorm [2] 8413.81 Event begin: VecScale [2] 8413.81 Event end: VecScale [2] 8413.81 Event begin: VecSet [2] 8413.81 Event end: VecSet [2] 8413.81 Event begin: VecMAXPY [2] 8413.81 Event end: VecMAXPY [2] 8413.81 Event begin: VecCopy [2] 8413.81 Event end: VecCopy [2] 8413.81 Event begin: VecAXPY [2] 8413.81 Event end: VecAXPY [2] 8413.81 Event begin: MatMult [2] 8413.81 Event begin: VecScatterBegin [2] 8413.81 Event begin: SFPack [2] 8413.81 Event end: SFPack [2] 8413.81 Event end: VecScatterBegin [2] 8413.81 Event begin: VecScatterEnd [2] 8413.81 Event begin: SFUnpack [2] 8413.81 Event end: SFUnpack [2] 8413.81 Event end: VecScatterEnd [2] 8413.81 Event end: MatMult [2] 8413.81 Event begin: VecAYPX [2] 8413.81 Event end: VecAYPX [2] 8413.81 Event begin: VecNorm [2] 8413.81 Event end: VecNorm [2] 8413.81 Event begin: PCApply [2] 8413.81 Event begin: VecSet [2] 8413.81 Event end: VecSet [2] 8413.81 Event begin: MatSolve [2] 8413.81 Event end: MatSolve [2] 8413.81 Event end: PCApply [2] 8413.81 Event begin: MatMult [2] 8413.81 Event begin: VecScatterBegin [2] 8413.81 Event begin: SFPack [2] 8413.81 Event end: SFPack [2] 8413.81 Event end: VecScatterBegin [2] 8413.81 Event begin: VecScatterEnd [2] 8413.81 Event begin: SFUnpack [2] 8413.81 Event end: SFUnpack [2] 8413.81 Event end: VecScatterEnd [3] 8412.39 Event end: VecSet [3] 8413.76 Event begin: MatSolve [3] 8413.76 Event end: MatSolve [3] 8413.76 Event end: PCApply [3] 8413.76 Event begin: MatMult [3] 8413.76 Event begin: VecScatterBegin [3] 8413.76 Event begin: SFPack [3] 8413.76 Event end: SFPack [3] 8413.76 Event end: VecScatterBegin [3] 8413.76 Event begin: VecScatterEnd [3] 8413.76 Event begin: SFUnpack [3] 8413.76 Event end: SFUnpack [3] 8413.76 Event end: VecScatterEnd [3] 8413.76 Event end: MatMult [3] 8413.76 Event begin: VecMDot [3] 8413.76 Event end: VecMDot [3] 8413.76 Event begin: VecMAXPY [3] 8413.76 Event end: VecMAXPY [3] 8413.76 Event begin: VecNorm [3] 8413.76 Event end: VecNorm [3] 8413.76 Event begin: VecScale [3] 8413.76 Event end: VecScale [3] 8413.76 Event begin: VecSet [3] 8413.76 Event end: VecSet [3] 8413.76 Event begin: VecMAXPY [3] 8413.76 Event end: VecMAXPY [3] 8413.76 Event begin: VecCopy [3] 8413.76 Event end: VecCopy [3] 8413.76 Event begin: VecAXPY [3] 8413.76 Event end: VecAXPY [3] 8413.76 Event begin: MatMult [3] 8413.76 Event begin: VecScatterBegin [3] 8413.76 Event begin: SFPack [3] 8413.76 Event end: SFPack [3] 8413.76 Event end: VecScatterBegin [3] 8413.76 Event begin: VecScatterEnd [3] 8413.76 Event begin: SFUnpack [3] 8413.76 Event end: SFUnpack [3] 8413.76 Event end: VecScatterEnd [3] 8413.76 Event end: MatMult [3] 8413.76 Event begin: VecAYPX [3] 8413.76 Event end: VecAYPX [3] 8413.76 Event begin: VecNorm [3] 8413.76 Event end: VecNorm [3] 8413.76 Event begin: PCApply [3] 8413.76 Event begin: VecSet [3] 8413.76 Event end: VecSet [3] 8413.76 Event begin: MatSolve [3] 8413.76 Event end: MatSolve [3] 8413.76 Event end: PCApply [3] 8413.76 Event begin: MatMult [3] 8413.76 Event begin: VecScatterBegin [3] 8413.76 Event begin: SFPack [3] 8413.76 Event end: SFPack [3] 8413.76 Event end: VecScatterBegin [3] 8413.76 Event begin: VecScatterEnd [3] 8413.76 Event begin: SFUnpack [3] 8413.76 Event end: SFUnpack [3] 8413.76 Event end: VecScatterEnd [3] 8413.76 Event end: MatMult [3] 8413.76 Event begin: VecMDot [3] 8413.76 Event end: VecMDot [3] 8413.76 Event begin: VecMAXPY [3] 8413.76 Event end: VecMAXPY [3] 8413.76 Event begin: VecNorm [3] 8413.76 Event end: VecNorm [3] 8413.76 Event begin: VecScale [3] 8413.76 Event end: VecScale [3] 8413.76 Event begin: VecSet [3] 8413.76 Event end: VecSet [3] 8413.76 Event begin: VecMAXPY [3] 8413.76 Event end: VecMAXPY [3] 8413.76 Event begin: VecCopy [3] 8413.76 Event end: VecCopy [3] 8413.76 Event begin: VecAXPY [3] 8413.76 Event end: VecAXPY [3] 8413.76 Event begin: MatMult [3] 8413.76 Event begin: VecScatterBegin [3] 8413.76 Event begin: SFPack [3] 8413.76 Event end: SFPack [3] 8413.76 Event end: VecScatterBegin [3] 8413.76 Event begin: VecScatterEnd [3] 8413.76 Event begin: SFUnpack [3] 8413.76 Event end: SFUnpack [3] 8413.76 Event end: VecScatterEnd [3] 8413.76 Event end: MatMult [3] 8413.76 Event begin: VecAYPX [3] 8413.76 Event end: VecAYPX [3] 8413.76 Event begin: VecNorm [3] 8413.76 Event end: VecNorm [3] 8413.76 Event begin: PCApply [3] 8413.76 Event begin: VecSet [3] 8413.76 Event end: VecSet [3] 8413.76 Event begin: MatSolve [3] 8413.76 Event end: MatSolve [3] 8413.76 Event end: PCApply [3] 8413.76 Event begin: MatMult [3] 8413.76 Event begin: VecScatterBegin [3] 8413.76 Event begin: SFPack [3] 8413.76 Event end: SFPack [3] 8413.76 Event end: VecScatterBegin [3] 8413.76 Event begin: VecScatterEnd [3] 8413.77 Event begin: SFUnpack [3] 8413.77 Event end: SFUnpack [3] 8413.77 Event end: VecScatterEnd [3] 8413.77 Event end: MatMult [3] 8413.77 Event begin: VecMDot [3] 8413.77 Event end: VecMDot [3] 8413.77 Event begin: VecMAXPY [3] 8413.77 Event end: VecMAXPY [3] 8413.77 Event begin: VecNorm [3] 8413.77 Event end: VecNorm [3] 8413.77 Event begin: VecScale [3] 8413.77 Event end: VecScale [3] 8413.77 Event begin: VecSet [3] 8413.77 Event end: VecSet [3] 8413.77 Event begin: VecMAXPY [3] 8413.77 Event end: VecMAXPY [3] 8413.77 Event begin: VecCopy [3] 8413.77 Event end: VecCopy [3] 8413.77 Event begin: VecAXPY [3] 8413.77 Event end: VecAXPY [3] 8413.77 Event begin: MatMult [3] 8413.77 Event begin: VecScatterBegin [3] 8413.77 Event begin: SFPack [3] 8413.77 Event end: SFPack [3] 8413.77 Event end: VecScatterBegin [3] 8413.77 Event begin: VecScatterEnd [3] 8413.77 Event begin: SFUnpack [3] 8413.77 Event end: SFUnpack [3] 8413.77 Event end: VecScatterEnd [3] 8413.77 Event end: MatMult [3] 8413.77 Event begin: VecAYPX [3] 8413.77 Event end: VecAYPX [3] 8413.77 Event begin: VecNorm [3] 8413.77 Event end: VecNorm [3] 8413.77 Event begin: PCApply [3] 8413.77 Event begin: VecSet [3] 8413.77 Event end: VecSet [3] 8413.77 Event begin: MatSolve [3] 8413.77 Event end: MatSolve [3] 8413.77 Event end: PCApply [3] 8413.77 Event begin: MatMult [3] 8413.77 Event begin: VecScatterBegin [3] 8413.77 Event begin: SFPack [3] 8413.77 Event end: SFPack [3] 8413.77 Event end: VecScatterBegin [3] 8413.77 Event begin: VecScatterEnd [3] 8413.77 Event begin: SFUnpack [3] 8413.77 Event end: SFUnpack [3] 8413.77 Event end: VecScatterEnd [3] 8413.77 Event end: MatMult [3] 8413.77 Event begin: VecMDot [3] 8413.77 Event end: VecMDot [3] 8413.77 Event begin: VecMAXPY [3] 8413.77 Event end: VecMAXPY [3] 8413.77 Event begin: VecNorm [3] 8413.77 Event end: VecNorm [3] 8413.77 Event begin: VecScale [3] 8413.77 Event end: VecScale [3] 8413.77 Event begin: VecSet [3] 8413.77 Event end: VecSet [3] 8413.77 Event begin: VecMAXPY [3] 8413.77 Event end: VecMAXPY [3] 8413.77 Event begin: VecCopy [3] 8413.77 Event end: VecCopy [3] 8413.77 Event begin: VecAXPY [3] 8413.77 Event end: VecAXPY [3] 8413.77 Event begin: MatMult [3] 8413.77 Event begin: VecScatterBegin [3] 8413.77 Event begin: SFPack [3] 8413.77 Event end: SFPack [3] 8413.77 Event end: VecScatterBegin [3] 8413.77 Event begin: VecScatterEnd [3] 8413.77 Event begin: SFUnpack [3] 8413.77 Event end: SFUnpack [3] 8413.77 Event end: VecScatterEnd [3] 8413.77 Event end: MatMult [3] 8413.77 Event begin: VecAYPX [3] 8413.77 Event end: VecAYPX [3] 8413.77 Event begin: VecNorm [3] 8413.77 Event end: VecNorm [3] 8413.77 Event begin: PCApply [3] 8413.77 Event begin: VecSet [3] 8413.77 Event end: VecSet [3] 8413.77 Event begin: MatSolve [3] 8413.77 Event end: MatSolve [3] 8413.77 Event end: PCApply [3] 8413.77 Event begin: MatMult [3] 8413.77 Event begin: VecScatterBegin [3] 8413.77 Event begin: SFPack [3] 8413.77 Event end: SFPack [3] 8413.77 Event end: VecScatterBegin [3] 8413.77 Event begin: VecScatterEnd [3] 8413.77 Event begin: SFUnpack [3] 8413.77 Event end: SFUnpack [3] 8413.77 Event end: VecScatterEnd [3] 8413.77 Event end: MatMult [3] 8413.77 Event begin: VecMDot [3] 8413.77 Event end: VecMDot [3] 8413.77 Event begin: VecMAXPY [3] 8413.77 Event end: VecMAXPY [3] 8413.77 Event begin: VecNorm [3] 8413.77 Event end: VecNorm [3] 8413.77 Event begin: VecScale [3] 8413.77 Event end: VecScale [3] 8413.77 Event begin: VecSet [3] 8413.77 Event end: VecSet [3] 8413.77 Event begin: VecMAXPY [3] 8413.77 Event end: VecMAXPY [3] 8413.77 Event begin: VecCopy [3] 8413.77 Event end: VecCopy [3] 8413.77 Event begin: VecAXPY [3] 8413.77 Event end: VecAXPY [3] 8413.77 Event begin: MatMult [3] 8413.77 Event begin: VecScatterBegin [3] 8413.77 Event begin: SFPack [3] 8413.77 Event end: SFPack [3] 8413.77 Event end: VecScatterBegin [3] 8413.77 Event begin: VecScatterEnd [3] 8413.77 Event begin: SFUnpack [3] 8413.77 Event end: SFUnpack [3] 8413.77 Event end: VecScatterEnd [3] 8413.77 Event end: MatMult [3] 8413.77 Event begin: VecAYPX [3] 8413.77 Event end: VecAYPX [3] 8413.77 Event begin: VecNorm [3] 8413.77 Event end: VecNorm [3] 8413.77 Event begin: PCApply [3] 8413.77 Event begin: VecSet [3] 8413.77 Event end: VecSet [3] 8413.77 Event begin: MatSolve [3] 8413.77 Event end: MatSolve [3] 8413.77 Event end: PCApply [3] 8413.77 Event begin: MatMult [3] 8413.77 Event begin: VecScatterBegin [3] 8413.77 Event begin: SFPack [3] 8413.77 Event end: SFPack [3] 8413.77 Event end: VecScatterBegin [3] 8413.77 Event begin: VecScatterEnd [3] 8413.77 Event begin: SFUnpack [3] 8413.77 Event end: SFUnpack [3] 8413.77 Event end: VecScatterEnd [3] 8413.77 Event end: MatMult [3] 8413.77 Event begin: VecMDot [3] 8413.78 Event end: VecMDot [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: VecScale [3] 8413.78 Event end: VecScale [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecCopy [3] 8413.78 Event end: VecCopy [3] 8413.78 Event begin: VecAXPY [3] 8413.78 Event end: VecAXPY [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecAYPX [3] 8413.78 Event end: VecAYPX [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: PCApply [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: MatSolve [3] 8413.78 Event end: MatSolve [3] 8413.78 Event end: PCApply [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecMDot [3] 8413.78 Event end: VecMDot [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: VecScale [3] 8413.78 Event end: VecScale [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecCopy [3] 8413.78 Event end: VecCopy [3] 8413.78 Event begin: VecAXPY [3] 8413.78 Event end: VecAXPY [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecAYPX [3] 8413.78 Event end: VecAYPX [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: PCApply [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: MatSolve [3] 8413.78 Event end: MatSolve [3] 8413.78 Event end: PCApply [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecMDot [3] 8413.78 Event end: VecMDot [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: VecScale [3] 8413.78 Event end: VecScale [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecCopy [3] 8413.78 Event end: VecCopy [3] 8413.78 Event begin: VecAXPY [3] 8413.78 Event end: VecAXPY [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecAYPX [3] 8413.78 Event end: VecAYPX [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: PCApply [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: MatSolve [3] 8413.78 Event end: MatSolve [3] 8413.78 Event end: PCApply [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecMDot [3] 8413.78 Event end: VecMDot [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: VecScale [3] 8413.78 Event end: VecScale [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecCopy [3] 8413.78 Event end: VecCopy [3] 8413.78 Event begin: VecAXPY [3] 8413.78 Event end: VecAXPY [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecAYPX [3] 8413.78 Event end: VecAYPX [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: PCApply [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: MatSolve [3] 8413.78 Event end: MatSolve [3] 8413.78 Event end: PCApply [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecMDot [3] 8413.78 Event end: VecMDot [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: VecScale [3] 8413.78 Event end: VecScale [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecCopy [3] 8413.78 Event end: VecCopy [3] 8413.78 Event begin: VecAXPY [3] 8413.78 Event end: VecAXPY [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecAYPX [3] 8413.78 Event end: VecAYPX [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: PCApply [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: MatSolve [3] 8413.78 Event end: MatSolve [3] 8413.78 Event end: PCApply [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecMDot [3] 8413.78 Event end: VecMDot [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: VecScale [3] 8413.78 Event end: VecScale [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecCopy [3] 8413.78 Event end: VecCopy [3] 8413.78 Event begin: VecAXPY [3] 8413.78 Event end: VecAXPY [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecAYPX [3] 8413.78 Event end: VecAYPX [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: PCApply [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: MatSolve [3] 8413.78 Event end: MatSolve [3] 8413.78 Event end: PCApply [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecMDot [3] 8413.78 Event end: VecMDot [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: VecScale [3] 8413.78 Event end: VecScale [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: VecMAXPY [3] 8413.78 Event end: VecMAXPY [3] 8413.78 Event begin: VecCopy [3] 8413.78 Event end: VecCopy [3] 8413.78 Event begin: VecAXPY [3] 8413.78 Event end: VecAXPY [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecAYPX [3] 8413.78 Event end: VecAYPX [3] 8413.78 Event begin: VecNorm [3] 8413.78 Event end: VecNorm [3] 8413.78 Event begin: PCApply [3] 8413.78 Event begin: VecSet [3] 8413.78 Event end: VecSet [3] 8413.78 Event begin: MatSolve [3] 8413.78 Event end: MatSolve [3] 8413.78 Event end: PCApply [3] 8413.78 Event begin: MatMult [3] 8413.78 Event begin: VecScatterBegin [3] 8413.78 Event begin: SFPack [3] 8413.78 Event end: SFPack [3] 8413.78 Event end: VecScatterBegin [3] 8413.78 Event begin: VecScatterEnd [3] 8413.78 Event begin: SFUnpack [3] 8413.78 Event end: SFUnpack [3] 8413.78 Event end: VecScatterEnd [3] 8413.78 Event end: MatMult [3] 8413.78 Event begin: VecMDot [3] 8413.81 Event end: VecMDot [3] 8413.81 Event begin: VecMAXPY [3] 8413.81 Event end: VecMAXPY [3] 8413.81 Event begin: VecNorm [3] 8413.81 Event end: VecNorm [3] 8413.81 Event begin: VecScale [3] 8413.81 Event end: VecScale [3] 8413.81 Event begin: VecSet [3] 8413.81 Event end: VecSet [3] 8413.81 Event begin: VecMAXPY [3] 8413.81 Event end: VecMAXPY [3] 8413.81 Event begin: VecCopy [3] 8413.81 Event end: VecCopy [3] 8413.81 Event begin: VecAXPY [3] 8413.81 Event end: VecAXPY [3] 8413.81 Event begin: MatMult [3] 8413.81 Event begin: VecScatterBegin [3] 8413.81 Event begin: SFPack [3] 8413.81 Event end: SFPack [3] 8413.81 Event end: VecScatterBegin [3] 8413.81 Event begin: VecScatterEnd [3] 8413.81 Event begin: SFUnpack [3] 8413.81 Event end: SFUnpack [3] 8413.81 Event end: VecScatterEnd [3] 8413.81 Event end: MatMult [3] 8413.81 Event begin: VecAYPX [3] 8413.81 Event end: VecAYPX [3] 8413.81 Event begin: VecNorm [3] 8413.81 Event end: VecNorm [3] 8413.81 Event begin: PCApply [3] 8413.81 Event begin: VecSet [3] 8413.81 Event end: VecSet [3] 8413.81 Event begin: MatSolve [3] 8413.81 Event end: MatSolve [3] 8413.81 Event end: PCApply [3] 8413.81 Event begin: MatMult [3] 8413.81 Event begin: VecScatterBegin [3] 8413.81 Event begin: SFPack [3] 8413.81 Event end: SFPack [3] 8413.81 Event end: VecScatterBegin [3] 8413.81 Event begin: VecScatterEnd [3] 8413.81 Event begin: SFUnpack [3] 8413.81 Event end: SFUnpack [3] 8413.81 Event end: VecScatterEnd [3] 8413.81 Event end: MatMult [3] 8413.81 Event begin: VecMDot [3] 8413.81 Event end: VecMDot [3] 8413.81 Event begin: VecMAXPY [3] 8413.81 Event end: VecMAXPY [3] 8413.81 Event begin: VecNorm [3] 8413.81 Event end: VecNorm [3] 8413.81 Event begin: VecScale [3] 8413.81 Event end: VecScale [3] 8413.81 Event begin: VecSet [3] 8413.81 Event end: VecSet [3] 8413.81 Event begin: VecMAXPY [3] 8413.81 Event end: VecMAXPY [3] 8413.81 Event begin: VecCopy [3] 8413.81 Event end: VecCopy [3] 8413.81 Event begin: VecAXPY [3] 8413.81 Event end: VecAXPY [3] 8413.81 Event begin: MatMult [3] 8413.81 Event begin: VecScatterBegin [3] 8413.81 Event begin: SFPack [3] 8413.81 Event end: SFPack [3] 8413.81 Event end: VecScatterBegin [3] 8413.81 Event begin: VecScatterEnd [3] 8413.81 Event begin: SFUnpack [3] 8413.81 Event end: SFUnpack [3] 8413.81 Event end: VecScatterEnd [3] 8413.81 Event end: MatMult [3] 8413.81 Event begin: VecAYPX [3] 8413.81 Event end: VecAYPX [3] 8413.81 Event begin: VecNorm [3] 8413.81 Event end: VecNorm [3] 8413.81 Event begin: PCApply [3] 8413.81 Event begin: VecSet [3] 8413.81 Event end: VecSet [3] 8413.81 Event begin: MatSolve [3] 8413.81 Event end: MatSolve [3] 8413.81 Event end: PCApply [3] 8413.81 Event begin: MatMult [3] 8413.81 Event begin: VecScatterBegin [3] 8413.81 Event begin: SFPack [3] 8413.81 Event end: SFPack [3] 8413.81 Event end: VecScatterBegin [3] 8413.81 Event begin: VecScatterEnd [3] 8413.81 Event begin: SFUnpack [3] 8413.81 Event end: SFUnpack [3] 8413.81 Event end: VecScatterEnd [3] 8413.81 Event end: MatMult [3] 8413.81 Event begin: VecMDot [3] 8413.81 Event end: VecMDot [3] 8413.81 Event begin: VecMAXPY [3] 8413.81 Event end: VecMAXPY [3] 8413.81 Event begin: VecNorm [3] 8413.81 Event end: VecNorm [3] 8413.81 Event begin: VecScale [3] 8413.81 Event end: VecScale [3] 8413.81 Event begin: VecSet [3] 8413.81 Event end: VecSet [3] 8413.81 Event begin: VecMAXPY [3] 8413.81 Event end: VecMAXPY [3] 8413.81 Event begin: VecCopy [3] 8413.81 Event end: VecCopy [3] 8413.81 Event begin: VecAXPY [3] 8413.81 Event end: VecAXPY [3] 8413.81 Event begin: MatMult [3] 8413.81 Event begin: VecScatterBegin [3] 8413.81 Event begin: SFPack [3] 8413.81 Event end: SFPack [3] 8413.81 Event end: VecScatterBegin [3] 8413.81 Event begin: VecScatterEnd [3] 8413.81 Event begin: SFUnpack [3] 8413.81 Event end: SFUnpack [3] 8413.81 Event end: VecScatterEnd [3] 8413.81 Event end: MatMult [3] 8413.81 Event begin: VecAYPX [3] 8413.81 Event end: VecAYPX [3] 8413.81 Event begin: VecNorm [3] 8413.81 Event end: VecNorm [3] 8413.81 Event begin: PCApply [3] 8413.81 Event begin: VecSet [3] 8413.81 Event end: VecSet [3] 8413.81 Event begin: MatSolve [3] 8413.81 Event end: MatSolve [3] 8413.81 Event end: PCApply [3] 8413.81 Event begin: MatMult [3] 8413.81 Event begin: VecScatterBegin [3] 8413.81 Event begin: SFPack [3] 8413.81 Event end: SFPack [3] 8413.81 Event end: VecScatterBegin [3] 8413.81 Event begin: VecScatterEnd [3] 8413.81 Event begin: SFUnpack [3] 8413.81 Event end: SFUnpack [3] 8413.81 Event end: VecScatterEnd [3] 8413.81 Event end: MatMult [3] 8413.81 Event begin: VecMDot [3] 8413.81 Event end: VecMDot [3] 8413.81 Event begin: VecMAXPY [3] 8413.81 Event end: VecMAXPY [3] 8413.81 Event begin: VecNorm [3] 8413.81 Event end: VecNorm [3] 8413.81 Event begin: VecScale [3] 8413.81 Event end: VecScale [3] 8413.81 Event begin: VecSet [3] 8413.81 Event end: VecSet [3] 8413.81 Event begin: VecMAXPY [3] 8413.81 Event end: VecMAXPY [3] 8413.81 Event begin: VecCopy [3] 8413.81 Event end: VecCopy [3] 8413.81 Event begin: VecAXPY [3] 8413.81 Event end: VecAXPY [3] 8413.81 Event begin: MatMult [3] 8413.81 Event begin: VecScatterBegin [3] 8413.81 Event begin: SFPack [3] 8413.81 Event end: SFPack [3] 8413.81 Event end: VecScatterBegin [3] 8413.81 Event begin: VecScatterEnd [3] 8413.81 Event begin: SFUnpack [0] 8413.78 Event end: MatSolve [0] 8413.78 Event end: PCApply [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecMDot [0] 8413.78 Event end: VecMDot [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm [0] 8413.78 Event begin: VecScale [0] 8413.78 Event end: VecScale [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecCopy [0] 8413.78 Event end: VecCopy [0] 8413.78 Event begin: VecAXPY [0] 8413.78 Event end: VecAXPY [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecAYPX [0] 8413.78 Event end: VecAYPX [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm 9 KSP unpreconditioned resid norm 4.240869626332e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.78 Event begin: PCApply [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: MatSolve [0] 8413.78 Event end: MatSolve [0] 8413.78 Event end: PCApply [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecMDot [0] 8413.78 Event end: VecMDot [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm [0] 8413.78 Event begin: VecScale [0] 8413.78 Event end: VecScale [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecCopy [0] 8413.78 Event end: VecCopy [0] 8413.78 Event begin: VecAXPY [0] 8413.78 Event end: VecAXPY [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecAYPX [0] 8413.78 Event end: VecAYPX [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm 10 KSP unpreconditioned resid norm 3.545861066058e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.78 Event begin: PCApply [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: MatSolve [0] 8413.78 Event end: MatSolve [0] 8413.78 Event end: PCApply [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecMDot [0] 8413.78 Event end: VecMDot [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm [0] 8413.78 Event begin: VecScale [0] 8413.78 Event end: VecScale [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecCopy [0] 8413.78 Event end: VecCopy [0] 8413.78 Event begin: VecAXPY [0] 8413.78 Event end: VecAXPY [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecAYPX [0] 8413.78 Event end: VecAYPX [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm 11 KSP unpreconditioned resid norm 2.796829035763e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.78 Event begin: PCApply [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: MatSolve [0] 8413.78 Event end: MatSolve [0] 8413.78 Event end: PCApply [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecMDot [0] 8413.78 Event end: VecMDot [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm [0] 8413.78 Event begin: VecScale [0] 8413.78 Event end: VecScale [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecCopy [0] 8413.78 Event end: VecCopy [0] 8413.78 Event begin: VecAXPY [0] 8413.78 Event end: VecAXPY [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecAYPX [0] 8413.78 Event end: VecAYPX [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm 12 KSP unpreconditioned resid norm 2.415853009738e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.78 Event begin: PCApply [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: MatSolve [0] 8413.78 Event end: MatSolve [0] 8413.78 Event end: PCApply [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecMDot [0] 8413.78 Event end: VecMDot [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm [0] 8413.78 Event begin: VecScale [0] 8413.78 Event end: VecScale [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: VecMAXPY [0] 8413.78 Event end: VecMAXPY [0] 8413.78 Event begin: VecCopy [0] 8413.78 Event end: VecCopy [0] 8413.78 Event begin: VecAXPY [0] 8413.78 Event end: VecAXPY [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecAYPX [0] 8413.78 Event end: VecAYPX [0] 8413.78 Event begin: VecNorm [0] 8413.78 Event end: VecNorm 13 KSP unpreconditioned resid norm 1.933876548152e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.78 Event begin: PCApply [0] 8413.78 Event begin: VecSet [0] 8413.78 Event end: VecSet [0] 8413.78 Event begin: MatSolve [0] 8413.78 Event end: MatSolve [0] 8413.78 Event end: PCApply [0] 8413.78 Event begin: MatMult [0] 8413.78 Event begin: VecScatterBegin [0] 8413.78 Event begin: SFPack [0] 8413.78 Event end: SFPack [0] 8413.78 Event end: VecScatterBegin [0] 8413.78 Event begin: VecScatterEnd [0] 8413.78 Event begin: SFUnpack [0] 8413.78 Event end: SFUnpack [0] 8413.78 Event end: VecScatterEnd [0] 8413.78 Event end: MatMult [0] 8413.78 Event begin: VecMDot [0] 8413.8 Event end: VecMDot [0] 8413.8 Event begin: VecMAXPY [0] 8413.8 Event end: VecMAXPY [0] 8413.8 Event begin: VecNorm [0] 8413.8 Event end: VecNorm [0] 8413.8 Event begin: VecScale [0] 8413.8 Event end: VecScale [0] 8413.8 Event begin: VecSet [0] 8413.8 Event end: VecSet [0] 8413.8 Event begin: VecMAXPY [0] 8413.81 Event end: VecMAXPY [0] 8413.81 Event begin: VecCopy [0] 8413.81 Event end: VecCopy [0] 8413.81 Event begin: VecAXPY [0] 8413.81 Event end: VecAXPY [0] 8413.81 Event begin: MatMult [0] 8413.81 Event begin: VecScatterBegin [0] 8413.81 Event begin: SFPack [0] 8413.81 Event end: SFPack [0] 8413.81 Event end: VecScatterBegin [0] 8413.81 Event begin: VecScatterEnd [0] 8413.81 Event begin: SFUnpack [0] 8413.81 Event end: SFUnpack [0] 8413.81 Event end: VecScatterEnd [0] 8413.81 Event end: MatMult [0] 8413.81 Event begin: VecAYPX [0] 8413.81 Event end: VecAYPX [0] 8413.81 Event begin: VecNorm [0] 8413.81 Event end: VecNorm 14 KSP unpreconditioned resid norm 1.820288341790e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.81 Event begin: PCApply [0] 8413.81 Event begin: VecSet [0] 8413.81 Event end: VecSet [0] 8413.81 Event begin: MatSolve [0] 8413.81 Event end: MatSolve [0] 8413.81 Event end: PCApply [0] 8413.81 Event begin: MatMult [0] 8413.81 Event begin: VecScatterBegin [0] 8413.81 Event begin: SFPack [0] 8413.81 Event end: SFPack [0] 8413.81 Event end: VecScatterBegin [0] 8413.81 Event begin: VecScatterEnd [0] 8413.81 Event begin: SFUnpack [0] 8413.81 Event end: SFUnpack [0] 8413.81 Event end: VecScatterEnd [0] 8413.81 Event end: MatMult [0] 8413.81 Event begin: VecMDot [0] 8413.81 Event end: VecMDot [0] 8413.81 Event begin: VecMAXPY [0] 8413.81 Event end: VecMAXPY [0] 8413.81 Event begin: VecNorm [0] 8413.81 Event end: VecNorm [0] 8413.81 Event begin: VecScale [0] 8413.81 Event end: VecScale [0] 8413.81 Event begin: VecSet [0] 8413.81 Event end: VecSet [0] 8413.81 Event begin: VecMAXPY [0] 8413.81 Event end: VecMAXPY [0] 8413.81 Event begin: VecCopy [0] 8413.81 Event end: VecCopy [0] 8413.81 Event begin: VecAXPY [0] 8413.81 Event end: VecAXPY [0] 8413.81 Event begin: MatMult [0] 8413.81 Event begin: VecScatterBegin [0] 8413.81 Event begin: SFPack [0] 8413.81 Event end: SFPack [0] 8413.81 Event end: VecScatterBegin [0] 8413.81 Event begin: VecScatterEnd [0] 8413.81 Event begin: SFUnpack [0] 8413.81 Event end: SFUnpack [0] 8413.81 Event end: VecScatterEnd [0] 8413.81 Event end: MatMult [0] 8413.81 Event begin: VecAYPX [0] 8413.81 Event end: VecAYPX [0] 8413.81 Event begin: VecNorm [0] 8413.81 Event end: VecNorm 15 KSP unpreconditioned resid norm 1.657259630262e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.81 Event begin: PCApply [0] 8413.81 Event begin: VecSet [0] 8413.81 Event end: VecSet [0] 8413.81 Event begin: MatSolve [0] 8413.81 Event end: MatSolve [0] 8413.81 Event end: PCApply [0] 8413.81 Event begin: MatMult [0] 8413.81 Event begin: VecScatterBegin [0] 8413.81 Event begin: SFPack [0] 8413.81 Event end: SFPack [0] 8413.81 Event end: VecScatterBegin [0] 8413.81 Event begin: VecScatterEnd [0] 8413.81 Event begin: SFUnpack [0] 8413.81 Event end: SFUnpack [0] 8413.81 Event end: VecScatterEnd [0] 8413.81 Event end: MatMult [0] 8413.81 Event begin: VecMDot [0] 8413.81 Event end: VecMDot [0] 8413.81 Event begin: VecMAXPY [0] 8413.81 Event end: VecMAXPY [0] 8413.81 Event begin: VecNorm [0] 8413.81 Event end: VecNorm [0] 8413.81 Event begin: VecScale [0] 8413.81 Event end: VecScale [0] 8413.81 Event begin: VecSet [0] 8413.81 Event end: VecSet [0] 8413.81 Event begin: VecMAXPY [0] 8413.81 Event end: VecMAXPY [0] 8413.81 Event begin: VecCopy [0] 8413.81 Event end: VecCopy [0] 8413.81 Event begin: VecAXPY [0] 8413.81 Event end: VecAXPY [0] 8413.81 Event begin: MatMult [0] 8413.81 Event begin: VecScatterBegin [0] 8413.81 Event begin: SFPack [0] 8413.81 Event end: SFPack [0] 8413.81 Event end: VecScatterBegin [0] 8413.81 Event begin: VecScatterEnd [0] 8413.81 Event begin: SFUnpack [0] 8413.81 Event end: SFUnpack [0] 8413.81 Event end: VecScatterEnd [0] 8413.81 Event end: MatMult [0] 8413.81 Event begin: VecAYPX [0] 8413.81 Event end: VecAYPX [0] 8413.81 Event begin: VecNorm [0] 8413.81 Event end: VecNorm 16 KSP unpreconditioned resid norm 1.563463774015e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.81 Event begin: PCApply [0] 8413.81 Event begin: VecSet [0] 8413.81 Event end: VecSet [0] 8413.81 Event begin: MatSolve [0] 8413.81 Event end: MatSolve [0] 8413.81 Event end: PCApply [0] 8413.81 Event begin: MatMult [0] 8413.81 Event begin: VecScatterBegin [0] 8413.81 Event begin: SFPack [0] 8413.81 Event end: SFPack [0] 8413.81 Event end: VecScatterBegin [0] 8413.81 Event begin: VecScatterEnd [0] 8413.81 Event begin: SFUnpack [0] 8413.81 Event end: SFUnpack [0] 8413.81 Event end: VecScatterEnd [0] 8413.81 Event end: MatMult [0] 8413.81 Event begin: VecMDot [0] 8413.81 Event end: VecMDot [0] 8413.81 Event begin: VecMAXPY [0] 8413.81 Event end: VecMAXPY [0] 8413.81 Event begin: VecNorm [0] 8413.81 Event end: VecNorm [0] 8413.81 Event begin: VecScale [0] 8413.81 Event end: VecScale [0] 8413.81 Event begin: VecSet [0] 8413.81 Event end: VecSet [0] 8413.81 Event begin: VecMAXPY [0] 8413.81 Event end: VecMAXPY [0] 8413.81 Event begin: VecCopy [0] 8413.81 Event end: VecCopy [0] 8413.81 Event begin: VecAXPY [0] 8413.81 Event end: VecAXPY [0] 8413.81 Event begin: MatMult [0] 8413.81 Event begin: VecScatterBegin [0] 8413.81 Event begin: SFPack [0] 8413.81 Event end: SFPack [0] 8413.81 Event end: VecScatterBegin [0] 8413.81 Event begin: VecScatterEnd [0] 8413.81 Event begin: SFUnpack [0] 8413.81 Event end: SFUnpack [0] 8413.81 Event end: VecScatterEnd [0] 8413.81 Event end: MatMult [0] 8413.81 Event begin: VecAYPX [0] 8413.81 Event end: VecAYPX [0] 8413.81 Event begin: VecNorm [0] 8413.84 Event end: VecNorm 17 KSP unpreconditioned resid norm 1.272726944551e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.84 Event begin: PCApply [0] 8413.84 Event begin: VecSet [0] 8413.84 Event end: VecSet [0] 8413.84 Event begin: MatSolve [0] 8413.84 Event end: MatSolve [0] 8413.84 Event end: PCApply [0] 8413.84 Event begin: MatMult [0] 8413.84 Event begin: VecScatterBegin [0] 8413.84 Event begin: SFPack [0] 8413.84 Event end: SFPack [0] 8413.84 Event end: VecScatterBegin [0] 8413.84 Event begin: VecScatterEnd [0] 8413.84 Event begin: SFUnpack [0] 8413.84 Event end: SFUnpack [0] 8413.84 Event end: VecScatterEnd [0] 8413.84 Event end: MatMult [0] 8413.84 Event begin: VecMDot [0] 8413.84 Event end: VecMDot [0] 8413.84 Event begin: VecMAXPY [0] 8413.84 Event end: VecMAXPY [0] 8413.84 Event begin: VecNorm [0] 8413.84 Event end: VecNorm [0] 8413.84 Event begin: VecScale [0] 8413.84 Event end: VecScale [0] 8413.84 Event begin: VecSet [0] 8413.84 Event end: VecSet [0] 8413.84 Event begin: VecMAXPY [0] 8413.84 Event end: VecMAXPY [0] 8413.84 Event begin: VecCopy [0] 8413.84 Event end: VecCopy [0] 8413.84 Event begin: VecAXPY [0] 8413.84 Event end: VecAXPY [0] 8413.84 Event begin: MatMult [0] 8413.84 Event begin: VecScatterBegin [0] 8413.84 Event begin: SFPack [0] 8413.84 Event end: SFPack [0] 8413.84 Event end: VecScatterBegin [0] 8413.84 Event begin: VecScatterEnd [0] 8413.84 Event begin: SFUnpack [0] 8413.84 Event end: SFUnpack [0] 8413.84 Event end: VecScatterEnd [0] 8413.84 Event end: MatMult [0] 8413.84 Event begin: VecAYPX [0] 8413.84 Event end: VecAYPX [0] 8413.84 Event begin: VecNorm [0] 8413.84 Event end: VecNorm 18 KSP unpreconditioned resid norm 1.137797050714e-01 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.84 Event begin: PCApply [0] 8413.84 Event begin: VecSet [0] 8413.84 Event end: VecSet [0] 8413.84 Event begin: MatSolve [0] 8413.84 Event end: MatSolve [0] 8413.84 Event end: PCApply [0] 8413.84 Event begin: MatMult [0] 8413.84 Event begin: VecScatterBegin [0] 8413.84 Event begin: SFPack [0] 8413.84 Event end: SFPack [0] 8413.84 Event end: VecScatterBegin [0] 8413.84 Event begin: VecScatterEnd [0] 8413.84 Event begin: SFUnpack [0] 8413.84 Event end: SFUnpack [0] 8413.84 Event end: VecScatterEnd [0] 8413.84 Event end: MatMult [0] 8413.84 Event begin: VecMDot [0] 8413.84 Event end: VecMDot [0] 8413.84 Event begin: VecMAXPY [0] 8413.84 Event end: VecMAXPY [0] 8413.84 Event begin: VecNorm [0] 8413.84 Event end: VecNorm [0] 8413.84 Event begin: VecScale [0] 8413.84 Event end: VecScale [0] 8413.84 Event begin: VecSet [0] 8413.84 Event end: VecSet [0] 8413.84 Event begin: VecMAXPY [0] 8413.84 Event end: VecMAXPY [0] 8413.84 Event begin: VecCopy [0] 8413.84 Event end: VecCopy [0] 8413.84 Event begin: VecAXPY [0] 8413.84 Event end: VecAXPY [0] 8413.84 Event begin: MatMult [0] 8413.84 Event begin: VecScatterBegin [0] 8413.84 Event begin: SFPack [0] 8413.84 Event end: SFPack [0] 8413.84 Event end: VecScatterBegin [0] 8413.84 Event begin: VecScatterEnd [0] 8413.84 Event begin: SFUnpack [0] 8413.84 Event end: SFUnpack [0] 8413.84 Event end: VecScatterEnd [0] 8413.84 Event end: MatMult [0] 8413.84 Event begin: VecAYPX [0] 8413.84 Event end: VecAYPX [0] 8413.84 Event begin: VecNorm [0] 8413.84 Event end: VecNorm 19 KSP unpreconditioned resid norm 8.582334823227e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.84 Event begin: PCApply [0] 8413.84 Event begin: VecSet [0] 8413.84 Event end: VecSet [0] 8413.84 Event begin: MatSolve [0] 8413.84 Event end: MatSolve [0] 8413.84 Event end: PCApply [0] 8413.84 Event begin: MatMult [0] 8413.84 Event begin: VecScatterBegin [0] 8413.84 Event begin: SFPack [0] 8413.84 Event end: SFPack [0] 8413.84 Event end: VecScatterBegin [0] 8413.84 Event begin: VecScatterEnd [0] 8413.84 Event begin: SFUnpack [0] 8413.84 Event end: SFUnpack [0] 8413.84 Event end: VecScatterEnd [0] 8413.84 Event end: MatMult [0] 8413.84 Event begin: VecMDot [0] 8413.84 Event end: VecMDot [0] 8413.84 Event begin: VecMAXPY [0] 8413.84 Event end: VecMAXPY [0] 8413.84 Event begin: VecNorm [0] 8413.84 Event end: VecNorm [0] 8413.84 Event begin: VecScale [0] 8413.84 Event end: VecScale [0] 8413.84 Event begin: VecSet [0] 8413.84 Event end: VecSet [0] 8413.84 Event begin: VecMAXPY [0] 8413.84 Event end: VecMAXPY [0] 8413.84 Event begin: VecCopy [0] 8413.84 Event end: VecCopy [0] 8413.84 Event begin: VecAXPY [0] 8413.84 Event end: VecAXPY [0] 8413.84 Event begin: MatMult [0] 8413.84 Event begin: VecScatterBegin [0] 8413.84 Event begin: SFPack [0] 8413.84 Event end: SFPack [0] 8413.84 Event end: VecScatterBegin [0] 8413.84 Event begin: VecScatterEnd [0] 8413.84 Event begin: SFUnpack [0] 8413.84 Event end: SFUnpack [0] 8413.84 Event end: VecScatterEnd [0] 8413.84 Event end: MatMult [0] 8413.84 Event begin: VecAYPX [0] 8413.84 Event end: VecAYPX [0] 8413.84 Event begin: VecNorm [0] 8413.84 Event end: VecNorm 20 KSP unpreconditioned resid norm 7.628931303919e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.84 Event begin: PCApply [0] 8413.84 Event begin: VecSet [0] 8413.84 Event end: VecSet [0] 8413.84 Event begin: MatSolve [0] 8413.84 Event end: MatSolve [0] 8413.84 Event end: PCApply [0] 8413.84 Event begin: MatMult [0] 8413.84 Event begin: VecScatterBegin [0] 8413.84 Event begin: SFPack [0] 8413.84 Event end: SFPack [0] 8413.84 Event end: VecScatterBegin [0] 8413.84 Event begin: VecScatterEnd [0] 8413.84 Event begin: SFUnpack [0] 8413.84 Event end: SFUnpack [0] 8413.84 Event end: VecScatterEnd [0] 8413.84 Event end: MatMult [0] 8413.84 Event begin: VecMDot [0] 8413.84 Event end: VecMDot [0] 8413.84 Event begin: VecMAXPY [0] 8413.84 Event end: VecMAXPY [0] 8413.84 Event begin: VecNorm [0] 8413.84 Event end: VecNorm [0] 8413.84 Event begin: VecScale [0] 8413.84 Event end: VecScale [0] 8413.84 Event begin: VecSet [0] 8413.84 Event end: VecSet [0] 8413.84 Event begin: VecMAXPY [0] 8413.84 Event end: VecMAXPY [0] 8413.84 Event begin: VecCopy [0] 8413.84 Event end: VecCopy [0] 8413.84 Event begin: VecAXPY [0] 8413.84 Event end: VecAXPY [0] 8413.84 Event begin: MatMult [0] 8413.84 Event begin: VecScatterBegin [0] 8413.84 Event begin: SFPack [0] 8413.84 Event end: SFPack [0] 8413.84 Event end: VecScatterBegin [0] 8413.84 Event begin: VecScatterEnd [0] 8413.84 Event begin: SFUnpack [0] 8413.84 Event end: SFUnpack [0] 8413.84 Event end: VecScatterEnd [0] 8413.84 Event end: MatMult [0] 8413.84 Event begin: VecAYPX [0] 8413.84 Event end: VecAYPX [0] 8413.84 Event begin: VecNorm [0] 8413.84 Event end: VecNorm 21 KSP unpreconditioned resid norm 5.901409162525e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.84 Event begin: PCApply [0] 8413.84 Event begin: VecSet [0] 8413.84 Event end: VecSet [0] 8413.84 Event begin: MatSolve [0] 8413.84 Event end: MatSolve [0] 8413.84 Event end: PCApply [0] 8413.84 Event begin: MatMult [0] 8413.84 Event begin: VecScatterBegin [0] 8413.84 Event begin: SFPack [0] 8413.84 Event end: SFPack [0] 8413.84 Event end: VecScatterBegin [0] 8413.84 Event begin: VecScatterEnd [0] 8413.84 Event begin: SFUnpack [0] 8413.84 Event end: SFUnpack [0] 8413.84 Event end: VecScatterEnd [0] 8413.84 Event end: MatMult [0] 8413.85 Event begin: VecMDot [0] 8413.85 Event end: VecMDot [0] 8413.85 Event begin: VecMAXPY [0] 8413.85 Event end: VecMAXPY [0] 8413.85 Event begin: VecNorm [0] 8413.85 Event end: VecNorm [0] 8413.85 Event begin: VecScale [0] 8413.85 Event end: VecScale [0] 8413.85 Event begin: VecSet [0] 8413.85 Event end: VecSet [0] 8413.85 Event begin: VecMAXPY [0] 8413.85 Event end: VecMAXPY [0] 8413.85 Event begin: VecCopy [0] 8413.85 Event end: VecCopy [0] 8413.85 Event begin: VecAXPY [0] 8413.85 Event end: VecAXPY [0] 8413.85 Event begin: MatMult [0] 8413.85 Event begin: VecScatterBegin [0] 8413.85 Event begin: SFPack [0] 8413.85 Event end: SFPack [0] 8413.85 Event end: VecScatterBegin [0] 8413.85 Event begin: VecScatterEnd [0] 8413.85 Event begin: SFUnpack [0] 8413.85 Event end: SFUnpack [0] 8413.85 Event end: VecScatterEnd [0] 8413.85 Event end: MatMult [0] 8413.85 Event begin: VecAYPX [0] 8413.85 Event end: VecAYPX [0] 8413.85 Event begin: VecNorm [0] 8413.85 Event end: VecNorm 22 KSP unpreconditioned resid norm 5.496261661066e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.85 Event begin: PCApply [0] 8413.85 Event begin: VecSet [0] 8413.85 Event end: VecSet [0] 8413.85 Event begin: MatSolve [0] 8413.85 Event end: MatSolve [0] 8413.85 Event end: PCApply [0] 8413.85 Event begin: MatMult [0] 8413.85 Event begin: VecScatterBegin [0] 8413.85 Event begin: SFPack [0] 8413.85 Event end: SFPack [0] 8413.85 Event end: VecScatterBegin [0] 8413.85 Event begin: VecScatterEnd [0] 8413.85 Event begin: SFUnpack [0] 8413.85 Event end: SFUnpack [0] 8413.85 Event end: VecScatterEnd [0] 8413.85 Event end: MatMult [0] 8413.85 Event begin: VecMDot [0] 8413.85 Event end: VecMDot [0] 8413.85 Event begin: VecMAXPY [0] 8413.85 Event end: VecMAXPY [0] 8413.85 Event begin: VecNorm [0] 8413.85 Event end: VecNorm [0] 8413.85 Event begin: VecScale [0] 8413.85 Event end: VecScale [0] 8413.85 Event begin: VecSet [0] 8413.85 Event end: VecSet [0] 8413.85 Event begin: VecMAXPY [0] 8413.85 Event end: VecMAXPY [0] 8413.85 Event begin: VecCopy [0] 8413.85 Event end: VecCopy [0] 8413.85 Event begin: VecAXPY [0] 8413.85 Event end: VecAXPY [0] 8413.85 Event begin: MatMult [0] 8413.85 Event begin: VecScatterBegin [0] 8413.85 Event begin: SFPack [0] 8413.85 Event end: SFPack [0] 8413.85 Event end: VecScatterBegin [0] 8413.85 Event begin: VecScatterEnd [0] 8413.85 Event begin: SFUnpack [0] 8413.85 Event end: SFUnpack [0] 8413.85 Event end: VecScatterEnd [0] 8413.85 Event end: MatMult [0] 8413.85 Event begin: VecAYPX [0] 8413.85 Event end: VecAYPX [0] 8413.85 Event begin: VecNorm [0] 8413.85 Event end: VecNorm 23 KSP unpreconditioned resid norm 4.367682975355e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.85 Event begin: PCApply [0] 8413.85 Event begin: VecSet [0] 8413.85 Event end: VecSet [0] 8413.85 Event begin: MatSolve [0] 8413.85 Event end: MatSolve [0] 8413.85 Event end: PCApply [0] 8413.85 Event begin: MatMult [0] 8413.85 Event begin: VecScatterBegin [0] 8413.85 Event begin: SFPack [0] 8413.85 Event end: SFPack [0] 8413.85 Event end: VecScatterBegin [0] 8413.85 Event begin: VecScatterEnd [0] 8413.85 Event begin: SFUnpack [0] 8413.85 Event end: SFUnpack [0] 8413.85 Event end: VecScatterEnd [0] 8413.85 Event end: MatMult [0] 8413.85 Event begin: VecMDot [0] 8413.85 Event end: VecMDot [0] 8413.85 Event begin: VecMAXPY [0] 8413.85 Event end: VecMAXPY [0] 8413.85 Event begin: VecNorm [0] 8413.85 Event end: VecNorm [1] 8413.79 Event begin: SFUnpack [1] 8413.81 Event end: SFUnpack [1] 8413.81 Event end: VecScatterEnd [1] 8413.81 Event end: MatMult [1] 8413.81 Event begin: VecMDot [1] 8413.81 Event end: VecMDot [1] 8413.81 Event begin: VecMAXPY [1] 8413.81 Event end: VecMAXPY [1] 8413.81 Event begin: VecNorm [1] 8413.81 Event end: VecNorm [1] 8413.81 Event begin: VecScale [1] 8413.81 Event end: VecScale [1] 8413.81 Event begin: VecSet [1] 8413.81 Event end: VecSet [1] 8413.81 Event begin: VecMAXPY [1] 8413.81 Event end: VecMAXPY [1] 8413.81 Event begin: VecCopy [1] 8413.81 Event end: VecCopy [1] 8413.81 Event begin: VecAXPY [1] 8413.81 Event end: VecAXPY [1] 8413.81 Event begin: MatMult [1] 8413.81 Event begin: VecScatterBegin [1] 8413.81 Event begin: SFPack [1] 8413.81 Event end: SFPack [1] 8413.81 Event end: VecScatterBegin [1] 8413.81 Event begin: VecScatterEnd [1] 8413.81 Event begin: SFUnpack [1] 8413.81 Event end: SFUnpack [1] 8413.81 Event end: VecScatterEnd [1] 8413.81 Event end: MatMult [1] 8413.81 Event begin: VecAYPX [1] 8413.81 Event end: VecAYPX [1] 8413.81 Event begin: VecNorm [1] 8413.81 Event end: VecNorm [1] 8413.81 Event begin: PCApply [1] 8413.81 Event begin: VecSet [1] 8413.81 Event end: VecSet [1] 8413.81 Event begin: MatSolve [1] 8413.81 Event end: MatSolve [1] 8413.81 Event end: PCApply [1] 8413.81 Event begin: MatMult [1] 8413.81 Event begin: VecScatterBegin [1] 8413.81 Event begin: SFPack [1] 8413.81 Event end: SFPack [1] 8413.81 Event end: VecScatterBegin [1] 8413.81 Event begin: VecScatterEnd [1] 8413.81 Event begin: SFUnpack [1] 8413.81 Event end: SFUnpack [1] 8413.81 Event end: VecScatterEnd [1] 8413.81 Event end: MatMult [1] 8413.81 Event begin: VecMDot [1] 8413.81 Event end: VecMDot [1] 8413.81 Event begin: VecMAXPY [1] 8413.81 Event end: VecMAXPY [1] 8413.81 Event begin: VecNorm [1] 8413.81 Event end: VecNorm [1] 8413.81 Event begin: VecScale [1] 8413.81 Event end: VecScale [1] 8413.81 Event begin: VecSet [1] 8413.81 Event end: VecSet [1] 8413.81 Event begin: VecMAXPY [1] 8413.81 Event end: VecMAXPY [1] 8413.81 Event begin: VecCopy [1] 8413.81 Event end: VecCopy [1] 8413.81 Event begin: VecAXPY [1] 8413.81 Event end: VecAXPY [1] 8413.81 Event begin: MatMult [1] 8413.81 Event begin: VecScatterBegin [1] 8413.81 Event begin: SFPack [1] 8413.81 Event end: SFPack [1] 8413.81 Event end: VecScatterBegin [1] 8413.81 Event begin: VecScatterEnd [1] 8413.81 Event begin: SFUnpack [1] 8413.81 Event end: SFUnpack [1] 8413.81 Event end: VecScatterEnd [1] 8413.81 Event end: MatMult [1] 8413.81 Event begin: VecAYPX [1] 8413.81 Event end: VecAYPX [1] 8413.81 Event begin: VecNorm [1] 8413.81 Event end: VecNorm [1] 8413.81 Event begin: PCApply [1] 8413.81 Event begin: VecSet [1] 8413.81 Event end: VecSet [1] 8413.81 Event begin: MatSolve [1] 8413.81 Event end: MatSolve [1] 8413.81 Event end: PCApply [1] 8413.81 Event begin: MatMult [1] 8413.81 Event begin: VecScatterBegin [1] 8413.81 Event begin: SFPack [1] 8413.81 Event end: SFPack [1] 8413.81 Event end: VecScatterBegin [1] 8413.81 Event begin: VecScatterEnd [1] 8413.81 Event begin: SFUnpack [1] 8413.81 Event end: SFUnpack [1] 8413.81 Event end: VecScatterEnd [1] 8413.81 Event end: MatMult [1] 8413.81 Event begin: VecMDot [1] 8413.81 Event end: VecMDot [1] 8413.81 Event begin: VecMAXPY [1] 8413.81 Event end: VecMAXPY [1] 8413.81 Event begin: VecNorm [1] 8413.81 Event end: VecNorm [1] 8413.81 Event begin: VecScale [1] 8413.81 Event end: VecScale [1] 8413.81 Event begin: VecSet [1] 8413.81 Event end: VecSet [1] 8413.81 Event begin: VecMAXPY [1] 8413.81 Event end: VecMAXPY [1] 8413.81 Event begin: VecCopy [1] 8413.81 Event end: VecCopy [1] 8413.81 Event begin: VecAXPY [1] 8413.81 Event end: VecAXPY [1] 8413.81 Event begin: MatMult [1] 8413.81 Event begin: VecScatterBegin [1] 8413.81 Event begin: SFPack [1] 8413.81 Event end: SFPack [1] 8413.81 Event end: VecScatterBegin [1] 8413.81 Event begin: VecScatterEnd [1] 8413.81 Event begin: SFUnpack [1] 8413.81 Event end: SFUnpack [1] 8413.81 Event end: VecScatterEnd [1] 8413.81 Event end: MatMult [1] 8413.81 Event begin: VecAYPX [1] 8413.81 Event end: VecAYPX [1] 8413.81 Event begin: VecNorm [1] 8413.81 Event end: VecNorm [1] 8413.81 Event begin: PCApply [1] 8413.81 Event begin: VecSet [1] 8413.81 Event end: VecSet [1] 8413.81 Event begin: MatSolve [1] 8413.81 Event end: MatSolve [1] 8413.81 Event end: PCApply [1] 8413.81 Event begin: MatMult [1] 8413.81 Event begin: VecScatterBegin [1] 8413.81 Event begin: SFPack [1] 8413.81 Event end: SFPack [1] 8413.81 Event end: VecScatterBegin [1] 8413.81 Event begin: VecScatterEnd [1] 8413.81 Event begin: SFUnpack [1] 8413.81 Event end: SFUnpack [1] 8413.81 Event end: VecScatterEnd [1] 8413.81 Event end: MatMult [1] 8413.81 Event begin: VecMDot [1] 8413.81 Event end: VecMDot [1] 8413.81 Event begin: VecMAXPY [1] 8413.81 Event end: VecMAXPY [1] 8413.81 Event begin: VecNorm [1] 8413.81 Event end: VecNorm [1] 8413.81 Event begin: VecScale [1] 8413.81 Event end: VecScale [1] 8413.81 Event begin: VecSet [1] 8413.81 Event end: VecSet [1] 8413.81 Event begin: VecMAXPY [1] 8413.81 Event end: VecMAXPY [1] 8413.81 Event begin: VecCopy [1] 8413.81 Event end: VecCopy [1] 8413.81 Event begin: VecAXPY [1] 8413.81 Event end: VecAXPY [1] 8413.81 Event begin: MatMult [1] 8413.81 Event begin: VecScatterBegin [1] 8413.81 Event begin: SFPack [1] 8413.81 Event end: SFPack [1] 8413.81 Event end: VecScatterBegin [1] 8413.81 Event begin: VecScatterEnd [1] 8413.81 Event begin: SFUnpack [1] 8413.81 Event end: SFUnpack [1] 8413.81 Event end: VecScatterEnd [1] 8413.81 Event end: MatMult [1] 8413.81 Event begin: VecAYPX [1] 8413.81 Event end: VecAYPX [1] 8413.81 Event begin: VecNorm [1] 8413.84 Event end: VecNorm [1] 8413.84 Event begin: PCApply [1] 8413.84 Event begin: VecSet [1] 8413.84 Event end: VecSet [1] 8413.84 Event begin: MatSolve [1] 8413.84 Event end: MatSolve [1] 8413.84 Event end: PCApply [1] 8413.84 Event begin: MatMult [1] 8413.84 Event begin: VecScatterBegin [1] 8413.84 Event begin: SFPack [1] 8413.84 Event end: SFPack [1] 8413.84 Event end: VecScatterBegin [1] 8413.84 Event begin: VecScatterEnd [1] 8413.84 Event begin: SFUnpack [1] 8413.84 Event end: SFUnpack [1] 8413.84 Event end: VecScatterEnd [1] 8413.84 Event end: MatMult [1] 8413.84 Event begin: VecMDot [1] 8413.84 Event end: VecMDot [1] 8413.84 Event begin: VecMAXPY [1] 8413.84 Event end: VecMAXPY [1] 8413.84 Event begin: VecNorm [1] 8413.84 Event end: VecNorm [1] 8413.84 Event begin: VecScale [1] 8413.84 Event end: VecScale [1] 8413.84 Event begin: VecSet [1] 8413.84 Event end: VecSet [1] 8413.84 Event begin: VecMAXPY [1] 8413.84 Event end: VecMAXPY [1] 8413.84 Event begin: VecCopy [1] 8413.84 Event end: VecCopy [1] 8413.84 Event begin: VecAXPY [1] 8413.84 Event end: VecAXPY [1] 8413.84 Event begin: MatMult [1] 8413.84 Event begin: VecScatterBegin [1] 8413.84 Event begin: SFPack [1] 8413.84 Event end: SFPack [1] 8413.84 Event end: VecScatterBegin [1] 8413.84 Event begin: VecScatterEnd [1] 8413.84 Event begin: SFUnpack [1] 8413.84 Event end: SFUnpack [1] 8413.84 Event end: VecScatterEnd [1] 8413.84 Event end: MatMult [1] 8413.84 Event begin: VecAYPX [1] 8413.84 Event end: VecAYPX [1] 8413.84 Event begin: VecNorm [1] 8413.84 Event end: VecNorm [1] 8413.84 Event begin: PCApply [1] 8413.84 Event begin: VecSet [1] 8413.84 Event end: VecSet [1] 8413.84 Event begin: MatSolve [1] 8413.84 Event end: MatSolve [1] 8413.84 Event end: PCApply [1] 8413.84 Event begin: MatMult [1] 8413.84 Event begin: VecScatterBegin [1] 8413.84 Event begin: SFPack [1] 8413.84 Event end: SFPack [1] 8413.84 Event end: VecScatterBegin [1] 8413.84 Event begin: VecScatterEnd [1] 8413.84 Event begin: SFUnpack [1] 8413.84 Event end: SFUnpack [1] 8413.84 Event end: VecScatterEnd [1] 8413.84 Event end: MatMult [1] 8413.84 Event begin: VecMDot [1] 8413.84 Event end: VecMDot [1] 8413.84 Event begin: VecMAXPY [1] 8413.84 Event end: VecMAXPY [1] 8413.84 Event begin: VecNorm [1] 8413.84 Event end: VecNorm [1] 8413.84 Event begin: VecScale [1] 8413.84 Event end: VecScale [1] 8413.84 Event begin: VecSet [1] 8413.84 Event end: VecSet [1] 8413.84 Event begin: VecMAXPY [1] 8413.84 Event end: VecMAXPY [1] 8413.84 Event begin: VecCopy [1] 8413.84 Event end: VecCopy [1] 8413.84 Event begin: VecAXPY [1] 8413.84 Event end: VecAXPY [1] 8413.84 Event begin: MatMult [1] 8413.84 Event begin: VecScatterBegin [1] 8413.84 Event begin: SFPack [1] 8413.84 Event end: SFPack [1] 8413.84 Event end: VecScatterBegin [1] 8413.84 Event begin: VecScatterEnd [1] 8413.84 Event begin: SFUnpack [1] 8413.84 Event end: SFUnpack [1] 8413.84 Event end: VecScatterEnd [1] 8413.84 Event end: MatMult [1] 8413.84 Event begin: VecAYPX [1] 8413.84 Event end: VecAYPX [1] 8413.84 Event begin: VecNorm [1] 8413.84 Event end: VecNorm [1] 8413.84 Event begin: PCApply [1] 8413.84 Event begin: VecSet [1] 8413.84 Event end: VecSet [1] 8413.84 Event begin: MatSolve [1] 8413.84 Event end: MatSolve [1] 8413.84 Event end: PCApply [1] 8413.84 Event begin: MatMult [1] 8413.84 Event begin: VecScatterBegin [1] 8413.84 Event begin: SFPack [1] 8413.84 Event end: SFPack [1] 8413.84 Event end: VecScatterBegin [1] 8413.84 Event begin: VecScatterEnd [1] 8413.84 Event begin: SFUnpack [1] 8413.84 Event end: SFUnpack [1] 8413.84 Event end: VecScatterEnd [1] 8413.84 Event end: MatMult [1] 8413.84 Event begin: VecMDot [1] 8413.84 Event end: VecMDot [1] 8413.84 Event begin: VecMAXPY [1] 8413.84 Event end: VecMAXPY [1] 8413.84 Event begin: VecNorm [1] 8413.84 Event end: VecNorm [1] 8413.84 Event begin: VecScale [1] 8413.84 Event end: VecScale [1] 8413.84 Event begin: VecSet [1] 8413.84 Event end: VecSet [1] 8413.84 Event begin: VecMAXPY [1] 8413.84 Event end: VecMAXPY [1] 8413.84 Event begin: VecCopy [1] 8413.84 Event end: VecCopy [1] 8413.84 Event begin: VecAXPY [1] 8413.84 Event end: VecAXPY [1] 8413.84 Event begin: MatMult [1] 8413.84 Event begin: VecScatterBegin [1] 8413.84 Event begin: SFPack [1] 8413.84 Event end: SFPack [1] 8413.84 Event end: VecScatterBegin [1] 8413.84 Event begin: VecScatterEnd [1] 8413.84 Event begin: SFUnpack [1] 8413.84 Event end: SFUnpack [1] 8413.84 Event end: VecScatterEnd [1] 8413.84 Event end: MatMult [1] 8413.84 Event begin: VecAYPX [1] 8413.84 Event end: VecAYPX [1] 8413.84 Event begin: VecNorm [1] 8413.84 Event end: VecNorm [1] 8413.84 Event begin: PCApply [1] 8413.84 Event begin: VecSet [1] 8413.84 Event end: VecSet [1] 8413.84 Event begin: MatSolve [1] 8413.84 Event end: MatSolve [1] 8413.84 Event end: PCApply [1] 8413.84 Event begin: MatMult [1] 8413.84 Event begin: VecScatterBegin [1] 8413.84 Event begin: SFPack [1] 8413.84 Event end: SFPack [1] 8413.84 Event end: VecScatterBegin [1] 8413.84 Event begin: VecScatterEnd [1] 8413.84 Event begin: SFUnpack [1] 8413.84 Event end: SFUnpack [1] 8413.84 Event end: VecScatterEnd [1] 8413.84 Event end: MatMult [1] 8413.84 Event begin: VecMDot [1] 8413.85 Event end: VecMDot [1] 8413.85 Event begin: VecMAXPY [1] 8413.85 Event end: VecMAXPY [1] 8413.85 Event begin: VecNorm [1] 8413.85 Event end: VecNorm [1] 8413.85 Event begin: VecScale [1] 8413.85 Event end: VecScale [1] 8413.85 Event begin: VecSet [1] 8413.85 Event end: VecSet [1] 8413.85 Event begin: VecMAXPY [1] 8413.85 Event end: VecMAXPY [1] 8413.85 Event begin: VecCopy [1] 8413.85 Event end: VecCopy [1] 8413.85 Event begin: VecAXPY [1] 8413.85 Event end: VecAXPY [1] 8413.85 Event begin: MatMult [1] 8413.85 Event begin: VecScatterBegin [1] 8413.85 Event begin: SFPack [1] 8413.85 Event end: SFPack [1] 8413.85 Event end: VecScatterBegin [1] 8413.85 Event begin: VecScatterEnd [1] 8413.85 Event begin: SFUnpack [1] 8413.85 Event end: SFUnpack [1] 8413.85 Event end: VecScatterEnd [1] 8413.85 Event end: MatMult [1] 8413.85 Event begin: VecAYPX [1] 8413.85 Event end: VecAYPX [1] 8413.85 Event begin: VecNorm [1] 8413.85 Event end: VecNorm [1] 8413.85 Event begin: PCApply [1] 8413.85 Event begin: VecSet [1] 8413.85 Event end: VecSet [1] 8413.85 Event begin: MatSolve [1] 8413.85 Event end: MatSolve [1] 8413.85 Event end: PCApply [1] 8413.85 Event begin: MatMult [1] 8413.85 Event begin: VecScatterBegin [1] 8413.85 Event begin: SFPack [1] 8413.85 Event end: SFPack [1] 8413.85 Event end: VecScatterBegin [1] 8413.85 Event begin: VecScatterEnd [1] 8413.85 Event begin: SFUnpack [1] 8413.85 Event end: SFUnpack [1] 8413.85 Event end: VecScatterEnd [1] 8413.85 Event end: MatMult [1] 8413.85 Event begin: VecMDot [1] 8413.85 Event end: VecMDot [1] 8413.85 Event begin: VecMAXPY [1] 8413.85 Event end: VecMAXPY [1] 8413.85 Event begin: VecNorm [1] 8413.85 Event end: VecNorm [1] 8413.85 Event begin: VecScale [1] 8413.85 Event end: VecScale [1] 8413.85 Event begin: VecSet [1] 8413.85 Event end: VecSet [1] 8413.85 Event begin: VecMAXPY [1] 8413.85 Event end: VecMAXPY [1] 8413.85 Event begin: VecCopy [1] 8413.85 Event end: VecCopy [1] 8413.85 Event begin: VecAXPY [1] 8413.85 Event end: VecAXPY [1] 8413.85 Event begin: MatMult [1] 8413.85 Event begin: VecScatterBegin [1] 8413.85 Event begin: SFPack [1] 8413.85 Event end: SFPack [1] 8413.85 Event end: VecScatterBegin [1] 8413.85 Event begin: VecScatterEnd [1] 8413.85 Event begin: SFUnpack [1] 8413.85 Event end: SFUnpack [1] 8413.85 Event end: VecScatterEnd [1] 8413.85 Event end: MatMult [1] 8413.85 Event begin: VecAYPX [1] 8413.85 Event end: VecAYPX [1] 8413.85 Event begin: VecNorm [1] 8413.85 Event end: VecNorm [1] 8413.85 Event begin: PCApply [1] 8413.85 Event begin: VecSet [1] 8413.85 Event end: VecSet [1] 8413.85 Event begin: MatSolve [1] 8413.85 Event end: MatSolve [1] 8413.85 Event end: PCApply [1] 8413.85 Event begin: MatMult [1] 8413.85 Event begin: VecScatterBegin [1] 8413.85 Event begin: SFPack [1] 8413.85 Event end: SFPack [1] 8413.85 Event end: VecScatterBegin [1] 8413.85 Event begin: VecScatterEnd [1] 8413.85 Event begin: SFUnpack [1] 8413.85 Event end: SFUnpack [1] 8413.85 Event end: VecScatterEnd [1] 8413.85 Event end: MatMult [1] 8413.85 Event begin: VecMDot [1] 8413.85 Event end: VecMDot [1] 8413.85 Event begin: VecMAXPY [1] 8413.85 Event end: VecMAXPY [1] 8413.85 Event begin: VecNorm [1] 8413.85 Event end: VecNorm [1] 8413.85 Event begin: VecScale [1] 8413.85 Event end: VecScale [1] 8413.85 Event begin: VecSet [1] 8413.85 Event end: VecSet [1] 8413.85 Event begin: VecMAXPY [1] 8413.85 Event end: VecMAXPY [1] 8413.85 Event begin: VecCopy [1] 8413.85 Event end: VecCopy [1] 8413.85 Event begin: VecAXPY [1] 8413.85 Event end: VecAXPY [1] 8413.85 Event begin: MatMult [1] 8413.85 Event begin: VecScatterBegin [1] 8413.85 Event begin: SFPack [1] 8413.85 Event end: SFPack [1] 8413.85 Event end: VecScatterBegin [1] 8413.85 Event begin: VecScatterEnd [1] 8413.85 Event begin: SFUnpack [1] 8413.85 Event end: SFUnpack [1] 8413.85 Event end: VecScatterEnd [1] 8413.85 Event end: MatMult [1] 8413.85 Event begin: VecAYPX [1] 8413.85 Event end: VecAYPX [1] 8413.85 Event begin: VecNorm [1] 8413.85 Event end: VecNorm [1] 8413.85 Event begin: PCApply [1] 8413.85 Event begin: VecSet [1] 8413.85 Event end: VecSet [1] 8413.85 Event begin: MatSolve [1] 8413.85 Event end: MatSolve [1] 8413.85 Event end: PCApply [1] 8413.85 Event begin: MatMult [1] 8413.85 Event begin: VecScatterBegin [1] 8413.85 Event begin: SFPack [1] 8413.85 Event end: SFPack [1] 8413.85 Event end: VecScatterBegin [1] 8413.85 Event begin: VecScatterEnd [1] 8413.85 Event begin: SFUnpack [1] 8413.85 Event end: SFUnpack [1] 8413.85 Event end: VecScatterEnd [1] 8413.85 Event end: MatMult [1] 8413.85 Event begin: VecMDot [1] 8413.85 Event end: VecMDot [1] 8413.85 Event begin: VecMAXPY [1] 8413.85 Event end: VecMAXPY [1] 8413.85 Event begin: VecNorm [1] 8413.85 Event end: VecNorm [1] 8413.85 Event begin: VecScale [1] 8413.85 Event end: VecScale [1] 8413.85 Event begin: VecSet [1] 8413.85 Event end: VecSet [1] 8413.85 Event begin: VecMAXPY [1] 8413.85 Event end: VecMAXPY [1] 8413.85 Event begin: VecCopy [1] 8413.85 Event end: VecCopy [1] 8413.85 Event begin: VecAXPY [1] 8413.85 Event end: VecAXPY [1] 8413.85 Event begin: MatMult [1] 8413.85 Event begin: VecScatterBegin [1] 8413.85 Event begin: SFPack [1] 8413.85 Event end: SFPack [1] 8413.85 Event end: VecScatterBegin [1] 8413.85 Event begin: VecScatterEnd [1] 8413.85 Event begin: SFUnpack [1] 8413.85 Event end: SFUnpack [1] 8413.85 Event end: VecScatterEnd [1] 8413.85 Event end: MatMult [1] 8413.85 Event begin: VecAYPX [1] 8413.85 Event end: VecAYPX [1] 8413.85 Event begin: VecNorm [1] 8413.85 Event end: VecNorm [1] 8413.85 Event begin: PCApply [1] 8413.85 Event begin: VecSet [1] 8413.85 Event end: VecSet [1] 8413.85 Event begin: MatSolve [1] 8413.85 Event end: MatSolve [1] 8413.85 Event end: PCApply [1] 8413.85 Event begin: MatMult [1] 8413.85 Event begin: VecScatterBegin [1] 8413.85 Event begin: SFPack [1] 8413.85 Event end: SFPack [1] 8413.85 Event end: VecScatterBegin [1] 8413.85 Event begin: VecScatterEnd [1] 8413.85 Event begin: SFUnpack [1] 8413.85 Event end: SFUnpack [1] 8413.85 Event end: VecScatterEnd [1] 8413.85 Event end: MatMult [1] 8413.85 Event begin: VecMDot [1] 8413.85 Event end: VecMDot [1] 8413.85 Event begin: VecMAXPY [1] 8413.85 Event end: VecMAXPY [1] 8413.85 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: VecScale [1] 8413.86 Event end: VecScale [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: VecMAXPY [1] 8413.86 Event end: VecMAXPY [1] 8413.86 Event begin: VecCopy [1] 8413.86 Event end: VecCopy [1] 8413.86 Event begin: VecAXPY [1] 8413.86 Event end: VecAXPY [1] 8413.86 Event begin: MatMult [1] 8413.86 Event begin: VecScatterBegin [1] 8413.86 Event begin: SFPack [1] 8413.86 Event end: SFPack [1] 8413.86 Event end: VecScatterBegin [1] 8413.86 Event begin: VecScatterEnd [1] 8413.86 Event begin: SFUnpack [1] 8413.86 Event end: SFUnpack [1] 8413.86 Event end: VecScatterEnd [1] 8413.86 Event end: MatMult [1] 8413.86 Event begin: VecAYPX [1] 8413.86 Event end: VecAYPX [1] 8413.86 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: PCApply [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: MatSolve [1] 8413.86 Event end: MatSolve [1] 8413.86 Event end: PCApply [1] 8413.86 Event begin: MatMult [1] 8413.86 Event begin: VecScatterBegin [1] 8413.86 Event begin: SFPack [1] 8413.86 Event end: SFPack [1] 8413.86 Event end: VecScatterBegin [1] 8413.86 Event begin: VecScatterEnd [1] 8413.86 Event begin: SFUnpack [1] 8413.86 Event end: SFUnpack [1] 8413.86 Event end: VecScatterEnd [1] 8413.86 Event end: MatMult [1] 8413.86 Event begin: VecMDot [1] 8413.86 Event end: VecMDot [1] 8413.86 Event begin: VecMAXPY [1] 8413.86 Event end: VecMAXPY [1] 8413.86 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: VecScale [1] 8413.86 Event end: VecScale [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: VecMAXPY [1] 8413.86 Event end: VecMAXPY [1] 8413.86 Event begin: VecCopy [1] 8413.86 Event end: VecCopy [1] 8413.86 Event begin: VecAXPY [1] 8413.86 Event end: VecAXPY [1] 8413.86 Event begin: MatMult [1] 8413.86 Event begin: VecScatterBegin [1] 8413.86 Event begin: SFPack [1] 8413.86 Event end: SFPack [1] 8413.86 Event end: VecScatterBegin [1] 8413.86 Event begin: VecScatterEnd [1] 8413.86 Event begin: SFUnpack [1] 8413.86 Event end: SFUnpack [1] 8413.86 Event end: VecScatterEnd [1] 8413.86 Event end: MatMult [1] 8413.86 Event begin: VecAYPX [1] 8413.86 Event end: VecAYPX [1] 8413.86 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: PCApply [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: MatSolve [1] 8413.86 Event end: MatSolve [1] 8413.86 Event end: PCApply [1] 8413.86 Event begin: MatMult [1] 8413.86 Event begin: VecScatterBegin [1] 8413.86 Event begin: SFPack [1] 8413.86 Event end: SFPack [1] 8413.86 Event end: VecScatterBegin [1] 8413.86 Event begin: VecScatterEnd [1] 8413.86 Event begin: SFUnpack [1] 8413.86 Event end: SFUnpack [1] 8413.86 Event end: VecScatterEnd [1] 8413.86 Event end: MatMult [1] 8413.86 Event begin: VecMDot [1] 8413.86 Event end: VecMDot [1] 8413.86 Event begin: VecMAXPY [1] 8413.86 Event end: VecMAXPY [1] 8413.86 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: VecScale [1] 8413.86 Event end: VecScale [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: VecMAXPY [1] 8413.86 Event end: VecMAXPY [1] 8413.86 Event begin: VecCopy [1] 8413.86 Event end: VecCopy [1] 8413.86 Event begin: VecAXPY [1] 8413.86 Event end: VecAXPY [1] 8413.86 Event begin: MatMult [1] 8413.86 Event begin: VecScatterBegin [1] 8413.86 Event begin: SFPack [1] 8413.86 Event end: SFPack [1] 8413.86 Event end: VecScatterBegin [1] 8413.86 Event begin: VecScatterEnd [1] 8413.86 Event begin: SFUnpack [1] 8413.86 Event end: SFUnpack [1] 8413.86 Event end: VecScatterEnd [1] 8413.86 Event end: MatMult [1] 8413.86 Event begin: VecAYPX [1] 8413.86 Event end: VecAYPX [1] 8413.86 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: PCApply [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: MatSolve [1] 8413.86 Event end: MatSolve [1] 8413.86 Event end: PCApply [1] 8413.86 Event begin: MatMult [1] 8413.86 Event begin: VecScatterBegin [1] 8413.86 Event begin: SFPack [1] 8413.86 Event end: SFPack [1] 8413.86 Event end: VecScatterBegin [1] 8413.86 Event begin: VecScatterEnd [1] 8413.86 Event begin: SFUnpack [1] 8413.86 Event end: SFUnpack [1] 8413.86 Event end: VecScatterEnd [1] 8413.86 Event end: MatMult [1] 8413.86 Event begin: VecMDot [1] 8413.86 Event end: VecMDot [1] 8413.86 Event begin: VecMAXPY [1] 8413.86 Event end: VecMAXPY [1] 8413.86 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: VecScale [1] 8413.86 Event end: VecScale [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: VecMAXPY [1] 8413.86 Event end: VecMAXPY [1] 8413.86 Event begin: VecCopy [1] 8413.86 Event end: VecCopy [1] 8413.86 Event begin: VecAXPY [1] 8413.86 Event end: VecAXPY [1] 8413.86 Event begin: MatMult [1] 8413.86 Event begin: VecScatterBegin [1] 8413.86 Event begin: SFPack [1] 8413.86 Event end: SFPack [1] 8413.86 Event end: VecScatterBegin [1] 8413.86 Event begin: VecScatterEnd [1] 8413.86 Event begin: SFUnpack [1] 8413.86 Event end: SFUnpack [1] 8413.86 Event end: VecScatterEnd [1] 8413.86 Event end: MatMult [1] 8413.86 Event begin: VecAYPX [1] 8413.86 Event end: VecAYPX [1] 8413.86 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: PCApply [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: MatSolve [1] 8413.86 Event end: MatSolve [1] 8413.86 Event end: PCApply [1] 8413.86 Event begin: MatMult [1] 8413.86 Event begin: VecScatterBegin [1] 8413.86 Event begin: SFPack [1] 8413.86 Event end: SFPack [1] 8413.86 Event end: VecScatterBegin [1] 8413.86 Event begin: VecScatterEnd [1] 8413.86 Event begin: SFUnpack [1] 8413.86 Event end: SFUnpack [1] 8413.86 Event end: VecScatterEnd [1] 8413.86 Event end: MatMult [1] 8413.86 Event begin: VecMDot [1] 8413.86 Event end: VecMDot [1] 8413.86 Event begin: VecMAXPY [1] 8413.86 Event end: VecMAXPY [1] 8413.86 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: VecScale [1] 8413.86 Event end: VecScale [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: VecMAXPY [1] 8413.86 Event end: VecMAXPY [1] 8413.86 Event begin: VecCopy [1] 8413.86 Event end: VecCopy [1] 8413.86 Event begin: VecAXPY [1] 8413.86 Event end: VecAXPY [1] 8413.86 Event begin: MatMult [1] 8413.86 Event begin: VecScatterBegin [1] 8413.86 Event begin: SFPack [1] 8413.86 Event end: SFPack [1] 8413.86 Event end: VecScatterBegin [1] 8413.86 Event begin: VecScatterEnd [1] 8413.86 Event begin: SFUnpack [1] 8413.86 Event end: SFUnpack [1] 8413.86 Event end: VecScatterEnd [1] 8413.86 Event end: MatMult [1] 8413.86 Event begin: VecAYPX [1] 8413.86 Event end: VecAYPX [1] 8413.86 Event begin: VecNorm [1] 8413.86 Event end: VecNorm [1] 8413.86 Event begin: PCApply [1] 8413.86 Event begin: VecSet [1] 8413.86 Event end: VecSet [1] 8413.86 Event begin: MatSolve [2] 8413.81 Event end: MatMult [2] 8413.81 Event begin: VecMDot [2] 8413.81 Event end: VecMDot [2] 8413.81 Event begin: VecMAXPY [2] 8413.81 Event end: VecMAXPY [2] 8413.81 Event begin: VecNorm [2] 8413.81 Event end: VecNorm [2] 8413.81 Event begin: VecScale [2] 8413.81 Event end: VecScale [2] 8413.81 Event begin: VecSet [2] 8413.81 Event end: VecSet [2] 8413.81 Event begin: VecMAXPY [2] 8413.81 Event end: VecMAXPY [2] 8413.81 Event begin: VecCopy [2] 8413.81 Event end: VecCopy [2] 8413.81 Event begin: VecAXPY [2] 8413.81 Event end: VecAXPY [2] 8413.81 Event begin: MatMult [2] 8413.81 Event begin: VecScatterBegin [2] 8413.81 Event begin: SFPack [2] 8413.81 Event end: SFPack [2] 8413.81 Event end: VecScatterBegin [2] 8413.81 Event begin: VecScatterEnd [2] 8413.81 Event begin: SFUnpack [2] 8413.81 Event end: SFUnpack [2] 8413.81 Event end: VecScatterEnd [2] 8413.81 Event end: MatMult [2] 8413.81 Event begin: VecAYPX [2] 8413.81 Event end: VecAYPX [2] 8413.81 Event begin: VecNorm [2] 8413.81 Event end: VecNorm [2] 8413.81 Event begin: PCApply [2] 8413.81 Event begin: VecSet [2] 8413.81 Event end: VecSet [2] 8413.81 Event begin: MatSolve [2] 8413.81 Event end: MatSolve [2] 8413.81 Event end: PCApply [2] 8413.81 Event begin: MatMult [2] 8413.81 Event begin: VecScatterBegin [2] 8413.81 Event begin: SFPack [2] 8413.81 Event end: SFPack [2] 8413.81 Event end: VecScatterBegin [2] 8413.81 Event begin: VecScatterEnd [2] 8413.81 Event begin: SFUnpack [2] 8413.81 Event end: SFUnpack [2] 8413.81 Event end: VecScatterEnd [2] 8413.81 Event end: MatMult [2] 8413.81 Event begin: VecMDot [2] 8413.81 Event end: VecMDot [2] 8413.81 Event begin: VecMAXPY [2] 8413.81 Event end: VecMAXPY [2] 8413.81 Event begin: VecNorm [2] 8413.81 Event end: VecNorm [2] 8413.81 Event begin: VecScale [2] 8413.81 Event end: VecScale [2] 8413.81 Event begin: VecSet [2] 8413.81 Event end: VecSet [2] 8413.81 Event begin: VecMAXPY [2] 8413.81 Event end: VecMAXPY [2] 8413.81 Event begin: VecCopy [2] 8413.81 Event end: VecCopy [2] 8413.81 Event begin: VecAXPY [2] 8413.81 Event end: VecAXPY [2] 8413.81 Event begin: MatMult [2] 8413.81 Event begin: VecScatterBegin [2] 8413.81 Event begin: SFPack [2] 8413.81 Event end: SFPack [2] 8413.81 Event end: VecScatterBegin [2] 8413.81 Event begin: VecScatterEnd [2] 8413.81 Event begin: SFUnpack [2] 8413.81 Event end: SFUnpack [2] 8413.81 Event end: VecScatterEnd [2] 8413.81 Event end: MatMult [2] 8413.81 Event begin: VecAYPX [2] 8413.81 Event end: VecAYPX [2] 8413.81 Event begin: VecNorm [2] 8413.81 Event end: VecNorm [2] 8413.81 Event begin: PCApply [2] 8413.81 Event begin: VecSet [2] 8413.81 Event end: VecSet [2] 8413.81 Event begin: MatSolve [2] 8413.81 Event end: MatSolve [2] 8413.81 Event end: PCApply [2] 8413.81 Event begin: MatMult [2] 8413.81 Event begin: VecScatterBegin [2] 8413.81 Event begin: SFPack [2] 8413.81 Event end: SFPack [2] 8413.81 Event end: VecScatterBegin [2] 8413.81 Event begin: VecScatterEnd [2] 8413.81 Event begin: SFUnpack [2] 8413.81 Event end: SFUnpack [2] 8413.81 Event end: VecScatterEnd [2] 8413.81 Event end: MatMult [2] 8413.81 Event begin: VecMDot [2] 8413.81 Event end: VecMDot [2] 8413.81 Event begin: VecMAXPY [2] 8413.81 Event end: VecMAXPY [2] 8413.81 Event begin: VecNorm [2] 8413.81 Event end: VecNorm [2] 8413.81 Event begin: VecScale [2] 8413.81 Event end: VecScale [2] 8413.81 Event begin: VecSet [2] 8413.81 Event end: VecSet [2] 8413.81 Event begin: VecMAXPY [2] 8413.81 Event end: VecMAXPY [2] 8413.81 Event begin: VecCopy [2] 8413.81 Event end: VecCopy [2] 8413.81 Event begin: VecAXPY [2] 8413.81 Event end: VecAXPY [2] 8413.81 Event begin: MatMult [2] 8413.81 Event begin: VecScatterBegin [2] 8413.81 Event begin: SFPack [2] 8413.81 Event end: SFPack [2] 8413.81 Event end: VecScatterBegin [2] 8413.81 Event begin: VecScatterEnd [2] 8413.81 Event begin: SFUnpack [2] 8413.81 Event end: SFUnpack [2] 8413.81 Event end: VecScatterEnd [2] 8413.81 Event end: MatMult [2] 8413.81 Event begin: VecAYPX [2] 8413.81 Event end: VecAYPX [2] 8413.81 Event begin: VecNorm [2] 8413.84 Event end: VecNorm [2] 8413.84 Event begin: PCApply [2] 8413.84 Event begin: VecSet [2] 8413.84 Event end: VecSet [2] 8413.84 Event begin: MatSolve [2] 8413.84 Event end: MatSolve [2] 8413.84 Event end: PCApply [2] 8413.84 Event begin: MatMult [2] 8413.84 Event begin: VecScatterBegin [2] 8413.84 Event begin: SFPack [2] 8413.84 Event end: SFPack [2] 8413.84 Event end: VecScatterBegin [2] 8413.84 Event begin: VecScatterEnd [2] 8413.84 Event begin: SFUnpack [2] 8413.84 Event end: SFUnpack [2] 8413.84 Event end: VecScatterEnd [2] 8413.84 Event end: MatMult [2] 8413.84 Event begin: VecMDot [2] 8413.84 Event end: VecMDot [2] 8413.84 Event begin: VecMAXPY [2] 8413.84 Event end: VecMAXPY [2] 8413.84 Event begin: VecNorm [2] 8413.84 Event end: VecNorm [2] 8413.84 Event begin: VecScale [2] 8413.84 Event end: VecScale [2] 8413.84 Event begin: VecSet [2] 8413.84 Event end: VecSet [2] 8413.84 Event begin: VecMAXPY [2] 8413.84 Event end: VecMAXPY [2] 8413.84 Event begin: VecCopy [2] 8413.84 Event end: VecCopy [2] 8413.84 Event begin: VecAXPY [2] 8413.84 Event end: VecAXPY [2] 8413.84 Event begin: MatMult [2] 8413.84 Event begin: VecScatterBegin [2] 8413.84 Event begin: SFPack [2] 8413.84 Event end: SFPack [2] 8413.84 Event end: VecScatterBegin [2] 8413.84 Event begin: VecScatterEnd [2] 8413.84 Event begin: SFUnpack [2] 8413.84 Event end: SFUnpack [2] 8413.84 Event end: VecScatterEnd [2] 8413.84 Event end: MatMult [2] 8413.84 Event begin: VecAYPX [2] 8413.84 Event end: VecAYPX [2] 8413.84 Event begin: VecNorm [2] 8413.84 Event end: VecNorm [2] 8413.84 Event begin: PCApply [2] 8413.84 Event begin: VecSet [2] 8413.84 Event end: VecSet [2] 8413.84 Event begin: MatSolve [2] 8413.84 Event end: MatSolve [2] 8413.84 Event end: PCApply [2] 8413.84 Event begin: MatMult [2] 8413.84 Event begin: VecScatterBegin [2] 8413.84 Event begin: SFPack [2] 8413.84 Event end: SFPack [2] 8413.84 Event end: VecScatterBegin [2] 8413.84 Event begin: VecScatterEnd [2] 8413.84 Event begin: SFUnpack [2] 8413.84 Event end: SFUnpack [2] 8413.84 Event end: VecScatterEnd [2] 8413.84 Event end: MatMult [2] 8413.84 Event begin: VecMDot [2] 8413.84 Event end: VecMDot [2] 8413.84 Event begin: VecMAXPY [2] 8413.84 Event end: VecMAXPY [2] 8413.84 Event begin: VecNorm [2] 8413.84 Event end: VecNorm [2] 8413.84 Event begin: VecScale [2] 8413.84 Event end: VecScale [2] 8413.84 Event begin: VecSet [2] 8413.84 Event end: VecSet [2] 8413.84 Event begin: VecMAXPY [2] 8413.84 Event end: VecMAXPY [2] 8413.84 Event begin: VecCopy [2] 8413.84 Event end: VecCopy [2] 8413.84 Event begin: VecAXPY [2] 8413.84 Event end: VecAXPY [2] 8413.84 Event begin: MatMult [2] 8413.84 Event begin: VecScatterBegin [2] 8413.84 Event begin: SFPack [2] 8413.84 Event end: SFPack [2] 8413.84 Event end: VecScatterBegin [2] 8413.84 Event begin: VecScatterEnd [2] 8413.84 Event begin: SFUnpack [2] 8413.84 Event end: SFUnpack [2] 8413.84 Event end: VecScatterEnd [2] 8413.84 Event end: MatMult [2] 8413.84 Event begin: VecAYPX [2] 8413.84 Event end: VecAYPX [2] 8413.84 Event begin: VecNorm [2] 8413.84 Event end: VecNorm [2] 8413.84 Event begin: PCApply [2] 8413.84 Event begin: VecSet [2] 8413.84 Event end: VecSet [2] 8413.84 Event begin: MatSolve [2] 8413.84 Event end: MatSolve [2] 8413.84 Event end: PCApply [2] 8413.84 Event begin: MatMult [2] 8413.84 Event begin: VecScatterBegin [2] 8413.84 Event begin: SFPack [2] 8413.84 Event end: SFPack [2] 8413.84 Event end: VecScatterBegin [2] 8413.84 Event begin: VecScatterEnd [2] 8413.84 Event begin: SFUnpack [2] 8413.84 Event end: SFUnpack [2] 8413.84 Event end: VecScatterEnd [2] 8413.84 Event end: MatMult [2] 8413.84 Event begin: VecMDot [2] 8413.84 Event end: VecMDot [2] 8413.84 Event begin: VecMAXPY [2] 8413.84 Event end: VecMAXPY [2] 8413.84 Event begin: VecNorm [2] 8413.84 Event end: VecNorm [2] 8413.84 Event begin: VecScale [2] 8413.84 Event end: VecScale [2] 8413.84 Event begin: VecSet [2] 8413.84 Event end: VecSet [2] 8413.84 Event begin: VecMAXPY [2] 8413.84 Event end: VecMAXPY [2] 8413.84 Event begin: VecCopy [2] 8413.84 Event end: VecCopy [2] 8413.84 Event begin: VecAXPY [2] 8413.84 Event end: VecAXPY [2] 8413.84 Event begin: MatMult [2] 8413.84 Event begin: VecScatterBegin [2] 8413.84 Event begin: SFPack [2] 8413.84 Event end: SFPack [2] 8413.84 Event end: VecScatterBegin [2] 8413.84 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecAYPX [2] 8413.85 Event end: VecAYPX [2] 8413.85 Event begin: VecNorm [2] 8413.85 Event end: VecNorm [2] 8413.85 Event begin: PCApply [2] 8413.85 Event begin: VecSet [2] 8413.85 Event end: VecSet [2] 8413.85 Event begin: MatSolve [2] 8413.85 Event end: MatSolve [2] 8413.85 Event end: PCApply [2] 8413.85 Event begin: MatMult [2] 8413.85 Event begin: VecScatterBegin [2] 8413.85 Event begin: SFPack [2] 8413.85 Event end: SFPack [2] 8413.85 Event end: VecScatterBegin [2] 8413.85 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecMDot [2] 8413.85 Event end: VecMDot [2] 8413.85 Event begin: VecMAXPY [2] 8413.85 Event end: VecMAXPY [2] 8413.85 Event begin: VecNorm [2] 8413.85 Event end: VecNorm [2] 8413.85 Event begin: VecScale [2] 8413.85 Event end: VecScale [2] 8413.85 Event begin: VecSet [2] 8413.85 Event end: VecSet [2] 8413.85 Event begin: VecMAXPY [2] 8413.85 Event end: VecMAXPY [2] 8413.85 Event begin: VecCopy [2] 8413.85 Event end: VecCopy [2] 8413.85 Event begin: VecAXPY [2] 8413.85 Event end: VecAXPY [2] 8413.85 Event begin: MatMult [2] 8413.85 Event begin: VecScatterBegin [2] 8413.85 Event begin: SFPack [2] 8413.85 Event end: SFPack [2] 8413.85 Event end: VecScatterBegin [2] 8413.85 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecAYPX [2] 8413.85 Event end: VecAYPX [2] 8413.85 Event begin: VecNorm [2] 8413.85 Event end: VecNorm [2] 8413.85 Event begin: PCApply [2] 8413.85 Event begin: VecSet [2] 8413.85 Event end: VecSet [2] 8413.85 Event begin: MatSolve [2] 8413.85 Event end: MatSolve [2] 8413.85 Event end: PCApply [2] 8413.85 Event begin: MatMult [2] 8413.85 Event begin: VecScatterBegin [2] 8413.85 Event begin: SFPack [2] 8413.85 Event end: SFPack [2] 8413.85 Event end: VecScatterBegin [2] 8413.85 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecMDot [2] 8413.85 Event end: VecMDot [2] 8413.85 Event begin: VecMAXPY [2] 8413.85 Event end: VecMAXPY [2] 8413.85 Event begin: VecNorm [2] 8413.85 Event end: VecNorm [2] 8413.85 Event begin: VecScale [2] 8413.85 Event end: VecScale [2] 8413.85 Event begin: VecSet [2] 8413.85 Event end: VecSet [2] 8413.85 Event begin: VecMAXPY [2] 8413.85 Event end: VecMAXPY [2] 8413.85 Event begin: VecCopy [2] 8413.85 Event end: VecCopy [2] 8413.85 Event begin: VecAXPY [2] 8413.85 Event end: VecAXPY [2] 8413.85 Event begin: MatMult [2] 8413.85 Event begin: VecScatterBegin [2] 8413.85 Event begin: SFPack [2] 8413.85 Event end: SFPack [2] 8413.85 Event end: VecScatterBegin [2] 8413.85 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecAYPX [2] 8413.85 Event end: VecAYPX [2] 8413.85 Event begin: VecNorm [2] 8413.85 Event end: VecNorm [2] 8413.85 Event begin: PCApply [2] 8413.85 Event begin: VecSet [2] 8413.85 Event end: VecSet [2] 8413.85 Event begin: MatSolve [2] 8413.85 Event end: MatSolve [2] 8413.85 Event end: PCApply [2] 8413.85 Event begin: MatMult [2] 8413.85 Event begin: VecScatterBegin [2] 8413.85 Event begin: SFPack [2] 8413.85 Event end: SFPack [2] 8413.85 Event end: VecScatterBegin [2] 8413.85 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecMDot [2] 8413.85 Event end: VecMDot [2] 8413.85 Event begin: VecMAXPY [2] 8413.85 Event end: VecMAXPY [2] 8413.85 Event begin: VecNorm [2] 8413.85 Event end: VecNorm [2] 8413.85 Event begin: VecScale [2] 8413.85 Event end: VecScale [2] 8413.85 Event begin: VecSet [2] 8413.85 Event end: VecSet [2] 8413.85 Event begin: VecMAXPY [2] 8413.85 Event end: VecMAXPY [2] 8413.85 Event begin: VecCopy [2] 8413.85 Event end: VecCopy [2] 8413.85 Event begin: VecAXPY [2] 8413.85 Event end: VecAXPY [2] 8413.85 Event begin: MatMult [2] 8413.85 Event begin: VecScatterBegin [2] 8413.85 Event begin: SFPack [2] 8413.85 Event end: SFPack [2] 8413.85 Event end: VecScatterBegin [2] 8413.85 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecAYPX [2] 8413.85 Event end: VecAYPX [2] 8413.85 Event begin: VecNorm [2] 8413.85 Event end: VecNorm [2] 8413.85 Event begin: PCApply [2] 8413.85 Event begin: VecSet [2] 8413.85 Event end: VecSet [2] 8413.85 Event begin: MatSolve [2] 8413.85 Event end: MatSolve [2] 8413.85 Event end: PCApply [2] 8413.85 Event begin: MatMult [2] 8413.85 Event begin: VecScatterBegin [2] 8413.85 Event begin: SFPack [2] 8413.85 Event end: SFPack [2] 8413.85 Event end: VecScatterBegin [2] 8413.85 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecMDot [2] 8413.85 Event end: VecMDot [2] 8413.85 Event begin: VecMAXPY [2] 8413.85 Event end: VecMAXPY [2] 8413.85 Event begin: VecNorm [2] 8413.85 Event end: VecNorm [2] 8413.85 Event begin: VecScale [2] 8413.85 Event end: VecScale [2] 8413.85 Event begin: VecSet [2] 8413.85 Event end: VecSet [2] 8413.85 Event begin: VecMAXPY [2] 8413.85 Event end: VecMAXPY [2] 8413.85 Event begin: VecCopy [2] 8413.85 Event end: VecCopy [2] 8413.85 Event begin: VecAXPY [2] 8413.85 Event end: VecAXPY [2] 8413.85 Event begin: MatMult [2] 8413.85 Event begin: VecScatterBegin [2] 8413.85 Event begin: SFPack [2] 8413.85 Event end: SFPack [2] 8413.85 Event end: VecScatterBegin [2] 8413.85 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecAYPX [2] 8413.85 Event end: VecAYPX [2] 8413.85 Event begin: VecNorm [2] 8413.85 Event end: VecNorm [2] 8413.85 Event begin: PCApply [2] 8413.85 Event begin: VecSet [2] 8413.85 Event end: VecSet [2] 8413.85 Event begin: MatSolve [2] 8413.85 Event end: MatSolve [2] 8413.85 Event end: PCApply [2] 8413.85 Event begin: MatMult [2] 8413.85 Event begin: VecScatterBegin [2] 8413.85 Event begin: SFPack [2] 8413.85 Event end: SFPack [2] 8413.85 Event end: VecScatterBegin [2] 8413.85 Event begin: VecScatterEnd [2] 8413.85 Event begin: SFUnpack [2] 8413.85 Event end: SFUnpack [2] 8413.85 Event end: VecScatterEnd [2] 8413.85 Event end: MatMult [2] 8413.85 Event begin: VecMDot [2] 8413.86 Event end: VecMDot [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecNorm [2] 8413.86 Event end: VecNorm [2] 8413.86 Event begin: VecScale [2] 8413.86 Event end: VecScale [2] 8413.86 Event begin: VecSet [2] 8413.86 Event end: VecSet [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecCopy [2] 8413.86 Event end: VecCopy [2] 8413.86 Event begin: VecAXPY [2] 8413.86 Event end: VecAXPY [2] 8413.86 Event begin: MatMult [2] 8413.86 Event begin: VecScatterBegin [2] 8413.86 Event begin: SFPack [2] 8413.86 Event end: SFPack [2] 8413.86 Event end: VecScatterBegin [2] 8413.86 Event begin: VecScatterEnd [2] 8413.86 Event begin: SFUnpack [2] 8413.86 Event end: SFUnpack [2] 8413.86 Event end: VecScatterEnd [2] 8413.86 Event end: MatMult [2] 8413.86 Event begin: VecAYPX [2] 8413.86 Event end: VecAYPX [2] 8413.86 Event begin: VecNorm [2] 8413.86 Event end: VecNorm [2] 8413.86 Event begin: PCApply [2] 8413.86 Event begin: VecSet [2] 8413.86 Event end: VecSet [2] 8413.86 Event begin: MatSolve [2] 8413.86 Event end: MatSolve [2] 8413.86 Event end: PCApply [2] 8413.86 Event begin: MatMult [2] 8413.86 Event begin: VecScatterBegin [2] 8413.86 Event begin: SFPack [2] 8413.86 Event end: SFPack [2] 8413.86 Event end: VecScatterBegin [2] 8413.86 Event begin: VecScatterEnd [2] 8413.86 Event begin: SFUnpack [2] 8413.86 Event end: SFUnpack [2] 8413.86 Event end: VecScatterEnd [2] 8413.86 Event end: MatMult [2] 8413.86 Event begin: VecMDot [2] 8413.86 Event end: VecMDot [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecNorm [2] 8413.86 Event end: VecNorm [2] 8413.86 Event begin: VecScale [2] 8413.86 Event end: VecScale [2] 8413.86 Event begin: VecSet [2] 8413.86 Event end: VecSet [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecCopy [2] 8413.86 Event end: VecCopy [2] 8413.86 Event begin: VecAXPY [2] 8413.86 Event end: VecAXPY [2] 8413.86 Event begin: MatMult [2] 8413.86 Event begin: VecScatterBegin [2] 8413.86 Event begin: SFPack [2] 8413.86 Event end: SFPack [2] 8413.86 Event end: VecScatterBegin [2] 8413.86 Event begin: VecScatterEnd [2] 8413.86 Event begin: SFUnpack [2] 8413.86 Event end: SFUnpack [2] 8413.86 Event end: VecScatterEnd [2] 8413.86 Event end: MatMult [2] 8413.86 Event begin: VecAYPX [2] 8413.86 Event end: VecAYPX [2] 8413.86 Event begin: VecNorm [2] 8413.86 Event end: VecNorm [2] 8413.86 Event begin: PCApply [2] 8413.86 Event begin: VecSet [2] 8413.86 Event end: VecSet [2] 8413.86 Event begin: MatSolve [2] 8413.86 Event end: MatSolve [2] 8413.86 Event end: PCApply [2] 8413.86 Event begin: MatMult [2] 8413.86 Event begin: VecScatterBegin [2] 8413.86 Event begin: SFPack [2] 8413.86 Event end: SFPack [2] 8413.86 Event end: VecScatterBegin [2] 8413.86 Event begin: VecScatterEnd [2] 8413.86 Event begin: SFUnpack [2] 8413.86 Event end: SFUnpack [2] 8413.86 Event end: VecScatterEnd [2] 8413.86 Event end: MatMult [2] 8413.86 Event begin: VecMDot [2] 8413.86 Event end: VecMDot [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecNorm [2] 8413.86 Event end: VecNorm [2] 8413.86 Event begin: VecScale [2] 8413.86 Event end: VecScale [2] 8413.86 Event begin: VecSet [2] 8413.86 Event end: VecSet [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecCopy [2] 8413.86 Event end: VecCopy [2] 8413.86 Event begin: VecAXPY [2] 8413.86 Event end: VecAXPY [2] 8413.86 Event begin: MatMult [2] 8413.86 Event begin: VecScatterBegin [2] 8413.86 Event begin: SFPack [2] 8413.86 Event end: SFPack [2] 8413.86 Event end: VecScatterBegin [2] 8413.86 Event begin: VecScatterEnd [2] 8413.86 Event begin: SFUnpack [2] 8413.86 Event end: SFUnpack [2] 8413.86 Event end: VecScatterEnd [2] 8413.86 Event end: MatMult [2] 8413.86 Event begin: VecAYPX [2] 8413.86 Event end: VecAYPX [2] 8413.86 Event begin: VecNorm [2] 8413.86 Event end: VecNorm [2] 8413.86 Event begin: PCApply [2] 8413.86 Event begin: VecSet [2] 8413.86 Event end: VecSet [2] 8413.86 Event begin: MatSolve [2] 8413.86 Event end: MatSolve [2] 8413.86 Event end: PCApply [2] 8413.86 Event begin: MatMult [2] 8413.86 Event begin: VecScatterBegin [2] 8413.86 Event begin: SFPack [2] 8413.86 Event end: SFPack [2] 8413.86 Event end: VecScatterBegin [2] 8413.86 Event begin: VecScatterEnd [2] 8413.86 Event begin: SFUnpack [2] 8413.86 Event end: SFUnpack [2] 8413.86 Event end: VecScatterEnd [2] 8413.86 Event end: MatMult [2] 8413.86 Event begin: VecMDot [2] 8413.86 Event end: VecMDot [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecNorm [2] 8413.86 Event end: VecNorm [2] 8413.86 Event begin: VecScale [2] 8413.86 Event end: VecScale [2] 8413.86 Event begin: VecSet [2] 8413.86 Event end: VecSet [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecCopy [2] 8413.86 Event end: VecCopy [2] 8413.86 Event begin: VecAXPY [2] 8413.86 Event end: VecAXPY [2] 8413.86 Event begin: MatMult [2] 8413.86 Event begin: VecScatterBegin [2] 8413.86 Event begin: SFPack [2] 8413.86 Event end: SFPack [2] 8413.86 Event end: VecScatterBegin [2] 8413.86 Event begin: VecScatterEnd [2] 8413.86 Event begin: SFUnpack [2] 8413.86 Event end: SFUnpack [2] 8413.86 Event end: VecScatterEnd [2] 8413.86 Event end: MatMult [2] 8413.86 Event begin: VecAYPX [2] 8413.86 Event end: VecAYPX [2] 8413.86 Event begin: VecNorm [2] 8413.86 Event end: VecNorm [2] 8413.86 Event begin: PCApply [2] 8413.86 Event begin: VecSet [2] 8413.86 Event end: VecSet [2] 8413.86 Event begin: MatSolve [2] 8413.86 Event end: MatSolve [2] 8413.86 Event end: PCApply [2] 8413.86 Event begin: MatMult [2] 8413.86 Event begin: VecScatterBegin [2] 8413.86 Event begin: SFPack [2] 8413.86 Event end: SFPack [2] 8413.86 Event end: VecScatterBegin [2] 8413.86 Event begin: VecScatterEnd [2] 8413.86 Event begin: SFUnpack [2] 8413.86 Event end: SFUnpack [2] 8413.86 Event end: VecScatterEnd [2] 8413.86 Event end: MatMult [2] 8413.86 Event begin: VecMDot [2] 8413.86 Event end: VecMDot [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecNorm [2] 8413.86 Event end: VecNorm [2] 8413.86 Event begin: VecScale [2] 8413.86 Event end: VecScale [2] 8413.86 Event begin: VecSet [2] 8413.86 Event end: VecSet [2] 8413.86 Event begin: VecMAXPY [2] 8413.86 Event end: VecMAXPY [2] 8413.86 Event begin: VecCopy [2] 8413.86 Event end: VecCopy [2] 8413.86 Event begin: VecAXPY [2] 8413.86 Event end: VecAXPY [2] 8413.86 Event begin: MatMult [2] 8413.86 Event begin: VecScatterBegin [2] 8413.86 Event begin: SFPack [2] 8413.86 Event end: SFPack [2] 8413.86 Event end: VecScatterBegin [2] 8413.86 Event begin: VecScatterEnd [2] 8413.87 Event begin: SFUnpack [2] 8413.87 Event end: SFUnpack [2] 8413.87 Event end: VecScatterEnd [2] 8413.87 Event end: MatMult [2] 8413.87 Event begin: VecAYPX [2] 8413.87 Event end: VecAYPX [2] 8413.87 Event begin: VecNorm [2] 8413.87 Event end: VecNorm [2] 8413.87 Event begin: PCApply [2] 8413.87 Event begin: VecSet [2] 8413.87 Event end: VecSet [2] 8413.87 Event begin: MatSolve [2] 8413.87 Event end: MatSolve [2] 8413.87 Event end: PCApply [2] 8413.87 Event begin: MatMult [2] 8413.87 Event begin: VecScatterBegin [2] 8413.87 Event begin: SFPack [2] 8413.87 Event end: SFPack [2] 8413.87 Event end: VecScatterBegin [2] 8413.87 Event begin: VecScatterEnd [2] 8413.89 Event begin: SFUnpack [2] 8413.89 Event end: SFUnpack [2] 8413.89 Event end: VecScatterEnd [2] 8413.89 Event end: MatMult [2] 8413.89 Event begin: VecMDot [2] 8413.89 Event end: VecMDot [2] 8413.89 Event begin: VecMAXPY [2] 8413.89 Event end: VecMAXPY [2] 8413.89 Event begin: VecNorm [2] 8413.89 Event end: VecNorm [2] 8413.89 Event begin: VecScale [2] 8413.89 Event end: VecScale [2] 8413.89 Event begin: VecSet [2] 8413.89 Event end: VecSet [2] 8413.89 Event begin: VecMAXPY [2] 8413.89 Event end: VecMAXPY [2] 8413.89 Event begin: VecCopy [2] 8413.89 Event end: VecCopy [2] 8413.89 Event begin: VecAXPY [2] 8413.89 Event end: VecAXPY [2] 8413.89 Event begin: MatMult [2] 8413.89 Event begin: VecScatterBegin [2] 8413.89 Event begin: SFPack [2] 8413.89 Event end: SFPack [2] 8413.89 Event end: VecScatterBegin [2] 8413.89 Event begin: VecScatterEnd [2] 8413.89 Event begin: SFUnpack [2] 8413.89 Event end: SFUnpack [2] 8413.89 Event end: VecScatterEnd [2] 8413.89 Event end: MatMult [2] 8413.89 Event begin: VecAYPX [2] 8413.89 Event end: VecAYPX [2] 8413.89 Event begin: VecNorm [2] 8413.89 Event end: VecNorm [2] 8413.89 Event begin: PCApply [2] 8413.89 Event begin: VecSet [2] 8413.89 Event end: VecSet [2] 8413.89 Event begin: MatSolve [2] 8413.89 Event end: MatSolve [2] 8413.89 Event end: PCApply [2] 8413.89 Event begin: MatMult [3] 8413.81 Event end: SFUnpack [3] 8413.84 Event end: VecScatterEnd [3] 8413.84 Event end: MatMult [3] 8413.84 Event begin: VecAYPX [3] 8413.84 Event end: VecAYPX [3] 8413.84 Event begin: VecNorm [3] 8413.84 Event end: VecNorm [3] 8413.84 Event begin: PCApply [3] 8413.84 Event begin: VecSet [3] 8413.84 Event end: VecSet [3] 8413.84 Event begin: MatSolve [3] 8413.84 Event end: MatSolve [3] 8413.84 Event end: PCApply [3] 8413.84 Event begin: MatMult [3] 8413.84 Event begin: VecScatterBegin [3] 8413.84 Event begin: SFPack [3] 8413.84 Event end: SFPack [3] 8413.84 Event end: VecScatterBegin [3] 8413.84 Event begin: VecScatterEnd [3] 8413.84 Event begin: SFUnpack [3] 8413.84 Event end: SFUnpack [3] 8413.84 Event end: VecScatterEnd [3] 8413.84 Event end: MatMult [3] 8413.84 Event begin: VecMDot [3] 8413.84 Event end: VecMDot [3] 8413.84 Event begin: VecMAXPY [3] 8413.84 Event end: VecMAXPY [3] 8413.84 Event begin: VecNorm [3] 8413.84 Event end: VecNorm [3] 8413.84 Event begin: VecScale [3] 8413.84 Event end: VecScale [3] 8413.84 Event begin: VecSet [3] 8413.84 Event end: VecSet [3] 8413.84 Event begin: VecMAXPY [3] 8413.84 Event end: VecMAXPY [3] 8413.84 Event begin: VecCopy [3] 8413.84 Event end: VecCopy [3] 8413.84 Event begin: VecAXPY [3] 8413.84 Event end: VecAXPY [3] 8413.84 Event begin: MatMult [3] 8413.84 Event begin: VecScatterBegin [3] 8413.84 Event begin: SFPack [3] 8413.84 Event end: SFPack [3] 8413.84 Event end: VecScatterBegin [3] 8413.84 Event begin: VecScatterEnd [3] 8413.84 Event begin: SFUnpack [3] 8413.84 Event end: SFUnpack [3] 8413.84 Event end: VecScatterEnd [3] 8413.84 Event end: MatMult [3] 8413.84 Event begin: VecAYPX [3] 8413.84 Event end: VecAYPX [3] 8413.84 Event begin: VecNorm [3] 8413.84 Event end: VecNorm [3] 8413.84 Event begin: PCApply [3] 8413.84 Event begin: VecSet [3] 8413.84 Event end: VecSet [3] 8413.84 Event begin: MatSolve [3] 8413.84 Event end: MatSolve [3] 8413.84 Event end: PCApply [3] 8413.84 Event begin: MatMult [3] 8413.84 Event begin: VecScatterBegin [3] 8413.84 Event begin: SFPack [3] 8413.84 Event end: SFPack [3] 8413.84 Event end: VecScatterBegin [3] 8413.84 Event begin: VecScatterEnd [3] 8413.84 Event begin: SFUnpack [3] 8413.84 Event end: SFUnpack [3] 8413.84 Event end: VecScatterEnd [3] 8413.84 Event end: MatMult [3] 8413.84 Event begin: VecMDot [3] 8413.84 Event end: VecMDot [3] 8413.84 Event begin: VecMAXPY [3] 8413.84 Event end: VecMAXPY [3] 8413.84 Event begin: VecNorm [3] 8413.84 Event end: VecNorm [3] 8413.84 Event begin: VecScale [3] 8413.84 Event end: VecScale [3] 8413.84 Event begin: VecSet [3] 8413.84 Event end: VecSet [3] 8413.84 Event begin: VecMAXPY [3] 8413.84 Event end: VecMAXPY [3] 8413.84 Event begin: VecCopy [3] 8413.84 Event end: VecCopy [3] 8413.84 Event begin: VecAXPY [3] 8413.84 Event end: VecAXPY [3] 8413.84 Event begin: MatMult [3] 8413.84 Event begin: VecScatterBegin [3] 8413.84 Event begin: SFPack [3] 8413.84 Event end: SFPack [3] 8413.84 Event end: VecScatterBegin [3] 8413.84 Event begin: VecScatterEnd [3] 8413.84 Event begin: SFUnpack [3] 8413.84 Event end: SFUnpack [3] 8413.84 Event end: VecScatterEnd [3] 8413.84 Event end: MatMult [3] 8413.84 Event begin: VecAYPX [3] 8413.84 Event end: VecAYPX [3] 8413.84 Event begin: VecNorm [3] 8413.84 Event end: VecNorm [3] 8413.84 Event begin: PCApply [3] 8413.84 Event begin: VecSet [3] 8413.84 Event end: VecSet [3] 8413.84 Event begin: MatSolve [3] 8413.84 Event end: MatSolve [3] 8413.84 Event end: PCApply [3] 8413.84 Event begin: MatMult [3] 8413.84 Event begin: VecScatterBegin [3] 8413.84 Event begin: SFPack [3] 8413.84 Event end: SFPack [3] 8413.84 Event end: VecScatterBegin [3] 8413.84 Event begin: VecScatterEnd [3] 8413.84 Event begin: SFUnpack [3] 8413.84 Event end: SFUnpack [3] 8413.84 Event end: VecScatterEnd [3] 8413.84 Event end: MatMult [3] 8413.84 Event begin: VecMDot [3] 8413.84 Event end: VecMDot [3] 8413.84 Event begin: VecMAXPY [3] 8413.84 Event end: VecMAXPY [3] 8413.84 Event begin: VecNorm [3] 8413.84 Event end: VecNorm [3] 8413.84 Event begin: VecScale [3] 8413.84 Event end: VecScale [3] 8413.84 Event begin: VecSet [3] 8413.84 Event end: VecSet [3] 8413.84 Event begin: VecMAXPY [3] 8413.84 Event end: VecMAXPY [3] 8413.84 Event begin: VecCopy [3] 8413.84 Event end: VecCopy [3] 8413.84 Event begin: VecAXPY [3] 8413.84 Event end: VecAXPY [3] 8413.84 Event begin: MatMult [3] 8413.84 Event begin: VecScatterBegin [3] 8413.84 Event begin: SFPack [3] 8413.84 Event end: SFPack [3] 8413.84 Event end: VecScatterBegin [3] 8413.84 Event begin: VecScatterEnd [3] 8413.84 Event begin: SFUnpack [3] 8413.84 Event end: SFUnpack [3] 8413.84 Event end: VecScatterEnd [3] 8413.84 Event end: MatMult [3] 8413.84 Event begin: VecAYPX [3] 8413.84 Event end: VecAYPX [3] 8413.84 Event begin: VecNorm [3] 8413.84 Event end: VecNorm [3] 8413.84 Event begin: PCApply [3] 8413.84 Event begin: VecSet [3] 8413.84 Event end: VecSet [3] 8413.84 Event begin: MatSolve [3] 8413.84 Event end: MatSolve [3] 8413.84 Event end: PCApply [3] 8413.84 Event begin: MatMult [3] 8413.84 Event begin: VecScatterBegin [3] 8413.84 Event begin: SFPack [3] 8413.84 Event end: SFPack [3] 8413.84 Event end: VecScatterBegin [3] 8413.84 Event begin: VecScatterEnd [3] 8413.84 Event begin: SFUnpack [3] 8413.84 Event end: SFUnpack [3] 8413.84 Event end: VecScatterEnd [3] 8413.84 Event end: MatMult [3] 8413.84 Event begin: VecMDot [3] 8413.84 Event end: VecMDot [3] 8413.84 Event begin: VecMAXPY [3] 8413.84 Event end: VecMAXPY [3] 8413.84 Event begin: VecNorm [3] 8413.84 Event end: VecNorm [3] 8413.84 Event begin: VecScale [3] 8413.84 Event end: VecScale [3] 8413.84 Event begin: VecSet [3] 8413.84 Event end: VecSet [3] 8413.84 Event begin: VecMAXPY [3] 8413.84 Event end: VecMAXPY [3] 8413.84 Event begin: VecCopy [3] 8413.84 Event end: VecCopy [3] 8413.84 Event begin: VecAXPY [3] 8413.85 Event end: VecAXPY [3] 8413.85 Event begin: MatMult [3] 8413.85 Event begin: VecScatterBegin [3] 8413.85 Event begin: SFPack [3] 8413.85 Event end: SFPack [3] 8413.85 Event end: VecScatterBegin [3] 8413.85 Event begin: VecScatterEnd [3] 8413.85 Event begin: SFUnpack [3] 8413.85 Event end: SFUnpack [3] 8413.85 Event end: VecScatterEnd [3] 8413.85 Event end: MatMult [3] 8413.85 Event begin: VecAYPX [3] 8413.85 Event end: VecAYPX [3] 8413.85 Event begin: VecNorm [3] 8413.85 Event end: VecNorm [3] 8413.85 Event begin: PCApply [3] 8413.85 Event begin: VecSet [3] 8413.85 Event end: VecSet [3] 8413.85 Event begin: MatSolve [3] 8413.85 Event end: MatSolve [3] 8413.85 Event end: PCApply [3] 8413.85 Event begin: MatMult [3] 8413.85 Event begin: VecScatterBegin [3] 8413.85 Event begin: SFPack [3] 8413.85 Event end: SFPack [3] 8413.85 Event end: VecScatterBegin [3] 8413.85 Event begin: VecScatterEnd [3] 8413.85 Event begin: SFUnpack [3] 8413.85 Event end: SFUnpack [3] 8413.85 Event end: VecScatterEnd [3] 8413.85 Event end: MatMult [3] 8413.85 Event begin: VecMDot [3] 8413.85 Event end: VecMDot [3] 8413.85 Event begin: VecMAXPY [3] 8413.85 Event end: VecMAXPY [3] 8413.85 Event begin: VecNorm [3] 8413.85 Event end: VecNorm [3] 8413.85 Event begin: VecScale [3] 8413.85 Event end: VecScale [3] 8413.85 Event begin: VecSet [3] 8413.85 Event end: VecSet [3] 8413.85 Event begin: VecMAXPY [3] 8413.85 Event end: VecMAXPY [3] 8413.85 Event begin: VecCopy [3] 8413.85 Event end: VecCopy [3] 8413.85 Event begin: VecAXPY [3] 8413.85 Event end: VecAXPY [3] 8413.85 Event begin: MatMult [3] 8413.85 Event begin: VecScatterBegin [3] 8413.85 Event begin: SFPack [3] 8413.85 Event end: SFPack [3] 8413.85 Event end: VecScatterBegin [3] 8413.85 Event begin: VecScatterEnd [3] 8413.85 Event begin: SFUnpack [3] 8413.85 Event end: SFUnpack [3] 8413.85 Event end: VecScatterEnd [3] 8413.85 Event end: MatMult [3] 8413.85 Event begin: VecAYPX [3] 8413.85 Event end: VecAYPX [3] 8413.85 Event begin: VecNorm [3] 8413.85 Event end: VecNorm [3] 8413.85 Event begin: PCApply [3] 8413.85 Event begin: VecSet [3] 8413.85 Event end: VecSet [3] 8413.85 Event begin: MatSolve [3] 8413.85 Event end: MatSolve [3] 8413.85 Event end: PCApply [3] 8413.85 Event begin: MatMult [3] 8413.85 Event begin: VecScatterBegin [3] 8413.85 Event begin: SFPack [3] 8413.85 Event end: SFPack [3] 8413.85 Event end: VecScatterBegin [3] 8413.85 Event begin: VecScatterEnd [3] 8413.85 Event begin: SFUnpack [3] 8413.85 Event end: SFUnpack [3] 8413.85 Event end: VecScatterEnd [3] 8413.85 Event end: MatMult [3] 8413.85 Event begin: VecMDot [3] 8413.85 Event end: VecMDot [3] 8413.85 Event begin: VecMAXPY [3] 8413.85 Event end: VecMAXPY [3] 8413.85 Event begin: VecNorm [3] 8413.85 Event end: VecNorm [3] 8413.85 Event begin: VecScale [3] 8413.85 Event end: VecScale [3] 8413.85 Event begin: VecSet [3] 8413.85 Event end: VecSet [3] 8413.85 Event begin: VecMAXPY [3] 8413.85 Event end: VecMAXPY [3] 8413.85 Event begin: VecCopy [3] 8413.85 Event end: VecCopy [3] 8413.85 Event begin: VecAXPY [3] 8413.85 Event end: VecAXPY [3] 8413.85 Event begin: MatMult [3] 8413.85 Event begin: VecScatterBegin [3] 8413.85 Event begin: SFPack [3] 8413.85 Event end: SFPack [3] 8413.85 Event end: VecScatterBegin [3] 8413.85 Event begin: VecScatterEnd [3] 8413.85 Event begin: SFUnpack [3] 8413.85 Event end: SFUnpack [3] 8413.85 Event end: VecScatterEnd [3] 8413.85 Event end: MatMult [3] 8413.85 Event begin: VecAYPX [3] 8413.85 Event end: VecAYPX [3] 8413.85 Event begin: VecNorm [3] 8413.85 Event end: VecNorm [3] 8413.85 Event begin: PCApply [3] 8413.85 Event begin: VecSet [3] 8413.85 Event end: VecSet [3] 8413.85 Event begin: MatSolve [3] 8413.85 Event end: MatSolve [3] 8413.85 Event end: PCApply [3] 8413.85 Event begin: MatMult [3] 8413.85 Event begin: VecScatterBegin [3] 8413.85 Event begin: SFPack [3] 8413.85 Event end: SFPack [3] 8413.85 Event end: VecScatterBegin [3] 8413.85 Event begin: VecScatterEnd [3] 8413.85 Event begin: SFUnpack [3] 8413.85 Event end: SFUnpack [3] 8413.85 Event end: VecScatterEnd [3] 8413.85 Event end: MatMult [3] 8413.85 Event begin: VecMDot [3] 8413.85 Event end: VecMDot [3] 8413.85 Event begin: VecMAXPY [3] 8413.85 Event end: VecMAXPY [3] 8413.85 Event begin: VecNorm [3] 8413.85 Event end: VecNorm [3] 8413.85 Event begin: VecScale [3] 8413.85 Event end: VecScale [3] 8413.85 Event begin: VecSet [3] 8413.85 Event end: VecSet [3] 8413.85 Event begin: VecMAXPY [3] 8413.85 Event end: VecMAXPY [3] 8413.85 Event begin: VecCopy [3] 8413.85 Event end: VecCopy [3] 8413.85 Event begin: VecAXPY [3] 8413.85 Event end: VecAXPY [3] 8413.85 Event begin: MatMult [3] 8413.85 Event begin: VecScatterBegin [3] 8413.85 Event begin: SFPack [3] 8413.85 Event end: SFPack [3] 8413.85 Event end: VecScatterBegin [3] 8413.85 Event begin: VecScatterEnd [3] 8413.85 Event begin: SFUnpack [3] 8413.85 Event end: SFUnpack [3] 8413.85 Event end: VecScatterEnd [3] 8413.85 Event end: MatMult [3] 8413.85 Event begin: VecAYPX [3] 8413.85 Event end: VecAYPX [3] 8413.85 Event begin: VecNorm [3] 8413.85 Event end: VecNorm [3] 8413.85 Event begin: PCApply [3] 8413.85 Event begin: VecSet [3] 8413.85 Event end: VecSet [3] 8413.85 Event begin: MatSolve [3] 8413.85 Event end: MatSolve [3] 8413.85 Event end: PCApply [3] 8413.85 Event begin: MatMult [3] 8413.85 Event begin: VecScatterBegin [3] 8413.85 Event begin: SFPack [3] 8413.85 Event end: SFPack [3] 8413.85 Event end: VecScatterBegin [3] 8413.85 Event begin: VecScatterEnd [3] 8413.85 Event begin: SFUnpack [3] 8413.85 Event end: SFUnpack [3] 8413.85 Event end: VecScatterEnd [3] 8413.85 Event end: MatMult [3] 8413.85 Event begin: VecMDot [3] 8413.85 Event end: VecMDot [3] 8413.85 Event begin: VecMAXPY [3] 8413.85 Event end: VecMAXPY [3] 8413.85 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: VecScale [3] 8413.86 Event end: VecScale [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: VecMAXPY [3] 8413.86 Event end: VecMAXPY [3] 8413.86 Event begin: VecCopy [3] 8413.86 Event end: VecCopy [3] 8413.86 Event begin: VecAXPY [3] 8413.86 Event end: VecAXPY [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.86 Event begin: SFUnpack [3] 8413.86 Event end: SFUnpack [3] 8413.86 Event end: VecScatterEnd [3] 8413.86 Event end: MatMult [3] 8413.86 Event begin: VecAYPX [3] 8413.86 Event end: VecAYPX [3] 8413.86 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: PCApply [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: MatSolve [3] 8413.86 Event end: MatSolve [3] 8413.86 Event end: PCApply [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.86 Event begin: SFUnpack [3] 8413.86 Event end: SFUnpack [3] 8413.86 Event end: VecScatterEnd [3] 8413.86 Event end: MatMult [3] 8413.86 Event begin: VecMDot [3] 8413.86 Event end: VecMDot [3] 8413.86 Event begin: VecMAXPY [3] 8413.86 Event end: VecMAXPY [3] 8413.86 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: VecScale [3] 8413.86 Event end: VecScale [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: VecMAXPY [3] 8413.86 Event end: VecMAXPY [3] 8413.86 Event begin: VecCopy [3] 8413.86 Event end: VecCopy [3] 8413.86 Event begin: VecAXPY [3] 8413.86 Event end: VecAXPY [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.86 Event begin: SFUnpack [3] 8413.86 Event end: SFUnpack [3] 8413.86 Event end: VecScatterEnd [3] 8413.86 Event end: MatMult [3] 8413.86 Event begin: VecAYPX [3] 8413.86 Event end: VecAYPX [3] 8413.86 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: PCApply [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: MatSolve [3] 8413.86 Event end: MatSolve [3] 8413.86 Event end: PCApply [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.86 Event begin: SFUnpack [3] 8413.86 Event end: SFUnpack [3] 8413.86 Event end: VecScatterEnd [3] 8413.86 Event end: MatMult [3] 8413.86 Event begin: VecMDot [3] 8413.86 Event end: VecMDot [3] 8413.86 Event begin: VecMAXPY [3] 8413.86 Event end: VecMAXPY [3] 8413.86 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: VecScale [3] 8413.86 Event end: VecScale [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: VecMAXPY [3] 8413.86 Event end: VecMAXPY [3] 8413.86 Event begin: VecCopy [3] 8413.86 Event end: VecCopy [3] 8413.86 Event begin: VecAXPY [3] 8413.86 Event end: VecAXPY [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.86 Event begin: SFUnpack [3] 8413.86 Event end: SFUnpack [3] 8413.86 Event end: VecScatterEnd [3] 8413.86 Event end: MatMult [3] 8413.86 Event begin: VecAYPX [3] 8413.86 Event end: VecAYPX [3] 8413.86 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: PCApply [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: MatSolve [3] 8413.86 Event end: MatSolve [3] 8413.86 Event end: PCApply [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.86 Event begin: SFUnpack [3] 8413.86 Event end: SFUnpack [3] 8413.86 Event end: VecScatterEnd [3] 8413.86 Event end: MatMult [3] 8413.86 Event begin: VecMDot [3] 8413.86 Event end: VecMDot [3] 8413.86 Event begin: VecMAXPY [3] 8413.86 Event end: VecMAXPY [3] 8413.86 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: VecScale [3] 8413.86 Event end: VecScale [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: VecMAXPY [3] 8413.86 Event end: VecMAXPY [3] 8413.86 Event begin: VecCopy [3] 8413.86 Event end: VecCopy [3] 8413.86 Event begin: VecAXPY [3] 8413.86 Event end: VecAXPY [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.86 Event begin: SFUnpack [3] 8413.86 Event end: SFUnpack [3] 8413.86 Event end: VecScatterEnd [3] 8413.86 Event end: MatMult [3] 8413.86 Event begin: VecAYPX [3] 8413.86 Event end: VecAYPX [3] 8413.86 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: PCApply [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: MatSolve [3] 8413.86 Event end: MatSolve [3] 8413.86 Event end: PCApply [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.86 Event begin: SFUnpack [3] 8413.86 Event end: SFUnpack [3] 8413.86 Event end: VecScatterEnd [3] 8413.86 Event end: MatMult [3] 8413.86 Event begin: VecMDot [3] 8413.86 Event end: VecMDot [3] 8413.86 Event begin: VecMAXPY [3] 8413.86 Event end: VecMAXPY [3] 8413.86 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: VecScale [3] 8413.86 Event end: VecScale [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: VecMAXPY [3] 8413.86 Event end: VecMAXPY [3] 8413.86 Event begin: VecCopy [3] 8413.86 Event end: VecCopy [3] 8413.86 Event begin: VecAXPY [3] 8413.86 Event end: VecAXPY [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.86 Event begin: SFUnpack [3] 8413.86 Event end: SFUnpack [3] 8413.86 Event end: VecScatterEnd [3] 8413.86 Event end: MatMult [3] 8413.86 Event begin: VecAYPX [3] 8413.86 Event end: VecAYPX [3] 8413.86 Event begin: VecNorm [3] 8413.86 Event end: VecNorm [3] 8413.86 Event begin: PCApply [3] 8413.86 Event begin: VecSet [3] 8413.86 Event end: VecSet [3] 8413.86 Event begin: MatSolve [3] 8413.86 Event end: MatSolve [3] 8413.86 Event end: PCApply [3] 8413.86 Event begin: MatMult [3] 8413.86 Event begin: VecScatterBegin [3] 8413.86 Event begin: SFPack [3] 8413.86 Event end: SFPack [3] 8413.86 Event end: VecScatterBegin [3] 8413.86 Event begin: VecScatterEnd [3] 8413.89 Event begin: SFUnpack [3] 8413.89 Event end: SFUnpack [3] 8413.89 Event end: VecScatterEnd [3] 8413.89 Event end: MatMult [3] 8413.89 Event begin: VecMDot [3] 8413.89 Event end: VecMDot [3] 8413.89 Event begin: VecMAXPY [3] 8413.89 Event end: VecMAXPY [3] 8413.89 Event begin: VecNorm [3] 8413.89 Event end: VecNorm [3] 8413.89 Event begin: VecScale [3] 8413.89 Event end: VecScale [3] 8413.89 Event begin: VecSet [3] 8413.89 Event end: VecSet [3] 8413.89 Event begin: VecMAXPY [3] 8413.89 Event end: VecMAXPY [3] 8413.89 Event begin: VecCopy [3] 8413.89 Event end: VecCopy [3] 8413.89 Event begin: VecAXPY [3] 8413.89 Event end: VecAXPY [3] 8413.89 Event begin: MatMult [3] 8413.89 Event begin: VecScatterBegin [3] 8413.89 Event begin: SFPack [3] 8413.89 Event end: SFPack [3] 8413.89 Event end: VecScatterBegin [3] 8413.89 Event begin: VecScatterEnd [3] 8413.89 Event begin: SFUnpack [3] 8413.89 Event end: SFUnpack [3] 8413.89 Event end: VecScatterEnd [3] 8413.89 Event end: MatMult [3] 8413.89 Event begin: VecAYPX [3] 8413.89 Event end: VecAYPX [3] 8413.89 Event begin: VecNorm [3] 8413.89 Event end: VecNorm [3] 8413.89 Event begin: PCApply [3] 8413.89 Event begin: VecSet [3] 8413.89 Event end: VecSet [3] 8413.89 Event begin: MatSolve [3] 8413.89 Event end: MatSolve [3] 8413.89 Event end: PCApply [3] 8413.89 Event begin: MatMult [3] 8413.89 Event begin: VecScatterBegin [3] 8413.89 Event begin: SFPack [3] 8413.89 Event end: SFPack [3] 8413.89 Event end: VecScatterBegin [3] 8413.89 Event begin: VecScatterEnd [3] 8413.9 Event begin: SFUnpack [3] 8413.9 Event end: SFUnpack [3] 8413.9 Event end: VecScatterEnd [3] 8413.9 Event end: MatMult [3] 8413.9 Event begin: VecMDot [3] 8413.9 Event end: VecMDot [3] 8413.9 Event begin: VecMAXPY [3] 8413.9 Event end: VecMAXPY [3] 8413.9 Event begin: VecNorm [3] 8413.9 Event end: VecNorm [3] 8413.9 Event begin: VecScale [3] 8413.9 Event end: VecScale [3] 8413.9 Event begin: VecSet [3] 8413.9 Event end: VecSet [3] 8413.9 Event begin: VecMAXPY [3] 8413.9 Event end: VecMAXPY [3] 8413.9 Event begin: VecCopy [3] 8413.9 Event end: VecCopy [3] 8413.9 Event begin: VecAXPY [3] 8413.9 Event end: VecAXPY [3] 8413.9 Event begin: MatMult [3] 8413.9 Event begin: VecScatterBegin [3] 8413.9 Event begin: SFPack [3] 8413.9 Event end: SFPack [3] 8413.9 Event end: VecScatterBegin [3] 8413.9 Event begin: VecScatterEnd [3] 8413.9 Event begin: SFUnpack [3] 8413.9 Event end: SFUnpack [3] 8413.9 Event end: VecScatterEnd [3] 8413.9 Event end: MatMult [3] 8413.9 Event begin: VecAYPX [3] 8413.9 Event end: VecAYPX [3] 8413.9 Event begin: VecNorm [3] 8413.9 Event end: VecNorm [3] 8413.9 Event begin: PCApply [3] 8413.9 Event begin: VecSet [3] 8413.9 Event end: VecSet [3] 8413.9 Event begin: MatSolve [3] 8413.9 Event end: MatSolve [3] 8413.9 Event end: PCApply [3] 8413.9 Event begin: MatMult [3] 8413.9 Event begin: VecScatterBegin [3] 8413.9 Event begin: SFPack [3] 8413.9 Event end: SFPack [3] 8413.9 Event end: VecScatterBegin [3] 8413.9 Event begin: VecScatterEnd [3] 8413.9 Event begin: SFUnpack [3] 8413.9 Event end: SFUnpack [3] 8413.9 Event end: VecScatterEnd [3] 8413.9 Event end: MatMult [3] 8413.9 Event begin: VecMDot [3] 8413.9 Event end: VecMDot [3] 8413.9 Event begin: VecMAXPY [3] 8413.9 Event end: VecMAXPY [3] 8413.9 Event begin: VecNorm [3] 8413.9 Event end: VecNorm [3] 8413.9 Event begin: VecScale [3] 8413.9 Event end: VecScale [3] 8413.9 Event begin: VecSet [3] 8413.9 Event end: VecSet [3] 8413.9 Event begin: VecMAXPY [3] 8413.9 Event end: VecMAXPY [3] 8413.9 Event begin: VecCopy [3] 8413.9 Event end: VecCopy [3] 8413.9 Event begin: VecAXPY [3] 8413.9 Event end: VecAXPY [3] 8413.9 Event begin: MatMult [3] 8413.9 Event begin: VecScatterBegin [3] 8413.9 Event begin: SFPack [3] 8413.9 Event end: SFPack [3] 8413.9 Event end: VecScatterBegin [3] 8413.9 Event begin: VecScatterEnd [3] 8413.9 Event begin: SFUnpack [3] 8413.9 Event end: SFUnpack [3] 8413.9 Event end: VecScatterEnd [3] 8413.9 Event end: MatMult [3] 8413.9 Event begin: VecAYPX [3] 8413.9 Event end: VecAYPX [3] 8413.9 Event begin: VecNorm [3] 8413.9 Event end: VecNorm [3] 8413.9 Event begin: PCApply [3] 8413.9 Event begin: VecSet [3] 8413.9 Event end: VecSet [3] 8413.9 Event begin: MatSolve [3] 8413.9 Event end: MatSolve [3] 8413.9 Event end: PCApply [3] 8413.9 Event begin: MatMult [3] 8413.9 Event begin: VecScatterBegin [3] 8413.9 Event begin: SFPack [3] 8413.9 Event end: SFPack [3] 8413.9 Event end: VecScatterBegin [3] 8413.9 Event begin: VecScatterEnd [3] 8413.9 Event begin: SFUnpack [3] 8413.9 Event end: SFUnpack [3] 8413.9 Event end: VecScatterEnd [3] 8413.9 Event end: MatMult [3] 8413.9 Event begin: VecMDot [3] 8413.9 Event end: VecMDot [3] 8413.9 Event begin: VecMAXPY [3] 8413.9 Event end: VecMAXPY [3] 8413.9 Event begin: VecNorm [3] 8413.9 Event end: VecNorm [3] 8413.9 Event begin: VecScale [3] 8413.9 Event end: VecScale [3] 8413.9 Event begin: VecSet [3] 8413.9 Event end: VecSet [3] 8413.9 Event begin: VecMAXPY [3] 8413.9 Event end: VecMAXPY [3] 8413.9 Event begin: VecCopy [3] 8413.9 Event end: VecCopy [3] 8413.9 Event begin: VecAXPY [3] 8413.9 Event end: VecAXPY [0] 8413.85 Event begin: VecScale [0] 8413.85 Event end: VecScale [0] 8413.85 Event begin: VecSet [0] 8413.85 Event end: VecSet [0] 8413.85 Event begin: VecMAXPY [0] 8413.85 Event end: VecMAXPY [0] 8413.85 Event begin: VecCopy [0] 8413.85 Event end: VecCopy [0] 8413.85 Event begin: VecAXPY [0] 8413.85 Event end: VecAXPY [0] 8413.85 Event begin: MatMult [0] 8413.85 Event begin: VecScatterBegin [0] 8413.85 Event begin: SFPack [0] 8413.85 Event end: SFPack [0] 8413.85 Event end: VecScatterBegin [0] 8413.85 Event begin: VecScatterEnd [0] 8413.85 Event begin: SFUnpack [0] 8413.85 Event end: SFUnpack [0] 8413.85 Event end: VecScatterEnd [0] 8413.85 Event end: MatMult [0] 8413.85 Event begin: VecAYPX [0] 8413.85 Event end: VecAYPX [0] 8413.85 Event begin: VecNorm [0] 8413.85 Event end: VecNorm 24 KSP unpreconditioned resid norm 3.767769229151e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.85 Event begin: PCApply [0] 8413.85 Event begin: VecSet [0] 8413.85 Event end: VecSet [0] 8413.85 Event begin: MatSolve [0] 8413.85 Event end: MatSolve [0] 8413.85 Event end: PCApply [0] 8413.85 Event begin: MatMult [0] 8413.85 Event begin: VecScatterBegin [0] 8413.85 Event begin: SFPack [0] 8413.85 Event end: SFPack [0] 8413.85 Event end: VecScatterBegin [0] 8413.85 Event begin: VecScatterEnd [0] 8413.85 Event begin: SFUnpack [0] 8413.85 Event end: SFUnpack [0] 8413.85 Event end: VecScatterEnd [0] 8413.85 Event end: MatMult [0] 8413.85 Event begin: VecMDot [0] 8413.85 Event end: VecMDot [0] 8413.85 Event begin: VecMAXPY [0] 8413.85 Event end: VecMAXPY [0] 8413.85 Event begin: VecNorm [0] 8413.85 Event end: VecNorm [0] 8413.85 Event begin: VecScale [0] 8413.85 Event end: VecScale [0] 8413.85 Event begin: VecSet [0] 8413.85 Event end: VecSet [0] 8413.85 Event begin: VecMAXPY [0] 8413.85 Event end: VecMAXPY [0] 8413.85 Event begin: VecCopy [0] 8413.85 Event end: VecCopy [0] 8413.85 Event begin: VecAXPY [0] 8413.85 Event end: VecAXPY [0] 8413.85 Event begin: MatMult [0] 8413.85 Event begin: VecScatterBegin [0] 8413.85 Event begin: SFPack [0] 8413.85 Event end: SFPack [0] 8413.85 Event end: VecScatterBegin [0] 8413.85 Event begin: VecScatterEnd [0] 8413.85 Event begin: SFUnpack [0] 8413.85 Event end: SFUnpack [0] 8413.85 Event end: VecScatterEnd [0] 8413.85 Event end: MatMult [0] 8413.85 Event begin: VecAYPX [0] 8413.85 Event end: VecAYPX [0] 8413.85 Event begin: VecNorm [0] 8413.86 Event end: VecNorm 25 KSP unpreconditioned resid norm 2.758466247011e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.86 Event begin: PCApply [0] 8413.86 Event begin: VecSet [0] 8413.86 Event end: VecSet [0] 8413.86 Event begin: MatSolve [0] 8413.86 Event end: MatSolve [0] 8413.86 Event end: PCApply [0] 8413.86 Event begin: MatMult [0] 8413.86 Event begin: VecScatterBegin [0] 8413.86 Event begin: SFPack [0] 8413.86 Event end: SFPack [0] 8413.86 Event end: VecScatterBegin [0] 8413.86 Event begin: VecScatterEnd [0] 8413.86 Event begin: SFUnpack [0] 8413.86 Event end: SFUnpack [0] 8413.86 Event end: VecScatterEnd [0] 8413.86 Event end: MatMult [0] 8413.86 Event begin: VecMDot [0] 8413.86 Event end: VecMDot [0] 8413.86 Event begin: VecMAXPY [0] 8413.86 Event end: VecMAXPY [0] 8413.86 Event begin: VecNorm [0] 8413.86 Event end: VecNorm [0] 8413.86 Event begin: VecScale [0] 8413.86 Event end: VecScale [0] 8413.86 Event begin: VecSet [0] 8413.86 Event end: VecSet [0] 8413.86 Event begin: VecMAXPY [0] 8413.86 Event end: VecMAXPY [0] 8413.86 Event begin: VecCopy [0] 8413.86 Event end: VecCopy [0] 8413.86 Event begin: VecAXPY [0] 8413.86 Event end: VecAXPY [0] 8413.86 Event begin: MatMult [0] 8413.86 Event begin: VecScatterBegin [0] 8413.86 Event begin: SFPack [0] 8413.86 Event end: SFPack [0] 8413.86 Event end: VecScatterBegin [0] 8413.86 Event begin: VecScatterEnd [0] 8413.86 Event begin: SFUnpack [0] 8413.86 Event end: SFUnpack [0] 8413.86 Event end: VecScatterEnd [0] 8413.86 Event end: MatMult [0] 8413.86 Event begin: VecAYPX [0] 8413.86 Event end: VecAYPX [0] 8413.86 Event begin: VecNorm [0] 8413.86 Event end: VecNorm 26 KSP unpreconditioned resid norm 2.401067727732e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.86 Event begin: PCApply [0] 8413.86 Event begin: VecSet [0] 8413.86 Event end: VecSet [0] 8413.86 Event begin: MatSolve [0] 8413.86 Event end: MatSolve [0] 8413.86 Event end: PCApply [0] 8413.86 Event begin: MatMult [0] 8413.86 Event begin: VecScatterBegin [0] 8413.86 Event begin: SFPack [0] 8413.86 Event end: SFPack [0] 8413.86 Event end: VecScatterBegin [0] 8413.86 Event begin: VecScatterEnd [0] 8413.86 Event begin: SFUnpack [0] 8413.86 Event end: SFUnpack [0] 8413.86 Event end: VecScatterEnd [0] 8413.86 Event end: MatMult [0] 8413.86 Event begin: VecMDot [0] 8413.86 Event end: VecMDot [0] 8413.86 Event begin: VecMAXPY [0] 8413.86 Event end: VecMAXPY [0] 8413.86 Event begin: VecNorm [0] 8413.86 Event end: VecNorm [0] 8413.86 Event begin: VecScale [0] 8413.86 Event end: VecScale [0] 8413.86 Event begin: VecSet [0] 8413.86 Event end: VecSet [0] 8413.86 Event begin: VecMAXPY [0] 8413.86 Event end: VecMAXPY [0] 8413.86 Event begin: VecCopy [0] 8413.86 Event end: VecCopy [0] 8413.86 Event begin: VecAXPY [0] 8413.86 Event end: VecAXPY [0] 8413.86 Event begin: MatMult [0] 8413.86 Event begin: VecScatterBegin [0] 8413.86 Event begin: SFPack [0] 8413.86 Event end: SFPack [0] 8413.86 Event end: VecScatterBegin [0] 8413.86 Event begin: VecScatterEnd [0] 8413.86 Event begin: SFUnpack [0] 8413.86 Event end: SFUnpack [0] 8413.86 Event end: VecScatterEnd [0] 8413.86 Event end: MatMult [0] 8413.86 Event begin: VecAYPX [0] 8413.86 Event end: VecAYPX [0] 8413.86 Event begin: VecNorm [0] 8413.86 Event end: VecNorm 27 KSP unpreconditioned resid norm 1.918365058572e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.86 Event begin: PCApply [0] 8413.86 Event begin: VecSet [0] 8413.86 Event end: VecSet [0] 8413.86 Event begin: MatSolve [0] 8413.86 Event end: MatSolve [0] 8413.86 Event end: PCApply [0] 8413.86 Event begin: MatMult [0] 8413.86 Event begin: VecScatterBegin [0] 8413.86 Event begin: SFPack [0] 8413.86 Event end: SFPack [0] 8413.86 Event end: VecScatterBegin [0] 8413.86 Event begin: VecScatterEnd [0] 8413.86 Event begin: SFUnpack [0] 8413.86 Event end: SFUnpack [0] 8413.86 Event end: VecScatterEnd [0] 8413.86 Event end: MatMult [0] 8413.86 Event begin: VecMDot [0] 8413.86 Event end: VecMDot [0] 8413.86 Event begin: VecMAXPY [0] 8413.86 Event end: VecMAXPY [0] 8413.86 Event begin: VecNorm [0] 8413.86 Event end: VecNorm [0] 8413.86 Event begin: VecScale [0] 8413.86 Event end: VecScale [0] 8413.86 Event begin: VecSet [0] 8413.86 Event end: VecSet [0] 8413.86 Event begin: VecMAXPY [0] 8413.86 Event end: VecMAXPY [0] 8413.86 Event begin: VecCopy [0] 8413.86 Event end: VecCopy [0] 8413.86 Event begin: VecAXPY [0] 8413.86 Event end: VecAXPY [0] 8413.86 Event begin: MatMult [0] 8413.86 Event begin: VecScatterBegin [0] 8413.86 Event begin: SFPack [0] 8413.86 Event end: SFPack [0] 8413.86 Event end: VecScatterBegin [0] 8413.86 Event begin: VecScatterEnd [0] 8413.86 Event begin: SFUnpack [0] 8413.86 Event end: SFUnpack [0] 8413.86 Event end: VecScatterEnd [0] 8413.86 Event end: MatMult [0] 8413.86 Event begin: VecAYPX [0] 8413.86 Event end: VecAYPX [0] 8413.86 Event begin: VecNorm [0] 8413.86 Event end: VecNorm 28 KSP unpreconditioned resid norm 1.796891339145e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.86 Event begin: PCApply [0] 8413.86 Event begin: VecSet [0] 8413.86 Event end: VecSet [0] 8413.86 Event begin: MatSolve [0] 8413.86 Event end: MatSolve [0] 8413.86 Event end: PCApply [0] 8413.86 Event begin: MatMult [0] 8413.86 Event begin: VecScatterBegin [0] 8413.86 Event begin: SFPack [0] 8413.86 Event end: SFPack [0] 8413.86 Event end: VecScatterBegin [0] 8413.86 Event begin: VecScatterEnd [0] 8413.86 Event begin: SFUnpack [0] 8413.86 Event end: SFUnpack [0] 8413.86 Event end: VecScatterEnd [0] 8413.86 Event end: MatMult [0] 8413.86 Event begin: VecMDot [0] 8413.86 Event end: VecMDot [0] 8413.86 Event begin: VecMAXPY [0] 8413.86 Event end: VecMAXPY [0] 8413.86 Event begin: VecNorm [0] 8413.86 Event end: VecNorm [0] 8413.86 Event begin: VecScale [0] 8413.86 Event end: VecScale [0] 8413.86 Event begin: VecSet [0] 8413.86 Event end: VecSet [0] 8413.86 Event begin: VecMAXPY [0] 8413.86 Event end: VecMAXPY [0] 8413.86 Event begin: VecCopy [0] 8413.86 Event end: VecCopy [0] 8413.86 Event begin: VecAXPY [0] 8413.86 Event end: VecAXPY [0] 8413.86 Event begin: MatMult [0] 8413.86 Event begin: VecScatterBegin [0] 8413.86 Event begin: SFPack [0] 8413.86 Event end: SFPack [0] 8413.86 Event end: VecScatterBegin [0] 8413.86 Event begin: VecScatterEnd [0] 8413.86 Event begin: SFUnpack [0] 8413.86 Event end: SFUnpack [0] 8413.86 Event end: VecScatterEnd [0] 8413.86 Event end: MatMult [0] 8413.86 Event begin: VecAYPX [0] 8413.86 Event end: VecAYPX [0] 8413.86 Event begin: VecNorm [0] 8413.86 Event end: VecNorm 29 KSP unpreconditioned resid norm 1.646774974855e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.86 Event begin: PCApply [0] 8413.86 Event begin: VecSet [0] 8413.86 Event end: VecSet [0] 8413.86 Event begin: MatSolve [0] 8413.86 Event end: MatSolve [0] 8413.86 Event end: PCApply [0] 8413.86 Event begin: MatMult [0] 8413.86 Event begin: VecScatterBegin [0] 8413.86 Event begin: SFPack [0] 8413.86 Event end: SFPack [0] 8413.86 Event end: VecScatterBegin [0] 8413.86 Event begin: VecScatterEnd [0] 8413.89 Event begin: SFUnpack [0] 8413.89 Event end: SFUnpack [0] 8413.89 Event end: VecScatterEnd [0] 8413.89 Event end: MatMult [0] 8413.89 Event begin: VecMDot [0] 8413.89 Event end: VecMDot [0] 8413.89 Event begin: VecMAXPY [0] 8413.89 Event end: VecMAXPY [0] 8413.89 Event begin: VecNorm [0] 8413.89 Event end: VecNorm [0] 8413.89 Event begin: VecScale [0] 8413.89 Event end: VecScale [0] 8413.89 Event begin: VecSet [0] 8413.89 Event end: VecSet [0] 8413.89 Event begin: VecMAXPY [0] 8413.89 Event end: VecMAXPY [0] 8413.89 Event begin: VecCopy [0] 8413.89 Event end: VecCopy [0] 8413.89 Event begin: VecAXPY [0] 8413.89 Event end: VecAXPY [0] 8413.89 Event begin: MatMult [0] 8413.89 Event begin: VecScatterBegin [0] 8413.89 Event begin: SFPack [0] 8413.89 Event end: SFPack [0] 8413.89 Event end: VecScatterBegin [0] 8413.89 Event begin: VecScatterEnd [0] 8413.89 Event begin: SFUnpack [0] 8413.89 Event end: SFUnpack [0] 8413.89 Event end: VecScatterEnd [0] 8413.89 Event end: MatMult [0] 8413.89 Event begin: VecAYPX [0] 8413.89 Event end: VecAYPX [0] 8413.89 Event begin: VecNorm [0] 8413.89 Event end: VecNorm 30 KSP unpreconditioned resid norm 1.581043134782e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.89 Event begin: PCApply [0] 8413.89 Event begin: VecSet [0] 8413.89 Event end: VecSet [0] 8413.89 Event begin: MatSolve [0] 8413.89 Event end: MatSolve [0] 8413.89 Event end: PCApply [0] 8413.89 Event begin: MatMult [0] 8413.89 Event begin: VecScatterBegin [0] 8413.89 Event begin: SFPack [0] 8413.89 Event end: SFPack [0] 8413.89 Event end: VecScatterBegin [0] 8413.89 Event begin: VecScatterEnd [0] 8413.89 Event begin: SFUnpack [0] 8413.89 Event end: SFUnpack [0] 8413.89 Event end: VecScatterEnd [0] 8413.89 Event end: MatMult [0] 8413.89 Event begin: VecMDot [0] 8413.9 Event end: VecMDot [0] 8413.9 Event begin: VecMAXPY [0] 8413.9 Event end: VecMAXPY [0] 8413.9 Event begin: VecNorm [0] 8413.9 Event end: VecNorm [0] 8413.9 Event begin: VecScale [0] 8413.9 Event end: VecScale [0] 8413.9 Event begin: VecSet [0] 8413.9 Event end: VecSet [0] 8413.9 Event begin: VecMAXPY [0] 8413.9 Event end: VecMAXPY [0] 8413.9 Event begin: VecCopy [0] 8413.9 Event end: VecCopy [0] 8413.9 Event begin: VecAXPY [0] 8413.9 Event end: VecAXPY [0] 8413.9 Event begin: MatMult [0] 8413.9 Event begin: VecScatterBegin [0] 8413.9 Event begin: SFPack [0] 8413.9 Event end: SFPack [0] 8413.9 Event end: VecScatterBegin [0] 8413.9 Event begin: VecScatterEnd [0] 8413.9 Event begin: SFUnpack [0] 8413.9 Event end: SFUnpack [0] 8413.9 Event end: VecScatterEnd [0] 8413.9 Event end: MatMult [0] 8413.9 Event begin: VecAYPX [0] 8413.9 Event end: VecAYPX [0] 8413.9 Event begin: VecNorm [0] 8413.9 Event end: VecNorm 31 KSP unpreconditioned resid norm 1.451402400457e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.9 Event begin: PCApply [0] 8413.9 Event begin: VecSet [0] 8413.9 Event end: VecSet [0] 8413.9 Event begin: MatSolve [0] 8413.9 Event end: MatSolve [0] 8413.9 Event end: PCApply [0] 8413.9 Event begin: MatMult [0] 8413.9 Event begin: VecScatterBegin [0] 8413.9 Event begin: SFPack [0] 8413.9 Event end: SFPack [0] 8413.9 Event end: VecScatterBegin [0] 8413.9 Event begin: VecScatterEnd [0] 8413.9 Event begin: SFUnpack [0] 8413.9 Event end: SFUnpack [0] 8413.9 Event end: VecScatterEnd [0] 8413.9 Event end: MatMult [0] 8413.9 Event begin: VecMDot [0] 8413.9 Event end: VecMDot [0] 8413.9 Event begin: VecMAXPY [0] 8413.9 Event end: VecMAXPY [0] 8413.9 Event begin: VecNorm [0] 8413.9 Event end: VecNorm [0] 8413.9 Event begin: VecScale [0] 8413.9 Event end: VecScale [0] 8413.9 Event begin: VecSet [0] 8413.9 Event end: VecSet [0] 8413.9 Event begin: VecMAXPY [0] 8413.9 Event end: VecMAXPY [0] 8413.9 Event begin: VecCopy [0] 8413.9 Event end: VecCopy [0] 8413.9 Event begin: VecAXPY [0] 8413.9 Event end: VecAXPY [0] 8413.9 Event begin: MatMult [0] 8413.9 Event begin: VecScatterBegin [0] 8413.9 Event begin: SFPack [0] 8413.9 Event end: SFPack [0] 8413.9 Event end: VecScatterBegin [0] 8413.9 Event begin: VecScatterEnd [0] 8413.9 Event begin: SFUnpack [0] 8413.9 Event end: SFUnpack [0] 8413.9 Event end: VecScatterEnd [0] 8413.9 Event end: MatMult [0] 8413.9 Event begin: VecAYPX [0] 8413.9 Event end: VecAYPX [0] 8413.9 Event begin: VecNorm [0] 8413.9 Event end: VecNorm 32 KSP unpreconditioned resid norm 1.365718641872e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.9 Event begin: PCApply [0] 8413.9 Event begin: VecSet [0] 8413.9 Event end: VecSet [0] 8413.9 Event begin: MatSolve [0] 8413.9 Event end: MatSolve [0] 8413.9 Event end: PCApply [0] 8413.9 Event begin: MatMult [0] 8413.9 Event begin: VecScatterBegin [0] 8413.9 Event begin: SFPack [0] 8413.9 Event end: SFPack [0] 8413.9 Event end: VecScatterBegin [0] 8413.9 Event begin: VecScatterEnd [0] 8413.9 Event begin: SFUnpack [0] 8413.9 Event end: SFUnpack [0] 8413.9 Event end: VecScatterEnd [0] 8413.9 Event end: MatMult [0] 8413.9 Event begin: VecMDot [0] 8413.9 Event end: VecMDot [0] 8413.9 Event begin: VecMAXPY [0] 8413.9 Event end: VecMAXPY [0] 8413.9 Event begin: VecNorm [0] 8413.9 Event end: VecNorm [0] 8413.9 Event begin: VecScale [0] 8413.9 Event end: VecScale [0] 8413.9 Event begin: VecSet [0] 8413.9 Event end: VecSet [0] 8413.9 Event begin: VecMAXPY [0] 8413.9 Event end: VecMAXPY [0] 8413.9 Event begin: VecCopy [0] 8413.9 Event end: VecCopy [0] 8413.9 Event begin: VecAXPY [0] 8413.9 Event end: VecAXPY [0] 8413.9 Event begin: MatMult [0] 8413.9 Event begin: VecScatterBegin [0] 8413.9 Event begin: SFPack [0] 8413.9 Event end: SFPack [0] 8413.9 Event end: VecScatterBegin [0] 8413.9 Event begin: VecScatterEnd [0] 8413.94 Event begin: SFUnpack [0] 8413.94 Event end: SFUnpack [0] 8413.94 Event end: VecScatterEnd [0] 8413.94 Event end: MatMult [0] 8413.94 Event begin: VecAYPX [0] 8413.94 Event end: VecAYPX [0] 8413.94 Event begin: VecNorm [1] 8413.86 Event end: MatSolve [1] 8413.89 Event end: PCApply [1] 8413.89 Event begin: MatMult [1] 8413.89 Event begin: VecScatterBegin [1] 8413.89 Event begin: SFPack [1] 8413.89 Event end: SFPack [1] 8413.89 Event end: VecScatterBegin [1] 8413.89 Event begin: VecScatterEnd [1] 8413.89 Event begin: SFUnpack [1] 8413.89 Event end: SFUnpack [1] 8413.89 Event end: VecScatterEnd [1] 8413.89 Event end: MatMult [1] 8413.89 Event begin: VecMDot [1] 8413.89 Event end: VecMDot [1] 8413.89 Event begin: VecMAXPY [1] 8413.89 Event end: VecMAXPY [1] 8413.89 Event begin: VecNorm [1] 8413.89 Event end: VecNorm [1] 8413.89 Event begin: VecScale [1] 8413.89 Event end: VecScale [1] 8413.89 Event begin: VecSet [1] 8413.89 Event end: VecSet [1] 8413.89 Event begin: VecMAXPY [1] 8413.89 Event end: VecMAXPY [1] 8413.89 Event begin: VecCopy [1] 8413.89 Event end: VecCopy [1] 8413.89 Event begin: VecAXPY [1] 8413.89 Event end: VecAXPY [1] 8413.89 Event begin: MatMult [1] 8413.89 Event begin: VecScatterBegin [1] 8413.89 Event begin: SFPack [1] 8413.89 Event end: SFPack [1] 8413.89 Event end: VecScatterBegin [1] 8413.89 Event begin: VecScatterEnd [1] 8413.89 Event begin: SFUnpack [1] 8413.89 Event end: SFUnpack [1] 8413.89 Event end: VecScatterEnd [1] 8413.89 Event end: MatMult [1] 8413.89 Event begin: VecAYPX [1] 8413.89 Event end: VecAYPX [1] 8413.89 Event begin: VecNorm [1] 8413.89 Event end: VecNorm [1] 8413.89 Event begin: PCApply [1] 8413.89 Event begin: VecSet [1] 8413.89 Event end: VecSet [1] 8413.89 Event begin: MatSolve [1] 8413.89 Event end: MatSolve [1] 8413.89 Event end: PCApply [1] 8413.89 Event begin: MatMult [1] 8413.89 Event begin: VecScatterBegin [1] 8413.89 Event begin: SFPack [1] 8413.89 Event end: SFPack [1] 8413.89 Event end: VecScatterBegin [1] 8413.89 Event begin: VecScatterEnd [1] 8413.9 Event begin: SFUnpack [1] 8413.9 Event end: SFUnpack [1] 8413.9 Event end: VecScatterEnd [1] 8413.9 Event end: MatMult [1] 8413.9 Event begin: VecMDot [1] 8413.9 Event end: VecMDot [1] 8413.9 Event begin: VecMAXPY [1] 8413.9 Event end: VecMAXPY [1] 8413.9 Event begin: VecNorm [1] 8413.9 Event end: VecNorm [1] 8413.9 Event begin: VecScale [1] 8413.9 Event end: VecScale [1] 8413.9 Event begin: VecSet [1] 8413.9 Event end: VecSet [1] 8413.9 Event begin: VecMAXPY [1] 8413.9 Event end: VecMAXPY [1] 8413.9 Event begin: VecCopy [1] 8413.9 Event end: VecCopy [1] 8413.9 Event begin: VecAXPY [1] 8413.9 Event end: VecAXPY [1] 8413.9 Event begin: MatMult [1] 8413.9 Event begin: VecScatterBegin [1] 8413.9 Event begin: SFPack [1] 8413.9 Event end: SFPack [1] 8413.9 Event end: VecScatterBegin [1] 8413.9 Event begin: VecScatterEnd [1] 8413.9 Event begin: SFUnpack [1] 8413.9 Event end: SFUnpack [1] 8413.9 Event end: VecScatterEnd [1] 8413.9 Event end: MatMult [1] 8413.9 Event begin: VecAYPX [1] 8413.9 Event end: VecAYPX [1] 8413.9 Event begin: VecNorm [1] 8413.9 Event end: VecNorm [1] 8413.9 Event begin: PCApply [1] 8413.9 Event begin: VecSet [1] 8413.9 Event end: VecSet [1] 8413.9 Event begin: MatSolve [1] 8413.9 Event end: MatSolve [1] 8413.9 Event end: PCApply [1] 8413.9 Event begin: MatMult [1] 8413.9 Event begin: VecScatterBegin [1] 8413.9 Event begin: SFPack [1] 8413.9 Event end: SFPack [1] 8413.9 Event end: VecScatterBegin [1] 8413.9 Event begin: VecScatterEnd [1] 8413.9 Event begin: SFUnpack [1] 8413.9 Event end: SFUnpack [1] 8413.9 Event end: VecScatterEnd [1] 8413.9 Event end: MatMult [1] 8413.9 Event begin: VecMDot [1] 8413.9 Event end: VecMDot [1] 8413.9 Event begin: VecMAXPY [1] 8413.9 Event end: VecMAXPY [1] 8413.9 Event begin: VecNorm [1] 8413.9 Event end: VecNorm [1] 8413.9 Event begin: VecScale [1] 8413.9 Event end: VecScale [1] 8413.9 Event begin: VecSet [1] 8413.9 Event end: VecSet [1] 8413.9 Event begin: VecMAXPY [1] 8413.9 Event end: VecMAXPY [1] 8413.9 Event begin: VecCopy [1] 8413.9 Event end: VecCopy [1] 8413.9 Event begin: VecAXPY [1] 8413.9 Event end: VecAXPY [1] 8413.9 Event begin: MatMult [1] 8413.9 Event begin: VecScatterBegin [1] 8413.9 Event begin: SFPack [1] 8413.9 Event end: SFPack [1] 8413.9 Event end: VecScatterBegin [1] 8413.9 Event begin: VecScatterEnd [1] 8413.9 Event begin: SFUnpack [1] 8413.9 Event end: SFUnpack [1] 8413.9 Event end: VecScatterEnd [1] 8413.9 Event end: MatMult [1] 8413.9 Event begin: VecAYPX [1] 8413.9 Event end: VecAYPX [1] 8413.9 Event begin: VecNorm [1] 8413.9 Event end: VecNorm [1] 8413.9 Event begin: PCApply [1] 8413.9 Event begin: VecSet [1] 8413.9 Event end: VecSet [1] 8413.9 Event begin: MatSolve [1] 8413.9 Event end: MatSolve [1] 8413.9 Event end: PCApply [1] 8413.9 Event begin: MatMult [1] 8413.9 Event begin: VecScatterBegin [1] 8413.9 Event begin: SFPack [1] 8413.9 Event end: SFPack [1] 8413.9 Event end: VecScatterBegin [1] 8413.9 Event begin: VecScatterEnd [1] 8413.9 Event begin: SFUnpack [1] 8413.9 Event end: SFUnpack [1] 8413.9 Event end: VecScatterEnd [1] 8413.9 Event end: MatMult [1] 8413.9 Event begin: VecMDot [1] 8413.9 Event end: VecMDot [1] 8413.9 Event begin: VecMAXPY [1] 8413.9 Event end: VecMAXPY [1] 8413.9 Event begin: VecNorm [1] 8413.9 Event end: VecNorm [1] 8413.9 Event begin: VecScale [1] 8413.9 Event end: VecScale [1] 8413.9 Event begin: VecSet [1] 8413.9 Event end: VecSet [1] 8413.9 Event begin: VecMAXPY [1] 8413.9 Event end: VecMAXPY [1] 8413.9 Event begin: VecCopy [1] 8413.9 Event end: VecCopy [1] 8413.9 Event begin: VecAXPY [1] 8413.9 Event end: VecAXPY [1] 8413.9 Event begin: MatMult [1] 8413.9 Event begin: VecScatterBegin [1] 8413.9 Event begin: SFPack [1] 8413.9 Event end: SFPack [1] 8413.9 Event end: VecScatterBegin [1] 8413.9 Event begin: VecScatterEnd [1] 8413.94 Event begin: SFUnpack [1] 8413.94 Event end: SFUnpack [1] 8413.94 Event end: VecScatterEnd [1] 8413.94 Event end: MatMult [1] 8413.94 Event begin: VecAYPX [1] 8413.94 Event end: VecAYPX [1] 8413.94 Event begin: VecNorm [1] 8413.94 Event end: VecNorm [1] 8413.94 Event begin: PCApply [1] 8413.94 Event begin: VecSet [1] 8413.94 Event end: VecSet [1] 8413.94 Event begin: MatSolve [1] 8413.94 Event end: MatSolve [1] 8413.94 Event end: PCApply [1] 8413.94 Event begin: MatMult [1] 8413.94 Event begin: VecScatterBegin [1] 8413.94 Event begin: SFPack [1] 8413.94 Event end: SFPack [1] 8413.94 Event end: VecScatterBegin [1] 8413.94 Event begin: VecScatterEnd [1] 8413.94 Event begin: SFUnpack [1] 8413.94 Event end: SFUnpack [1] 8413.94 Event end: VecScatterEnd [1] 8413.94 Event end: MatMult [1] 8413.94 Event begin: VecMDot [1] 8413.94 Event end: VecMDot [1] 8413.94 Event begin: VecMAXPY [1] 8413.94 Event end: VecMAXPY [1] 8413.94 Event begin: VecNorm [1] 8413.94 Event end: VecNorm [1] 8413.94 Event begin: VecScale [1] 8413.94 Event end: VecScale [1] 8413.94 Event begin: VecSet [1] 8413.94 Event end: VecSet [1] 8413.94 Event begin: VecMAXPY [1] 8413.94 Event end: VecMAXPY [1] 8413.94 Event begin: VecCopy [1] 8413.94 Event end: VecCopy [1] 8413.94 Event begin: VecAXPY [1] 8413.94 Event end: VecAXPY [1] 8413.94 Event begin: MatMult [1] 8413.94 Event begin: VecScatterBegin [1] 8413.94 Event begin: SFPack [1] 8413.94 Event end: SFPack [1] 8413.94 Event end: VecScatterBegin [1] 8413.94 Event begin: VecScatterEnd [1] 8413.94 Event begin: SFUnpack [1] 8413.94 Event end: SFUnpack [1] 8413.94 Event end: VecScatterEnd [1] 8413.94 Event end: MatMult [1] 8413.94 Event begin: VecAYPX [1] 8413.94 Event end: VecAYPX [1] 8413.94 Event begin: VecNorm [1] 8413.94 Event end: VecNorm [1] 8413.94 Event begin: PCApply [1] 8413.94 Event begin: VecSet [1] 8413.94 Event end: VecSet [1] 8413.94 Event begin: MatSolve [1] 8413.94 Event end: MatSolve [1] 8413.94 Event end: PCApply [1] 8413.94 Event begin: MatMult [1] 8413.94 Event begin: VecScatterBegin [1] 8413.95 Event begin: SFPack [1] 8413.95 Event end: SFPack [1] 8413.95 Event end: VecScatterBegin [1] 8413.95 Event begin: VecScatterEnd [1] 8413.95 Event begin: SFUnpack [1] 8413.95 Event end: SFUnpack [1] 8413.95 Event end: VecScatterEnd [1] 8413.95 Event end: MatMult [1] 8413.95 Event begin: VecMDot [1] 8413.95 Event end: VecMDot [1] 8413.95 Event begin: VecMAXPY [1] 8413.95 Event end: VecMAXPY [1] 8413.95 Event begin: VecNorm [1] 8413.95 Event end: VecNorm [1] 8413.95 Event begin: VecScale [1] 8413.95 Event end: VecScale [1] 8413.95 Event begin: VecSet [1] 8413.95 Event end: VecSet [1] 8413.95 Event begin: VecMAXPY [1] 8413.95 Event end: VecMAXPY [1] 8413.95 Event begin: VecCopy [1] 8413.95 Event end: VecCopy [1] 8413.95 Event begin: VecAXPY [1] 8413.95 Event end: VecAXPY [1] 8413.95 Event begin: MatMult [1] 8413.95 Event begin: VecScatterBegin [1] 8413.95 Event begin: SFPack [1] 8413.95 Event end: SFPack [1] 8413.95 Event end: VecScatterBegin [1] 8413.95 Event begin: VecScatterEnd [1] 8413.95 Event begin: SFUnpack [1] 8413.95 Event end: SFUnpack [1] 8413.95 Event end: VecScatterEnd [1] 8413.95 Event end: MatMult [1] 8413.95 Event begin: VecAYPX [1] 8413.95 Event end: VecAYPX [1] 8413.95 Event begin: VecNorm [1] 8413.95 Event end: VecNorm [1] 8413.95 Event begin: PCApply [1] 8413.95 Event begin: VecSet [1] 8413.95 Event end: VecSet [1] 8413.95 Event begin: MatSolve [1] 8413.95 Event end: MatSolve [1] 8413.95 Event end: PCApply [1] 8413.95 Event begin: MatMult [1] 8413.95 Event begin: VecScatterBegin [1] 8413.95 Event begin: SFPack [1] 8413.95 Event end: SFPack [1] 8413.95 Event end: VecScatterBegin [1] 8413.95 Event begin: VecScatterEnd [1] 8413.95 Event begin: SFUnpack [1] 8413.95 Event end: SFUnpack [1] 8413.95 Event end: VecScatterEnd [1] 8413.95 Event end: MatMult [1] 8413.95 Event begin: VecMDot [1] 8413.95 Event end: VecMDot [1] 8413.95 Event begin: VecMAXPY [1] 8413.95 Event end: VecMAXPY [1] 8413.95 Event begin: VecNorm [1] 8413.95 Event end: VecNorm [1] 8413.95 Event begin: VecScale [1] 8413.95 Event end: VecScale [1] 8413.95 Event begin: VecSet [1] 8413.95 Event end: VecSet [1] 8413.95 Event begin: VecMAXPY [1] 8413.95 Event end: VecMAXPY [1] 8413.95 Event begin: VecCopy [1] 8413.95 Event end: VecCopy [1] 8413.95 Event begin: VecAXPY [1] 8413.95 Event end: VecAXPY [1] 8413.95 Event begin: MatMult [1] 8413.95 Event begin: VecScatterBegin [1] 8413.95 Event begin: SFPack [1] 8413.95 Event end: SFPack [1] 8413.95 Event end: VecScatterBegin [1] 8413.95 Event begin: VecScatterEnd [1] 8413.95 Event begin: SFUnpack [1] 8413.95 Event end: SFUnpack [1] 8413.95 Event end: VecScatterEnd [1] 8413.95 Event end: MatMult [1] 8413.95 Event begin: VecAYPX [1] 8413.95 Event end: VecAYPX [1] 8413.95 Event begin: VecNorm [1] 8413.95 Event end: VecNorm [1] 8413.95 Event begin: PCApply [1] 8413.95 Event begin: VecSet [1] 8413.95 Event end: VecSet [1] 8413.95 Event begin: MatSolve [1] 8413.95 Event end: MatSolve [1] 8413.95 Event end: PCApply [1] 8413.95 Event begin: MatMult [1] 8413.95 Event begin: VecScatterBegin [1] 8413.95 Event begin: SFPack [1] 8413.95 Event end: SFPack [1] 8413.95 Event end: VecScatterBegin [1] 8413.95 Event begin: VecScatterEnd [1] 8413.95 Event begin: SFUnpack [1] 8413.95 Event end: SFUnpack [1] 8413.95 Event end: VecScatterEnd [1] 8413.95 Event end: MatMult [1] 8413.95 Event begin: VecMDot [1] 8413.95 Event end: VecMDot [1] 8413.95 Event begin: VecMAXPY [1] 8413.95 Event end: VecMAXPY [1] 8413.95 Event begin: VecNorm [1] 8413.95 Event end: VecNorm [1] 8413.95 Event begin: VecScale [1] 8413.95 Event end: VecScale [1] 8413.95 Event begin: VecSet [1] 8413.95 Event end: VecSet [1] 8413.95 Event begin: VecMAXPY [1] 8413.95 Event end: VecMAXPY [1] 8413.95 Event begin: VecCopy [1] 8413.95 Event end: VecCopy [1] 8413.95 Event begin: VecAXPY [1] 8413.95 Event end: VecAXPY [1] 8413.95 Event begin: MatMult [1] 8413.95 Event begin: VecScatterBegin [1] 8413.95 Event begin: SFPack [1] 8413.95 Event end: SFPack [1] 8413.95 Event end: VecScatterBegin [1] 8413.95 Event begin: VecScatterEnd [1] 8413.95 Event begin: SFUnpack [1] 8413.95 Event end: SFUnpack [1] 8413.95 Event end: VecScatterEnd [1] 8413.95 Event end: MatMult [1] 8413.95 Event begin: VecAYPX [1] 8413.95 Event end: VecAYPX [1] 8413.95 Event begin: VecNorm [1] 8413.95 Event end: VecNorm [1] 8413.95 Event begin: PCApply [1] 8413.95 Event begin: VecSet [1] 8413.95 Event end: VecSet [1] 8413.95 Event begin: MatSolve [1] 8413.95 Event end: MatSolve [1] 8413.95 Event end: PCApply [1] 8413.95 Event begin: MatMult [1] 8413.95 Event begin: VecScatterBegin [1] 8413.95 Event begin: SFPack [1] 8413.95 Event end: SFPack [1] 8413.95 Event end: VecScatterBegin [1] 8413.95 Event begin: VecScatterEnd [1] 8413.95 Event begin: SFUnpack [1] 8413.95 Event end: SFUnpack [1] 8413.95 Event end: VecScatterEnd [1] 8413.95 Event end: MatMult [1] 8413.95 Event begin: VecMDot [1] 8413.95 Event end: VecMDot [1] 8413.95 Event begin: VecMAXPY [1] 8413.95 Event end: VecMAXPY [1] 8413.95 Event begin: VecNorm [1] 8413.95 Event end: VecNorm [1] 8413.95 Event begin: VecScale [1] 8413.95 Event end: VecScale [1] 8413.95 Event begin: VecSet [1] 8413.95 Event end: VecSet [1] 8413.95 Event begin: VecMAXPY [1] 8413.95 Event end: VecMAXPY [1] 8413.95 Event begin: VecCopy [1] 8413.95 Event end: VecCopy [1] 8413.95 Event begin: VecAXPY [1] 8413.96 Event end: VecAXPY [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecAYPX [1] 8413.96 Event end: VecAYPX [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: PCApply [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: MatSolve [1] 8413.96 Event end: MatSolve [1] 8413.96 Event end: PCApply [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecMDot [1] 8413.96 Event end: VecMDot [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: VecScale [1] 8413.96 Event end: VecScale [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecCopy [1] 8413.96 Event end: VecCopy [1] 8413.96 Event begin: VecAXPY [1] 8413.96 Event end: VecAXPY [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecAYPX [1] 8413.96 Event end: VecAYPX [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: PCApply [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: MatSolve [1] 8413.96 Event end: MatSolve [1] 8413.96 Event end: PCApply [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecMDot [1] 8413.96 Event end: VecMDot [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: VecScale [1] 8413.96 Event end: VecScale [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecCopy [1] 8413.96 Event end: VecCopy [1] 8413.96 Event begin: VecAXPY [1] 8413.96 Event end: VecAXPY [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecAYPX [1] 8413.96 Event end: VecAYPX [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: PCApply [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: MatSolve [1] 8413.96 Event end: MatSolve [1] 8413.96 Event end: PCApply [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecMDot [1] 8413.96 Event end: VecMDot [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: VecScale [1] 8413.96 Event end: VecScale [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecCopy [1] 8413.96 Event end: VecCopy [1] 8413.96 Event begin: VecAXPY [1] 8413.96 Event end: VecAXPY [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecAYPX [1] 8413.96 Event end: VecAYPX [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: PCApply [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: MatSolve [1] 8413.96 Event end: MatSolve [1] 8413.96 Event end: PCApply [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecMDot [1] 8413.96 Event end: VecMDot [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: VecScale [1] 8413.96 Event end: VecScale [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecCopy [1] 8413.96 Event end: VecCopy [1] 8413.96 Event begin: VecAXPY [1] 8413.96 Event end: VecAXPY [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecAYPX [1] 8413.96 Event end: VecAYPX [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: PCApply [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: MatSolve [1] 8413.96 Event end: MatSolve [1] 8413.96 Event end: PCApply [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecMDot [1] 8413.96 Event end: VecMDot [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: VecScale [1] 8413.96 Event end: VecScale [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecCopy [1] 8413.96 Event end: VecCopy [1] 8413.96 Event begin: VecAXPY [1] 8413.96 Event end: VecAXPY [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecAYPX [1] 8413.96 Event end: VecAYPX [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: PCApply [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: MatSolve [1] 8413.96 Event end: MatSolve [1] 8413.96 Event end: PCApply [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.96 Event begin: VecScatterEnd [1] 8413.96 Event begin: SFUnpack [1] 8413.96 Event end: SFUnpack [1] 8413.96 Event end: VecScatterEnd [1] 8413.96 Event end: MatMult [1] 8413.96 Event begin: VecMDot [1] 8413.96 Event end: VecMDot [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecNorm [1] 8413.96 Event end: VecNorm [1] 8413.96 Event begin: VecScale [1] 8413.96 Event end: VecScale [1] 8413.96 Event begin: VecSet [1] 8413.96 Event end: VecSet [1] 8413.96 Event begin: VecMAXPY [1] 8413.96 Event end: VecMAXPY [1] 8413.96 Event begin: VecCopy [1] 8413.96 Event end: VecCopy [1] 8413.96 Event begin: VecAXPY [1] 8413.96 Event end: VecAXPY [1] 8413.96 Event begin: MatMult [1] 8413.96 Event begin: VecScatterBegin [1] 8413.96 Event begin: SFPack [1] 8413.96 Event end: SFPack [1] 8413.96 Event end: VecScatterBegin [1] 8413.97 Event begin: VecScatterEnd [1] 8413.97 Event begin: SFUnpack [1] 8413.97 Event end: SFUnpack [1] 8413.97 Event end: VecScatterEnd [1] 8413.97 Event end: MatMult [1] 8413.97 Event begin: VecAYPX [1] 8413.97 Event end: VecAYPX [1] 8413.97 Event begin: VecNorm [1] 8413.97 Event end: VecNorm [1] 8413.97 Event begin: PCApply [1] 8413.97 Event begin: VecSet [1] 8413.97 Event end: VecSet [1] 8413.97 Event begin: MatSolve [1] 8413.97 Event end: MatSolve [1] 8413.97 Event end: PCApply [1] 8413.97 Event begin: MatMult [1] 8413.97 Event begin: VecScatterBegin [1] 8413.97 Event begin: SFPack [1] 8413.97 Event end: SFPack [1] 8413.97 Event end: VecScatterBegin [1] 8413.97 Event begin: VecScatterEnd [1] 8413.97 Event begin: SFUnpack [1] 8413.97 Event end: SFUnpack [1] 8413.97 Event end: VecScatterEnd [1] 8413.97 Event end: MatMult [1] 8413.97 Event begin: VecMDot [1] 8413.97 Event end: VecMDot [1] 8413.97 Event begin: VecMAXPY [1] 8413.97 Event end: VecMAXPY [1] 8413.97 Event begin: VecNorm [1] 8413.97 Event end: VecNorm [1] 8413.97 Event begin: VecScale [1] 8413.97 Event end: VecScale [1] 8413.97 Event begin: VecSet [1] 8413.97 Event end: VecSet [1] 8413.97 Event begin: VecMAXPY [1] 8413.97 Event end: VecMAXPY [1] 8413.97 Event begin: VecCopy [1] 8413.97 Event end: VecCopy [1] 8413.97 Event begin: VecAXPY [1] 8413.97 Event end: VecAXPY [1] 8413.97 Event begin: MatMult [1] 8413.97 Event begin: VecScatterBegin [1] 8413.97 Event begin: SFPack [1] 8413.97 Event end: SFPack [1] 8413.97 Event end: VecScatterBegin [1] 8413.97 Event begin: VecScatterEnd [1] 8413.97 Event begin: SFUnpack [1] 8413.97 Event end: SFUnpack [1] 8413.97 Event end: VecScatterEnd [1] 8413.97 Event end: MatMult [1] 8413.97 Event begin: VecAYPX [2] 8413.9 Event begin: VecScatterBegin [2] 8413.9 Event begin: SFPack [2] 8413.9 Event end: SFPack [2] 8413.9 Event end: VecScatterBegin [2] 8413.9 Event begin: VecScatterEnd [2] 8413.9 Event begin: SFUnpack [2] 8413.9 Event end: SFUnpack [2] 8413.9 Event end: VecScatterEnd [2] 8413.9 Event end: MatMult [2] 8413.9 Event begin: VecMDot [2] 8413.9 Event end: VecMDot [2] 8413.9 Event begin: VecMAXPY [2] 8413.9 Event end: VecMAXPY [2] 8413.9 Event begin: VecNorm [2] 8413.9 Event end: VecNorm [2] 8413.9 Event begin: VecScale [2] 8413.9 Event end: VecScale [2] 8413.9 Event begin: VecSet [2] 8413.9 Event end: VecSet [2] 8413.9 Event begin: VecMAXPY [2] 8413.9 Event end: VecMAXPY [2] 8413.9 Event begin: VecCopy [2] 8413.9 Event end: VecCopy [2] 8413.9 Event begin: VecAXPY [2] 8413.9 Event end: VecAXPY [2] 8413.9 Event begin: MatMult [2] 8413.9 Event begin: VecScatterBegin [2] 8413.9 Event begin: SFPack [2] 8413.9 Event end: SFPack [2] 8413.9 Event end: VecScatterBegin [2] 8413.9 Event begin: VecScatterEnd [2] 8413.9 Event begin: SFUnpack [2] 8413.9 Event end: SFUnpack [2] 8413.9 Event end: VecScatterEnd [2] 8413.9 Event end: MatMult [2] 8413.9 Event begin: VecAYPX [2] 8413.9 Event end: VecAYPX [2] 8413.9 Event begin: VecNorm [2] 8413.9 Event end: VecNorm [2] 8413.9 Event begin: PCApply [2] 8413.9 Event begin: VecSet [2] 8413.9 Event end: VecSet [2] 8413.9 Event begin: MatSolve [2] 8413.9 Event end: MatSolve [2] 8413.9 Event end: PCApply [2] 8413.9 Event begin: MatMult [2] 8413.9 Event begin: VecScatterBegin [2] 8413.9 Event begin: SFPack [2] 8413.9 Event end: SFPack [2] 8413.9 Event end: VecScatterBegin [2] 8413.9 Event begin: VecScatterEnd [2] 8413.9 Event begin: SFUnpack [2] 8413.9 Event end: SFUnpack [2] 8413.9 Event end: VecScatterEnd [2] 8413.9 Event end: MatMult [2] 8413.9 Event begin: VecMDot [2] 8413.9 Event end: VecMDot [2] 8413.9 Event begin: VecMAXPY [2] 8413.9 Event end: VecMAXPY [2] 8413.9 Event begin: VecNorm [2] 8413.9 Event end: VecNorm [2] 8413.9 Event begin: VecScale [2] 8413.9 Event end: VecScale [2] 8413.9 Event begin: VecSet [2] 8413.9 Event end: VecSet [2] 8413.9 Event begin: VecMAXPY [2] 8413.9 Event end: VecMAXPY [2] 8413.9 Event begin: VecCopy [2] 8413.9 Event end: VecCopy [2] 8413.9 Event begin: VecAXPY [2] 8413.9 Event end: VecAXPY [2] 8413.9 Event begin: MatMult [2] 8413.9 Event begin: VecScatterBegin [2] 8413.9 Event begin: SFPack [2] 8413.9 Event end: SFPack [2] 8413.9 Event end: VecScatterBegin [2] 8413.9 Event begin: VecScatterEnd [2] 8413.9 Event begin: SFUnpack [2] 8413.9 Event end: SFUnpack [2] 8413.9 Event end: VecScatterEnd [2] 8413.9 Event end: MatMult [2] 8413.9 Event begin: VecAYPX [2] 8413.9 Event end: VecAYPX [2] 8413.9 Event begin: VecNorm [2] 8413.9 Event end: VecNorm [2] 8413.9 Event begin: PCApply [2] 8413.9 Event begin: VecSet [2] 8413.9 Event end: VecSet [2] 8413.9 Event begin: MatSolve [2] 8413.9 Event end: MatSolve [2] 8413.9 Event end: PCApply [2] 8413.9 Event begin: MatMult [2] 8413.9 Event begin: VecScatterBegin [2] 8413.9 Event begin: SFPack [2] 8413.9 Event end: SFPack [2] 8413.9 Event end: VecScatterBegin [2] 8413.9 Event begin: VecScatterEnd [2] 8413.9 Event begin: SFUnpack [2] 8413.9 Event end: SFUnpack [2] 8413.9 Event end: VecScatterEnd [2] 8413.9 Event end: MatMult [2] 8413.9 Event begin: VecMDot [2] 8413.9 Event end: VecMDot [2] 8413.9 Event begin: VecMAXPY [2] 8413.9 Event end: VecMAXPY [2] 8413.9 Event begin: VecNorm [2] 8413.9 Event end: VecNorm [2] 8413.9 Event begin: VecScale [2] 8413.9 Event end: VecScale [2] 8413.9 Event begin: VecSet [2] 8413.9 Event end: VecSet [2] 8413.9 Event begin: VecMAXPY [2] 8413.9 Event end: VecMAXPY [2] 8413.9 Event begin: VecCopy [2] 8413.9 Event end: VecCopy [2] 8413.9 Event begin: VecAXPY [2] 8413.9 Event end: VecAXPY [2] 8413.9 Event begin: MatMult [2] 8413.9 Event begin: VecScatterBegin [2] 8413.9 Event begin: SFPack [2] 8413.9 Event end: SFPack [2] 8413.9 Event end: VecScatterBegin [2] 8413.9 Event begin: VecScatterEnd [2] 8413.94 Event begin: SFUnpack [2] 8413.94 Event end: SFUnpack [2] 8413.94 Event end: VecScatterEnd [2] 8413.94 Event end: MatMult [2] 8413.94 Event begin: VecAYPX [2] 8413.94 Event end: VecAYPX [2] 8413.94 Event begin: VecNorm [2] 8413.94 Event end: VecNorm [2] 8413.94 Event begin: PCApply [2] 8413.94 Event begin: VecSet [2] 8413.94 Event end: VecSet [2] 8413.94 Event begin: MatSolve [2] 8413.94 Event end: MatSolve [2] 8413.94 Event end: PCApply [2] 8413.94 Event begin: MatMult [2] 8413.94 Event begin: VecScatterBegin [2] 8413.94 Event begin: SFPack [2] 8413.94 Event end: SFPack [2] 8413.94 Event end: VecScatterBegin [2] 8413.94 Event begin: VecScatterEnd [2] 8413.94 Event begin: SFUnpack [2] 8413.94 Event end: SFUnpack [2] 8413.94 Event end: VecScatterEnd [2] 8413.94 Event end: MatMult [2] 8413.94 Event begin: VecMDot [2] 8413.94 Event end: VecMDot [2] 8413.94 Event begin: VecMAXPY [2] 8413.94 Event end: VecMAXPY [2] 8413.94 Event begin: VecNorm [2] 8413.94 Event end: VecNorm [2] 8413.94 Event begin: VecScale [2] 8413.94 Event end: VecScale [2] 8413.94 Event begin: VecSet [2] 8413.94 Event end: VecSet [2] 8413.94 Event begin: VecMAXPY [2] 8413.95 Event end: VecMAXPY [2] 8413.95 Event begin: VecCopy [2] 8413.95 Event end: VecCopy [2] 8413.95 Event begin: VecAXPY [2] 8413.95 Event end: VecAXPY [2] 8413.95 Event begin: MatMult [2] 8413.95 Event begin: VecScatterBegin [2] 8413.95 Event begin: SFPack [2] 8413.95 Event end: SFPack [2] 8413.95 Event end: VecScatterBegin [2] 8413.95 Event begin: VecScatterEnd [2] 8413.95 Event begin: SFUnpack [2] 8413.95 Event end: SFUnpack [2] 8413.95 Event end: VecScatterEnd [2] 8413.95 Event end: MatMult [2] 8413.95 Event begin: VecAYPX [2] 8413.95 Event end: VecAYPX [2] 8413.95 Event begin: VecNorm [2] 8413.95 Event end: VecNorm [2] 8413.95 Event begin: PCApply [2] 8413.95 Event begin: VecSet [2] 8413.95 Event end: VecSet [2] 8413.95 Event begin: MatSolve [2] 8413.95 Event end: MatSolve [2] 8413.95 Event end: PCApply [2] 8413.95 Event begin: MatMult [2] 8413.95 Event begin: VecScatterBegin [2] 8413.95 Event begin: SFPack [2] 8413.95 Event end: SFPack [2] 8413.95 Event end: VecScatterBegin [2] 8413.95 Event begin: VecScatterEnd [2] 8413.95 Event begin: SFUnpack [2] 8413.95 Event end: SFUnpack [2] 8413.95 Event end: VecScatterEnd [2] 8413.95 Event end: MatMult [2] 8413.95 Event begin: VecMDot [2] 8413.95 Event end: VecMDot [2] 8413.95 Event begin: VecMAXPY [2] 8413.95 Event end: VecMAXPY [2] 8413.95 Event begin: VecNorm [2] 8413.95 Event end: VecNorm [2] 8413.95 Event begin: VecScale [2] 8413.95 Event end: VecScale [2] 8413.95 Event begin: VecSet [2] 8413.95 Event end: VecSet [2] 8413.95 Event begin: VecMAXPY [2] 8413.95 Event end: VecMAXPY [2] 8413.95 Event begin: VecCopy [2] 8413.95 Event end: VecCopy [2] 8413.95 Event begin: VecAXPY [2] 8413.95 Event end: VecAXPY [2] 8413.95 Event begin: MatMult [2] 8413.95 Event begin: VecScatterBegin [2] 8413.95 Event begin: SFPack [2] 8413.95 Event end: SFPack [2] 8413.95 Event end: VecScatterBegin [2] 8413.95 Event begin: VecScatterEnd [2] 8413.95 Event begin: SFUnpack [2] 8413.95 Event end: SFUnpack [2] 8413.95 Event end: VecScatterEnd [2] 8413.95 Event end: MatMult [2] 8413.95 Event begin: VecAYPX [2] 8413.95 Event end: VecAYPX [2] 8413.95 Event begin: VecNorm [2] 8413.95 Event end: VecNorm [2] 8413.95 Event begin: PCApply [2] 8413.95 Event begin: VecSet [2] 8413.95 Event end: VecSet [2] 8413.95 Event begin: MatSolve [2] 8413.95 Event end: MatSolve [2] 8413.95 Event end: PCApply [2] 8413.95 Event begin: MatMult [2] 8413.95 Event begin: VecScatterBegin [2] 8413.95 Event begin: SFPack [2] 8413.95 Event end: SFPack [2] 8413.95 Event end: VecScatterBegin [2] 8413.95 Event begin: VecScatterEnd [2] 8413.95 Event begin: SFUnpack [2] 8413.95 Event end: SFUnpack [2] 8413.95 Event end: VecScatterEnd [2] 8413.95 Event end: MatMult [2] 8413.95 Event begin: VecMDot [2] 8413.95 Event end: VecMDot [2] 8413.95 Event begin: VecMAXPY [2] 8413.95 Event end: VecMAXPY [2] 8413.95 Event begin: VecNorm [2] 8413.95 Event end: VecNorm [2] 8413.95 Event begin: VecScale [2] 8413.95 Event end: VecScale [2] 8413.95 Event begin: VecSet [2] 8413.95 Event end: VecSet [2] 8413.95 Event begin: VecMAXPY [2] 8413.95 Event end: VecMAXPY [2] 8413.95 Event begin: VecCopy [2] 8413.95 Event end: VecCopy [2] 8413.95 Event begin: VecAXPY [2] 8413.95 Event end: VecAXPY [2] 8413.95 Event begin: MatMult [2] 8413.95 Event begin: VecScatterBegin [2] 8413.95 Event begin: SFPack [2] 8413.95 Event end: SFPack [2] 8413.95 Event end: VecScatterBegin [2] 8413.95 Event begin: VecScatterEnd [2] 8413.95 Event begin: SFUnpack [2] 8413.95 Event end: SFUnpack [2] 8413.95 Event end: VecScatterEnd [2] 8413.95 Event end: MatMult [2] 8413.95 Event begin: VecAYPX [2] 8413.95 Event end: VecAYPX [2] 8413.95 Event begin: VecNorm [2] 8413.95 Event end: VecNorm [2] 8413.95 Event begin: PCApply [2] 8413.95 Event begin: VecSet [2] 8413.95 Event end: VecSet [2] 8413.95 Event begin: MatSolve [2] 8413.95 Event end: MatSolve [2] 8413.95 Event end: PCApply [2] 8413.95 Event begin: MatMult [2] 8413.95 Event begin: VecScatterBegin [2] 8413.95 Event begin: SFPack [2] 8413.95 Event end: SFPack [2] 8413.95 Event end: VecScatterBegin [2] 8413.95 Event begin: VecScatterEnd [2] 8413.95 Event begin: SFUnpack [2] 8413.95 Event end: SFUnpack [2] 8413.95 Event end: VecScatterEnd [2] 8413.95 Event end: MatMult [2] 8413.95 Event begin: VecMDot [2] 8413.95 Event end: VecMDot [2] 8413.95 Event begin: VecMAXPY [2] 8413.95 Event end: VecMAXPY [2] 8413.95 Event begin: VecNorm [2] 8413.95 Event end: VecNorm [2] 8413.95 Event begin: VecScale [2] 8413.95 Event end: VecScale [2] 8413.95 Event begin: VecSet [2] 8413.95 Event end: VecSet [2] 8413.95 Event begin: VecMAXPY [2] 8413.95 Event end: VecMAXPY [2] 8413.95 Event begin: VecCopy [2] 8413.95 Event end: VecCopy [2] 8413.95 Event begin: VecAXPY [2] 8413.95 Event end: VecAXPY [2] 8413.95 Event begin: MatMult [2] 8413.95 Event begin: VecScatterBegin [2] 8413.95 Event begin: SFPack [2] 8413.95 Event end: SFPack [2] 8413.95 Event end: VecScatterBegin [2] 8413.95 Event begin: VecScatterEnd [2] 8413.95 Event begin: SFUnpack [2] 8413.95 Event end: SFUnpack [2] 8413.95 Event end: VecScatterEnd [2] 8413.95 Event end: MatMult [2] 8413.95 Event begin: VecAYPX [2] 8413.95 Event end: VecAYPX [2] 8413.95 Event begin: VecNorm [2] 8413.95 Event end: VecNorm [2] 8413.95 Event begin: PCApply [2] 8413.95 Event begin: VecSet [2] 8413.95 Event end: VecSet [2] 8413.95 Event begin: MatSolve [2] 8413.96 Event end: MatSolve [2] 8413.96 Event end: PCApply [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecMDot [2] 8413.96 Event end: VecMDot [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: VecScale [2] 8413.96 Event end: VecScale [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecCopy [2] 8413.96 Event end: VecCopy [2] 8413.96 Event begin: VecAXPY [2] 8413.96 Event end: VecAXPY [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecAYPX [2] 8413.96 Event end: VecAYPX [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: PCApply [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: MatSolve [2] 8413.96 Event end: MatSolve [2] 8413.96 Event end: PCApply [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecMDot [2] 8413.96 Event end: VecMDot [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: VecScale [2] 8413.96 Event end: VecScale [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecCopy [2] 8413.96 Event end: VecCopy [2] 8413.96 Event begin: VecAXPY [2] 8413.96 Event end: VecAXPY [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecAYPX [2] 8413.96 Event end: VecAYPX [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: PCApply [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: MatSolve [2] 8413.96 Event end: MatSolve [2] 8413.96 Event end: PCApply [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecMDot [2] 8413.96 Event end: VecMDot [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: VecScale [2] 8413.96 Event end: VecScale [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecCopy [2] 8413.96 Event end: VecCopy [2] 8413.96 Event begin: VecAXPY [2] 8413.96 Event end: VecAXPY [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecAYPX [2] 8413.96 Event end: VecAYPX [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: PCApply [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: MatSolve [2] 8413.96 Event end: MatSolve [2] 8413.96 Event end: PCApply [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecMDot [2] 8413.96 Event end: VecMDot [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: VecScale [2] 8413.96 Event end: VecScale [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecCopy [2] 8413.96 Event end: VecCopy [2] 8413.96 Event begin: VecAXPY [2] 8413.96 Event end: VecAXPY [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecAYPX [2] 8413.96 Event end: VecAYPX [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: PCApply [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: MatSolve [2] 8413.96 Event end: MatSolve [2] 8413.96 Event end: PCApply [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecMDot [2] 8413.96 Event end: VecMDot [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: VecScale [2] 8413.96 Event end: VecScale [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecCopy [2] 8413.96 Event end: VecCopy [2] 8413.96 Event begin: VecAXPY [2] 8413.96 Event end: VecAXPY [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecAYPX [2] 8413.96 Event end: VecAYPX [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: PCApply [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: MatSolve [2] 8413.96 Event end: MatSolve [2] 8413.96 Event end: PCApply [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecMDot [2] 8413.96 Event end: VecMDot [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: VecScale [2] 8413.96 Event end: VecScale [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: VecMAXPY [2] 8413.96 Event end: VecMAXPY [2] 8413.96 Event begin: VecCopy [2] 8413.96 Event end: VecCopy [2] 8413.96 Event begin: VecAXPY [2] 8413.96 Event end: VecAXPY [2] 8413.96 Event begin: MatMult [2] 8413.96 Event begin: VecScatterBegin [2] 8413.96 Event begin: SFPack [2] 8413.96 Event end: SFPack [2] 8413.96 Event end: VecScatterBegin [2] 8413.96 Event begin: VecScatterEnd [2] 8413.96 Event begin: SFUnpack [2] 8413.96 Event end: SFUnpack [2] 8413.96 Event end: VecScatterEnd [2] 8413.96 Event end: MatMult [2] 8413.96 Event begin: VecAYPX [2] 8413.96 Event end: VecAYPX [2] 8413.96 Event begin: VecNorm [2] 8413.96 Event end: VecNorm [2] 8413.96 Event begin: PCApply [2] 8413.96 Event begin: VecSet [2] 8413.96 Event end: VecSet [2] 8413.96 Event begin: MatSolve [2] 8413.96 Event end: MatSolve [2] 8413.97 Event end: PCApply [2] 8413.97 Event begin: MatMult [2] 8413.97 Event begin: VecScatterBegin [2] 8413.97 Event begin: SFPack [2] 8413.97 Event end: SFPack [2] 8413.97 Event end: VecScatterBegin [2] 8413.97 Event begin: VecScatterEnd [2] 8413.97 Event begin: SFUnpack [2] 8413.97 Event end: SFUnpack [2] 8413.97 Event end: VecScatterEnd [2] 8413.97 Event end: MatMult [2] 8413.97 Event begin: VecMDot [2] 8413.97 Event end: VecMDot [2] 8413.97 Event begin: VecMAXPY [2] 8413.97 Event end: VecMAXPY [2] 8413.97 Event begin: VecNorm [2] 8413.97 Event end: VecNorm [2] 8413.97 Event begin: VecScale [2] 8413.97 Event end: VecScale [2] 8413.97 Event begin: VecSet [2] 8413.97 Event end: VecSet [2] 8413.97 Event begin: VecMAXPY [2] 8413.97 Event end: VecMAXPY [2] 8413.97 Event begin: VecCopy [2] 8413.97 Event end: VecCopy [2] 8413.97 Event begin: VecAXPY [2] 8413.97 Event end: VecAXPY [2] 8413.97 Event begin: MatMult [2] 8413.97 Event begin: VecScatterBegin [2] 8413.97 Event begin: SFPack [2] 8413.97 Event end: SFPack [2] 8413.97 Event end: VecScatterBegin [2] 8413.97 Event begin: VecScatterEnd [2] 8413.97 Event begin: SFUnpack [2] 8413.97 Event end: SFUnpack [2] 8413.97 Event end: VecScatterEnd [2] 8413.97 Event end: MatMult [2] 8413.97 Event begin: VecAYPX [2] 8413.97 Event end: VecAYPX [2] 8413.97 Event begin: VecNorm [2] 8413.97 Event end: VecNorm [2] 8413.97 Event begin: PCApply [2] 8413.97 Event begin: VecSet [2] 8413.97 Event end: VecSet [2] 8413.97 Event begin: MatSolve [2] 8413.97 Event end: MatSolve [2] 8413.97 Event end: PCApply [2] 8413.97 Event begin: MatMult [2] 8413.97 Event begin: VecScatterBegin [2] 8413.97 Event begin: SFPack [2] 8413.97 Event end: SFPack [2] 8413.97 Event end: VecScatterBegin [2] 8413.97 Event begin: VecScatterEnd [2] 8413.97 Event begin: SFUnpack [2] 8413.97 Event end: SFUnpack [2] 8413.97 Event end: VecScatterEnd [2] 8413.97 Event end: MatMult [2] 8413.97 Event begin: VecMDot [2] 8413.97 Event end: VecMDot [2] 8413.97 Event begin: VecMAXPY [2] 8413.97 Event end: VecMAXPY [2] 8413.97 Event begin: VecNorm [2] 8413.97 Event end: VecNorm [2] 8413.97 Event begin: VecScale [2] 8413.97 Event end: VecScale [2] 8413.97 Event begin: VecSet [2] 8413.97 Event end: VecSet [2] 8413.97 Event begin: VecMAXPY [2] 8413.97 Event end: VecMAXPY [2] 8413.97 Event begin: VecCopy [2] 8413.97 Event end: VecCopy [2] 8413.97 Event begin: VecAXPY [2] 8413.97 Event end: VecAXPY [2] 8413.97 Event begin: MatMult [2] 8413.97 Event begin: VecScatterBegin [2] 8413.97 Event begin: SFPack [2] 8413.97 Event end: SFPack [2] 8413.97 Event end: VecScatterBegin [2] 8413.97 Event begin: VecScatterEnd [2] 8413.97 Event begin: SFUnpack [2] 8413.97 Event end: SFUnpack [2] 8413.97 Event end: VecScatterEnd [2] 8413.97 Event end: MatMult [2] 8413.97 Event begin: VecAYPX [2] 8413.97 Event end: VecAYPX [2] 8413.97 Event begin: VecNorm [2] 8414.03 Event end: VecNorm [2] 8414.03 Event begin: PCApply [2] 8414.03 Event begin: VecSet [2] 8414.03 Event end: VecSet [2] 8414.03 Event begin: MatSolve [2] 8414.03 Event end: MatSolve [2] 8414.03 Event end: PCApply [2] 8414.03 Event begin: MatMult [2] 8414.03 Event begin: VecScatterBegin [2] 8414.03 Event begin: SFPack [2] 8414.03 Event end: SFPack [2] 8414.03 Event end: VecScatterBegin [2] 8414.03 Event begin: VecScatterEnd [2] 8414.03 Event begin: SFUnpack [2] 8414.03 Event end: SFUnpack [2] 8414.03 Event end: VecScatterEnd [2] 8414.03 Event end: MatMult [2] 8414.03 Event begin: VecMDot [2] 8414.03 Event end: VecMDot [2] 8414.03 Event begin: VecMAXPY [2] 8414.03 Event end: VecMAXPY [2] 8414.03 Event begin: VecNorm [2] 8414.03 Event end: VecNorm [2] 8414.03 Event begin: VecScale [2] 8414.03 Event end: VecScale [2] 8414.03 Event begin: VecSet [2] 8414.03 Event end: VecSet [2] 8414.03 Event begin: VecMAXPY [2] 8414.03 Event end: VecMAXPY [2] 8414.03 Event begin: VecCopy [2] 8414.03 Event end: VecCopy [2] 8414.03 Event begin: VecAXPY [2] 8414.03 Event end: VecAXPY [2] 8414.03 Event begin: MatMult [2] 8414.03 Event begin: VecScatterBegin [2] 8414.03 Event begin: SFPack [2] 8414.03 Event end: SFPack [2] 8414.03 Event end: VecScatterBegin [2] 8414.03 Event begin: VecScatterEnd [2] 8414.03 Event begin: SFUnpack [2] 8414.03 Event end: SFUnpack [2] 8414.03 Event end: VecScatterEnd [2] 8414.03 Event end: MatMult [2] 8414.03 Event begin: VecAYPX [2] 8414.03 Event end: VecAYPX [2] 8414.03 Event begin: VecNorm [2] 8414.03 Event end: VecNorm [3] 8413.9 Event begin: MatMult [3] 8413.94 Event begin: VecScatterBegin [3] 8413.94 Event begin: SFPack [3] 8413.94 Event end: SFPack [3] 8413.94 Event end: VecScatterBegin [3] 8413.94 Event begin: VecScatterEnd [3] 8413.94 Event begin: SFUnpack [3] 8413.94 Event end: SFUnpack [3] 8413.94 Event end: VecScatterEnd [3] 8413.94 Event end: MatMult [3] 8413.94 Event begin: VecAYPX [3] 8413.94 Event end: VecAYPX [3] 8413.94 Event begin: VecNorm [3] 8413.94 Event end: VecNorm [3] 8413.94 Event begin: PCApply [3] 8413.94 Event begin: VecSet [3] 8413.94 Event end: VecSet [3] 8413.94 Event begin: MatSolve [3] 8413.94 Event end: MatSolve [3] 8413.94 Event end: PCApply [3] 8413.94 Event begin: MatMult [3] 8413.94 Event begin: VecScatterBegin [3] 8413.94 Event begin: SFPack [3] 8413.94 Event end: SFPack [3] 8413.94 Event end: VecScatterBegin [3] 8413.94 Event begin: VecScatterEnd [3] 8413.94 Event begin: SFUnpack [3] 8413.94 Event end: SFUnpack [3] 8413.94 Event end: VecScatterEnd [3] 8413.94 Event end: MatMult [3] 8413.94 Event begin: VecMDot [3] 8413.94 Event end: VecMDot [3] 8413.94 Event begin: VecMAXPY [3] 8413.94 Event end: VecMAXPY [3] 8413.94 Event begin: VecNorm [3] 8413.94 Event end: VecNorm [3] 8413.94 Event begin: VecScale [3] 8413.94 Event end: VecScale [3] 8413.94 Event begin: VecSet [3] 8413.94 Event end: VecSet [3] 8413.94 Event begin: VecMAXPY [3] 8413.94 Event end: VecMAXPY [3] 8413.94 Event begin: VecCopy [3] 8413.94 Event end: VecCopy [3] 8413.94 Event begin: VecAXPY [3] 8413.94 Event end: VecAXPY [3] 8413.94 Event begin: MatMult [3] 8413.94 Event begin: VecScatterBegin [3] 8413.94 Event begin: SFPack [3] 8413.94 Event end: SFPack [3] 8413.94 Event end: VecScatterBegin [3] 8413.94 Event begin: VecScatterEnd [3] 8413.94 Event begin: SFUnpack [3] 8413.94 Event end: SFUnpack [3] 8413.94 Event end: VecScatterEnd [3] 8413.94 Event end: MatMult [3] 8413.94 Event begin: VecAYPX [3] 8413.94 Event end: VecAYPX [3] 8413.94 Event begin: VecNorm [3] 8413.94 Event end: VecNorm [3] 8413.94 Event begin: PCApply [3] 8413.94 Event begin: VecSet [3] 8413.94 Event end: VecSet [3] 8413.94 Event begin: MatSolve [3] 8413.94 Event end: MatSolve [3] 8413.94 Event end: PCApply [3] 8413.94 Event begin: MatMult [3] 8413.94 Event begin: VecScatterBegin [3] 8413.94 Event begin: SFPack [3] 8413.94 Event end: SFPack [3] 8413.94 Event end: VecScatterBegin [3] 8413.94 Event begin: VecScatterEnd [3] 8413.94 Event begin: SFUnpack [3] 8413.94 Event end: SFUnpack [3] 8413.94 Event end: VecScatterEnd [3] 8413.94 Event end: MatMult [3] 8413.95 Event begin: VecMDot [3] 8413.95 Event end: VecMDot [3] 8413.95 Event begin: VecMAXPY [3] 8413.95 Event end: VecMAXPY [3] 8413.95 Event begin: VecNorm [3] 8413.95 Event end: VecNorm [3] 8413.95 Event begin: VecScale [3] 8413.95 Event end: VecScale [3] 8413.95 Event begin: VecSet [3] 8413.95 Event end: VecSet [3] 8413.95 Event begin: VecMAXPY [3] 8413.95 Event end: VecMAXPY [3] 8413.95 Event begin: VecCopy [3] 8413.95 Event end: VecCopy [3] 8413.95 Event begin: VecAXPY [3] 8413.95 Event end: VecAXPY [3] 8413.95 Event begin: MatMult [3] 8413.95 Event begin: VecScatterBegin [3] 8413.95 Event begin: SFPack [3] 8413.95 Event end: SFPack [3] 8413.95 Event end: VecScatterBegin [3] 8413.95 Event begin: VecScatterEnd [3] 8413.95 Event begin: SFUnpack [3] 8413.95 Event end: SFUnpack [3] 8413.95 Event end: VecScatterEnd [3] 8413.95 Event end: MatMult [3] 8413.95 Event begin: VecAYPX [3] 8413.95 Event end: VecAYPX [3] 8413.95 Event begin: VecNorm [3] 8413.95 Event end: VecNorm [3] 8413.95 Event begin: PCApply [3] 8413.95 Event begin: VecSet [3] 8413.95 Event end: VecSet [3] 8413.95 Event begin: MatSolve [3] 8413.95 Event end: MatSolve [3] 8413.95 Event end: PCApply [3] 8413.95 Event begin: MatMult [3] 8413.95 Event begin: VecScatterBegin [3] 8413.95 Event begin: SFPack [3] 8413.95 Event end: SFPack [3] 8413.95 Event end: VecScatterBegin [3] 8413.95 Event begin: VecScatterEnd [3] 8413.95 Event begin: SFUnpack [3] 8413.95 Event end: SFUnpack [3] 8413.95 Event end: VecScatterEnd [3] 8413.95 Event end: MatMult [3] 8413.95 Event begin: VecMDot [3] 8413.95 Event end: VecMDot [3] 8413.95 Event begin: VecMAXPY [3] 8413.95 Event end: VecMAXPY [3] 8413.95 Event begin: VecNorm [3] 8413.95 Event end: VecNorm [3] 8413.95 Event begin: VecScale [3] 8413.95 Event end: VecScale [3] 8413.95 Event begin: VecSet [3] 8413.95 Event end: VecSet [3] 8413.95 Event begin: VecMAXPY [3] 8413.95 Event end: VecMAXPY [3] 8413.95 Event begin: VecCopy [3] 8413.95 Event end: VecCopy [3] 8413.95 Event begin: VecAXPY [3] 8413.95 Event end: VecAXPY [3] 8413.95 Event begin: MatMult [3] 8413.95 Event begin: VecScatterBegin [3] 8413.95 Event begin: SFPack [3] 8413.95 Event end: SFPack [3] 8413.95 Event end: VecScatterBegin [3] 8413.95 Event begin: VecScatterEnd [3] 8413.95 Event begin: SFUnpack [3] 8413.95 Event end: SFUnpack [3] 8413.95 Event end: VecScatterEnd [3] 8413.95 Event end: MatMult [3] 8413.95 Event begin: VecAYPX [3] 8413.95 Event end: VecAYPX [3] 8413.95 Event begin: VecNorm [3] 8413.95 Event end: VecNorm [3] 8413.95 Event begin: PCApply [3] 8413.95 Event begin: VecSet [3] 8413.95 Event end: VecSet [3] 8413.95 Event begin: MatSolve [3] 8413.95 Event end: MatSolve [3] 8413.95 Event end: PCApply [3] 8413.95 Event begin: MatMult [3] 8413.95 Event begin: VecScatterBegin [3] 8413.95 Event begin: SFPack [3] 8413.95 Event end: SFPack [3] 8413.95 Event end: VecScatterBegin [3] 8413.95 Event begin: VecScatterEnd [3] 8413.95 Event begin: SFUnpack [3] 8413.95 Event end: SFUnpack [3] 8413.95 Event end: VecScatterEnd [3] 8413.95 Event end: MatMult [3] 8413.95 Event begin: VecMDot [3] 8413.95 Event end: VecMDot [3] 8413.95 Event begin: VecMAXPY [3] 8413.95 Event end: VecMAXPY [3] 8413.95 Event begin: VecNorm [3] 8413.95 Event end: VecNorm [3] 8413.95 Event begin: VecScale [3] 8413.95 Event end: VecScale [3] 8413.95 Event begin: VecSet [3] 8413.95 Event end: VecSet [3] 8413.95 Event begin: VecMAXPY [3] 8413.95 Event end: VecMAXPY [3] 8413.95 Event begin: VecCopy [3] 8413.95 Event end: VecCopy [3] 8413.95 Event begin: VecAXPY [3] 8413.95 Event end: VecAXPY [3] 8413.95 Event begin: MatMult [3] 8413.95 Event begin: VecScatterBegin [3] 8413.95 Event begin: SFPack [3] 8413.95 Event end: SFPack [3] 8413.95 Event end: VecScatterBegin [3] 8413.95 Event begin: VecScatterEnd [3] 8413.95 Event begin: SFUnpack [3] 8413.95 Event end: SFUnpack [3] 8413.95 Event end: VecScatterEnd [3] 8413.95 Event end: MatMult [3] 8413.95 Event begin: VecAYPX [3] 8413.95 Event end: VecAYPX [3] 8413.95 Event begin: VecNorm [3] 8413.95 Event end: VecNorm [3] 8413.95 Event begin: PCApply [3] 8413.95 Event begin: VecSet [3] 8413.95 Event end: VecSet [3] 8413.95 Event begin: MatSolve [3] 8413.95 Event end: MatSolve [3] 8413.95 Event end: PCApply [3] 8413.95 Event begin: MatMult [3] 8413.95 Event begin: VecScatterBegin [3] 8413.95 Event begin: SFPack [3] 8413.95 Event end: SFPack [3] 8413.95 Event end: VecScatterBegin [3] 8413.95 Event begin: VecScatterEnd [3] 8413.95 Event begin: SFUnpack [3] 8413.95 Event end: SFUnpack [3] 8413.95 Event end: VecScatterEnd [3] 8413.95 Event end: MatMult [3] 8413.95 Event begin: VecMDot [3] 8413.95 Event end: VecMDot [3] 8413.95 Event begin: VecMAXPY [3] 8413.95 Event end: VecMAXPY [3] 8413.95 Event begin: VecNorm [3] 8413.95 Event end: VecNorm [3] 8413.95 Event begin: VecScale [3] 8413.95 Event end: VecScale [3] 8413.95 Event begin: VecSet [3] 8413.95 Event end: VecSet [3] 8413.95 Event begin: VecMAXPY [3] 8413.95 Event end: VecMAXPY [3] 8413.95 Event begin: VecCopy [3] 8413.95 Event end: VecCopy [3] 8413.95 Event begin: VecAXPY [3] 8413.95 Event end: VecAXPY [3] 8413.95 Event begin: MatMult [3] 8413.95 Event begin: VecScatterBegin [3] 8413.95 Event begin: SFPack [3] 8413.95 Event end: SFPack [3] 8413.95 Event end: VecScatterBegin [3] 8413.95 Event begin: VecScatterEnd [3] 8413.95 Event begin: SFUnpack [3] 8413.95 Event end: SFUnpack [3] 8413.95 Event end: VecScatterEnd [3] 8413.95 Event end: MatMult [3] 8413.95 Event begin: VecAYPX [3] 8413.95 Event end: VecAYPX [3] 8413.95 Event begin: VecNorm [3] 8413.95 Event end: VecNorm [3] 8413.95 Event begin: PCApply [3] 8413.95 Event begin: VecSet [3] 8413.95 Event end: VecSet [3] 8413.95 Event begin: MatSolve [3] 8413.96 Event end: MatSolve [3] 8413.96 Event end: PCApply [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecMDot [3] 8413.96 Event end: VecMDot [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: VecScale [3] 8413.96 Event end: VecScale [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecCopy [3] 8413.96 Event end: VecCopy [3] 8413.96 Event begin: VecAXPY [3] 8413.96 Event end: VecAXPY [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecAYPX [3] 8413.96 Event end: VecAYPX [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: PCApply [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: MatSolve [3] 8413.96 Event end: MatSolve [3] 8413.96 Event end: PCApply [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecMDot [3] 8413.96 Event end: VecMDot [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: VecScale [3] 8413.96 Event end: VecScale [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecCopy [3] 8413.96 Event end: VecCopy [3] 8413.96 Event begin: VecAXPY [3] 8413.96 Event end: VecAXPY [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecAYPX [3] 8413.96 Event end: VecAYPX [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: PCApply [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: MatSolve [3] 8413.96 Event end: MatSolve [3] 8413.96 Event end: PCApply [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecMDot [3] 8413.96 Event end: VecMDot [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: VecScale [3] 8413.96 Event end: VecScale [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecCopy [3] 8413.96 Event end: VecCopy [3] 8413.96 Event begin: VecAXPY [3] 8413.96 Event end: VecAXPY [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecAYPX [3] 8413.96 Event end: VecAYPX [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: PCApply [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: MatSolve [3] 8413.96 Event end: MatSolve [3] 8413.96 Event end: PCApply [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecMDot [3] 8413.96 Event end: VecMDot [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: VecScale [3] 8413.96 Event end: VecScale [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecCopy [3] 8413.96 Event end: VecCopy [3] 8413.96 Event begin: VecAXPY [3] 8413.96 Event end: VecAXPY [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecAYPX [3] 8413.96 Event end: VecAYPX [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: PCApply [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: MatSolve [3] 8413.96 Event end: MatSolve [3] 8413.96 Event end: PCApply [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecMDot [3] 8413.96 Event end: VecMDot [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: VecScale [3] 8413.96 Event end: VecScale [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecCopy [3] 8413.96 Event end: VecCopy [3] 8413.96 Event begin: VecAXPY [3] 8413.96 Event end: VecAXPY [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecAYPX [3] 8413.96 Event end: VecAYPX [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: PCApply [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: MatSolve [3] 8413.96 Event end: MatSolve [3] 8413.96 Event end: PCApply [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecMDot [3] 8413.96 Event end: VecMDot [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: VecScale [3] 8413.96 Event end: VecScale [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: VecMAXPY [3] 8413.96 Event end: VecMAXPY [3] 8413.96 Event begin: VecCopy [3] 8413.96 Event end: VecCopy [3] 8413.96 Event begin: VecAXPY [3] 8413.96 Event end: VecAXPY [3] 8413.96 Event begin: MatMult [3] 8413.96 Event begin: VecScatterBegin [3] 8413.96 Event begin: SFPack [3] 8413.96 Event end: SFPack [3] 8413.96 Event end: VecScatterBegin [3] 8413.96 Event begin: VecScatterEnd [3] 8413.96 Event begin: SFUnpack [3] 8413.96 Event end: SFUnpack [3] 8413.96 Event end: VecScatterEnd [3] 8413.96 Event end: MatMult [3] 8413.96 Event begin: VecAYPX [3] 8413.96 Event end: VecAYPX [3] 8413.96 Event begin: VecNorm [3] 8413.96 Event end: VecNorm [3] 8413.96 Event begin: PCApply [3] 8413.96 Event begin: VecSet [3] 8413.96 Event end: VecSet [3] 8413.96 Event begin: MatSolve [3] 8413.97 Event end: MatSolve [3] 8413.97 Event end: PCApply [3] 8413.97 Event begin: MatMult [3] 8413.97 Event begin: VecScatterBegin [3] 8413.97 Event begin: SFPack [3] 8413.97 Event end: SFPack [3] 8413.97 Event end: VecScatterBegin [3] 8413.97 Event begin: VecScatterEnd [3] 8413.97 Event begin: SFUnpack [3] 8413.97 Event end: SFUnpack [3] 8413.97 Event end: VecScatterEnd [3] 8413.97 Event end: MatMult [3] 8413.97 Event begin: VecMDot [3] 8413.97 Event end: VecMDot [3] 8413.97 Event begin: VecMAXPY [3] 8413.97 Event end: VecMAXPY [3] 8413.97 Event begin: VecNorm [3] 8413.97 Event end: VecNorm [3] 8413.97 Event begin: VecScale [3] 8413.97 Event end: VecScale [3] 8413.97 Event begin: VecSet [3] 8413.97 Event end: VecSet [3] 8413.97 Event begin: VecMAXPY [3] 8413.97 Event end: VecMAXPY [3] 8413.97 Event begin: VecCopy [3] 8413.97 Event end: VecCopy [3] 8413.97 Event begin: VecAXPY [3] 8413.97 Event end: VecAXPY [3] 8413.97 Event begin: MatMult [3] 8413.97 Event begin: VecScatterBegin [3] 8413.97 Event begin: SFPack [3] 8413.97 Event end: SFPack [3] 8413.97 Event end: VecScatterBegin [3] 8413.97 Event begin: VecScatterEnd [3] 8413.97 Event begin: SFUnpack [3] 8413.97 Event end: SFUnpack [3] 8413.97 Event end: VecScatterEnd [3] 8413.97 Event end: MatMult [3] 8413.97 Event begin: VecAYPX [3] 8413.97 Event end: VecAYPX [3] 8413.97 Event begin: VecNorm [3] 8414.03 Event end: VecNorm [3] 8414.03 Event begin: PCApply [3] 8414.03 Event begin: VecSet [3] 8414.03 Event end: VecSet [3] 8414.03 Event begin: MatSolve [3] 8414.03 Event end: MatSolve [3] 8414.03 Event end: PCApply [3] 8414.03 Event begin: MatMult [3] 8414.03 Event begin: VecScatterBegin [3] 8414.03 Event begin: SFPack [3] 8414.03 Event end: SFPack [3] 8414.03 Event end: VecScatterBegin [3] 8414.03 Event begin: VecScatterEnd [3] 8414.03 Event begin: SFUnpack [3] 8414.03 Event end: SFUnpack [3] 8414.03 Event end: VecScatterEnd [3] 8414.03 Event end: MatMult [3] 8414.03 Event begin: VecMDot [3] 8414.03 Event end: VecMDot [3] 8414.03 Event begin: VecMAXPY [3] 8414.03 Event end: VecMAXPY [3] 8414.03 Event begin: VecNorm [3] 8414.03 Event end: VecNorm [3] 8414.03 Event begin: VecScale [3] 8414.03 Event end: VecScale [3] 8414.03 Event begin: VecSet [3] 8414.03 Event end: VecSet [3] 8414.03 Event begin: VecMAXPY [3] 8414.03 Event end: VecMAXPY [3] 8414.03 Event begin: VecCopy [3] 8414.03 Event end: VecCopy [3] 8414.03 Event begin: VecAXPY [3] 8414.03 Event end: VecAXPY [3] 8414.03 Event begin: MatMult [3] 8414.03 Event begin: VecScatterBegin [3] 8414.03 Event begin: SFPack [3] 8414.03 Event end: SFPack [3] 8414.03 Event end: VecScatterBegin [3] 8414.03 Event begin: VecScatterEnd [3] 8414.03 Event begin: SFUnpack [3] 8414.03 Event end: SFUnpack [3] 8414.03 Event end: VecScatterEnd [3] 8414.03 Event end: MatMult [3] 8414.03 Event begin: VecAYPX [3] 8414.03 Event end: VecAYPX [3] 8414.03 Event begin: VecNorm [3] 8414.03 Event end: VecNorm [3] 8414.03 Event begin: PCApply [3] 8414.03 Event begin: VecSet [3] 8414.03 Event end: VecSet [3] 8414.03 Event begin: MatSolve [3] 8414.03 Event end: MatSolve [3] 8414.03 Event end: PCApply [3] 8414.03 Event begin: MatMult [3] 8414.03 Event begin: VecScatterBegin [3] 8414.03 Event begin: SFPack [3] 8414.03 Event end: SFPack [3] 8414.03 Event end: VecScatterBegin [3] 8414.03 Event begin: VecScatterEnd [3] 8414.03 Event begin: SFUnpack [3] 8414.03 Event end: SFUnpack [3] 8414.03 Event end: VecScatterEnd [3] 8414.03 Event end: MatMult [3] 8414.03 Event begin: VecMDot [3] 8414.03 Event end: VecMDot [3] 8414.03 Event begin: VecMAXPY [3] 8414.03 Event end: VecMAXPY [3] 8414.03 Event begin: VecNorm [3] 8414.03 Event end: VecNorm [3] 8414.03 Event begin: VecScale [3] 8414.03 Event end: VecScale [3] 8414.03 Event begin: VecSet [3] 8414.03 Event end: VecSet [3] 8414.03 Event begin: VecMAXPY [3] 8414.03 Event end: VecMAXPY [3] 8414.03 Event begin: VecCopy [3] 8414.03 Event end: VecCopy [3] 8414.03 Event begin: VecAXPY [3] 8414.03 Event end: VecAXPY [3] 8414.03 Event begin: MatMult [3] 8414.03 Event begin: VecScatterBegin [3] 8414.03 Event begin: SFPack [3] 8414.03 Event end: SFPack [3] 8414.03 Event end: VecScatterBegin [3] 8414.03 Event begin: VecScatterEnd [3] 8414.03 Event begin: SFUnpack [3] 8414.03 Event end: SFUnpack [3] 8414.03 Event end: VecScatterEnd [3] 8414.03 Event end: MatMult [3] 8414.03 Event begin: VecAYPX [3] 8414.03 Event end: VecAYPX [3] 8414.03 Event begin: VecNorm [3] 8414.03 Event end: VecNorm [3] 8414.03 Event begin: PCApply [3] 8414.03 Event begin: VecSet [3] 8414.03 Event end: VecSet [3] 8414.03 Event begin: MatSolve [3] 8414.03 Event end: MatSolve [3] 8414.03 Event end: PCApply [3] 8414.03 Event begin: MatMult [3] 8414.03 Event begin: VecScatterBegin [3] 8414.03 Event begin: SFPack [3] 8414.03 Event end: SFPack [3] 8414.03 Event end: VecScatterBegin [3] 8414.03 Event begin: VecScatterEnd [3] 8414.03 Event begin: SFUnpack [3] 8414.03 Event end: SFUnpack [3] 8414.03 Event end: VecScatterEnd [3] 8414.03 Event end: MatMult [3] 8414.03 Event begin: VecMDot [3] 8414.03 Event end: VecMDot [3] 8414.03 Event begin: VecMAXPY [3] 8414.03 Event end: VecMAXPY [3] 8414.03 Event begin: VecNorm [3] 8414.03 Event end: VecNorm [3] 8414.03 Event begin: VecScale [3] 8414.03 Event end: VecScale [3] 8414.03 Event begin: VecSet [3] 8414.03 Event end: VecSet [3] 8414.03 Event begin: VecMAXPY [3] 8414.03 Event end: VecMAXPY [3] 8414.03 Event begin: VecCopy [3] 8414.03 Event end: VecCopy [3] 8414.03 Event begin: VecAXPY [3] 8414.03 Event end: VecAXPY [3] 8414.03 Event begin: MatMult [3] 8414.03 Event begin: VecScatterBegin [3] 8414.03 Event begin: SFPack [3] 8414.03 Event end: SFPack [3] 8414.03 Event end: VecScatterBegin [3] 8414.03 Event begin: VecScatterEnd [3] 8414.03 Event begin: SFUnpack [3] 8414.03 Event end: SFUnpack [3] 8414.03 Event end: VecScatterEnd [3] 8414.03 Event end: MatMult [3] 8414.03 Event begin: VecAYPX [3] 8414.03 Event end: VecAYPX [3] 8414.03 Event begin: VecNorm [3] 8414.03 Event end: VecNorm [3] 8414.03 Event begin: PCApply [3] 8414.03 Event begin: VecSet [3] 8414.03 Event end: VecSet [3] 8414.03 Event begin: MatSolve [3] 8414.03 Event end: MatSolve [3] 8414.03 Event end: PCApply [3] 8414.03 Event begin: MatMult [3] 8414.03 Event begin: VecScatterBegin [3] 8414.03 Event begin: SFPack [3] 8414.03 Event end: SFPack [3] 8414.03 Event end: VecScatterBegin [3] 8414.03 Event begin: VecScatterEnd [3] 8414.03 Event begin: SFUnpack [3] 8414.03 Event end: SFUnpack [3] 8414.03 Event end: VecScatterEnd [3] 8414.03 Event end: MatMult [3] 8414.03 Event begin: VecMDot [0] 8413.94 Event end: VecNorm 33 KSP unpreconditioned resid norm 1.221813418384e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.94 Event begin: PCApply [0] 8413.94 Event begin: VecSet [0] 8413.94 Event end: VecSet [0] 8413.94 Event begin: MatSolve [0] 8413.94 Event end: MatSolve [0] 8413.94 Event end: PCApply [0] 8413.94 Event begin: MatMult [0] 8413.94 Event begin: VecScatterBegin [0] 8413.94 Event begin: SFPack [0] 8413.94 Event end: SFPack [0] 8413.94 Event end: VecScatterBegin [0] 8413.94 Event begin: VecScatterEnd [0] 8413.94 Event begin: SFUnpack [0] 8413.94 Event end: SFUnpack [0] 8413.94 Event end: VecScatterEnd [0] 8413.94 Event end: MatMult [0] 8413.94 Event begin: VecMDot [0] 8413.94 Event end: VecMDot [0] 8413.94 Event begin: VecMAXPY [0] 8413.94 Event end: VecMAXPY [0] 8413.94 Event begin: VecNorm [0] 8413.94 Event end: VecNorm [0] 8413.94 Event begin: VecScale [0] 8413.94 Event end: VecScale [0] 8413.94 Event begin: VecSet [0] 8413.94 Event end: VecSet [0] 8413.94 Event begin: VecMAXPY [0] 8413.94 Event end: VecMAXPY [0] 8413.94 Event begin: VecCopy [0] 8413.94 Event end: VecCopy [0] 8413.94 Event begin: VecAXPY [0] 8413.94 Event end: VecAXPY [0] 8413.94 Event begin: MatMult [0] 8413.94 Event begin: VecScatterBegin [0] 8413.94 Event begin: SFPack [0] 8413.94 Event end: SFPack [0] 8413.94 Event end: VecScatterBegin [0] 8413.94 Event begin: VecScatterEnd [0] 8413.94 Event begin: SFUnpack [0] 8413.94 Event end: SFUnpack [0] 8413.94 Event end: VecScatterEnd [0] 8413.94 Event end: MatMult [0] 8413.94 Event begin: VecAYPX [0] 8413.94 Event end: VecAYPX [0] 8413.94 Event begin: VecNorm [0] 8413.94 Event end: VecNorm 34 KSP unpreconditioned resid norm 1.170503502672e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.94 Event begin: PCApply [0] 8413.94 Event begin: VecSet [0] 8413.94 Event end: VecSet [0] 8413.94 Event begin: MatSolve [0] 8413.94 Event end: MatSolve [0] 8413.94 Event end: PCApply [0] 8413.94 Event begin: MatMult [0] 8413.94 Event begin: VecScatterBegin [0] 8413.94 Event begin: SFPack [0] 8413.94 Event end: SFPack [0] 8413.94 Event end: VecScatterBegin [0] 8413.94 Event begin: VecScatterEnd [0] 8413.94 Event begin: SFUnpack [0] 8413.94 Event end: SFUnpack [0] 8413.94 Event end: VecScatterEnd [0] 8413.94 Event end: MatMult [0] 8413.94 Event begin: VecMDot [0] 8413.94 Event end: VecMDot [0] 8413.94 Event begin: VecMAXPY [0] 8413.94 Event end: VecMAXPY [0] 8413.94 Event begin: VecNorm [0] 8413.94 Event end: VecNorm [0] 8413.94 Event begin: VecScale [0] 8413.94 Event end: VecScale [0] 8413.94 Event begin: VecSet [0] 8413.94 Event end: VecSet [0] 8413.94 Event begin: VecMAXPY [0] 8413.94 Event end: VecMAXPY [0] 8413.94 Event begin: VecCopy [0] 8413.94 Event end: VecCopy [0] 8413.94 Event begin: VecAXPY [0] 8413.94 Event end: VecAXPY [0] 8413.94 Event begin: MatMult [0] 8413.94 Event begin: VecScatterBegin [0] 8413.94 Event begin: SFPack [0] 8413.94 Event end: SFPack [0] 8413.94 Event end: VecScatterBegin [0] 8413.94 Event begin: VecScatterEnd [0] 8413.94 Event begin: SFUnpack [0] 8413.94 Event end: SFUnpack [0] 8413.94 Event end: VecScatterEnd [0] 8413.94 Event end: MatMult [0] 8413.94 Event begin: VecAYPX [0] 8413.94 Event end: VecAYPX [0] 8413.94 Event begin: VecNorm [0] 8413.95 Event end: VecNorm 35 KSP unpreconditioned resid norm 1.112117727571e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.95 Event begin: PCApply [0] 8413.95 Event begin: VecSet [0] 8413.95 Event end: VecSet [0] 8413.95 Event begin: MatSolve [0] 8413.95 Event end: MatSolve [0] 8413.95 Event end: PCApply [0] 8413.95 Event begin: MatMult [0] 8413.95 Event begin: VecScatterBegin [0] 8413.95 Event begin: SFPack [0] 8413.95 Event end: SFPack [0] 8413.95 Event end: VecScatterBegin [0] 8413.95 Event begin: VecScatterEnd [0] 8413.95 Event begin: SFUnpack [0] 8413.95 Event end: SFUnpack [0] 8413.95 Event end: VecScatterEnd [0] 8413.95 Event end: MatMult [0] 8413.95 Event begin: VecMDot [0] 8413.95 Event end: VecMDot [0] 8413.95 Event begin: VecMAXPY [0] 8413.95 Event end: VecMAXPY [0] 8413.95 Event begin: VecNorm [0] 8413.95 Event end: VecNorm [0] 8413.95 Event begin: VecScale [0] 8413.95 Event end: VecScale [0] 8413.95 Event begin: VecSet [0] 8413.95 Event end: VecSet [0] 8413.95 Event begin: VecMAXPY [0] 8413.95 Event end: VecMAXPY [0] 8413.95 Event begin: VecCopy [0] 8413.95 Event end: VecCopy [0] 8413.95 Event begin: VecAXPY [0] 8413.95 Event end: VecAXPY [0] 8413.95 Event begin: MatMult [0] 8413.95 Event begin: VecScatterBegin [0] 8413.95 Event begin: SFPack [0] 8413.95 Event end: SFPack [0] 8413.95 Event end: VecScatterBegin [0] 8413.95 Event begin: VecScatterEnd [0] 8413.95 Event begin: SFUnpack [0] 8413.95 Event end: SFUnpack [0] 8413.95 Event end: VecScatterEnd [0] 8413.95 Event end: MatMult [0] 8413.95 Event begin: VecAYPX [0] 8413.95 Event end: VecAYPX [0] 8413.95 Event begin: VecNorm [0] 8413.95 Event end: VecNorm 36 KSP unpreconditioned resid norm 1.041365991480e-02 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.95 Event begin: PCApply [0] 8413.95 Event begin: VecSet [0] 8413.95 Event end: VecSet [0] 8413.95 Event begin: MatSolve [0] 8413.95 Event end: MatSolve [0] 8413.95 Event end: PCApply [0] 8413.95 Event begin: MatMult [0] 8413.95 Event begin: VecScatterBegin [0] 8413.95 Event begin: SFPack [0] 8413.95 Event end: SFPack [0] 8413.95 Event end: VecScatterBegin [0] 8413.95 Event begin: VecScatterEnd [0] 8413.95 Event begin: SFUnpack [0] 8413.95 Event end: SFUnpack [0] 8413.95 Event end: VecScatterEnd [0] 8413.95 Event end: MatMult [0] 8413.95 Event begin: VecMDot [0] 8413.95 Event end: VecMDot [0] 8413.95 Event begin: VecMAXPY [0] 8413.95 Event end: VecMAXPY [0] 8413.95 Event begin: VecNorm [0] 8413.95 Event end: VecNorm [0] 8413.95 Event begin: VecScale [0] 8413.95 Event end: VecScale [0] 8413.95 Event begin: VecSet [0] 8413.95 Event end: VecSet [0] 8413.95 Event begin: VecMAXPY [0] 8413.95 Event end: VecMAXPY [0] 8413.95 Event begin: VecCopy [0] 8413.95 Event end: VecCopy [0] 8413.95 Event begin: VecAXPY [0] 8413.95 Event end: VecAXPY [0] 8413.95 Event begin: MatMult [0] 8413.95 Event begin: VecScatterBegin [0] 8413.95 Event begin: SFPack [0] 8413.95 Event end: SFPack [0] 8413.95 Event end: VecScatterBegin [0] 8413.95 Event begin: VecScatterEnd [0] 8413.95 Event begin: SFUnpack [0] 8413.95 Event end: SFUnpack [0] 8413.95 Event end: VecScatterEnd [0] 8413.95 Event end: MatMult [0] 8413.95 Event begin: VecAYPX [0] 8413.95 Event end: VecAYPX [0] 8413.95 Event begin: VecNorm [0] 8413.95 Event end: VecNorm 37 KSP unpreconditioned resid norm 8.898425200683e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.95 Event begin: PCApply [0] 8413.95 Event begin: VecSet [0] 8413.95 Event end: VecSet [0] 8413.95 Event begin: MatSolve [0] 8413.95 Event end: MatSolve [0] 8413.95 Event end: PCApply [0] 8413.95 Event begin: MatMult [0] 8413.95 Event begin: VecScatterBegin [0] 8413.95 Event begin: SFPack [0] 8413.95 Event end: SFPack [0] 8413.95 Event end: VecScatterBegin [0] 8413.95 Event begin: VecScatterEnd [0] 8413.95 Event begin: SFUnpack [0] 8413.95 Event end: SFUnpack [0] 8413.95 Event end: VecScatterEnd [0] 8413.95 Event end: MatMult [0] 8413.95 Event begin: VecMDot [0] 8413.95 Event end: VecMDot [0] 8413.95 Event begin: VecMAXPY [0] 8413.95 Event end: VecMAXPY [0] 8413.95 Event begin: VecNorm [0] 8413.95 Event end: VecNorm [0] 8413.95 Event begin: VecScale [0] 8413.95 Event end: VecScale [0] 8413.95 Event begin: VecSet [0] 8413.95 Event end: VecSet [0] 8413.95 Event begin: VecMAXPY [0] 8413.95 Event end: VecMAXPY [0] 8413.95 Event begin: VecCopy [0] 8413.95 Event end: VecCopy [0] 8413.95 Event begin: VecAXPY [0] 8413.95 Event end: VecAXPY [0] 8413.95 Event begin: MatMult [0] 8413.95 Event begin: VecScatterBegin [0] 8413.95 Event begin: SFPack [0] 8413.95 Event end: SFPack [0] 8413.95 Event end: VecScatterBegin [0] 8413.95 Event begin: VecScatterEnd [0] 8413.95 Event begin: SFUnpack [0] 8413.95 Event end: SFUnpack [0] 8413.95 Event end: VecScatterEnd [0] 8413.95 Event end: MatMult [0] 8413.95 Event begin: VecAYPX [0] 8413.95 Event end: VecAYPX [0] 8413.95 Event begin: VecNorm [0] 8413.95 Event end: VecNorm 38 KSP unpreconditioned resid norm 7.828444318032e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.95 Event begin: PCApply [0] 8413.95 Event begin: VecSet [0] 8413.95 Event end: VecSet [0] 8413.95 Event begin: MatSolve [0] 8413.95 Event end: MatSolve [0] 8413.95 Event end: PCApply [0] 8413.95 Event begin: MatMult [0] 8413.95 Event begin: VecScatterBegin [0] 8413.95 Event begin: SFPack [0] 8413.95 Event end: SFPack [0] 8413.95 Event end: VecScatterBegin [0] 8413.95 Event begin: VecScatterEnd [0] 8413.95 Event begin: SFUnpack [0] 8413.95 Event end: SFUnpack [0] 8413.95 Event end: VecScatterEnd [0] 8413.95 Event end: MatMult [0] 8413.95 Event begin: VecMDot [0] 8413.95 Event end: VecMDot [0] 8413.95 Event begin: VecMAXPY [0] 8413.95 Event end: VecMAXPY [0] 8413.95 Event begin: VecNorm [0] 8413.95 Event end: VecNorm [0] 8413.95 Event begin: VecScale [0] 8413.95 Event end: VecScale [0] 8413.95 Event begin: VecSet [0] 8413.95 Event end: VecSet [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecCopy [0] 8413.96 Event end: VecCopy [0] 8413.96 Event begin: VecAXPY [0] 8413.96 Event end: VecAXPY [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecAYPX [0] 8413.96 Event end: VecAYPX [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm 39 KSP unpreconditioned resid norm 6.804847890303e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.96 Event begin: PCApply [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: MatSolve [0] 8413.96 Event end: MatSolve [0] 8413.96 Event end: PCApply [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecMDot [0] 8413.96 Event end: VecMDot [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm [0] 8413.96 Event begin: VecScale [0] 8413.96 Event end: VecScale [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecCopy [0] 8413.96 Event end: VecCopy [0] 8413.96 Event begin: VecAXPY [0] 8413.96 Event end: VecAXPY [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecAYPX [0] 8413.96 Event end: VecAYPX [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm 40 KSP unpreconditioned resid norm 5.932483484656e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.96 Event begin: PCApply [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: MatSolve [0] 8413.96 Event end: MatSolve [0] 8413.96 Event end: PCApply [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecMDot [0] 8413.96 Event end: VecMDot [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm [0] 8413.96 Event begin: VecScale [0] 8413.96 Event end: VecScale [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecCopy [0] 8413.96 Event end: VecCopy [0] 8413.96 Event begin: VecAXPY [0] 8413.96 Event end: VecAXPY [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecAYPX [0] 8413.96 Event end: VecAYPX [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm 41 KSP unpreconditioned resid norm 5.038615399194e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.96 Event begin: PCApply [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: MatSolve [0] 8413.96 Event end: MatSolve [0] 8413.96 Event end: PCApply [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecMDot [0] 8413.96 Event end: VecMDot [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm [0] 8413.96 Event begin: VecScale [0] 8413.96 Event end: VecScale [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecCopy [0] 8413.96 Event end: VecCopy [0] 8413.96 Event begin: VecAXPY [0] 8413.96 Event end: VecAXPY [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecAYPX [0] 8413.96 Event end: VecAYPX [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm 42 KSP unpreconditioned resid norm 4.351876224962e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.96 Event begin: PCApply [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: MatSolve [0] 8413.96 Event end: MatSolve [0] 8413.96 Event end: PCApply [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecMDot [0] 8413.96 Event end: VecMDot [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm [0] 8413.96 Event begin: VecScale [0] 8413.96 Event end: VecScale [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecCopy [0] 8413.96 Event end: VecCopy [0] 8413.96 Event begin: VecAXPY [0] 8413.96 Event end: VecAXPY [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecAYPX [0] 8413.96 Event end: VecAYPX [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm 43 KSP unpreconditioned resid norm 3.340922526921e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.96 Event begin: PCApply [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: MatSolve [0] 8413.96 Event end: MatSolve [0] 8413.96 Event end: PCApply [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecMDot [0] 8413.96 Event end: VecMDot [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm [0] 8413.96 Event begin: VecScale [0] 8413.96 Event end: VecScale [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecCopy [0] 8413.96 Event end: VecCopy [0] 8413.96 Event begin: VecAXPY [0] 8413.96 Event end: VecAXPY [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecAYPX [0] 8413.96 Event end: VecAYPX [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm 44 KSP unpreconditioned resid norm 2.489477739180e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8413.96 Event begin: PCApply [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: MatSolve [0] 8413.96 Event end: MatSolve [0] 8413.96 Event end: PCApply [0] 8413.96 Event begin: MatMult [0] 8413.96 Event begin: VecScatterBegin [0] 8413.96 Event begin: SFPack [0] 8413.96 Event end: SFPack [0] 8413.96 Event end: VecScatterBegin [0] 8413.96 Event begin: VecScatterEnd [0] 8413.96 Event begin: SFUnpack [0] 8413.96 Event end: SFUnpack [0] 8413.96 Event end: VecScatterEnd [0] 8413.96 Event end: MatMult [0] 8413.96 Event begin: VecMDot [0] 8413.96 Event end: VecMDot [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecNorm [0] 8413.96 Event end: VecNorm [0] 8413.96 Event begin: VecScale [0] 8413.96 Event end: VecScale [0] 8413.96 Event begin: VecSet [0] 8413.96 Event end: VecSet [0] 8413.96 Event begin: VecMAXPY [0] 8413.96 Event end: VecMAXPY [0] 8413.96 Event begin: VecCopy [0] 8413.96 Event end: VecCopy [0] 8413.96 Event begin: VecAXPY [0] 8413.97 Event end: VecAXPY [0] 8413.97 Event begin: MatMult [0] 8413.97 Event begin: VecScatterBegin [0] 8413.97 Event begin: SFPack [0] 8413.97 Event end: SFPack [0] 8413.97 Event end: VecScatterBegin [0] 8413.97 Event begin: VecScatterEnd [0] 8413.97 Event begin: SFUnpack [0] 8413.97 Event end: SFUnpack [0] 8413.97 Event end: VecScatterEnd [0] 8413.97 Event end: MatMult [0] 8413.97 Event begin: VecAYPX [0] 8413.97 Event end: VecAYPX [0] 8413.97 Event begin: VecNorm [0] 8414.03 Event end: VecNorm 45 KSP unpreconditioned resid norm 1.982019091986e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8414.03 Event begin: PCApply [0] 8414.03 Event begin: VecSet [0] 8414.03 Event end: VecSet [0] 8414.03 Event begin: MatSolve [0] 8414.03 Event end: MatSolve [0] 8414.03 Event end: PCApply [0] 8414.03 Event begin: MatMult [0] 8414.03 Event begin: VecScatterBegin [0] 8414.03 Event begin: SFPack [0] 8414.03 Event end: SFPack [0] 8414.03 Event end: VecScatterBegin [0] 8414.03 Event begin: VecScatterEnd [0] 8414.03 Event begin: SFUnpack [0] 8414.03 Event end: SFUnpack [0] 8414.03 Event end: VecScatterEnd [0] 8414.03 Event end: MatMult [0] 8414.03 Event begin: VecMDot [0] 8414.03 Event end: VecMDot [0] 8414.03 Event begin: VecMAXPY [0] 8414.03 Event end: VecMAXPY [0] 8414.03 Event begin: VecNorm [0] 8414.03 Event end: VecNorm [0] 8414.03 Event begin: VecScale [0] 8414.03 Event end: VecScale [0] 8414.03 Event begin: VecSet [0] 8414.03 Event end: VecSet [0] 8414.03 Event begin: VecMAXPY [0] 8414.03 Event end: VecMAXPY [0] 8414.03 Event begin: VecCopy [0] 8414.03 Event end: VecCopy [0] 8414.03 Event begin: VecAXPY [0] 8414.03 Event end: VecAXPY [0] 8414.03 Event begin: MatMult [0] 8414.03 Event begin: VecScatterBegin [0] 8414.03 Event begin: SFPack [0] 8414.03 Event end: SFPack [0] 8414.03 Event end: VecScatterBegin [0] 8414.03 Event begin: VecScatterEnd [0] 8414.03 Event begin: SFUnpack [0] 8414.03 Event end: SFUnpack [0] 8414.03 Event end: VecScatterEnd [0] 8414.03 Event end: MatMult [0] 8414.03 Event begin: VecAYPX [0] 8414.03 Event end: VecAYPX [0] 8414.03 Event begin: VecNorm [0] 8414.03 Event end: VecNorm 46 KSP unpreconditioned resid norm 1.542792020160e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8414.03 Event begin: PCApply [0] 8414.03 Event begin: VecSet [0] 8414.03 Event end: VecSet [0] 8414.03 Event begin: MatSolve [0] 8414.03 Event end: MatSolve [0] 8414.03 Event end: PCApply [0] 8414.03 Event begin: MatMult [0] 8414.03 Event begin: VecScatterBegin [0] 8414.03 Event begin: SFPack [0] 8414.03 Event end: SFPack [0] 8414.03 Event end: VecScatterBegin [0] 8414.03 Event begin: VecScatterEnd [0] 8414.03 Event begin: SFUnpack [0] 8414.03 Event end: SFUnpack [0] 8414.03 Event end: VecScatterEnd [0] 8414.03 Event end: MatMult [0] 8414.03 Event begin: VecMDot [0] 8414.03 Event end: VecMDot [0] 8414.03 Event begin: VecMAXPY [0] 8414.03 Event end: VecMAXPY [0] 8414.03 Event begin: VecNorm [0] 8414.03 Event end: VecNorm [0] 8414.03 Event begin: VecScale [0] 8414.03 Event end: VecScale [0] 8414.03 Event begin: VecSet [0] 8414.03 Event end: VecSet [0] 8414.03 Event begin: VecMAXPY [0] 8414.03 Event end: VecMAXPY [0] 8414.03 Event begin: VecCopy [0] 8414.03 Event end: VecCopy [0] 8414.03 Event begin: VecAXPY [0] 8414.03 Event end: VecAXPY [0] 8414.03 Event begin: MatMult [0] 8414.03 Event begin: VecScatterBegin [0] 8414.03 Event begin: SFPack [0] 8414.03 Event end: SFPack [0] 8414.03 Event end: VecScatterBegin [0] 8414.03 Event begin: VecScatterEnd [0] 8414.03 Event begin: SFUnpack [0] 8414.03 Event end: SFUnpack [0] 8414.03 Event end: VecScatterEnd [0] 8414.03 Event end: MatMult [0] 8414.03 Event begin: VecAYPX [0] 8414.03 Event end: VecAYPX [0] 8414.03 Event begin: VecNorm [0] 8414.03 Event end: VecNorm 47 KSP unpreconditioned resid norm 1.040829825286e-03 true resid norm -nan ||r(i)||/||b|| -nan [0] 8414.03 Event begin: PCApply [0] 8414.03 Event begin: VecSet [0] 8414.03 Event end: VecSet [0] 8414.03 Event begin: MatSolve [0] 8414.03 Event end: MatSolve [0] 8414.03 Event end: PCApply [0] 8414.03 Event begin: MatMult [0] 8414.03 Event begin: VecScatterBegin [0] 8414.03 Event begin: SFPack [0] 8414.03 Event end: SFPack [0] 8414.03 Event end: VecScatterBegin [0] 8414.03 Event begin: VecScatterEnd [0] 8414.03 Event begin: SFUnpack [0] 8414.03 Event end: SFUnpack [0] 8414.03 Event end: VecScatterEnd [0] 8414.03 Event end: MatMult [0] 8414.03 Event begin: VecMDot [0] 8414.03 Event end: VecMDot [0] 8414.03 Event begin: VecMAXPY [0] 8414.03 Event end: VecMAXPY [0] 8414.03 Event begin: VecNorm [0] 8414.03 Event end: VecNorm [0] 8414.03 Event begin: VecScale [0] 8414.03 Event end: VecScale [0] 8414.03 Event begin: VecSet [0] 8414.03 Event end: VecSet [0] 8414.03 Event begin: VecMAXPY [0] 8414.03 Event end: VecMAXPY [0] 8414.03 Event begin: VecCopy [0] 8414.03 Event end: VecCopy [0] 8414.03 Event begin: VecAXPY [0] 8414.03 Event end: VecAXPY [0] 8414.03 Event begin: MatMult [0] 8414.03 Event begin: VecScatterBegin [0] 8414.03 Event begin: SFPack [0] 8414.03 Event end: SFPack [0] 8414.03 Event end: VecScatterBegin [0] 8414.03 Event begin: VecScatterEnd [0] 8414.03 Event begin: SFUnpack [0] 8414.03 Event end: SFUnpack [0] 8414.03 Event end: VecScatterEnd [0] 8414.03 Event end: MatMult [0] 8414.03 Event begin: VecAYPX [0] 8414.03 Event end: VecAYPX [0] 8414.03 Event begin: VecNorm [0] 8414.03 Event end: VecNorm 48 KSP unpreconditioned resid norm 7.073558685429e-04 true resid norm -nan ||r(i)||/||b|| -nan [0] 8414.03 Event begin: PCApply [0] 8414.03 Event begin: VecSet [0] 8414.03 Event end: VecSet [0] 8414.03 Event begin: MatSolve [0] 8414.03 Event end: MatSolve [0] 8414.03 Event end: PCApply [0] 8414.03 Event begin: MatMult [0] 8414.03 Event begin: VecScatterBegin [0] 8414.03 Event begin: SFPack [0] 8414.03 Event end: SFPack [0] 8414.03 Event end: VecScatterBegin [0] 8414.03 Event begin: VecScatterEnd [0] 8414.03 Event begin: SFUnpack [0] 8414.03 Event end: SFUnpack [0] 8414.03 Event end: VecScatterEnd [0] 8414.03 Event end: MatMult [1] 8413.97 Event end: VecAYPX [1] 8414.03 Event begin: VecNorm [1] 8414.03 Event end: VecNorm [1] 8414.03 Event begin: PCApply [1] 8414.03 Event begin: VecSet [1] 8414.03 Event end: VecSet [1] 8414.03 Event begin: MatSolve [1] 8414.03 Event end: MatSolve [1] 8414.03 Event end: PCApply [1] 8414.03 Event begin: MatMult [1] 8414.03 Event begin: VecScatterBegin [1] 8414.03 Event begin: SFPack [1] 8414.03 Event end: SFPack [1] 8414.03 Event end: VecScatterBegin [1] 8414.03 Event begin: VecScatterEnd [1] 8414.03 Event begin: SFUnpack [1] 8414.03 Event end: SFUnpack [1] 8414.03 Event end: VecScatterEnd [1] 8414.03 Event end: MatMult [1] 8414.03 Event begin: VecMDot [1] 8414.03 Event end: VecMDot [1] 8414.03 Event begin: VecMAXPY [1] 8414.03 Event end: VecMAXPY [1] 8414.03 Event begin: VecNorm [1] 8414.03 Event end: VecNorm [1] 8414.03 Event begin: VecScale [1] 8414.03 Event end: VecScale [1] 8414.03 Event begin: VecSet [1] 8414.03 Event end: VecSet [1] 8414.03 Event begin: VecMAXPY [1] 8414.03 Event end: VecMAXPY [1] 8414.03 Event begin: VecCopy [1] 8414.03 Event end: VecCopy [1] 8414.03 Event begin: VecAXPY [1] 8414.03 Event end: VecAXPY [1] 8414.03 Event begin: MatMult [1] 8414.03 Event begin: VecScatterBegin [1] 8414.03 Event begin: SFPack [1] 8414.03 Event end: SFPack [1] 8414.03 Event end: VecScatterBegin [1] 8414.03 Event begin: VecScatterEnd [1] 8414.03 Event begin: SFUnpack [1] 8414.03 Event end: SFUnpack [1] 8414.03 Event end: VecScatterEnd [1] 8414.03 Event end: MatMult [1] 8414.03 Event begin: VecAYPX [1] 8414.03 Event end: VecAYPX [1] 8414.03 Event begin: VecNorm [1] 8414.03 Event end: VecNorm [1] 8414.03 Event begin: PCApply [1] 8414.03 Event begin: VecSet [1] 8414.03 Event end: VecSet [1] 8414.03 Event begin: MatSolve [1] 8414.03 Event end: MatSolve [1] 8414.03 Event end: PCApply [1] 8414.03 Event begin: MatMult [1] 8414.03 Event begin: VecScatterBegin [1] 8414.03 Event begin: SFPack [1] 8414.03 Event end: SFPack [1] 8414.03 Event end: VecScatterBegin [1] 8414.03 Event begin: VecScatterEnd [1] 8414.03 Event begin: SFUnpack [1] 8414.03 Event end: SFUnpack [1] 8414.03 Event end: VecScatterEnd [1] 8414.03 Event end: MatMult [1] 8414.03 Event begin: VecMDot [1] 8414.03 Event end: VecMDot [1] 8414.03 Event begin: VecMAXPY [1] 8414.03 Event end: VecMAXPY [1] 8414.03 Event begin: VecNorm [1] 8414.03 Event end: VecNorm [1] 8414.03 Event begin: VecScale [1] 8414.03 Event end: VecScale [1] 8414.03 Event begin: VecSet [1] 8414.03 Event end: VecSet [1] 8414.03 Event begin: VecMAXPY [1] 8414.03 Event end: VecMAXPY [1] 8414.03 Event begin: VecCopy [1] 8414.03 Event end: VecCopy [1] 8414.03 Event begin: VecAXPY [1] 8414.03 Event end: VecAXPY [1] 8414.03 Event begin: MatMult [1] 8414.03 Event begin: VecScatterBegin [1] 8414.03 Event begin: SFPack [1] 8414.03 Event end: SFPack [1] 8414.03 Event end: VecScatterBegin [1] 8414.03 Event begin: VecScatterEnd [1] 8414.03 Event begin: SFUnpack [1] 8414.03 Event end: SFUnpack [1] 8414.03 Event end: VecScatterEnd [1] 8414.03 Event end: MatMult [1] 8414.03 Event begin: VecAYPX [1] 8414.03 Event end: VecAYPX [1] 8414.03 Event begin: VecNorm [1] 8414.03 Event end: VecNorm [1] 8414.03 Event begin: PCApply [1] 8414.03 Event begin: VecSet [1] 8414.03 Event end: VecSet [1] 8414.03 Event begin: MatSolve [1] 8414.03 Event end: MatSolve [1] 8414.03 Event end: PCApply [1] 8414.03 Event begin: MatMult [1] 8414.03 Event begin: VecScatterBegin [1] 8414.03 Event begin: SFPack [1] 8414.03 Event end: SFPack [1] 8414.03 Event end: VecScatterBegin [1] 8414.03 Event begin: VecScatterEnd [1] 8414.03 Event begin: SFUnpack [1] 8414.03 Event end: SFUnpack [1] 8414.03 Event end: VecScatterEnd [1] 8414.03 Event end: MatMult [1] 8414.03 Event begin: VecMDot [1] 8414.03 Event end: VecMDot [1] 8414.03 Event begin: VecMAXPY [1] 8414.03 Event end: VecMAXPY [1] 8414.03 Event begin: VecNorm [1] 8414.03 Event end: VecNorm [1] 8414.03 Event begin: VecScale [1] 8414.03 Event end: VecScale [1] 8414.03 Event begin: VecSet [1] 8414.03 Event end: VecSet [1] 8414.03 Event begin: VecMAXPY [1] 8414.03 Event end: VecMAXPY [1] 8414.03 Event begin: VecCopy [1] 8414.03 Event end: VecCopy [1] 8414.03 Event begin: VecAXPY [1] 8414.03 Event end: VecAXPY [1] 8414.03 Event begin: MatMult [1] 8414.03 Event begin: VecScatterBegin [1] 8414.03 Event begin: SFPack [1] 8414.03 Event end: SFPack [1] 8414.03 Event end: VecScatterBegin [1] 8414.03 Event begin: VecScatterEnd [1] 8414.03 Event begin: SFUnpack [1] 8414.03 Event end: SFUnpack [1] 8414.03 Event end: VecScatterEnd [1] 8414.03 Event end: MatMult [1] 8414.03 Event begin: VecAYPX [1] 8414.03 Event end: VecAYPX [1] 8414.03 Event begin: VecNorm [1] 8414.03 Event end: VecNorm [1] 8414.03 Event begin: PCApply [1] 8414.03 Event begin: VecSet [1] 8414.03 Event end: VecSet [1] 8414.03 Event begin: MatSolve [1] 8414.03 Event end: MatSolve [1] 8414.03 Event end: PCApply [1] 8414.03 Event begin: MatMult [1] 8414.03 Event begin: VecScatterBegin [1] 8414.03 Event begin: SFPack [1] 8414.03 Event end: SFPack [1] 8414.03 Event end: VecScatterBegin [1] 8414.03 Event begin: VecScatterEnd [1] 8414.03 Event begin: SFUnpack [1] 8414.03 Event end: SFUnpack [1] 8414.03 Event end: VecScatterEnd [1] 8414.03 Event end: MatMult [1] 8414.03 Event begin: VecMDot [1] 8414.07 Event end: VecMDot [1] 8414.07 Event begin: VecMAXPY [1] 8414.07 Event end: VecMAXPY [1] 8414.07 Event begin: VecNorm [1] 8414.07 Event end: VecNorm [1] 8414.07 Event begin: VecScale [1] 8414.07 Event end: VecScale [1] 8414.07 Event begin: VecSet [1] 8414.07 Event end: VecSet [1] 8414.07 Event begin: VecMAXPY [1] 8414.07 Event end: VecMAXPY [1] 8414.07 Event begin: VecCopy [1] 8414.07 Event end: VecCopy [1] 8414.07 Event begin: VecAXPY [1] 8414.07 Event end: VecAXPY [1] 8414.07 Event begin: MatMult [1] 8414.07 Event begin: VecScatterBegin [1] 8414.07 Event begin: SFPack [1] 8414.07 Event end: SFPack [1] 8414.07 Event end: VecScatterBegin [1] 8414.07 Event begin: VecScatterEnd [1] 8414.07 Event begin: SFUnpack [1] 8414.07 Event end: SFUnpack [1] 8414.07 Event end: VecScatterEnd [1] 8414.07 Event end: MatMult [1] 8414.07 Event begin: VecAYPX [1] 8414.07 Event end: VecAYPX [1] 8414.07 Event begin: VecNorm [1] 8414.07 Event end: VecNorm [1] 8414.07 Event begin: PCApply [1] 8414.07 Event begin: VecSet [1] 8414.07 Event end: VecSet [1] 8414.07 Event begin: MatSolve [1] 8414.07 Event end: MatSolve [1] 8414.07 Event end: PCApply [1] 8414.07 Event begin: MatMult [1] 8414.07 Event begin: VecScatterBegin [1] 8414.07 Event begin: SFPack [1] 8414.07 Event end: SFPack [1] 8414.07 Event end: VecScatterBegin [1] 8414.07 Event begin: VecScatterEnd [1] 8414.07 Event begin: SFUnpack [1] 8414.07 Event end: SFUnpack [1] 8414.07 Event end: VecScatterEnd [1] 8414.07 Event end: MatMult [1] 8414.07 Event begin: VecMDot [1] 8414.07 Event end: VecMDot [1] 8414.07 Event begin: VecMAXPY [1] 8414.07 Event end: VecMAXPY [1] 8414.07 Event begin: VecNorm [1] 8414.07 Event end: VecNorm [1] 8414.07 Event begin: VecScale [1] 8414.07 Event end: VecScale [1] 8414.07 Event begin: VecSet [1] 8414.07 Event end: VecSet [1] 8414.07 Event begin: VecMAXPY [1] 8414.07 Event end: VecMAXPY [1] 8414.07 Event begin: VecCopy [1] 8414.07 Event end: VecCopy [1] 8414.07 Event begin: VecAXPY [1] 8414.07 Event end: VecAXPY [1] 8414.07 Event begin: MatMult [1] 8414.07 Event begin: VecScatterBegin [1] 8414.07 Event begin: SFPack [1] 8414.07 Event end: SFPack [1] 8414.07 Event end: VecScatterBegin [1] 8414.07 Event begin: VecScatterEnd [1] 8414.07 Event begin: SFUnpack [1] 8414.07 Event end: SFUnpack [1] 8414.07 Event end: VecScatterEnd [1] 8414.07 Event end: MatMult [1] 8414.07 Event begin: VecAYPX [1] 8414.07 Event end: VecAYPX [1] 8414.07 Event begin: VecNorm [1] 8414.07 Event end: VecNorm [1] 8414.07 Event begin: PCApply [1] 8414.07 Event begin: VecSet [1] 8414.07 Event end: VecSet [1] 8414.07 Event begin: MatSolve [1] 8414.07 Event end: MatSolve [1] 8414.07 Event end: PCApply [1] 8414.07 Event begin: MatMult [1] 8414.07 Event begin: VecScatterBegin [1] 8414.07 Event begin: SFPack [1] 8414.07 Event end: SFPack [1] 8414.07 Event end: VecScatterBegin [1] 8414.07 Event begin: VecScatterEnd [1] 8414.07 Event begin: SFUnpack [1] 8414.07 Event end: SFUnpack [1] 8414.07 Event end: VecScatterEnd [1] 8414.07 Event end: MatMult [1] 8414.07 Event begin: VecMDot [1] 8414.07 Event end: VecMDot [1] 8414.07 Event begin: VecMAXPY [1] 8414.07 Event end: VecMAXPY [1] 8414.07 Event begin: VecNorm [1] 8414.07 Event end: VecNorm [1] 8414.07 Event begin: VecScale [1] 8414.07 Event end: VecScale [1] 8414.07 Event begin: VecSet [1] 8414.07 Event end: VecSet [1] 8414.07 Event begin: VecMAXPY [1] 8414.07 Event end: VecMAXPY [1] 8414.07 Event begin: VecCopy [1] 8414.07 Event end: VecCopy [1] 8414.07 Event begin: VecAXPY [1] 8414.07 Event end: VecAXPY [1] 8414.07 Event begin: MatMult [1] 8414.07 Event begin: VecScatterBegin [1] 8414.07 Event begin: SFPack [1] 8414.07 Event end: SFPack [1] 8414.07 Event end: VecScatterBegin [1] 8414.07 Event begin: VecScatterEnd [1] 8414.07 Event begin: SFUnpack [1] 8414.07 Event end: SFUnpack [1] 8414.07 Event end: VecScatterEnd [1] 8414.07 Event end: MatMult [1] 8414.07 Event begin: VecAYPX [1] 8414.07 Event end: VecAYPX [1] 8414.07 Event begin: VecNorm [1] 8414.07 Event end: VecNorm [1] 8414.07 Event begin: PCApply [1] 8414.07 Event begin: VecSet [1] 8414.07 Event end: VecSet [1] 8414.07 Event begin: MatSolve [1] 8414.07 Event end: MatSolve [1] 8414.07 Event end: PCApply [1] 8414.07 Event begin: MatMult [1] 8414.07 Event begin: VecScatterBegin [1] 8414.07 Event begin: SFPack [1] 8414.07 Event end: SFPack [1] 8414.07 Event end: VecScatterBegin [1] 8414.07 Event begin: VecScatterEnd [1] 8414.07 Event begin: SFUnpack [1] 8414.07 Event end: SFUnpack [1] 8414.07 Event end: VecScatterEnd [1] 8414.07 Event end: MatMult [1] 8414.07 Event begin: VecMDot [1] 8414.07 Event end: VecMDot [1] 8414.07 Event begin: VecMAXPY [1] 8414.07 Event end: VecMAXPY [1] 8414.07 Event begin: VecNorm [1] 8414.07 Event end: VecNorm [1] 8414.07 Event begin: VecScale [1] 8414.07 Event end: VecScale [1] 8414.07 Event begin: VecSet [1] 8414.07 Event end: VecSet [1] 8414.07 Event begin: VecMAXPY [1] 8414.07 Event end: VecMAXPY [1] 8414.07 Event begin: VecCopy [1] 8414.07 Event end: VecCopy [1] 8414.07 Event begin: VecAXPY [1] 8414.07 Event end: VecAXPY [1] 8414.07 Event begin: MatMult [1] 8414.07 Event begin: VecScatterBegin [1] 8414.07 Event begin: SFPack [1] 8414.08 Event end: SFPack [1] 8414.08 Event end: VecScatterBegin [1] 8414.08 Event begin: VecScatterEnd [1] 8414.08 Event begin: SFUnpack [1] 8414.08 Event end: SFUnpack [1] 8414.08 Event end: VecScatterEnd [1] 8414.08 Event end: MatMult [1] 8414.08 Event begin: VecAYPX [1] 8414.08 Event end: VecAYPX [1] 8414.08 Event begin: VecNorm [1] 8414.08 Event end: VecNorm [1] 8414.08 Event begin: PCApply [1] 8414.08 Event begin: VecSet [1] 8414.08 Event end: VecSet [1] 8414.08 Event begin: MatSolve [1] 8414.08 Event end: MatSolve [1] 8414.08 Event end: PCApply [1] 8414.08 Event begin: MatMult [1] 8414.08 Event begin: VecScatterBegin [1] 8414.08 Event begin: SFPack [1] 8414.08 Event end: SFPack [1] 8414.08 Event end: VecScatterBegin [1] 8414.08 Event begin: VecScatterEnd [1] 8414.08 Event begin: SFUnpack [1] 8414.08 Event end: SFUnpack [1] 8414.08 Event end: VecScatterEnd [1] 8414.08 Event end: MatMult [1] 8414.08 Event begin: VecMDot [1] 8414.08 Event end: VecMDot [1] 8414.08 Event begin: VecMAXPY [1] 8414.08 Event end: VecMAXPY [1] 8414.08 Event begin: VecNorm [1] 8414.08 Event end: VecNorm [1] 8414.08 Event begin: VecScale [1] 8414.08 Event end: VecScale [1] 8414.08 Event begin: VecSet [1] 8414.08 Event end: VecSet [1] 8414.08 Event begin: VecMAXPY [1] 8414.08 Event end: VecMAXPY [1] 8414.08 Event begin: VecCopy [1] 8414.08 Event end: VecCopy [1] 8414.08 Event begin: VecAXPY [1] 8414.08 Event end: VecAXPY [1] 8414.08 Event begin: MatMult [1] 8414.08 Event begin: VecScatterBegin [1] 8414.08 Event begin: SFPack [1] 8414.08 Event end: SFPack [1] 8414.08 Event end: VecScatterBegin [1] 8414.08 Event begin: VecScatterEnd [1] 8414.08 Event begin: SFUnpack [1] 8414.08 Event end: SFUnpack [1] 8414.08 Event end: VecScatterEnd [1] 8414.08 Event end: MatMult [1] 8414.08 Event begin: VecAYPX [1] 8414.08 Event end: VecAYPX [1] 8414.08 Event begin: VecNorm [1] 8414.08 Event end: VecNorm [1] 8414.08 Event begin: VecSet [1] 8414.08 Event end: VecSet [1] 8414.08 Event begin: VecMAXPY [1] 8414.08 Event end: VecMAXPY [1] 8414.08 Event begin: VecAXPY [1] 8414.08 Event end: VecAXPY [1] 8414.08 Event end: KSPSolve [1] 8414.08 Event begin: MatView [1] 8414.08 Event end: MatView [1] 8414.08 Event begin: VecSet [1] 8414.08 Event end: VecSet [1] 8414.08 Event begin: VecScatterBegin [1] 8414.08 Event begin: SFSetUp [1] 8414.08 Event end: SFSetUp [1] 8414.08 Event begin: SFPack [1] 8414.08 Event end: SFPack [1] 8414.08 Event end: VecScatterBegin [1] 8414.08 Event begin: VecScatterEnd [1] 8414.08 Event begin: SFUnpack [1] 8414.08 Event end: SFUnpack [1] 8414.08 Event end: VecScatterEnd [1] 8414.08 Event begin: MatZeroEntries [1] 8414.08 Event end: MatZeroEntries [1] 8414.08 Event begin: VecSet [1] 8414.08 Event end: VecSet [1] 8414.09 Event begin: MatAssemblyBegin [1] 8414.09 Event begin: BuildTwoSidedF [1] 8414.09 Event begin: BuildTwoSided [1] 8414.1 Event end: BuildTwoSided [1] 8414.1 Event end: BuildTwoSidedF [1] 8414.1 Event end: MatAssemblyBegin [1] 8414.1 Event begin: MatAssemblyEnd [1] 8414.1 Event end: MatAssemblyEnd [1] 8414.1 Event begin: VecAssemblyBegin [1] 8414.1 Event begin: BuildTwoSidedF [1] 8414.1 Event begin: BuildTwoSided [1] 8414.1 Event end: BuildTwoSided [1] 8414.1 Event end: BuildTwoSidedF [1] 8414.1 Event end: VecAssemblyBegin [1] 8414.1 Event begin: VecAssemblyEnd [1] 8414.1 Event end: VecAssemblyEnd [1] 8414.1 Event begin: MatView [1] 8414.1 Event end: MatView [1] 8414.1 Event begin: SFSetGraph [1] 8414.1 Event end: SFSetGraph [1] 8414.1 Event begin: SFSetUp [1] 8414.1 Event begin: BuildTwoSided [1] 8414.1 Event end: BuildTwoSided [1] 8414.1 Event end: SFSetUp [1] 8414.1 Event begin: SFReduceBegin [1] 8414.1 Event begin: SFPack [1] 8414.1 Event end: SFPack [1] 8414.1 Event end: SFReduceBegin [1] 8414.1 Event begin: SFReduceEnd [1] 8414.1 Event begin: SFUnpack [1] 8414.1 Event end: SFUnpack [1] 8414.1 Event end: SFReduceEnd [1] 8414.1 Event begin: VecSet [1] 8414.1 Event end: VecSet [1] 8414.1 Event begin: VecScatterBegin [1] 8414.1 Event begin: SFPack [1] 8414.1 Event end: SFPack [1] 8414.1 Event end: VecScatterBegin [1] 8414.1 Event begin: VecScatterEnd [1] 8414.1 Event begin: SFUnpack [1] 8414.1 Event end: SFUnpack [1] 8414.1 Event end: VecScatterEnd [1] 8414.1 Event begin: VecScatterBegin [1] 8414.1 Event begin: SFPack [1] 8414.1 Event end: SFPack [1] 8414.1 Event end: VecScatterBegin [1] 8414.1 Event begin: VecScatterEnd [1] 8414.1 Event begin: SFUnpack [1] 8414.1 Event end: SFUnpack [1] 8414.1 Event end: VecScatterEnd [1] 8414.1 Event begin: KSPSetUp [1] 8414.1 Event end: KSPSetUp [1] 8414.1 Event begin: PCSetUp [1] 8414.1 Event end: PCSetUp [1] 8414.1 Event begin: VecNorm [1] 8414.1 Event end: VecNorm [1] 8414.1 Event begin: PCSetUpOnBlocks [1] 8414.1 Event begin: KSPSetUp [1] 8414.1 Event end: KSPSetUp [1] 8414.1 Event begin: PCSetUp [1] 8414.1 Event begin: MatGetOrdering [1] 8414.1 Event begin: MatGetRowIJ [1] 8414.1 Event end: MatGetRowIJ [1] 8414.1 Event end: MatGetOrdering [1] 8414.1 Event begin: MatLUFactorSym [1] 8414.1 Event end: MatLUFactorSym [1] 8414.1 Event begin: MatLUFactorNum [1] 8414.11 Event end: MatLUFactorNum [1] 8414.11 Event end: PCSetUp [1] 8414.11 Event end: PCSetUpOnBlocks [1] 8414.11 Event begin: KSPSolve [1] 8414.11 Event begin: VecSet [1] 8414.11 Event end: VecSet [1] 8414.11 Event begin: VecCopy [1] 8414.11 Event end: VecCopy [1] 8414.11 Event begin: VecNorm [1] 8414.12 Event end: VecNorm [1] 8414.12 Event end: KSPSolve [1] 8414.12 Event begin: MatView [1] 8414.12 Event end: MatView [1] 8414.12 Event begin: KSPSetUp [1] 8414.12 Event end: KSPSetUp [1] 8414.12 Event begin: PCSetUp [2] 8414.03 Event begin: PCApply [2] 8414.03 Event begin: VecSet [2] 8414.03 Event end: VecSet [2] 8414.03 Event begin: MatSolve [2] 8414.03 Event end: MatSolve [2] 8414.03 Event end: PCApply [2] 8414.03 Event begin: MatMult [2] 8414.03 Event begin: VecScatterBegin [2] 8414.03 Event begin: SFPack [2] 8414.03 Event end: SFPack [2] 8414.03 Event end: VecScatterBegin [2] 8414.03 Event begin: VecScatterEnd [2] 8414.03 Event begin: SFUnpack [2] 8414.03 Event end: SFUnpack [2] 8414.03 Event end: VecScatterEnd [2] 8414.03 Event end: MatMult [2] 8414.03 Event begin: VecMDot [2] 8414.03 Event end: VecMDot [2] 8414.03 Event begin: VecMAXPY [2] 8414.03 Event end: VecMAXPY [2] 8414.03 Event begin: VecNorm [2] 8414.03 Event end: VecNorm [2] 8414.03 Event begin: VecScale [2] 8414.03 Event end: VecScale [2] 8414.03 Event begin: VecSet [2] 8414.03 Event end: VecSet [2] 8414.03 Event begin: VecMAXPY [2] 8414.03 Event end: VecMAXPY [2] 8414.03 Event begin: VecCopy [2] 8414.03 Event end: VecCopy [2] 8414.03 Event begin: VecAXPY [2] 8414.03 Event end: VecAXPY [2] 8414.03 Event begin: MatMult [2] 8414.03 Event begin: VecScatterBegin [2] 8414.03 Event begin: SFPack [2] 8414.03 Event end: SFPack [2] 8414.03 Event end: VecScatterBegin [2] 8414.03 Event begin: VecScatterEnd [2] 8414.03 Event begin: SFUnpack [2] 8414.03 Event end: SFUnpack [2] 8414.03 Event end: VecScatterEnd [2] 8414.03 Event end: MatMult [2] 8414.03 Event begin: VecAYPX [2] 8414.03 Event end: VecAYPX [2] 8414.03 Event begin: VecNorm [2] 8414.03 Event end: VecNorm [2] 8414.03 Event begin: PCApply [2] 8414.03 Event begin: VecSet [2] 8414.03 Event end: VecSet [2] 8414.03 Event begin: MatSolve [2] 8414.03 Event end: MatSolve [2] 8414.03 Event end: PCApply [2] 8414.03 Event begin: MatMult [2] 8414.03 Event begin: VecScatterBegin [2] 8414.03 Event begin: SFPack [2] 8414.03 Event end: SFPack [2] 8414.03 Event end: VecScatterBegin [2] 8414.03 Event begin: VecScatterEnd [2] 8414.03 Event begin: SFUnpack [2] 8414.03 Event end: SFUnpack [2] 8414.03 Event end: VecScatterEnd [2] 8414.03 Event end: MatMult [2] 8414.03 Event begin: VecMDot [2] 8414.03 Event end: VecMDot [2] 8414.03 Event begin: VecMAXPY [2] 8414.03 Event end: VecMAXPY [2] 8414.03 Event begin: VecNorm [2] 8414.03 Event end: VecNorm [2] 8414.03 Event begin: VecScale [2] 8414.03 Event end: VecScale [2] 8414.03 Event begin: VecSet [2] 8414.03 Event end: VecSet [2] 8414.03 Event begin: VecMAXPY [2] 8414.03 Event end: VecMAXPY [2] 8414.03 Event begin: VecCopy [2] 8414.03 Event end: VecCopy [2] 8414.03 Event begin: VecAXPY [2] 8414.03 Event end: VecAXPY [2] 8414.03 Event begin: MatMult [2] 8414.03 Event begin: VecScatterBegin [2] 8414.03 Event begin: SFPack [2] 8414.03 Event end: SFPack [2] 8414.03 Event end: VecScatterBegin [2] 8414.03 Event begin: VecScatterEnd [2] 8414.03 Event begin: SFUnpack [2] 8414.03 Event end: SFUnpack [2] 8414.03 Event end: VecScatterEnd [2] 8414.03 Event end: MatMult [2] 8414.03 Event begin: VecAYPX [2] 8414.03 Event end: VecAYPX [2] 8414.03 Event begin: VecNorm [2] 8414.03 Event end: VecNorm [2] 8414.03 Event begin: PCApply [2] 8414.03 Event begin: VecSet [2] 8414.03 Event end: VecSet [2] 8414.03 Event begin: MatSolve [2] 8414.03 Event end: MatSolve [2] 8414.03 Event end: PCApply [2] 8414.03 Event begin: MatMult [2] 8414.03 Event begin: VecScatterBegin [2] 8414.03 Event begin: SFPack [2] 8414.03 Event end: SFPack [2] 8414.03 Event end: VecScatterBegin [2] 8414.03 Event begin: VecScatterEnd [2] 8414.03 Event begin: SFUnpack [2] 8414.03 Event end: SFUnpack [2] 8414.03 Event end: VecScatterEnd [2] 8414.03 Event end: MatMult [2] 8414.03 Event begin: VecMDot [2] 8414.07 Event end: VecMDot [2] 8414.07 Event begin: VecMAXPY [2] 8414.07 Event end: VecMAXPY [2] 8414.07 Event begin: VecNorm [2] 8414.07 Event end: VecNorm [2] 8414.07 Event begin: VecScale [2] 8414.07 Event end: VecScale [2] 8414.07 Event begin: VecSet [2] 8414.07 Event end: VecSet [2] 8414.07 Event begin: VecMAXPY [2] 8414.07 Event end: VecMAXPY [2] 8414.07 Event begin: VecCopy [2] 8414.07 Event end: VecCopy [2] 8414.07 Event begin: VecAXPY [2] 8414.07 Event end: VecAXPY [2] 8414.07 Event begin: MatMult [2] 8414.07 Event begin: VecScatterBegin [2] 8414.07 Event begin: SFPack [2] 8414.07 Event end: SFPack [2] 8414.07 Event end: VecScatterBegin [2] 8414.07 Event begin: VecScatterEnd [2] 8414.07 Event begin: SFUnpack [2] 8414.07 Event end: SFUnpack [2] 8414.07 Event end: VecScatterEnd [2] 8414.07 Event end: MatMult [2] 8414.07 Event begin: VecAYPX [2] 8414.07 Event end: VecAYPX [2] 8414.07 Event begin: VecNorm [2] 8414.07 Event end: VecNorm [2] 8414.07 Event begin: PCApply [2] 8414.07 Event begin: VecSet [2] 8414.07 Event end: VecSet [2] 8414.07 Event begin: MatSolve [2] 8414.07 Event end: MatSolve [2] 8414.07 Event end: PCApply [2] 8414.07 Event begin: MatMult [2] 8414.07 Event begin: VecScatterBegin [2] 8414.07 Event begin: SFPack [2] 8414.07 Event end: SFPack [2] 8414.07 Event end: VecScatterBegin [2] 8414.07 Event begin: VecScatterEnd [2] 8414.07 Event begin: SFUnpack [2] 8414.07 Event end: SFUnpack [2] 8414.07 Event end: VecScatterEnd [2] 8414.07 Event end: MatMult [2] 8414.07 Event begin: VecMDot [2] 8414.07 Event end: VecMDot [2] 8414.07 Event begin: VecMAXPY [2] 8414.07 Event end: VecMAXPY [2] 8414.07 Event begin: VecNorm [2] 8414.07 Event end: VecNorm [2] 8414.07 Event begin: VecScale [2] 8414.07 Event end: VecScale [2] 8414.07 Event begin: VecSet [2] 8414.07 Event end: VecSet [2] 8414.07 Event begin: VecMAXPY [2] 8414.07 Event end: VecMAXPY [2] 8414.07 Event begin: VecCopy [2] 8414.07 Event end: VecCopy [2] 8414.07 Event begin: VecAXPY [2] 8414.07 Event end: VecAXPY [2] 8414.07 Event begin: MatMult [2] 8414.07 Event begin: VecScatterBegin [2] 8414.07 Event begin: SFPack [2] 8414.07 Event end: SFPack [2] 8414.07 Event end: VecScatterBegin [2] 8414.07 Event begin: VecScatterEnd [2] 8414.07 Event begin: SFUnpack [2] 8414.07 Event end: SFUnpack [2] 8414.07 Event end: VecScatterEnd [2] 8414.07 Event end: MatMult [2] 8414.07 Event begin: VecAYPX [2] 8414.07 Event end: VecAYPX [2] 8414.07 Event begin: VecNorm [2] 8414.07 Event end: VecNorm [2] 8414.07 Event begin: PCApply [2] 8414.07 Event begin: VecSet [2] 8414.07 Event end: VecSet [2] 8414.07 Event begin: MatSolve [2] 8414.07 Event end: MatSolve [2] 8414.07 Event end: PCApply [2] 8414.07 Event begin: MatMult [2] 8414.07 Event begin: VecScatterBegin [2] 8414.07 Event begin: SFPack [2] 8414.07 Event end: SFPack [2] 8414.07 Event end: VecScatterBegin [2] 8414.07 Event begin: VecScatterEnd [2] 8414.07 Event begin: SFUnpack [2] 8414.07 Event end: SFUnpack [2] 8414.07 Event end: VecScatterEnd [2] 8414.07 Event end: MatMult [2] 8414.07 Event begin: VecMDot [2] 8414.07 Event end: VecMDot [2] 8414.07 Event begin: VecMAXPY [2] 8414.07 Event end: VecMAXPY [2] 8414.07 Event begin: VecNorm [2] 8414.07 Event end: VecNorm [2] 8414.07 Event begin: VecScale [2] 8414.07 Event end: VecScale [2] 8414.07 Event begin: VecSet [2] 8414.07 Event end: VecSet [2] 8414.07 Event begin: VecMAXPY [2] 8414.07 Event end: VecMAXPY [2] 8414.07 Event begin: VecCopy [2] 8414.07 Event end: VecCopy [2] 8414.07 Event begin: VecAXPY [2] 8414.07 Event end: VecAXPY [2] 8414.07 Event begin: MatMult [2] 8414.07 Event begin: VecScatterBegin [2] 8414.07 Event begin: SFPack [2] 8414.07 Event end: SFPack [2] 8414.07 Event end: VecScatterBegin [2] 8414.07 Event begin: VecScatterEnd [2] 8414.07 Event begin: SFUnpack [2] 8414.07 Event end: SFUnpack [2] 8414.07 Event end: VecScatterEnd [2] 8414.07 Event end: MatMult [2] 8414.07 Event begin: VecAYPX [2] 8414.07 Event end: VecAYPX [2] 8414.07 Event begin: VecNorm [2] 8414.07 Event end: VecNorm [2] 8414.07 Event begin: PCApply [2] 8414.07 Event begin: VecSet [2] 8414.07 Event end: VecSet [2] 8414.07 Event begin: MatSolve [2] 8414.07 Event end: MatSolve [2] 8414.07 Event end: PCApply [2] 8414.07 Event begin: MatMult [2] 8414.07 Event begin: VecScatterBegin [2] 8414.07 Event begin: SFPack [2] 8414.07 Event end: SFPack [2] 8414.07 Event end: VecScatterBegin [2] 8414.07 Event begin: VecScatterEnd [2] 8414.07 Event begin: SFUnpack [2] 8414.07 Event end: SFUnpack [2] 8414.07 Event end: VecScatterEnd [2] 8414.07 Event end: MatMult [2] 8414.07 Event begin: VecMDot [2] 8414.08 Event end: VecMDot [2] 8414.08 Event begin: VecMAXPY [2] 8414.08 Event end: VecMAXPY [2] 8414.08 Event begin: VecNorm [2] 8414.08 Event end: VecNorm [2] 8414.08 Event begin: VecScale [2] 8414.08 Event end: VecScale [2] 8414.08 Event begin: VecSet [2] 8414.08 Event end: VecSet [2] 8414.08 Event begin: VecMAXPY [2] 8414.08 Event end: VecMAXPY [2] 8414.08 Event begin: VecCopy [2] 8414.08 Event end: VecCopy [2] 8414.08 Event begin: VecAXPY [2] 8414.08 Event end: VecAXPY [2] 8414.08 Event begin: MatMult [2] 8414.08 Event begin: VecScatterBegin [2] 8414.08 Event begin: SFPack [2] 8414.08 Event end: SFPack [2] 8414.08 Event end: VecScatterBegin [2] 8414.08 Event begin: VecScatterEnd [2] 8414.08 Event begin: SFUnpack [2] 8414.08 Event end: SFUnpack [2] 8414.08 Event end: VecScatterEnd [2] 8414.08 Event end: MatMult [2] 8414.08 Event begin: VecAYPX [2] 8414.08 Event end: VecAYPX [2] 8414.08 Event begin: VecNorm [2] 8414.08 Event end: VecNorm [2] 8414.08 Event begin: PCApply [2] 8414.08 Event begin: VecSet [2] 8414.08 Event end: VecSet [2] 8414.08 Event begin: MatSolve [2] 8414.08 Event end: MatSolve [2] 8414.08 Event end: PCApply [2] 8414.08 Event begin: MatMult [2] 8414.08 Event begin: VecScatterBegin [2] 8414.08 Event begin: SFPack [2] 8414.08 Event end: SFPack [2] 8414.08 Event end: VecScatterBegin [2] 8414.08 Event begin: VecScatterEnd [2] 8414.08 Event begin: SFUnpack [2] 8414.08 Event end: SFUnpack [2] 8414.08 Event end: VecScatterEnd [2] 8414.08 Event end: MatMult [2] 8414.08 Event begin: VecMDot [2] 8414.08 Event end: VecMDot [2] 8414.08 Event begin: VecMAXPY [2] 8414.08 Event end: VecMAXPY [2] 8414.08 Event begin: VecNorm [2] 8414.08 Event end: VecNorm [2] 8414.08 Event begin: VecScale [2] 8414.08 Event end: VecScale [2] 8414.08 Event begin: VecSet [2] 8414.08 Event end: VecSet [2] 8414.08 Event begin: VecMAXPY [2] 8414.08 Event end: VecMAXPY [2] 8414.08 Event begin: VecCopy [2] 8414.08 Event end: VecCopy [2] 8414.08 Event begin: VecAXPY [2] 8414.08 Event end: VecAXPY [2] 8414.08 Event begin: MatMult [2] 8414.08 Event begin: VecScatterBegin [2] 8414.08 Event begin: SFPack [2] 8414.08 Event end: SFPack [2] 8414.08 Event end: VecScatterBegin [2] 8414.08 Event begin: VecScatterEnd [2] 8414.08 Event begin: SFUnpack [2] 8414.08 Event end: SFUnpack [2] 8414.08 Event end: VecScatterEnd [2] 8414.08 Event end: MatMult [2] 8414.08 Event begin: VecAYPX [2] 8414.08 Event end: VecAYPX [2] 8414.08 Event begin: VecNorm [2] 8414.08 Event end: VecNorm [2] 8414.08 Event begin: VecSet [2] 8414.08 Event end: VecSet [2] 8414.08 Event begin: VecMAXPY [2] 8414.08 Event end: VecMAXPY [2] 8414.08 Event begin: VecAXPY [2] 8414.08 Event end: VecAXPY [2] 8414.08 Event end: KSPSolve [2] 8414.08 Event begin: MatView [2] 8414.08 Event end: MatView [2] 8414.08 Event begin: VecSet [2] 8414.08 Event end: VecSet [2] 8414.08 Event begin: VecScatterBegin [2] 8414.08 Event begin: SFSetUp [2] 8414.08 Event end: SFSetUp [2] 8414.08 Event begin: SFPack [2] 8414.08 Event end: SFPack [2] 8414.08 Event end: VecScatterBegin [2] 8414.08 Event begin: VecScatterEnd [2] 8414.08 Event begin: SFUnpack [2] 8414.08 Event end: SFUnpack [2] 8414.08 Event end: VecScatterEnd [2] 8414.08 Event begin: MatZeroEntries [2] 8414.08 Event end: MatZeroEntries [2] 8414.08 Event begin: VecSet [2] 8414.08 Event end: VecSet [2] 8414.09 Event begin: MatAssemblyBegin [2] 8414.09 Event begin: BuildTwoSidedF [2] 8414.09 Event begin: BuildTwoSided [2] 8414.1 Event end: BuildTwoSided [2] 8414.1 Event end: BuildTwoSidedF [2] 8414.1 Event end: MatAssemblyBegin [2] 8414.1 Event begin: MatAssemblyEnd [2] 8414.1 Event end: MatAssemblyEnd [2] 8414.1 Event begin: VecAssemblyBegin [2] 8414.1 Event begin: BuildTwoSidedF [2] 8414.1 Event begin: BuildTwoSided [2] 8414.1 Event end: BuildTwoSided [2] 8414.1 Event end: BuildTwoSidedF [2] 8414.1 Event end: VecAssemblyBegin [2] 8414.1 Event begin: VecAssemblyEnd [2] 8414.1 Event end: VecAssemblyEnd [2] 8414.1 Event begin: MatView [2] 8414.1 Event end: MatView [2] 8414.1 Event begin: SFSetGraph [2] 8414.1 Event end: SFSetGraph [2] 8414.1 Event begin: SFSetUp [2] 8414.1 Event begin: BuildTwoSided [2] 8414.1 Event end: BuildTwoSided [2] 8414.1 Event end: SFSetUp [2] 8414.1 Event begin: SFReduceBegin [2] 8414.1 Event begin: SFPack [2] 8414.1 Event end: SFPack [2] 8414.1 Event end: SFReduceBegin [2] 8414.1 Event begin: SFReduceEnd [2] 8414.1 Event begin: SFUnpack [2] 8414.1 Event end: SFUnpack [2] 8414.1 Event end: SFReduceEnd [2] 8414.1 Event begin: VecSet [2] 8414.1 Event end: VecSet [2] 8414.1 Event begin: VecScatterBegin [2] 8414.1 Event begin: SFPack [2] 8414.1 Event end: SFPack [2] 8414.1 Event end: VecScatterBegin [2] 8414.1 Event begin: VecScatterEnd [2] 8414.1 Event begin: SFUnpack [2] 8414.1 Event end: SFUnpack [2] 8414.1 Event end: VecScatterEnd [2] 8414.1 Event begin: VecScatterBegin [2] 8414.1 Event begin: SFPack [2] 8414.1 Event end: SFPack [2] 8414.1 Event end: VecScatterBegin [2] 8414.1 Event begin: VecScatterEnd [2] 8414.1 Event begin: SFUnpack [2] 8414.1 Event end: SFUnpack [2] 8414.1 Event end: VecScatterEnd [2] 8414.1 Event begin: KSPSetUp [2] 8414.1 Event end: KSPSetUp [2] 8414.1 Event begin: PCSetUp [2] 8414.1 Event end: PCSetUp [2] 8414.1 Event begin: VecNorm [2] 8414.1 Event end: VecNorm [2] 8414.1 Event begin: PCSetUpOnBlocks [2] 8414.1 Event begin: KSPSetUp [2] 8414.1 Event end: KSPSetUp [2] 8414.1 Event begin: PCSetUp [2] 8414.1 Event begin: MatGetOrdering [2] 8414.1 Event begin: MatGetRowIJ [2] 8414.1 Event end: MatGetRowIJ [2] 8414.1 Event end: MatGetOrdering [2] 8414.1 Event begin: MatLUFactorSym [2] 8414.1 Event end: MatLUFactorSym [2] 8414.1 Event begin: MatLUFactorNum [2] 8414.11 Event end: MatLUFactorNum [2] 8414.11 Event end: PCSetUp [2] 8414.11 Event end: PCSetUpOnBlocks [2] 8414.11 Event begin: KSPSolve [2] 8414.11 Event begin: VecSet [2] 8414.11 Event end: VecSet [2] 8414.11 Event begin: VecCopy [2] 8414.11 Event end: VecCopy [2] 8414.11 Event begin: VecNorm [2] 8414.12 Event end: VecNorm [2] 8414.12 Event end: KSPSolve [2] 8414.12 Event begin: MatView [2] 8414.12 Event end: MatView [2] 8414.12 Event begin: KSPSetUp [2] 8414.12 Event end: KSPSetUp [2] 8414.12 Event begin: PCSetUp [0] 8414.03 Event begin: VecMDot [0] 8414.07 Event end: VecMDot [0] 8414.07 Event begin: VecMAXPY [0] 8414.07 Event end: VecMAXPY [0] 8414.07 Event begin: VecNorm [0] 8414.07 Event end: VecNorm [0] 8414.07 Event begin: VecScale [0] 8414.07 Event end: VecScale [0] 8414.07 Event begin: VecSet [0] 8414.07 Event end: VecSet [0] 8414.07 Event begin: VecMAXPY [0] 8414.07 Event end: VecMAXPY [0] 8414.07 Event begin: VecCopy [0] 8414.07 Event end: VecCopy [0] 8414.07 Event begin: VecAXPY [0] 8414.07 Event end: VecAXPY [0] 8414.07 Event begin: MatMult [0] 8414.07 Event begin: VecScatterBegin [0] 8414.07 Event begin: SFPack [0] 8414.07 Event end: SFPack [0] 8414.07 Event end: VecScatterBegin [0] 8414.07 Event begin: VecScatterEnd [0] 8414.07 Event begin: SFUnpack [0] 8414.07 Event end: SFUnpack [0] 8414.07 Event end: VecScatterEnd [0] 8414.07 Event end: MatMult [0] 8414.07 Event begin: VecAYPX [0] 8414.07 Event end: VecAYPX [0] 8414.07 Event begin: VecNorm [0] 8414.07 Event end: VecNorm 49 KSP unpreconditioned resid norm 4.323260656510e-04 true resid norm -nan ||r(i)||/||b|| -nan [0] 8414.07 Event begin: PCApply [0] 8414.07 Event begin: VecSet [0] 8414.07 Event end: VecSet [0] 8414.07 Event begin: MatSolve [0] 8414.07 Event end: MatSolve [0] 8414.07 Event end: PCApply [0] 8414.07 Event begin: MatMult [0] 8414.07 Event begin: VecScatterBegin [0] 8414.07 Event begin: SFPack [0] 8414.07 Event end: SFPack [0] 8414.07 Event end: VecScatterBegin [0] 8414.07 Event begin: VecScatterEnd [0] 8414.07 Event begin: SFUnpack [0] 8414.07 Event end: SFUnpack [0] 8414.07 Event end: VecScatterEnd [0] 8414.07 Event end: MatMult [0] 8414.07 Event begin: VecMDot [0] 8414.07 Event end: VecMDot [0] 8414.07 Event begin: VecMAXPY [0] 8414.07 Event end: VecMAXPY [0] 8414.07 Event begin: VecNorm [0] 8414.07 Event end: VecNorm [0] 8414.07 Event begin: VecScale [0] 8414.07 Event end: VecScale [0] 8414.07 Event begin: VecSet [0] 8414.07 Event end: VecSet [0] 8414.07 Event begin: VecMAXPY [0] 8414.07 Event end: VecMAXPY [0] 8414.07 Event begin: VecCopy [0] 8414.07 Event end: VecCopy [0] 8414.07 Event begin: VecAXPY [0] 8414.07 Event end: VecAXPY [0] 8414.07 Event begin: MatMult [0] 8414.07 Event begin: VecScatterBegin [0] 8414.07 Event begin: SFPack [0] 8414.07 Event end: SFPack [0] 8414.07 Event end: VecScatterBegin [0] 8414.07 Event begin: VecScatterEnd [0] 8414.07 Event begin: SFUnpack [0] 8414.07 Event end: SFUnpack [0] 8414.07 Event end: VecScatterEnd [0] 8414.07 Event end: MatMult [0] 8414.07 Event begin: VecAYPX [0] 8414.07 Event end: VecAYPX [0] 8414.07 Event begin: VecNorm [0] 8414.07 Event end: VecNorm 50 KSP unpreconditioned resid norm 3.113754193220e-04 true resid norm -nan ||r(i)||/||b|| -nan [0] 8414.07 Event begin: PCApply [0] 8414.07 Event begin: VecSet [0] 8414.07 Event end: VecSet [0] 8414.07 Event begin: MatSolve [0] 8414.07 Event end: MatSolve [0] 8414.07 Event end: PCApply [0] 8414.07 Event begin: MatMult [0] 8414.07 Event begin: VecScatterBegin [0] 8414.07 Event begin: SFPack [0] 8414.07 Event end: SFPack [0] 8414.07 Event end: VecScatterBegin [0] 8414.07 Event begin: VecScatterEnd [0] 8414.07 Event begin: SFUnpack [0] 8414.07 Event end: SFUnpack [0] 8414.07 Event end: VecScatterEnd [0] 8414.07 Event end: MatMult [0] 8414.07 Event begin: VecMDot [0] 8414.07 Event end: VecMDot [0] 8414.07 Event begin: VecMAXPY [0] 8414.07 Event end: VecMAXPY [0] 8414.07 Event begin: VecNorm [0] 8414.07 Event end: VecNorm [0] 8414.07 Event begin: VecScale [0] 8414.07 Event end: VecScale [0] 8414.07 Event begin: VecSet [0] 8414.07 Event end: VecSet [0] 8414.07 Event begin: VecMAXPY [0] 8414.07 Event end: VecMAXPY [0] 8414.07 Event begin: VecCopy [0] 8414.07 Event end: VecCopy [0] 8414.07 Event begin: VecAXPY [0] 8414.07 Event end: VecAXPY [0] 8414.07 Event begin: MatMult [0] 8414.07 Event begin: VecScatterBegin [0] 8414.07 Event begin: SFPack [0] 8414.07 Event end: SFPack [0] 8414.07 Event end: VecScatterBegin [0] 8414.07 Event begin: VecScatterEnd [0] 8414.07 Event begin: SFUnpack [0] 8414.07 Event end: SFUnpack [0] 8414.07 Event end: VecScatterEnd [0] 8414.07 Event end: MatMult [0] 8414.07 Event begin: VecAYPX [0] 8414.07 Event end: VecAYPX [0] 8414.07 Event begin: VecNorm [0] 8414.07 Event end: VecNorm 51 KSP unpreconditioned resid norm 1.976589856877e-04 true resid norm -nan ||r(i)||/||b|| -nan [0] 8414.07 Event begin: PCApply [0] 8414.07 Event begin: VecSet [0] 8414.07 Event end: VecSet [0] 8414.07 Event begin: MatSolve [0] 8414.07 Event end: MatSolve [0] 8414.07 Event end: PCApply [0] 8414.07 Event begin: MatMult [0] 8414.07 Event begin: VecScatterBegin [0] 8414.07 Event begin: SFPack [0] 8414.07 Event end: SFPack [0] 8414.07 Event end: VecScatterBegin [0] 8414.07 Event begin: VecScatterEnd [0] 8414.07 Event begin: SFUnpack [0] 8414.07 Event end: SFUnpack [0] 8414.07 Event end: VecScatterEnd [0] 8414.07 Event end: MatMult [0] 8414.07 Event begin: VecMDot [0] 8414.07 Event end: VecMDot [0] 8414.07 Event begin: VecMAXPY [0] 8414.07 Event end: VecMAXPY [0] 8414.07 Event begin: VecNorm [0] 8414.07 Event end: VecNorm [0] 8414.07 Event begin: VecScale [0] 8414.07 Event end: VecScale [0] 8414.07 Event begin: VecSet [0] 8414.07 Event end: VecSet [0] 8414.07 Event begin: VecMAXPY [0] 8414.07 Event end: VecMAXPY [0] 8414.07 Event begin: VecCopy [0] 8414.07 Event end: VecCopy [0] 8414.07 Event begin: VecAXPY [0] 8414.07 Event end: VecAXPY [0] 8414.07 Event begin: MatMult [0] 8414.07 Event begin: VecScatterBegin [0] 8414.07 Event begin: SFPack [0] 8414.07 Event end: SFPack [0] 8414.07 Event end: VecScatterBegin [0] 8414.07 Event begin: VecScatterEnd [0] 8414.07 Event begin: SFUnpack [0] 8414.07 Event end: SFUnpack [0] 8414.07 Event end: VecScatterEnd [0] 8414.07 Event end: MatMult [0] 8414.07 Event begin: VecAYPX [0] 8414.07 Event end: VecAYPX [0] 8414.07 Event begin: VecNorm [0] 8414.07 Event end: VecNorm 52 KSP unpreconditioned resid norm 1.525265024153e-04 true resid norm -nan ||r(i)||/||b|| -nan [0] 8414.07 Event begin: PCApply [0] 8414.07 Event begin: VecSet [0] 8414.07 Event end: VecSet [0] 8414.07 Event begin: MatSolve [0] 8414.07 Event end: MatSolve [0] 8414.07 Event end: PCApply [0] 8414.07 Event begin: MatMult [0] 8414.07 Event begin: VecScatterBegin [0] 8414.07 Event begin: SFPack [0] 8414.07 Event end: SFPack [0] 8414.07 Event end: VecScatterBegin [0] 8414.07 Event begin: VecScatterEnd [0] 8414.07 Event begin: SFUnpack [0] 8414.07 Event end: SFUnpack [0] 8414.07 Event end: VecScatterEnd [0] 8414.07 Event end: MatMult [0] 8414.07 Event begin: VecMDot [0] 8414.07 Event end: VecMDot [0] 8414.07 Event begin: VecMAXPY [0] 8414.07 Event end: VecMAXPY [0] 8414.08 Event begin: VecNorm [0] 8414.08 Event end: VecNorm [0] 8414.08 Event begin: VecScale [0] 8414.08 Event end: VecScale [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 8.897908325511e-05 is less than relative tolerance 1.000000000000e-08 times initial right hand side norm 1.466597558465e+04 at iteration 53 [0] 8414.08 Event begin: VecSet [0] 8414.08 Event end: VecSet [0] 8414.08 Event begin: VecMAXPY [0] 8414.08 Event end: VecMAXPY [0] 8414.08 Event begin: VecCopy [0] 8414.08 Event end: VecCopy [0] 8414.08 Event begin: VecAXPY [0] 8414.08 Event end: VecAXPY [0] 8414.08 Event begin: MatMult [0] 8414.08 Event begin: VecScatterBegin [0] 8414.08 Event begin: SFPack [0] 8414.08 Event end: SFPack [0] 8414.08 Event end: VecScatterBegin [0] 8414.08 Event begin: VecScatterEnd [0] 8414.08 Event begin: SFUnpack [0] 8414.08 Event end: SFUnpack [0] 8414.08 Event end: VecScatterEnd [0] 8414.08 Event end: MatMult [0] 8414.08 Event begin: VecAYPX [0] 8414.08 Event end: VecAYPX [0] 8414.08 Event begin: VecNorm [0] 8414.08 Event end: VecNorm 53 KSP unpreconditioned resid norm 8.897908325511e-05 true resid norm -nan ||r(i)||/||b|| -nan [0] 8414.08 Event begin: VecSet [0] 8414.08 Event end: VecSet [0] 8414.08 Event begin: VecMAXPY [0] 8414.08 Event end: VecMAXPY [0] 8414.08 Event begin: VecAXPY [0] 8414.08 Event end: VecAXPY Linear solve converged due to CONVERGED_RTOL iterations 53 [0] 8414.08 Event end: KSPSolve KSP Object: 4 MPI processes type: fgmres restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=500, initial guess is zero tolerances: relative=1e-08, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 4 MPI processes type: bjacobi number of blocks = 4 Local solver information for first block is in the following KSP and PC objects on rank 0: Use -ksp_view ::ascii_info_detail to display information for all blocks KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: lu out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 3.8053 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=1089, cols=1089 package used to perform factorization: petsc total: nonzeros=77571, allocated nonzeros=77571 using I-node routines: found 363 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (sub_) 1 MPI processes type: seqaij rows=1089, cols=1089 total: nonzeros=20385, allocated nonzeros=20385 total number of mallocs used during MatSetValues calls=0 using I-node routines: found 363 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: 4 MPI processes type: mpiaij rows=4353, cols=4353 total: nonzeros=88389, allocated nonzeros=88389 total number of mallocs used during MatSetValues calls=0 using I-node (on process 0) routines: found 363 nodes, limit used is 5 Solver converged within 53 iterations. Elapsed time: 1.699531 Newton iteration: 0 - L2 Position Norm: INF - L2 Pressure Norm: INF Memory used by each processor: 36.250000 Mb -------------- next part -------------- A non-text attachment was scrubbed... Name: patch.png Type: image/png Size: 13586 bytes Desc: not available URL: From bsmith at petsc.dev Fri Feb 25 15:16:23 2022 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 25 Feb 2022 16:16:23 -0500 Subject: [petsc-users] Questions regarding nested field split In-Reply-To: References: Message-ID: For parallel layouts generally each MPI rank (parallel process) will have roughly the same number of indices for each field. So you would not end up with entire fields on one or a small number of processes. This is done by using interlaced (check the docs) storage of variables rather by having all of the first type of variable, followed by all the second type of variable etc. For parallel domain decomposition based preconditioners one usually uses PCBJACOBI or PCASM which can automatically split the unknowns by rank. Barry > On Feb 25, 2022, at 2:38 PM, Sundar Namala wrote: > > Hi, I am currently using fieldsplit and I am creating the fields using ISCreateGeneral.programming is being carried out in FORTRAN. I have a couple of questions regarding fieldsplit in parallel. > > Do we need to create the index list of all the fields separately for each processor? > > For example, say I have 3 fields and the indices for field_0 is 0-99, field_1 is 100-299 and field_2 is 300-349. In case of 2 processors do I have to specify the indices for the first processor as field_0 is 0-99, field_1 is 100-174 and field_2 is null. On the second processor field_0 is null, field_1 is 175-299 and field_2 is 300-349. > > my second question is if the indices need to be listed separately how do you assign the null index list using ISCreateGeneral. > > Thanks, > Sundar. From bsmith at petsc.dev Fri Feb 25 15:22:23 2022 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 25 Feb 2022 16:22:23 -0500 Subject: [petsc-users] [KSP] PETSc not reporting a KSP fail when true residual is NaN In-Reply-To: References: Message-ID: <850B6DA1-9FB8-4139-ADDF-B32F118A5EA3@petsc.dev> Hmm, this is going to be tricky to debug why it the Inf/Nan is not found when it should be. In a debugger you can catch/trap floating point exceptions (how to do this depends on your debugger) and then step through the code after that to see why PETSc KSP is not properly noting the Inf/Nan and returning. This may be cumbersome to do if you don't know PETSc well. Is your code easy to build, would be willing to share it to me so I can run it and debug directly? If you know how to make docker images or something you might be able to give it to me easily. Barry > On Feb 25, 2022, at 3:59 PM, Giovane Avancini wrote: > > Mark, Matthew and Barry, > > Thank you all for the quick responses. > > Others might have a better idea, but you could run with '-info :ksp' and see if you see any messages like "Linear solver has created a not a number (NaN) as the residual norm, declaring divergence \n" > You could also run with -log_trace and see if it is using KSPConvergedDefault. I'm not sure if this is the method used given your parameters, but I think it is. > Mark, I ran with both options. I didn't get any messages like "linear solver has created a not a number..." when using -info: ksp. When turning on -log_trace, I could verify that it is using KSPConvergedDefault but what does it mean exactly? When FGMRES converges with the true residual being NaN, I get the following message: [0] KSPConvergedDefault(): Linear solver has converged. Residual norm 8.897908325511e-05 is less than relative tolerance 1.000000000000e-08 times initial right hand side norm 1.466597558465e+04 at iteration 53. No information about NaN whatsoever. > > We check for NaN or Inf, for example, in KSPCheckDot(). if you have the KSP set to error (https://petsc.org/main/docs/manualpages/KSP/KSPSetErrorIfNotConverged.html ) > then we throw an error, but the return codes do not seem to be checked in your implementation. If not, then we set the flag for divergence. > Matthew, I do not check the return code in this case because I don't want PETSc to stop if an error occurs during the solving step. I just want to know that it didn't converge and treat this error inside my code. The problem is that the flag for divergence is not always being set when FGMRES is not converging. I was just wondering why it was set during time step 921 and why not for time step 922 as well. > > Thanks for the complete report. It looks like we may be missing a check in our FGMRES implementation that allows the iteration to continue after a NaN/Inf. > > I will explain how we handle the checking and then attach a patch that you can apply to see if it resolves the problem. Whenever our KSP solvers compute a norm we > check after that calculation to verify that the norm is not an Inf or Nan. This is an inexpensive global check across all MPI ranks because immediately after the norm computation all ranks that share the KSP have the same value. If the norm is a Inf or Nan we "short-circuit" the KSP solve and return immediately with an appropriate not converged code. A quick eye-ball inspection of the FGMRES code found a missing check. > > You can apply the attached patch file in the PETSC_DIR with > > patch -p1 < fgmres.patch > make libs > > then rerun your code and see if it now handles the Inf/NaN correctly. If so we'll patch our release branch with the fix. > Thank you for checking this, Barry. I applied the patch exactly the way you instructed, however, the problem is still happening. Is there a way to check if the patch was in fact applied? You can see in the attached screenshot the terminal information. > > Kind regards, > > Giovane > > Em sex., 25 de fev. de 2022 ?s 13:48, Barry Smith > escreveu: > > Giovane, > > Thanks for the complete report. It looks like we may be missing a check in our FGMRES implementation that allows the iteration to continue after a NaN/Inf. > > I will explain how we handle the checking and then attach a patch that you can apply to see if it resolves the problem. Whenever our KSP solvers compute a norm we > check after that calculation to verify that the norm is not an Inf or Nan. This is an inexpensive global check across all MPI ranks because immediately after the norm computation all ranks that share the KSP have the same value. If the norm is a Inf or Nan we "short-circuit" the KSP solve and return immediately with an appropriate not converged code. A quick eye-ball inspection of the FGMRES code found a missing check. > > You can apply the attached patch file in the PETSC_DIR with > > patch -p1 < fgmres.patch > make libs > > then rerun your code and see if it now handles the Inf/NaN correctly. If so we'll patch our release branch with the fix. > > Barry > > > >> Giovane > > >> On Feb 25, 2022, at 11:06 AM, Giovane Avancini via petsc-users > wrote: >> >> Dear PETSc users, >> >> I'm working on an inhouse code that solves the Navier-Stokes equation in a Lagrangian fashion for free surface flows. Because of the large distortions and pressure gradients, it is quite common to encounter some issues with iterative solvers for some time steps, and because of that, I implemented a function that changes the solver type based on the flag KSPConvergedReason. If this flag is negative after a call to KSPSolve, I solve the same linear system again using a direct method. >> >> The problem is that, sometimes, KSP keeps converging even though the residual is NaN, and because of that, I'm not able to identify the problem and change the solver, which leads to a solution vector equals to INF and obviously the code ends up crashing. Is it normal to observe this kind of behaviour? >> >> Please find attached the log produced with the options -ksp_monitor_lg_residualnorm -ksp_log -ksp_view -ksp_monitor_true_residual -ksp_converged_reason and the function that changes the solver. I'm currently using FGMRES and BJACOBI preconditioner with LU for each block. The problem still happens with ILU for example. We can see in the log file that for the time step 921, the true residual is NaN and within just one iteration, the solver fails and it gives the reason DIVERGED_PC_FAILED. I simply changed the solver to MUMPS and it converged for that time step. However, when solving time step 922 we can see that FGMRES converges while the true residual is NaN. Why is that possible? I would appreciate it if someone could clarify this issue to me. >> >> Kind regards, >> Giovane >> >> >> >> -- >> Giovane Avancini >> Doutorando em Engenharia de Estruturas - Escola de Engenharia de S?o Carlos, USP >> >> PhD researcher in Structural Engineering - School of Engineering of S?o Carlos. USP >> > > > > -- > Giovane Avancini > Doutorando em Engenharia de Estruturas - Escola de Engenharia de S?o Carlos, USP > > PhD researcher in Structural Engineering - School of Engineering of S?o Carlos. USP > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Feb 25 18:03:59 2022 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 25 Feb 2022 19:03:59 -0500 Subject: [petsc-users] [KSP] PETSc not reporting a KSP fail when true residual is NaN In-Reply-To: References: Message-ID: On Fri, Feb 25, 2022 at 4:00 PM Giovane Avancini via petsc-users < petsc-users at mcs.anl.gov> wrote: > Mark, Matthew and Barry, > > Thank you all for the quick responses. > > Others might have a better idea, but you could run with '-info :ksp' and > see if you see any messages like "Linear solver has created a not a number > (NaN) as the residual norm, declaring divergence \n" > You could also run with -log_trace and see if it is > using KSPConvergedDefault. I'm not sure if this is the method used given > your parameters, but I think it is. > > Mark, I ran with both options. I didn't get any messages like "linear > solver has created a not a number..." when using -info: ksp. When turning > on -log_trace, I could verify that it is using KSPConvergedDefault but what > does it mean exactly? > It was not clear to me where the check was done and as Barry said it is done in the FGMRES iteration, and not in the separate common function KSPConvergedDefault, and that is where the bug was. (-log_trace prints a message when you enter and exit a method so it is a quick way to see where you are) So nevermind. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bojan.niceno.scientist at gmail.com Sun Feb 27 01:36:17 2022 From: bojan.niceno.scientist at gmail.com (Bojan Niceno) Date: Sun, 27 Feb 2022 08:36:17 +0100 Subject: [petsc-users] Solving a Singular System with PETSc Message-ID: Dear all, I have coupled PETSc with my computational fluid dynamics (CFD) solver for incompressible flows where the most computationally intensive part is a solution of the linear system for pressure - which is singular. A simple call to PETSc solvers resulted in divergence, as expected, but things work when I set the null space for the pressure matrix as demonstrated in src/ksp/ksp/tutorials/ex29.c: MatNullSpace nullspace; ierr = MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_TRUE,0,0,&nullspace);CHKERRQ(ierr); ierr = MatSetNullSpace(J,nullspace);CHKERRQ(ierr); ierr = MatNullSpaceDestroy(&nullspace);CHKERRQ(ierr); However, the effect of setting the null space as described above, has almost the same effect (convergence history is almost the same) as if when I multiply each diagonal of the system matrix with (1.0 + 1.0e-6), i.e., desingularize the matrix by making it slightly diagonally dominant. I prefer the former solution as the latter one seems a bit like an ad-hoc patch and I am not sure how general it is, but I wonder, from a mathematical point of view, is it the same thing? Any thoughts on that? Cheers, Bojan Niceno -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sun Feb 27 02:21:07 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 27 Feb 2022 09:21:07 +0100 Subject: [petsc-users] Solving a Singular System with PETSc In-Reply-To: References: Message-ID: <38DF5330-6E4B-44E6-B261-8848D8BEADB0@dsic.upv.es> In both cases, it is like you are solving a nonsingular system with a matrix B. With MatNullSpace, B=A-e*e' where e=ones(n,1) normalized, and with your approach it is B=A+sigma*I with sigma=1e-6. The first approach shifts the zero eigenvalue, while in the second approach all eigenvalues are shifted. Jose > El 27 feb 2022, a las 8:36, Bojan Niceno escribi?: > > Dear all, > > I have coupled PETSc with my computational fluid dynamics (CFD) solver for incompressible flows where the most computationally intensive part is a solution of the linear system for pressure - which is singular. > > A simple call to PETSc solvers resulted in divergence, as expected, but things work when I set the null space for the pressure matrix as demonstrated in src/ksp/ksp/tutorials/ex29.c: > MatNullSpace nullspace; > ierr = MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_TRUE,0,0,&nullspace);CHKERRQ(ierr); > ierr = MatSetNullSpace(J,nullspace);CHKERRQ(ierr); > ierr = MatNullSpaceDestroy(&nullspace);CHKERRQ(ierr); > > However, the effect of setting the null space as described above, has almost the same effect (convergence history is almost the same) as if when I multiply each diagonal of the system matrix with (1.0 + 1.0e-6), i.e., desingularize the matrix by making it slightly diagonally dominant. > > I prefer the former solution as the latter one seems a bit like an ad-hoc patch and I am not sure how general it is, but I wonder, from a mathematical point of view, is it the same thing? Any thoughts on that? > > > Cheers, > > Bojan Niceno From jroman at dsic.upv.es Sun Feb 27 02:25:44 2022 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 27 Feb 2022 09:25:44 +0100 Subject: [petsc-users] Solving a Singular System with PETSc In-Reply-To: <38DF5330-6E4B-44E6-B261-8848D8BEADB0@dsic.upv.es> References: <38DF5330-6E4B-44E6-B261-8848D8BEADB0@dsic.upv.es> Message-ID: A correction: it is B=A+sigma*I when you *add* 1e-6 to the diagonal entries. but if you "multiply each diagonal of the system matrix with (1.0 + 1.0e-6)" you are doing a different thing. > El 27 feb 2022, a las 9:21, Jose E. Roman escribi?: > > In both cases, it is like you are solving a nonsingular system with a matrix B. With MatNullSpace, B=A-e*e' where e=ones(n,1) normalized, and with your approach it is B=A+sigma*I with sigma=1e-6. The first approach shifts the zero eigenvalue, while in the second approach all eigenvalues are shifted. > > Jose > > >> El 27 feb 2022, a las 8:36, Bojan Niceno escribi?: >> >> Dear all, >> >> I have coupled PETSc with my computational fluid dynamics (CFD) solver for incompressible flows where the most computationally intensive part is a solution of the linear system for pressure - which is singular. >> >> A simple call to PETSc solvers resulted in divergence, as expected, but things work when I set the null space for the pressure matrix as demonstrated in src/ksp/ksp/tutorials/ex29.c: >> MatNullSpace nullspace; >> ierr = MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_TRUE,0,0,&nullspace);CHKERRQ(ierr); >> ierr = MatSetNullSpace(J,nullspace);CHKERRQ(ierr); >> ierr = MatNullSpaceDestroy(&nullspace);CHKERRQ(ierr); >> >> However, the effect of setting the null space as described above, has almost the same effect (convergence history is almost the same) as if when I multiply each diagonal of the system matrix with (1.0 + 1.0e-6), i.e., desingularize the matrix by making it slightly diagonally dominant. >> >> I prefer the former solution as the latter one seems a bit like an ad-hoc patch and I am not sure how general it is, but I wonder, from a mathematical point of view, is it the same thing? Any thoughts on that? >> >> >> Cheers, >> >> Bojan Niceno > From yc17470 at connect.um.edu.mo Sun Feb 27 08:50:51 2022 From: yc17470 at connect.um.edu.mo (Gong Yujie) Date: Sun, 27 Feb 2022 14:50:51 +0000 Subject: [petsc-users] Question about the performance of KSP solver Message-ID: Hi, I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) to solve an elasticity problem. First, I use 16 cores to test the computation time, then use 32 cores to run the same code with the same parameters. But I just get about 10% speed up. From the log file I found that the computation time of KSPSolve() and MatSolve() just decrease a little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when configure it. The matrix size is about 7*10^6. Some detail of the log is shown below: 16-cores: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 32-cores: ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 Hope you can help me! Best Regards, Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Feb 27 09:16:09 2022 From: jed at jedbrown.org (Jed Brown) Date: Sun, 27 Feb 2022 08:16:09 -0700 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: References: Message-ID: <87sfs4732e.fsf@jedbrown.org> This is pretty typical. You see the factorization time is significantly better (because their more compute-limited) but MatMult and MatSolve are about the same because they are limited by memory bandwidth. On most modern architectures, the bandwidth is saturated with 16 cores or so. https://petsc.org/release/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup If you haven't yet, I recommend trying to use AMG for this problem. You should call MatSetNearNullSpace() to set the rigid body modes and then use -pc_type gamg or (with external packages -pc_type ml and -pc_type hypre). The iteration count should be much less and solves reasonably fast. If you're interested in using different data structures, our experience is that we can solve similar problem sizes using Q2 elements in a few seconds (2-10) on a single node. Gong Yujie writes: > Hi, > > I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) to solve an elasticity problem. First, I use 16 cores to test the computation time, then use 32 cores to run the same code with the same parameters. But I just get about 10% speed up. From the log file I found that the computation time of KSPSolve() and MatSolve() just decrease a little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when configure it. The matrix size is about 7*10^6. Some detail of the log is shown below: > > 16-cores: > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 > MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 > MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 > MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 > KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 > PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 > PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 > PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 > PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 > > 32-cores: > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 > MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 > MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 > MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 > KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 > PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 > PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 > PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 > PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 > > Hope you can help me! > > Best Regards, > Yujie From mfadams at lbl.gov Sun Feb 27 09:23:13 2022 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 27 Feb 2022 10:23:13 -0500 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: References: Message-ID: First, you probably want -ksp_type cg ILU sucks, you can try 'gamg' and I would configure it with hypre and try that also. Now, you are getting very little increase in flop rate on MatMult (8K -->8.6K). That is the problem * You have a "bad" network * Your problem is too small (with respect to your network) to get speedup after 16 porcs * Bad partitioning * You have a fair amount of load imbalance (MatMult 671 1.0 4.7602e+01 *2.0* ), and it gets worse at 32 procs. That can be from bad partitioning or a bad network. With a decent network you probably want to keep at least 25K equations per processor. This depends on these other issues, but it is a start. With a good network you should be able to get down to 10K/proc It is best to start with a simple model problem like a cube, with cube shaped subdomains is ideal, and isolate issues. Mark On Sun, Feb 27, 2022 at 9:51 AM Gong Yujie wrote: > Hi, > > I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) > to solve an elasticity problem. First, I use 16 cores to test the > computation time, then use 32 cores to run the same code with the same > parameters. But I just get about 10% speed up. From the log file I found > that the computation time of KSPSolve() and MatSolve() just decrease a > little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when > configure it. The matrix size is about 7*10^6. Some detail of the log is > shown below: > > 16-cores: > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 > 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 > MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 > 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 > MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 > 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 > MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 > 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 > KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 > 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 > PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 > 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 > PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 > 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 > PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 > 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 > PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 > 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 > > 32-cores: > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 > 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 > MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 > 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 > MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 > 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 > MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 > 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 > KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 > 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 > PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 > 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 > PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 > 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 > PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 > 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 > PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 > 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 > > Hope you can help me! > > Best Regards, > Yujie > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Sun Feb 27 09:24:16 2022 From: mfadams at lbl.gov (Mark Adams) Date: Sun, 27 Feb 2022 10:24:16 -0500 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: <87sfs4732e.fsf@jedbrown.org> References: <87sfs4732e.fsf@jedbrown.org> Message-ID: Ah, "cores". Jed is right if these are cores on one socket. On Sun, Feb 27, 2022 at 10:16 AM Jed Brown wrote: > This is pretty typical. You see the factorization time is significantly > better (because their more compute-limited) but MatMult and MatSolve are > about the same because they are limited by memory bandwidth. On most modern > architectures, the bandwidth is saturated with 16 cores or so. > > > https://petsc.org/release/faq/#what-kind-of-parallel-computers-or-clusters-are-needed-to-use-petsc-or-why-do-i-get-little-speedup > > If you haven't yet, I recommend trying to use AMG for this problem. You > should call MatSetNearNullSpace() to set the rigid body modes and then use > -pc_type gamg or (with external packages -pc_type ml and -pc_type hypre). > The iteration count should be much less and solves reasonably fast. > > If you're interested in using different data structures, our experience is > that we can solve similar problem sizes using Q2 elements in a few seconds > (2-10) on a single node. > > Gong Yujie writes: > > > Hi, > > > > I'm using the GMRES with ASM preconditioner with sub-domain solver > ILU(2) to solve an elasticity problem. First, I use 16 cores to test the > computation time, then use 32 cores to run the same code with the same > parameters. But I just get about 10% speed up. From the log file I found > that the computation time of KSPSolve() and MatSolve() just decrease a > little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when > configure it. The matrix size is about 7*10^6. Some detail of the log is > shown below: > > > > 16-cores: > > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------------------------------------------------------------------ > > MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 > 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 > > MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 > 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 > > MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 > 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 > > MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > > KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 > 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 > > KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 > 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 > > PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 > 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 > > PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 > 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 > > PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 > 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 > > PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 > 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 > > > > 32-cores: > > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------------------------------------------------------------------ > > MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 > 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 > > MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 > 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 > > MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 > 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 > > MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > > KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 > 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 > > KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 > 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 > > PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 > 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 > > PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 > 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 > > PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 > 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 > > PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 > 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 > > > > Hope you can help me! > > > > Best Regards, > > Yujie > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Feb 27 10:48:50 2022 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 27 Feb 2022 11:48:50 -0500 Subject: [petsc-users] Solving a Singular System with PETSc In-Reply-To: References: Message-ID: On Sun, Feb 27, 2022 at 2:36 AM Bojan Niceno < bojan.niceno.scientist at gmail.com> wrote: > Dear all, > > I have coupled PETSc with my computational fluid dynamics (CFD) solver for > incompressible flows where the most computationally intensive part is a > solution of the linear system for pressure - which is singular. > > A simple call to PETSc solvers resulted in divergence, as expected, but > things work when I set the null space for the pressure matrix as > demonstrated in src/ksp/ksp/tutorials/ex29.c: > MatNullSpace nullspace; > ierr = > MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_TRUE,0,0,&nullspace);CHKERRQ(ierr); > ierr = MatSetNullSpace(J,nullspace);CHKERRQ(ierr); > ierr = MatNullSpaceDestroy(&nullspace);CHKERRQ(ierr); > > However, the effect of setting the null space as described above, has > almost the same effect (convergence history is almost the same) as if when > I multiply each diagonal of the system matrix with (1.0 + 1.0e-6), i.e., > desingularize the matrix by making it slightly diagonally dominant. > > I prefer the former solution as the latter one seems a bit like an ad-hoc > patch and I am not sure how general it is, but I wonder, from a > mathematical point of view, is it the same thing? Any thoughts on that? > I will give a slightly different explanation than Jose. When you set a nullspace N, it tells us what space the solution must be orthogonal to N^T x = 0 We use this to project out these components at each step of the iterative method. At the end, we get a solution A x = b which _also_ satisfies the uniqueness condition N^T x = 0 When you perturb the matrix, you the the solution to a _different_ linear system (A + sigma I) x = b It is close, but not the same, and there is no guarantee (unless you know something about A) that this is close to the other solution in a normwise sense. Thanks, Matt > Cheers, > > Bojan Niceno > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sun Feb 27 13:20:26 2022 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 27 Feb 2022 14:20:26 -0500 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: References: Message-ID: <30431EF8-9FC6-429C-9159-18907304FEA9@petsc.dev> We should think about have -log_view automatically running streams on subsets of ranks and using the resulting information to provide guidance to users on interpretating the -log_view output instead of expecting users to run streams themselves on their system and then figuring out what to do. > On Feb 27, 2022, at 9:50 AM, Gong Yujie wrote: > > Hi, > > I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) to solve an elasticity problem. First, I use 16 cores to test the computation time, then use 32 cores to run the same code with the same parameters. But I just get about 10% speed up. From the log file I found that the computation time of KSPSolve() and MatSolve() just decrease a little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when configure it. The matrix size is about 7*10^6. Some detail of the log is shown below: > > 16-cores: > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 > MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 > MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 > MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 > KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 > PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 > PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 > PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 > PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 > > 32-cores: > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 > MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 > MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 > MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 > KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 > PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 > PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 > PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 > PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 > > Hope you can help me! > > Best Regards, > Yujie -------------- next part -------------- An HTML attachment was scrubbed... URL: From s_g at berkeley.edu Sun Feb 27 14:03:43 2022 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Sun, 27 Feb 2022 12:03:43 -0800 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: <87sfs4732e.fsf@jedbrown.org> References: <87sfs4732e.fsf@jedbrown.org> Message-ID: <237f27f2-7006-2974-b9b0-32704b4c9333@berkeley.edu> Hi Jed, ? Do you have a reference for this? -sanjay On 2/27/22 7:16 AM, Jed Brown wrote: > If you're interested in using different data structures, our experience is that we can solve similar problem sizes using Q2 elements in a few seconds (2-10) on a single node. From jed at jedbrown.org Sun Feb 27 14:39:00 2022 From: jed at jedbrown.org (Jed Brown) Date: Sun, 27 Feb 2022 13:39:00 -0700 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: <237f27f2-7006-2974-b9b0-32704b4c9333@berkeley.edu> References: <87sfs4732e.fsf@jedbrown.org> <237f27f2-7006-2974-b9b0-32704b4c9333@berkeley.edu> Message-ID: <87mtic6o4b.fsf@jedbrown.org> Hi, Sanjay. The method uses matrix-free p-multigrid with AMG coarse solves. Attached is one of the models we're using as a test problem (hyperelastic model on an extruded Schwarz-P surface). Docs for the code we're using. https://ratel.micromorph.org/ https://gitlab.com/micromorph/ratel https://doi.org/10.21105/joss.02945 (about libCEED's data structures) We'd love to test these methods out on more problems. We're working on improving integration with PETSc -- the goal is that users of PETSc will be able to write libCEED qfunctions and run with these data structures (which are optimized for both CPU and GPU) with almost no other code modifications. -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot from 2022-02-27 13-23-50.png Type: image/png Size: 2436118 bytes Desc: not available URL: -------------- next part -------------- Sanjay Govindjee writes: > Hi Jed, > ? Do you have a reference for this? > -sanjay > > On 2/27/22 7:16 AM, Jed Brown wrote: >> If you're interested in using different data structures, our experience is that we can solve similar problem sizes using Q2 elements in a few seconds (2-10) on a single node. From jed at jedbrown.org Sun Feb 27 14:41:21 2022 From: jed at jedbrown.org (Jed Brown) Date: Sun, 27 Feb 2022 13:41:21 -0700 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: <30431EF8-9FC6-429C-9159-18907304FEA9@petsc.dev> References: <30431EF8-9FC6-429C-9159-18907304FEA9@petsc.dev> Message-ID: <87k0dg6o0e.fsf@jedbrown.org> Probably not implied by -log_view alone, but -streams_view or some such doing it automatically would save having to context switch elsewhere to obtain that data. Barry Smith writes: > We should think about have -log_view automatically running streams on subsets of ranks and using the resulting information to provide guidance to users on interpretating the -log_view output instead of expecting users to run streams themselves on their system and then figuring out what to do. > >> On Feb 27, 2022, at 9:50 AM, Gong Yujie wrote: >> >> Hi, >> >> I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) to solve an elasticity problem. First, I use 16 cores to test the computation time, then use 32 cores to run the same code with the same parameters. But I just get about 10% speed up. From the log file I found that the computation time of KSPSolve() and MatSolve() just decrease a little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when configure it. The matrix size is about 7*10^6. Some detail of the log is shown below: >> >> 16-cores: >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 >> MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 >> MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 >> MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >> KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 >> KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 >> PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 >> PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 >> PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 >> PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 >> >> 32-cores: >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ------------------------------------------------------------------------------------------------------------------------ >> MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 >> MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 >> MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 >> MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >> KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 >> KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 >> PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 >> PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 >> PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 >> PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 >> >> Hope you can help me! >> >> Best Regards, >> Yujie From bsmith at petsc.dev Sun Feb 27 15:12:33 2022 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 27 Feb 2022 16:12:33 -0500 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: <87k0dg6o0e.fsf@jedbrown.org> References: <30431EF8-9FC6-429C-9159-18907304FEA9@petsc.dev> <87k0dg6o0e.fsf@jedbrown.org> Message-ID: At PetscLogView() the code could see how long the run was, if it was greater than n seconds it could automatically run a few levels of streams (taking presumably well less than a few seconds) and adjust suitable the output. If the user runs, for example, 10min they surely don't mind .5 seconds to get more useful information. > On Feb 27, 2022, at 3:41 PM, Jed Brown wrote: > > Probably not implied by -log_view alone, but -streams_view or some such doing it automatically would save having to context switch elsewhere to obtain that data. > > Barry Smith writes: > >> We should think about have -log_view automatically running streams on subsets of ranks and using the resulting information to provide guidance to users on interpretating the -log_view output instead of expecting users to run streams themselves on their system and then figuring out what to do. >> >>> On Feb 27, 2022, at 9:50 AM, Gong Yujie wrote: >>> >>> Hi, >>> >>> I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) to solve an elasticity problem. First, I use 16 cores to test the computation time, then use 32 cores to run the same code with the same parameters. But I just get about 10% speed up. From the log file I found that the computation time of KSPSolve() and MatSolve() just decrease a little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when configure it. The matrix size is about 7*10^6. Some detail of the log is shown below: >>> >>> 16-cores: >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> ------------------------------------------------------------------------------------------------------------------------ >>> MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 >>> MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 >>> MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 >>> MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>> KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 >>> KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 >>> PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 >>> PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 >>> PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 >>> PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 >>> >>> 32-cores: >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> ------------------------------------------------------------------------------------------------------------------------ >>> MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 >>> MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 >>> MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 >>> MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>> KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 >>> KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 >>> PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 >>> PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 >>> PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 >>> PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 >>> >>> Hope you can help me! >>> >>> Best Regards, >>> Yujie From jed at jedbrown.org Sun Feb 27 15:24:49 2022 From: jed at jedbrown.org (Jed Brown) Date: Sun, 27 Feb 2022 14:24:49 -0700 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: References: <30431EF8-9FC6-429C-9159-18907304FEA9@petsc.dev> <87k0dg6o0e.fsf@jedbrown.org> Message-ID: <87ee3o6lzy.fsf@jedbrown.org> I assume this would be running VecWAXPY on CPU (and GPU) with some empty ranks? I'd be mildly concerned about allocating GPU memory because a crash here would be really bad. Barry Smith writes: > At PetscLogView() the code could see how long the run was, if it was greater than n seconds it could automatically run a few levels of streams (taking presumably well less than a few seconds) and adjust suitable the output. If the user runs, for example, 10min they surely don't mind .5 seconds to get more useful information. > > > >> On Feb 27, 2022, at 3:41 PM, Jed Brown wrote: >> >> Probably not implied by -log_view alone, but -streams_view or some such doing it automatically would save having to context switch elsewhere to obtain that data. >> >> Barry Smith writes: >> >>> We should think about have -log_view automatically running streams on subsets of ranks and using the resulting information to provide guidance to users on interpretating the -log_view output instead of expecting users to run streams themselves on their system and then figuring out what to do. >>> >>>> On Feb 27, 2022, at 9:50 AM, Gong Yujie wrote: >>>> >>>> Hi, >>>> >>>> I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) to solve an elasticity problem. First, I use 16 cores to test the computation time, then use 32 cores to run the same code with the same parameters. But I just get about 10% speed up. From the log file I found that the computation time of KSPSolve() and MatSolve() just decrease a little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when configure it. The matrix size is about 7*10^6. Some detail of the log is shown below: >>>> >>>> 16-cores: >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 >>>> MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 >>>> MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 >>>> MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>>> KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 >>>> KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 >>>> PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 >>>> PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 >>>> PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 >>>> PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 >>>> >>>> 32-cores: >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 >>>> MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 >>>> MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 >>>> MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>>> KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 >>>> KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 >>>> PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 >>>> PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 >>>> PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 >>>> PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 >>>> >>>> Hope you can help me! >>>> >>>> Best Regards, >>>> Yujie From bsmith at petsc.dev Sun Feb 27 15:31:04 2022 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 27 Feb 2022 16:31:04 -0500 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: <87ee3o6lzy.fsf@jedbrown.org> References: <30431EF8-9FC6-429C-9159-18907304FEA9@petsc.dev> <87k0dg6o0e.fsf@jedbrown.org> <87ee3o6lzy.fsf@jedbrown.org> Message-ID: This would be after the user code is complete, PETSc memory has all been freed and we can put a signal catch around the code to prevent such crashes. > On Feb 27, 2022, at 4:24 PM, Jed Brown wrote: > > I assume this would be running VecWAXPY on CPU (and GPU) with some empty ranks? I'd be mildly concerned about allocating GPU memory because a crash here would be really bad. > > Barry Smith writes: > >> At PetscLogView() the code could see how long the run was, if it was greater than n seconds it could automatically run a few levels of streams (taking presumably well less than a few seconds) and adjust suitable the output. If the user runs, for example, 10min they surely don't mind .5 seconds to get more useful information. >> >> >> >>> On Feb 27, 2022, at 3:41 PM, Jed Brown wrote: >>> >>> Probably not implied by -log_view alone, but -streams_view or some such doing it automatically would save having to context switch elsewhere to obtain that data. >>> >>> Barry Smith writes: >>> >>>> We should think about have -log_view automatically running streams on subsets of ranks and using the resulting information to provide guidance to users on interpretating the -log_view output instead of expecting users to run streams themselves on their system and then figuring out what to do. >>>> >>>>> On Feb 27, 2022, at 9:50 AM, Gong Yujie wrote: >>>>> >>>>> Hi, >>>>> >>>>> I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) to solve an elasticity problem. First, I use 16 cores to test the computation time, then use 32 cores to run the same code with the same parameters. But I just get about 10% speed up. From the log file I found that the computation time of KSPSolve() and MatSolve() just decrease a little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when configure it. The matrix size is about 7*10^6. Some detail of the log is shown below: >>>>> >>>>> 16-cores: >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 >>>>> MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 >>>>> MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 >>>>> MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>>>> KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 >>>>> KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 >>>>> PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 >>>>> PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 >>>>> PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 >>>>> PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 >>>>> >>>>> 32-cores: >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>> MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 >>>>> MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 >>>>> MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 >>>>> MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>>>> KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>> KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 >>>>> KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 >>>>> PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 >>>>> PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 >>>>> PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 >>>>> PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 >>>>> >>>>> Hope you can help me! >>>>> >>>>> Best Regards, >>>>> Yujie From jed at jedbrown.org Sun Feb 27 15:36:15 2022 From: jed at jedbrown.org (Jed Brown) Date: Sun, 27 Feb 2022 14:36:15 -0700 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: References: <30431EF8-9FC6-429C-9159-18907304FEA9@petsc.dev> <87k0dg6o0e.fsf@jedbrown.org> <87ee3o6lzy.fsf@jedbrown.org> Message-ID: <87bkys6lgw.fsf@jedbrown.org> That's sounds okay, just need to be able to guarantee that no system errors can prevent us from finishing writing the -log_view. Barry Smith writes: > This would be after the user code is complete, PETSc memory has all been freed and we can put a signal catch around the code to prevent such crashes. > >> On Feb 27, 2022, at 4:24 PM, Jed Brown wrote: >> >> I assume this would be running VecWAXPY on CPU (and GPU) with some empty ranks? I'd be mildly concerned about allocating GPU memory because a crash here would be really bad. >> >> Barry Smith writes: >> >>> At PetscLogView() the code could see how long the run was, if it was greater than n seconds it could automatically run a few levels of streams (taking presumably well less than a few seconds) and adjust suitable the output. If the user runs, for example, 10min they surely don't mind .5 seconds to get more useful information. >>> >>> >>> >>>> On Feb 27, 2022, at 3:41 PM, Jed Brown wrote: >>>> >>>> Probably not implied by -log_view alone, but -streams_view or some such doing it automatically would save having to context switch elsewhere to obtain that data. >>>> >>>> Barry Smith writes: >>>> >>>>> We should think about have -log_view automatically running streams on subsets of ranks and using the resulting information to provide guidance to users on interpretating the -log_view output instead of expecting users to run streams themselves on their system and then figuring out what to do. >>>>> >>>>>> On Feb 27, 2022, at 9:50 AM, Gong Yujie wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) to solve an elasticity problem. First, I use 16 cores to test the computation time, then use 32 cores to run the same code with the same parameters. But I just get about 10% speed up. From the log file I found that the computation time of KSPSolve() and MatSolve() just decrease a little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when configure it. The matrix size is about 7*10^6. Some detail of the log is shown below: >>>>>> >>>>>> 16-cores: >>>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>>>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>>> MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 >>>>>> MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 >>>>>> MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 >>>>>> MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>>>>> KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>>> KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 >>>>>> KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 >>>>>> PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 >>>>>> PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 >>>>>> PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 >>>>>> PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 >>>>>> >>>>>> 32-cores: >>>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>>>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>>> MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 >>>>>> MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 >>>>>> MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 >>>>>> MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>>>>> KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>>> KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 >>>>>> KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 >>>>>> PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 >>>>>> PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 >>>>>> PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 >>>>>> PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 >>>>>> >>>>>> Hope you can help me! >>>>>> >>>>>> Best Regards, >>>>>> Yujie From bsmith at petsc.dev Sun Feb 27 17:03:46 2022 From: bsmith at petsc.dev (Barry Smith) Date: Sun, 27 Feb 2022 18:03:46 -0500 Subject: [petsc-users] Question about the performance of KSP solver In-Reply-To: <87bkys6lgw.fsf@jedbrown.org> References: <30431EF8-9FC6-429C-9159-18907304FEA9@petsc.dev> <87k0dg6o0e.fsf@jedbrown.org> <87ee3o6lzy.fsf@jedbrown.org> <87bkys6lgw.fsf@jedbrown.org> Message-ID: Could even write the "normal" logview information before gathering the data to ensure no early crash. > On Feb 27, 2022, at 4:36 PM, Jed Brown wrote: > > That's sounds okay, just need to be able to guarantee that no system errors can prevent us from finishing writing the -log_view. > > Barry Smith writes: > >> This would be after the user code is complete, PETSc memory has all been freed and we can put a signal catch around the code to prevent such crashes. >> >>> On Feb 27, 2022, at 4:24 PM, Jed Brown wrote: >>> >>> I assume this would be running VecWAXPY on CPU (and GPU) with some empty ranks? I'd be mildly concerned about allocating GPU memory because a crash here would be really bad. >>> >>> Barry Smith writes: >>> >>>> At PetscLogView() the code could see how long the run was, if it was greater than n seconds it could automatically run a few levels of streams (taking presumably well less than a few seconds) and adjust suitable the output. If the user runs, for example, 10min they surely don't mind .5 seconds to get more useful information. >>>> >>>> >>>> >>>>> On Feb 27, 2022, at 3:41 PM, Jed Brown wrote: >>>>> >>>>> Probably not implied by -log_view alone, but -streams_view or some such doing it automatically would save having to context switch elsewhere to obtain that data. >>>>> >>>>> Barry Smith writes: >>>>> >>>>>> We should think about have -log_view automatically running streams on subsets of ranks and using the resulting information to provide guidance to users on interpretating the -log_view output instead of expecting users to run streams themselves on their system and then figuring out what to do. >>>>>> >>>>>>> On Feb 27, 2022, at 9:50 AM, Gong Yujie wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I'm using the GMRES with ASM preconditioner with sub-domain solver ILU(2) to solve an elasticity problem. First, I use 16 cores to test the computation time, then use 32 cores to run the same code with the same parameters. But I just get about 10% speed up. From the log file I found that the computation time of KSPSolve() and MatSolve() just decrease a little bit. My PETSc version is 3.16.0 and use --with-debugging=0 when configure it. The matrix size is about 7*10^6. Some detail of the log is shown below: >>>>>>> >>>>>>> 16-cores: >>>>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>>>>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>>>> MatMult 664 1.0 5.0794e+01 1.6 2.70e+10 1.1 7.1e+04 4.8e+04 1.0e+00 7 13 49 20 0 7 13 49 20 0 8010 >>>>>>> MatSolve 663 1.0 1.9868e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10932 >>>>>>> MatLUFactorNum 1 1.0 6.1501e+00 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 35056 >>>>>>> MatILUFactorSym 1 1.0 1.5566e+01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>>>>>> KSPSetUp 2 1.0 5.9627e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> KSPSolve 1 1.0 2.5168e+02 1.0 1.90e+11 1.1 1.4e+05 4.8e+04 1.3e+03 44 93 98 40 89 44 93 98 40 90 11437 >>>>>>> KSPGMRESOrthog 641 1.0 1.8980e+01 1.7 1.82e+10 1.1 0.0e+00 0.0e+00 6.4e+02 3 9 0 0 43 3 9 0 0 44 14578 >>>>>>> PCSetUp 2 1.0 2.2480e+01 1.1 1.40e+10 1.1 5.3e+02 6.5e+05 7.0e+00 4 7 0 2 0 4 7 0 2 0 9591 >>>>>>> PCSetUpOnBlocks 1 1.0 2.1555e+01 1.1 1.40e+10 1.1 0.0e+00 0.0e+00 0.0e+00 3 7 0 0 0 3 7 0 0 0 10002 >>>>>>> PCApply 663 1.0 2.0296e+02 1.1 1.43e+11 1.1 7.0e+04 4.8e+04 1.0e+00 33 70 49 20 0 33 70 49 20 0 10701 >>>>>>> PCApplyOnBlocks 663 1.0 1.9908e+02 1.1 1.43e+11 1.1 0.0e+00 0.0e+00 0.0e+00 33 70 0 0 0 33 70 0 0 0 10910 >>>>>>> >>>>>>> 32-cores: >>>>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>>>> Event Count Time (sec) Flop --- Global --- --- Stage ---- Total >>>>>>> Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>>>>> ------------------------------------------------------------------------------------------------------------------------ >>>>>>> MatMult 671 1.0 4.7602e+01 2.0 1.39e+10 1.1 1.7e+05 2.8e+04 1.0e+00 7 13 49 23 0 7 13 49 23 0 8637 >>>>>>> MatSolve 670 1.0 1.7800e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12544 >>>>>>> MatLUFactorNum 1 1.0 3.5714e+00 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 60743 >>>>>>> MatILUFactorSym 1 1.0 8.4088e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>>>>>> KSPSetUp 2 1.0 3.8060e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>>>>> KSPSolve 1 1.0 2.1680e+02 1.0 9.95e+10 1.1 3.5e+05 2.8e+04 1.3e+03 44 93 98 47 89 44 93 98 47 90 13592 >>>>>>> KSPGMRESOrthog 648 1.0 1.6999e+01 2.0 9.39e+09 1.1 0.0e+00 0.0e+00 6.5e+02 2 9 0 0 43 2 9 0 0 44 16450 >>>>>>> PCSetUp 2 1.0 1.2439e+01 1.1 7.16e+09 1.1 1.3e+03 3.7e+05 7.0e+00 2 7 0 2 0 2 7 0 2 0 17440 >>>>>>> PCSetUpOnBlocks 1 1.0 1.1876e+01 1.1 7.16e+09 1.1 0.0e+00 0.0e+00 0.0e+00 2 7 0 0 0 2 7 0 0 0 18267 >>>>>>> PCApply 670 1.0 1.8235e+02 1.1 7.56e+10 1.1 1.7e+05 2.7e+04 1.0e+00 34 71 49 23 0 34 71 49 23 0 12245 >>>>>>> PCApplyOnBlocks 670 1.0 1.7838e+02 1.1 7.56e+10 1.1 0.0e+00 0.0e+00 0.0e+00 33 71 0 0 0 33 71 0 0 0 12517 >>>>>>> >>>>>>> Hope you can help me! >>>>>>> >>>>>>> Best Regards, >>>>>>> Yujie From jeremy at seamplex.com Mon Feb 28 06:39:47 2022 From: jeremy at seamplex.com (Jeremy Theler) Date: Mon, 28 Feb 2022 09:39:47 -0300 Subject: [petsc-users] HDF5 timestepping in PETSc 3.16 In-Reply-To: References: <18b4e68c-2524-932d-9aa4-c1a28ea44158@auckland.ac.nz> Message-ID: <142a5d4bb503ba94a9c02feda85b227ab0ccdb5c.camel@seamplex.com> On Thu, 2021-10-21 at 13:04 -0400, Matthew Knepley wrote: > On Tue, Oct 19, 2021 at 6:12 AM Matthew Knepley > wrote: > > On Mon, Oct 18, 2021 at 10:35 PM Adrian Croucher > > wrote: > > > Any response on this? > > > > > > This is a bit of a showstopper for me - I can't upgrade to PETSc > > > 3.16 if > > > it does not allow my users to read their HDF5 files created using > > > earlier versions of PETSc. > > > > > > So far I can't see a workaround. Possibly the timestepping > > > functions > > > need some kind of optional parameter to specify what the default > > > timestepping attribute should be, if it's not present in the file > > > > I think you are right. We should always write the attribute, but > > have it be false. We should > > interpret a missing attribute as an old file. > > > Okay, I think I have it. Can you look at this branch? > > ??https://gitlab.com/petsc/petsc/-/merge_requests/4483 > > There is now an option that lets you set the default timestepping > behavior > > ??-viewer_hdf5_default_timestepping > > I think that is what you want. I'd like to rely on PetscViewerHDF5SetDefaultTimestepping() to provide backwards compatibility as well. This branch has been merged into master back in November but never made it to the stable v3.16 releases. Can you guys check please? Thanks -- jeremy theler www.seamplex.com From knepley at gmail.com Mon Feb 28 07:32:27 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 28 Feb 2022 08:32:27 -0500 Subject: [petsc-users] HDF5 timestepping in PETSc 3.16 In-Reply-To: <142a5d4bb503ba94a9c02feda85b227ab0ccdb5c.camel@seamplex.com> References: <18b4e68c-2524-932d-9aa4-c1a28ea44158@auckland.ac.nz> <142a5d4bb503ba94a9c02feda85b227ab0ccdb5c.camel@seamplex.com> Message-ID: On Mon, Feb 28, 2022 at 7:40 AM Jeremy Theler wrote: > On Thu, 2021-10-21 at 13:04 -0400, Matthew Knepley wrote: > > On Tue, Oct 19, 2021 at 6:12 AM Matthew Knepley > > wrote: > > > On Mon, Oct 18, 2021 at 10:35 PM Adrian Croucher > > > wrote: > > > > Any response on this? > > > > > > > > This is a bit of a showstopper for me - I can't upgrade to PETSc > > > > 3.16 if > > > > it does not allow my users to read their HDF5 files created using > > > > earlier versions of PETSc. > > > > > > > > So far I can't see a workaround. Possibly the timestepping > > > > functions > > > > need some kind of optional parameter to specify what the default > > > > timestepping attribute should be, if it's not present in the file > > > > > > I think you are right. We should always write the attribute, but > > > have it be false. We should > > > interpret a missing attribute as an old file. > > > > > Okay, I think I have it. Can you look at this branch? > > > > https://gitlab.com/petsc/petsc/-/merge_requests/4483 > > > > There is now an option that lets you set the default timestepping > > behavior > > > > -viewer_hdf5_default_timestepping > > > > I think that is what you want. > > I'd like to rely on PetscViewerHDF5SetDefaultTimestepping() to provide > backwards compatibility as well. This branch has been merged into > master back in November but never made it to the stable v3.16 releases. > > Can you guys check please? > Yes, it is in main. Since it was an interface change, it does not go in a point release for 3.16, but rather in the 3.17 release. We do a feature release about every 6 months. Since 3.16 was released on Sept 29 2021, I estimate 3.17 in March/April. Thanks, Matt > Thanks > -- > jeremy theler > www.seamplex.com > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Mon Feb 28 11:42:58 2022 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 28 Feb 2022 09:42:58 -0800 Subject: [petsc-users] Fortran HDF5 Cannot be found in PETSc-3.16 Message-ID: <6067abba-3438-ea33-9772-81170bee2304@gmail.com> Hi All, Does anyone encounter the problem when HDF5 related fortran code cannot be compiled in PETSc-3.16 because the 'use hdf5' cannot find the required file? Compared to HDF5-1.12.0 and earlier versions, some object files (e.g., hdf5.mod, hdf5.o) are missing in HDF5-1.12.1 in PETSc-3.16. I checked the makefile in hdf5 folder in externalpackage, there are some difference which I guess might cause the problem. In PETSc-3.15, HDF5-1.12.0 # Make sure that these variables are exported to the Makefiles F9XMODEXT = mod F9XMODFLAG = -I F9XSUFFIXFLAG = In PETSc-3.16, HDF5-1.12.1 # Make sure that these variables are exported to the Makefiles F9XMODEXT = F9XMODFLAG = F9XSUFFIXFLAG The configuration I use in PETSc-3.16 is the same as PETSc-3.15. ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes --download-cmake --with-debugging=0 COPTFLAGS="-O2 -march=native -mtune=native" CXXOPTFLAGS="-O2 -march=native -mtune=native" FOPTFLAGS="-O2 -march=native -mtune=native" Is this a bug or something wrong in my PETSc configuration? Thanks, Danyang From bsmith at petsc.dev Mon Feb 28 11:59:56 2022 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 28 Feb 2022 12:59:56 -0500 Subject: [petsc-users] Fortran HDF5 Cannot be found in PETSc-3.16 In-Reply-To: <6067abba-3438-ea33-9772-81170bee2304@gmail.com> References: <6067abba-3438-ea33-9772-81170bee2304@gmail.com> Message-ID: You need the additional configure option --download-hdf5-fortran-bindings Please make sure you have the latest 3.16.4 Barry > On Feb 28, 2022, at 12:42 PM, Danyang Su wrote: > > Hi All, > > Does anyone encounter the problem when HDF5 related fortran code cannot be compiled in PETSc-3.16 because the 'use hdf5' cannot find the required file? > > Compared to HDF5-1.12.0 and earlier versions, some object files (e.g., hdf5.mod, hdf5.o) are missing in HDF5-1.12.1 in PETSc-3.16. I checked the makefile in hdf5 folder in externalpackage, there are some difference which I guess might cause the problem. > > In PETSc-3.15, HDF5-1.12.0 > > # Make sure that these variables are exported to the Makefiles > F9XMODEXT = mod > F9XMODFLAG = -I > F9XSUFFIXFLAG = > > In PETSc-3.16, HDF5-1.12.1 > > # Make sure that these variables are exported to the Makefiles > F9XMODEXT = > F9XMODFLAG = > F9XSUFFIXFLAG > > The configuration I use in PETSc-3.16 is the same as PETSc-3.15. > > ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes --download-cmake --with-debugging=0 COPTFLAGS="-O2 -march=native -mtune=native" CXXOPTFLAGS="-O2 -march=native -mtune=native" FOPTFLAGS="-O2 -march=native -mtune=native" > > Is this a bug or something wrong in my PETSc configuration? > > Thanks, > > Danyang > From FERRANJ2 at my.erau.edu Mon Feb 28 13:14:50 2022 From: FERRANJ2 at my.erau.edu (Ferrand, Jesus A.) Date: Mon, 28 Feb 2022 19:14:50 +0000 Subject: [petsc-users] PetscSections and DMPlex's Message-ID: Dear PETSc team: I am working on an FEA code that solves the elasticity equation using P1 elements. I want to write Vecs to ParaView in .vtk format. I can output the displacements by defining a PetscSection that assigned dof over the vertices of my mesh. I can export this Vec using PetscViewerVTKAddField() and DMPlexVTKWriteAll(). Now I want to export strains and stresses using the same VTK functions, but these Vecs need a different PetscSection (one that assigns dof over the cells). My concern is that to output displacements I attached the PetscSection to the DMPlex using DMSetLocalSection(). I'm not sure if I can define another local section for my cell-based strains and stresses and set it as a local section . My questions are: * Could I perhaps do this with DMSetGlobalSection()? * Can a DM hold multiple PetscSections (say multiple local sections and multiple global sections)? * If so, how can select individual ones to properly output Vecs with DMPlexVTKWriteAll()? * Does the local PetscSection depend on the Global one or vice versa? * Can I avoid having to define a PetscSection altogether for my cell-based Vecs if I use DMCreateGlobalVector()? * I got this idea from the source code of DMPlexCreateRankField(), which I successfully use to visualize my mesh partition. Sorry for the mouthful. Sincerely: J.A. Ferrand Embry-Riddle Aeronautical University - Daytona Beach FL M.Sc. Aerospace Engineering | May 2022 B.Sc. Aerospace Engineering B.Sc. Computational Mathematics Sigma Gamma Tau Tau Beta Pi Honors Program Phone: (386)-843-1829 Email(s): ferranj2 at my.erau.edu jesus.ferrand at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Mon Feb 28 13:22:01 2022 From: sam.guo at cd-adapco.com (Sam Guo) Date: Mon, 28 Feb 2022 11:22:01 -0800 Subject: [petsc-users] set MUMPS OOC_TMPDIR Message-ID: Dear PETSc dev team, I would like to set MUMPS OOC_TMPDIR programmatically on fly (instead of statically using PetscOptionsString ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), NULL);). It seems there is such an API, am I correct? Thanks, Sam -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Mon Feb 28 13:24:14 2022 From: sam.guo at cd-adapco.com (Sam Guo) Date: Mon, 28 Feb 2022 11:24:14 -0800 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: References: Message-ID: Typo correction: It seems there is no such an API. On Mon, Feb 28, 2022 at 11:22 AM Sam Guo wrote: > Dear PETSc dev team, > I would like to set MUMPS OOC_TMPDIR programmatically on fly (instead > of statically using PetscOptionsString > > ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", > mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), > NULL);). It seems there is such an API, am I correct? > > Thanks, > Sam > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 28 13:43:37 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 28 Feb 2022 14:43:37 -0500 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: References: Message-ID: No we do not have that right now. We would be happy to accept a contribution with a get/set function interface if you need that. Thanks, Matt On Mon, Feb 28, 2022 at 2:24 PM Sam Guo wrote: > Typo correction: It seems there is no such an API. > > On Mon, Feb 28, 2022 at 11:22 AM Sam Guo wrote: > >> Dear PETSc dev team, >> I would like to set MUMPS OOC_TMPDIR programmatically on fly (instead >> of statically using PetscOptionsString >> >> ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", >> mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), >> NULL);). It seems there is such an API, am I correct? >> >> Thanks, >> Sam >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 28 14:03:24 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 28 Feb 2022 15:03:24 -0500 Subject: [petsc-users] PetscSections and DMPlex's In-Reply-To: References: Message-ID: On Mon, Feb 28, 2022 at 2:15 PM Ferrand, Jesus A. wrote: > Dear PETSc team: > > I am working on an FEA code that solves the elasticity equation using P1 > elements. I want to write Vecs to ParaView in .vtk format. I can output the > displacements by defining a PetscSection that assigned dof over the > vertices of my mesh. I can export this Vec using PetscViewerVTKAddField() > and DMPlexVTKWriteAll(). Now I want to export strains and stresses using > the same VTK functions, but these Vecs need a different PetscSection (one > that assigns dof over the cells). My concern is that to output > displacements I attached the PetscSection to the DMPlex using > DMSetLocalSection(). I'm not sure if I can define another local section for > my cell-based strains and stresses and set it as a local section . > 1. I think this is possible, and will try to lay it out below. However, VTK is not a great format, so I will offer another, more flexible solution, afterwards. The way to do this is to call DMClone(dm, &sdm) to get another DM with the same topology. Then set the Section to be your cellwise section for strains. Then you just output that vector into the same VTK file. Since the topology is the same, this will work. 2. We have converted from VTK to HDF5 for Paraview. It is _much_ more flexible. You have the same two DMs, and you output one of them to the HDF5 file suing DMView(), and all the fields using VecView(). I usually use DMViewFromOptions() and VecViewFromOptions() and then something like -dm_view hdf5:sol.h5 -disp_vec_view hdf5:sol.h5::append -strain_vec_view hdf5:sol.h5::append and then process that file to make an XDMF file sol.xmf $PETSC_DIR/lib/petc/bin/petsc_gen_xdmf.py sol.h5 All the fields will show up in Paraview when you load sol.xmf. This approach handles multiple timesteps, multiple meshes, mesh change, and many other things. THanks, Matt > My questions are: > > - Could I perhaps do this with DMSetGlobalSection()? > - Can a DM hold multiple PetscSections (say multiple local sections > and multiple global sections)? > - If so, how can select individual ones to properly output Vecs > with DMPlexVTKWriteAll()? > - Does the local PetscSection depend on the Global one or vice > versa? > - Can I avoid having to define a PetscSection altogether for my > cell-based Vecs if I use DMCreateGlobalVector()? > - I got this idea from the source code of DMPlexCreateRankField(), > which I successfully use to visualize my mesh partition. > > Sorry for the mouthful. > > Sincerely: > > *J.A. Ferrand* > > Embry-Riddle Aeronautical University - Daytona Beach FL > > M.Sc. Aerospace Engineering | May 2022 > > B.Sc. Aerospace Engineering > > B.Sc. Computational Mathematics > > > > Sigma Gamma Tau > > Tau Beta Pi > > Honors Program > > > > *Phone:* (386)-843-1829 > > *Email(s):* ferranj2 at my.erau.edu > > jesus.ferrand at gmail.com > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Mon Feb 28 14:09:49 2022 From: sam.guo at cd-adapco.com (Sam Guo) Date: Mon, 28 Feb 2022 12:09:49 -0800 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: References: Message-ID: Hi Matt, I love to have a get/set function interface. Thanks, Sam On Mon, Feb 28, 2022 at 11:43 AM Matthew Knepley wrote: > No we do not have that right now. We would be happy to accept a > contribution with a get/set function interface if you need that. > > Thanks, > > Matt > > On Mon, Feb 28, 2022 at 2:24 PM Sam Guo wrote: > >> Typo correction: It seems there is no such an API. >> >> On Mon, Feb 28, 2022 at 11:22 AM Sam Guo wrote: >> >>> Dear PETSc dev team, >>> I would like to set MUMPS OOC_TMPDIR programmatically on fly (instead >>> of statically using PetscOptionsString >>> >>> ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", >>> mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), >>> NULL);). It seems there is such an API, am I correct? >>> >>> Thanks, >>> Sam >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Mon Feb 28 14:11:51 2022 From: danyang.su at gmail.com (Danyang Su) Date: Mon, 28 Feb 2022 12:11:51 -0800 Subject: [petsc-users] Fortran HDF5 Cannot be found in PETSc-3.16 In-Reply-To: References: <6067abba-3438-ea33-9772-81170bee2304@gmail.com> Message-ID: Thanks, Barry. It works now. Danyang On 2022-02-28 9:59 a.m., Barry Smith wrote: > You need the additional configure option --download-hdf5-fortran-bindings Please make sure you have the latest 3.16.4 > > Barry > > >> On Feb 28, 2022, at 12:42 PM, Danyang Su wrote: >> >> Hi All, >> >> Does anyone encounter the problem when HDF5 related fortran code cannot be compiled in PETSc-3.16 because the 'use hdf5' cannot find the required file? >> >> Compared to HDF5-1.12.0 and earlier versions, some object files (e.g., hdf5.mod, hdf5.o) are missing in HDF5-1.12.1 in PETSc-3.16. I checked the makefile in hdf5 folder in externalpackage, there are some difference which I guess might cause the problem. >> >> In PETSc-3.15, HDF5-1.12.0 >> >> # Make sure that these variables are exported to the Makefiles >> F9XMODEXT = mod >> F9XMODFLAG = -I >> F9XSUFFIXFLAG = >> >> In PETSc-3.16, HDF5-1.12.1 >> >> # Make sure that these variables are exported to the Makefiles >> F9XMODEXT = >> F9XMODFLAG = >> F9XSUFFIXFLAG >> >> The configuration I use in PETSc-3.16 is the same as PETSc-3.15. >> >> ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mumps --download-scalapack --download-parmetis --download-metis --download-ptscotch --download-fblaslapack --download-mpich --download-hypre --download-superlu_dist --download-hdf5=yes --download-cmake --with-debugging=0 COPTFLAGS="-O2 -march=native -mtune=native" CXXOPTFLAGS="-O2 -march=native -mtune=native" FOPTFLAGS="-O2 -march=native -mtune=native" >> >> Is this a bug or something wrong in my PETSc configuration? >> >> Thanks, >> >> Danyang >> From knepley at gmail.com Mon Feb 28 14:14:18 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 28 Feb 2022 15:14:18 -0500 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: References: Message-ID: On Mon, Feb 28, 2022 at 3:10 PM Sam Guo wrote: > Hi Matt, > I love to have a get/set function interface. > We can help you write the code, or you can create an Issue and we will write it as soon as we can. Thanks, MAtt > Thanks, > Sam > > On Mon, Feb 28, 2022 at 11:43 AM Matthew Knepley > wrote: > >> No we do not have that right now. We would be happy to accept a >> contribution with a get/set function interface if you need that. >> >> Thanks, >> >> Matt >> >> On Mon, Feb 28, 2022 at 2:24 PM Sam Guo wrote: >> >>> Typo correction: It seems there is no such an API. >>> >>> On Mon, Feb 28, 2022 at 11:22 AM Sam Guo wrote: >>> >>>> Dear PETSc dev team, >>>> I would like to set MUMPS OOC_TMPDIR programmatically on fly >>>> (instead of statically using PetscOptionsString >>>> >>>> ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", >>>> mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), >>>> NULL);). It seems there is such an API, am I correct? >>>> >>>> Thanks, >>>> Sam >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Mon Feb 28 14:19:40 2022 From: sam.guo at cd-adapco.com (Sam Guo) Date: Mon, 28 Feb 2022 12:19:40 -0800 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: References: Message-ID: Great. Can you send me the patch so I can apply it to my local branch? Then I will create an issue for the next release. How do I create an issue? When is the next release? On Mon, Feb 28, 2022 at 12:14 PM Matthew Knepley wrote: > On Mon, Feb 28, 2022 at 3:10 PM Sam Guo wrote: > >> Hi Matt, >> I love to have a get/set function interface. >> > > We can help you write the code, or you can create an Issue and we will > write it as soon as we can. > > Thanks, > > MAtt > > >> Thanks, >> Sam >> >> On Mon, Feb 28, 2022 at 11:43 AM Matthew Knepley >> wrote: >> >>> No we do not have that right now. We would be happy to accept a >>> contribution with a get/set function interface if you need that. >>> >>> Thanks, >>> >>> Matt >>> >>> On Mon, Feb 28, 2022 at 2:24 PM Sam Guo wrote: >>> >>>> Typo correction: It seems there is no such an API. >>>> >>>> On Mon, Feb 28, 2022 at 11:22 AM Sam Guo wrote: >>>> >>>>> Dear PETSc dev team, >>>>> I would like to set MUMPS OOC_TMPDIR programmatically on fly >>>>> (instead of statically using PetscOptionsString >>>>> >>>>> ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", >>>>> mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), >>>>> NULL);). It seems there is such an API, am I correct? >>>>> >>>>> Thanks, >>>>> Sam >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 28 14:29:09 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 28 Feb 2022 15:29:09 -0500 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: References: Message-ID: On Mon, Feb 28, 2022 at 3:19 PM Sam Guo wrote: > Great. Can you send me the patch so I can apply it to my local branch? > Then I will create an issue for the next release. How do I create an issue? > When is the next release? > You make the issue so we remember to write the code: https://gitlab.com/petsc/petsc/-/issues?sort=created_date&state=opened Thanks, Matt > On Mon, Feb 28, 2022 at 12:14 PM Matthew Knepley > wrote: > >> On Mon, Feb 28, 2022 at 3:10 PM Sam Guo wrote: >> >>> Hi Matt, >>> I love to have a get/set function interface. >>> >> >> We can help you write the code, or you can create an Issue and we will >> write it as soon as we can. >> >> Thanks, >> >> MAtt >> >> >>> Thanks, >>> Sam >>> >>> On Mon, Feb 28, 2022 at 11:43 AM Matthew Knepley >>> wrote: >>> >>>> No we do not have that right now. We would be happy to accept a >>>> contribution with a get/set function interface if you need that. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On Mon, Feb 28, 2022 at 2:24 PM Sam Guo wrote: >>>> >>>>> Typo correction: It seems there is no such an API. >>>>> >>>>> On Mon, Feb 28, 2022 at 11:22 AM Sam Guo >>>>> wrote: >>>>> >>>>>> Dear PETSc dev team, >>>>>> I would like to set MUMPS OOC_TMPDIR programmatically on fly >>>>>> (instead of statically using PetscOptionsString >>>>>> >>>>>> ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", >>>>>> mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), >>>>>> NULL);). It seems there is such an API, am I correct? >>>>>> >>>>>> Thanks, >>>>>> Sam >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Mon Feb 28 14:42:05 2022 From: sam.guo at cd-adapco.com (Sam Guo) Date: Mon, 28 Feb 2022 12:42:05 -0800 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: References: Message-ID: Done. Thanks. On Mon, Feb 28, 2022 at 12:29 PM Matthew Knepley wrote: > On Mon, Feb 28, 2022 at 3:19 PM Sam Guo wrote: > >> Great. Can you send me the patch so I can apply it to my local branch? >> Then I will create an issue for the next release. How do I create an issue? >> When is the next release? >> > > You make the issue so we remember to write the code: > > https://gitlab.com/petsc/petsc/-/issues?sort=created_date&state=opened > > Thanks, > > Matt > > >> On Mon, Feb 28, 2022 at 12:14 PM Matthew Knepley >> wrote: >> >>> On Mon, Feb 28, 2022 at 3:10 PM Sam Guo wrote: >>> >>>> Hi Matt, >>>> I love to have a get/set function interface. >>>> >>> >>> We can help you write the code, or you can create an Issue and we will >>> write it as soon as we can. >>> >>> Thanks, >>> >>> MAtt >>> >>> >>>> Thanks, >>>> Sam >>>> >>>> On Mon, Feb 28, 2022 at 11:43 AM Matthew Knepley >>>> wrote: >>>> >>>>> No we do not have that right now. We would be happy to accept a >>>>> contribution with a get/set function interface if you need that. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> On Mon, Feb 28, 2022 at 2:24 PM Sam Guo wrote: >>>>> >>>>>> Typo correction: It seems there is no such an API. >>>>>> >>>>>> On Mon, Feb 28, 2022 at 11:22 AM Sam Guo >>>>>> wrote: >>>>>> >>>>>>> Dear PETSc dev team, >>>>>>> I would like to set MUMPS OOC_TMPDIR programmatically on fly >>>>>>> (instead of statically using PetscOptionsString >>>>>>> >>>>>>> ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", >>>>>>> mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), >>>>>>> NULL);). It seems there is such an API, am I correct? >>>>>>> >>>>>>> Thanks, >>>>>>> Sam >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Feb 28 15:34:26 2022 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 28 Feb 2022 16:34:26 -0500 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: References: Message-ID: <997F651C-BB2E-43FE-82D9-47353F5D5F46@petsc.dev> You can call PetscOptionsSetValue("-mat_mumps_ooc_tmpdir","directory"); anytime before the KSP/PCSetUp() is called. If you have multiple uses of MUMPs solvers you can call this between each use so that a different directory is used for different MUMPS usage. Barry > On Feb 28, 2022, at 2:22 PM, Sam Guo wrote: > > Dear PETSc dev team, > I would like to set MUMPS OOC_TMPDIR programmatically on fly (instead of statically using PetscOptionsString ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), NULL);). It seems there is such an API, am I correct? > > Thanks, > Sam -------------- next part -------------- An HTML attachment was scrubbed... URL: From bantingl at myumanitoba.ca Mon Feb 28 17:54:09 2022 From: bantingl at myumanitoba.ca (Lucas Banting) Date: Mon, 28 Feb 2022 23:54:09 +0000 Subject: [petsc-users] Preconditioner for LSQR Message-ID: Hello, I have an MPIDENSE matrix of size about 200,000 x 200, using KSPLSQR on my machine a solution takes about 15 s. I typically run with six to eight processors. I have to solve the system several times, typically 4-30, and was looking for recommendations on reusable preconditioners to use with KSPLSQR to increase speed. Would it make the most sense to use PCCHOLESKY on the smaller system A^T * A? Thanks, Lucas -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 28 18:29:18 2022 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 28 Feb 2022 19:29:18 -0500 Subject: [petsc-users] Preconditioner for LSQR In-Reply-To: References: Message-ID: On Mon, Feb 28, 2022 at 6:54 PM Lucas Banting wrote: > Hello, > > I have an MPIDENSE matrix of size about 200,000 x 200, using KSPLSQR on my > machine a solution takes about 15 s. I typically run with six to eight > processors. > I have to solve the system several times, typically 4-30, and was looking > for recommendations on reusable preconditioners to use with KSPLSQR to > increase speed. > > Would it make the most sense to use PCCHOLESKY on the smaller system A^T * > A? > Yes. However, if you only have 200 columns, it might be even better to just use TSQR. There is an implementation of this in SLEPc. I am Cc'ing Jose to see what would be the easiest way to call it for a least-squares solution. Thanks, Matt > Thanks, > Lucas > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Mon Feb 28 18:34:01 2022 From: sam.guo at cd-adapco.com (Sam Guo) Date: Mon, 28 Feb 2022 16:34:01 -0800 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: <997F651C-BB2E-43FE-82D9-47353F5D5F46@petsc.dev> References: <997F651C-BB2E-43FE-82D9-47353F5D5F46@petsc.dev> Message-ID: Hi Barry, I can only call PetscOptionsSetValue before PetscInitialize() but I call PetscInitialize only once at the beginning of my program. I want to change MUMPS OOC_TMPDIR dynamically on the fly. Thanks, Sam On Mon, Feb 28, 2022 at 1:34 PM Barry Smith wrote: > > You can call PetscOptionsSetValue("-mat_mumps_ooc_tmpdir","directory"); > anytime before the KSP/PCSetUp() is called. If you have multiple uses of > MUMPs solvers you can call this between each use so that a different > directory is used for different MUMPS usage. > > Barry > > > > On Feb 28, 2022, at 2:22 PM, Sam Guo wrote: > > Dear PETSc dev team, > I would like to set MUMPS OOC_TMPDIR programmatically on fly (instead > of statically using PetscOptionsString > > ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", > mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), > NULL);). It seems there is such an API, am I correct? > > Thanks, > Sam > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Feb 28 18:38:10 2022 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 28 Feb 2022 19:38:10 -0500 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: References: <997F651C-BB2E-43FE-82D9-47353F5D5F46@petsc.dev> Message-ID: <4C042735-2FAC-45DF-A0C4-463974AABC68@petsc.dev> You should be able to call PetscOptionsSetValue() anytime you want, as I said between different uses of MUMPS you can call it to use different directories. Perhaps this confused you? Note: This function can be called BEFORE PetscInitialize() It is one of the very few functions that can be called before PetscInitialize() but it does not NEED to be called before PetscInitialize(). Barry > On Feb 28, 2022, at 7:34 PM, Sam Guo wrote: > > Hi Barry, > I can only call PetscOptionsSetValue before PetscInitialize() but I call PetscInitialize only once at the beginning of my program. I want to change MUMPS OOC_TMPDIR dynamically on the fly. > > Thanks, > Sam > > On Mon, Feb 28, 2022 at 1:34 PM Barry Smith > wrote: > > You can call PetscOptionsSetValue("-mat_mumps_ooc_tmpdir","directory"); anytime before the KSP/PCSetUp() is called. If you have multiple uses of MUMPs solvers you can call this between each use so that a different directory is used for different MUMPS usage. > > Barry > > > >> On Feb 28, 2022, at 2:22 PM, Sam Guo > wrote: >> >> Dear PETSc dev team, >> I would like to set MUMPS OOC_TMPDIR programmatically on fly (instead of statically using PetscOptionsString ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), NULL);). It seems there is such an API, am I correct? >> >> Thanks, >> Sam > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Mon Feb 28 18:39:54 2022 From: sam.guo at cd-adapco.com (Sam Guo) Date: Mon, 28 Feb 2022 16:39:54 -0800 Subject: [petsc-users] set MUMPS OOC_TMPDIR In-Reply-To: <4C042735-2FAC-45DF-A0C4-463974AABC68@petsc.dev> References: <997F651C-BB2E-43FE-82D9-47353F5D5F46@petsc.dev> <4C042735-2FAC-45DF-A0C4-463974AABC68@petsc.dev> Message-ID: Yes, " This function can be called BEFORE PetscInitialize()" confused me. Thanks for the clarification. On Mon, Feb 28, 2022 at 4:38 PM Barry Smith wrote: > > You should be able to call PetscOptionsSetValue() anytime you want, as I > said between different uses of MUMPS you can call it to use different > directories. > > Perhaps this confused you? > > Note: > This function can be called BEFORE PetscInitialize() > > It is one of the very few functions that can be called before > PetscInitialize() but it does not NEED to be called before > PetscInitialize(). > > Barry > > > On Feb 28, 2022, at 7:34 PM, Sam Guo wrote: > > Hi Barry, > I can only call PetscOptionsSetValue before PetscInitialize() but I > call PetscInitialize only once at the beginning of my program. I want to > change MUMPS OOC_TMPDIR dynamically on the fly. > > Thanks, > Sam > > On Mon, Feb 28, 2022 at 1:34 PM Barry Smith wrote: > >> >> You can call PetscOptionsSetValue("-mat_mumps_ooc_tmpdir","directory"); >> anytime before the KSP/PCSetUp() is called. If you have multiple uses of >> MUMPs solvers you can call this between each use so that a different >> directory is used for different MUMPS usage. >> >> Barry >> >> >> >> On Feb 28, 2022, at 2:22 PM, Sam Guo wrote: >> >> Dear PETSc dev team, >> I would like to set MUMPS OOC_TMPDIR programmatically on fly (instead >> of statically using PetscOptionsString >> >> ("-mat_mumps_ooc_tmpdir", "out of core directory", "None", >> mumps->id.ooc_tmpdir, mumps->id.ooc_tmpdir, sizeof(mumps->id.ooc_tmpdir), >> NULL);). It seems there is such an API, am I correct? >> >> Thanks, >> Sam >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Mon Feb 28 22:30:29 2022 From: jed at jedbrown.org (Jed Brown) Date: Mon, 28 Feb 2022 21:30:29 -0700 Subject: [petsc-users] Preconditioner for LSQR In-Reply-To: References: Message-ID: <87lexu5m6y.fsf@jedbrown.org> This is a small problem for which direct Householder QR may be fast enough (depending on the rest of your application). For multi-node, you can use TSQR (backward stable like Householder) or Cholesky (unstable). julia> A = rand(200000, 200); julia> @time Q, R = qr(A); 0.866989 seconds (14 allocations: 305.591 MiB) julia> @time begin C = A' * A; cholesky(C); end; 0.300977 seconds (8 allocations: 625.250 KiB) Lucas Banting writes: > Hello, > > I have an MPIDENSE matrix of size about 200,000 x 200, using KSPLSQR on my machine a solution takes about 15 s. I typically run with six to eight processors. > I have to solve the system several times, typically 4-30, and was looking for recommendations on reusable preconditioners to use with KSPLSQR to increase speed. > > Would it make the most sense to use PCCHOLESKY on the smaller system A^T * A? > > Thanks, > Lucas