From bastian.loehrer at tu-dresden.de Thu Aug 1 06:07:12 2019 From: bastian.loehrer at tu-dresden.de (=?UTF-8?Q?Bastian_L=c3=b6hrer?=) Date: Thu, 1 Aug 2019 13:07:12 +0200 Subject: [petsc-users] When building PETSc with --prefix, reference to temporary build directory remains Message-ID: <3c2f7063-330e-67a9-c5fe-db6694a6f153@tu-dresden.de> Dear all, I'm struggling to compile PETSc 3.3-p6 on a cluster where it is to be provided in a read-only folder. My scenario is the following: PETSc shall end up in a folder into which I can write from a login node but which is read-only on compute nodes: I'll call it /readonly/ below. So, using a compute node, I need to compile PETSc in a different location, which I'll call /temporary/ I have read numerous instructions on the web and here are the steps that I came up with: 1. on a compute node: unpack the PETSc source to /temporary/ and navigate there. 2. configure: ./configure \ ???????????????? --prefix=/readonly/ \ --with-gnu-compilers=0 \ --with-vendor-compilers=intel \ --with-large-file-io=1 \ ???????????????? --CFLAGS="-fPIC -L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi"?????????? \ ?????????????? --CXXFLAGS="-fPIC -L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi -lmpicxx"? \ ???????????????? --FFLAGS="-fPIC -L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi"?????????? \ ??????????????? --LDFLAGS="-L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi"???????????????? \ ??????????????? COPTFLAGS="-O3 -axCORE-AVX2 -xSSE4.2 -fp-model consistent -fp-model source -fp-speculation=safe -ftz"? \ ????????????? CXXOPTFLAGS="-O3 -axCORE-AVX2 -xSSE4.2 -fp-model consistent -fp-model source -fp-speculation=safe -ftz"? \ ??????????????? FOPTFLAGS="-O3 -axCORE-AVX2 -xSSE4.2 -fp-model consistent -fp-model source -fp-speculation=safe -ftz"? \ --with-blas-lapack-dir="${MKLROOT}/lib/intel64" \ --download-hypre \ ???????? --with-debugging=no 3. make all 4. on a login node: make install 5. From now on set PETSC_DIR=/readonly PETSC_ARCH='' step 4 moves the compiled PETSc to /readonly/ and it works, but when I compile a program with it the following line pops up in the linking command: -Wl,-rpath,/temporary/-Xlinker This is a problem when the drive on which /temporary/ is placed is not reachable which is the case right now due to technical issues. This causes the linking process to get stuck. The folder /temporary/ is to be deleted anyway so I do not see why it should be referenced here. Am I missing something? - Bastian -------------- next part -------------- An HTML attachment was scrubbed... URL: From yang.bo at ntu.edu.sg Thu Aug 1 08:58:37 2019 From: yang.bo at ntu.edu.sg (Yang Bo (Asst Prof)) Date: Thu, 1 Aug 2019 13:58:37 +0000 Subject: [petsc-users] ST solver error Message-ID: <77DC4A95-EDD3-4F28-83C5-A5FB910F6B2C@ntu.edu.sg> Hi everyone, I am trying to use the Shift-and-invert spectral transformations for my diagonalisation code. While there is no problem running the code with STSetType(st,STSHIFT); I receive the following errors when using STSetType(st,STSINVERT): [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for possible LU and Cholesky solvers [0]PETSC ERROR: Could not locate a solver package. Perhaps you must ./configure with --download- [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.10.2, Oct, 09, 2018 [0]PETSC ERROR: ./main on a arch-linux2-c-debug named yangbo-ThinkStation-P720 by yangbo Thu Aug 1 17:30:34 2019 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack --with-debugging=0 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" [0]PETSC ERROR: #1 MatGetFactor() line 4415 in /home/yangbo/petsc-3.10.2/src/mat/interface/matrix.c [0]PETSC ERROR: #2 PCSetUp_LU() line 93 in /home/yangbo/petsc-3.10.2/src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: #3 PCSetUp() line 932 in /home/yangbo/petsc-3.10.2/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #4 KSPSetUp() line 391 in /home/yangbo/petsc-3.10.2/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #5 STSetUp_Cayley() line 192 in /home/yangbo/slepc-3.10.1/src/sys/classes/st/impls/cayley/cayley.c [0]PETSC ERROR: #6 STSetUp() line 271 in /home/yangbo/slepc-3.10.1/src/sys/classes/st/interface/stsolve.c [0]PETSC ERROR: #7 EPSSetUp() line 263 in /home/yangbo/slepc-3.10.1/src/eps/interface/epssetup.c [0]PETSC ERROR: #8 EPSSolve() line 135 in /home/yangbo/slepc-3.10.1/src/eps/interface/epssolve.c [0]PETSC ERROR: #9 main() line 331 in main.cpp [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -e 7 [0]PETSC ERROR: -f d [0]PETSC ERROR: -nev 20 [0]PETSC ERROR: -o 19 [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 92) - process 0 May I know if I need to install the solver package for this to work? Thank you very much! Best regards, Yang Bo -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Aug 1 09:01:39 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 1 Aug 2019 16:01:39 +0200 Subject: [petsc-users] ST solver error In-Reply-To: <77DC4A95-EDD3-4F28-83C5-A5FB910F6B2C@ntu.edu.sg> References: <77DC4A95-EDD3-4F28-83C5-A5FB910F6B2C@ntu.edu.sg> Message-ID: See FAQ #10 http://slepc.upv.es/documentation/faq.htm > El 1 ago 2019, a las 15:58, Yang Bo (Asst Prof) via petsc-users escribi?: > > Hi everyone, > > I am trying to use the Shift-and-invert spectral transformations for my diagonalisation code. While there is no problem running the code with > > STSetType(st,STSHIFT); > > I receive the following errors when using STSetType(st,STSINVERT): > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/linearsolvertable.html for possible LU and Cholesky solvers > [0]PETSC ERROR: Could not locate a solver package. Perhaps you must ./configure with --download- > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.2, Oct, 09, 2018 > [0]PETSC ERROR: ./main on a arch-linux2-c-debug named yangbo-ThinkStation-P720 by yangbo Thu Aug 1 17:30:34 2019 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack --with-debugging=0 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" > [0]PETSC ERROR: #1 MatGetFactor() line 4415 in /home/yangbo/petsc-3.10.2/src/mat/interface/matrix.c > [0]PETSC ERROR: #2 PCSetUp_LU() line 93 in /home/yangbo/petsc-3.10.2/src/ksp/pc/impls/factor/lu/lu.c > [0]PETSC ERROR: #3 PCSetUp() line 932 in /home/yangbo/petsc-3.10.2/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #4 KSPSetUp() line 391 in /home/yangbo/petsc-3.10.2/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #5 STSetUp_Cayley() line 192 in /home/yangbo/slepc-3.10.1/src/sys/classes/st/impls/cayley/cayley.c > [0]PETSC ERROR: #6 STSetUp() line 271 in /home/yangbo/slepc-3.10.1/src/sys/classes/st/interface/stsolve.c > [0]PETSC ERROR: #7 EPSSetUp() line 263 in /home/yangbo/slepc-3.10.1/src/eps/interface/epssetup.c > [0]PETSC ERROR: #8 EPSSolve() line 135 in /home/yangbo/slepc-3.10.1/src/eps/interface/epssolve.c > [0]PETSC ERROR: #9 main() line 331 in main.cpp > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -e 7 > [0]PETSC ERROR: -f d > [0]PETSC ERROR: -nev 20 > [0]PETSC ERROR: -o 19 > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_WORLD, 92) - process 0 > > > May I know if I need to install the solver package for this to work? > > Thank you very much! > > Best regards, > > Yang Bo From bsmith at mcs.anl.gov Thu Aug 1 09:35:43 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 1 Aug 2019 14:35:43 +0000 Subject: [petsc-users] When building PETSc with --prefix, reference to temporary build directory remains In-Reply-To: <3c2f7063-330e-67a9-c5fe-db6694a6f153@tu-dresden.de> References: <3c2f7063-330e-67a9-c5fe-db6694a6f153@tu-dresden.de> Message-ID: Please consider upgrading to the latest PETSc 3.11 it has many new features and fewer bugs etc. > 5. From now on set PETSC_DIR=/readonly PETSC_ARCH='' > step 4 moves the compiled PETSc to /readonly/ and it works, but when I compile a program with it the following line pops up in the linking command: > -Wl,-rpath,/temporary/-Xlinker How are you compiling the program? If you are using the PETSc make facilities the easiest fix would be to remove this offend -Wl,-rpath,/temporary/-Xlinker which isn't needed. It is likely in /readonly/lib/petsc/conf/petscvariables I may have gotten the file or location wrong. You can use find in /readonly to locate any use of -Wl,-rpath,/temporary/-Xlinker and remove them. Barry > On Aug 1, 2019, at 6:07 AM, Bastian L?hrer via petsc-users wrote: > > Dear all, > > I'm struggling to compile PETSc 3.3-p6 on a cluster where it is to be provided in a read-only folder. > > My scenario is the following: > PETSc shall end up in a folder into which I can write from a login node but which is read-only on compute nodes: I'll call it /readonly/ below. > So, using a compute node, I need to compile PETSc in a different location, which I'll call /temporary/ > I have read numerous instructions on the web and here are the steps that I came up with: > > 1. on a compute node: unpack the PETSc source to /temporary/ and navigate there. > 2. configure: > ./configure \ > --prefix=/readonly/ \ > --with-gnu-compilers=0 \ > --with-vendor-compilers=intel \ > --with-large-file-io=1 \ > --CFLAGS="-fPIC -L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi" \ > --CXXFLAGS="-fPIC -L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi -lmpicxx" \ > --FFLAGS="-fPIC -L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi" \ > --LDFLAGS="-L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi" \ > COPTFLAGS="-O3 -axCORE-AVX2 -xSSE4.2 -fp-model consistent -fp-model source -fp-speculation=safe -ftz" \ > CXXOPTFLAGS="-O3 -axCORE-AVX2 -xSSE4.2 -fp-model consistent -fp-model source -fp-speculation=safe -ftz" \ > FOPTFLAGS="-O3 -axCORE-AVX2 -xSSE4.2 -fp-model consistent -fp-model source -fp-speculation=safe -ftz" \ > --with-blas-lapack-dir="${MKLROOT}/lib/intel64" \ > --download-hypre \ > --with-debugging=no > > 3. make all > 4. on a login node: make install > 5. From now on set PETSC_DIR=/readonly PETSC_ARCH='' > step 4 moves the compiled PETSc to /readonly/ and it works, but when I compile a program with it the following line pops up in the linking command: > -Wl,-rpath,/temporary/-Xlinker > > This is a problem when the drive on which /temporary/ is placed is not reachable which is the case right now due to technical issues. This causes the linking process to get stuck. > The folder /temporary/ is to be deleted anyway so I do not see why it should be referenced here. > > Am I missing something? > > - Bastian From balay at mcs.anl.gov Thu Aug 1 10:10:49 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 1 Aug 2019 15:10:49 +0000 Subject: [petsc-users] When building PETSc with --prefix, reference to temporary build directory remains In-Reply-To: References: <3c2f7063-330e-67a9-c5fe-db6694a6f153@tu-dresden.de> Message-ID: On Thu, 1 Aug 2019, Smith, Barry F. via petsc-users wrote: > > Please consider upgrading to the latest PETSc 3.11 it has many new features and fewer bugs etc. > > > 5. From now on set PETSC_DIR=/readonly PETSC_ARCH='' > > step 4 moves the compiled PETSc to /readonly/ and it works, but when I compile a program with it the following line pops up in the linking command: > > -Wl,-rpath,/temporary/-Xlinker > > > How are you compiling the program? > > If you are using the PETSc make facilities the easiest fix would be to remove this offend -Wl,-rpath,/temporary/-Xlinker which isn't needed. It is likely in /readonly/lib/petsc/conf/petscvariables > > I may have gotten the file or location wrong. You can use find in /readonly to locate any use of -Wl,-rpath,/temporary/-Xlinker and remove them. Actually its best to replace -Wl,-rpath,/temporary/-Xlinker to -Wl,-rpath,/readonly/-Xlinker or appropriate value. Likely 'prefix' install code for this version of petsc is buggy. Alternative is to do inplace install (i.e do not use --prefix - but have sources and build) in /readonly Satish > > Barry > > > > > > On Aug 1, 2019, at 6:07 AM, Bastian L?hrer via petsc-users wrote: > > > > Dear all, > > > > I'm struggling to compile PETSc 3.3-p6 on a cluster where it is to be provided in a read-only folder. > > > > My scenario is the following: > > PETSc shall end up in a folder into which I can write from a login node but which is read-only on compute nodes: I'll call it /readonly/ below. > > So, using a compute node, I need to compile PETSc in a different location, which I'll call /temporary/ > > I have read numerous instructions on the web and here are the steps that I came up with: > > > > 1. on a compute node: unpack the PETSc source to /temporary/ and navigate there. > > 2. configure: > > ./configure \ > > --prefix=/readonly/ \ > > --with-gnu-compilers=0 \ > > --with-vendor-compilers=intel \ > > --with-large-file-io=1 \ > > --CFLAGS="-fPIC -L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi" \ > > --CXXFLAGS="-fPIC -L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi -lmpicxx" \ > > --FFLAGS="-fPIC -L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi" \ > > --LDFLAGS="-L${I_MPI_ROOT}/intel64/lib -I${I_MPI_ROOT}/intel64/include -lmpi" \ > > COPTFLAGS="-O3 -axCORE-AVX2 -xSSE4.2 -fp-model consistent -fp-model source -fp-speculation=safe -ftz" \ > > CXXOPTFLAGS="-O3 -axCORE-AVX2 -xSSE4.2 -fp-model consistent -fp-model source -fp-speculation=safe -ftz" \ > > FOPTFLAGS="-O3 -axCORE-AVX2 -xSSE4.2 -fp-model consistent -fp-model source -fp-speculation=safe -ftz" \ > > --with-blas-lapack-dir="${MKLROOT}/lib/intel64" \ > > --download-hypre \ > > --with-debugging=no > > > > 3. make all > > 4. on a login node: make install > > 5. From now on set PETSC_DIR=/readonly PETSC_ARCH='' > > step 4 moves the compiled PETSc to /readonly/ and it works, but when I compile a program with it the following line pops up in the linking command: > > -Wl,-rpath,/temporary/-Xlinker > > > > This is a problem when the drive on which /temporary/ is placed is not reachable which is the case right now due to technical issues. This causes the linking process to get stuck. > > The folder /temporary/ is to be deleted anyway so I do not see why it should be referenced here. > > > > Am I missing something? > > > > - Bastian > > From d_mckinnell at aol.co.uk Thu Aug 1 10:59:26 2019 From: d_mckinnell at aol.co.uk (Daniel Mckinnell) Date: Thu, 1 Aug 2019 15:59:26 +0000 (UTC) Subject: [petsc-users] Refine DMPlex with a Refinement Function References: <546286456.174214.1564675166914.ref@mail.yahoo.com> Message-ID: <546286456.174214.1564675166914@mail.yahoo.com> Hi, I have been having some trouble trying to refine a DMPlex object using a Refinement Function. I have been working with reference to the code discussed here: https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2019-April/038341.html including the code from the GitHub directory mentioned. The error I have been getting is as follows: [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: No grid refiner of dimension 2 registered [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.11.2, unknown [0]PETSC ERROR: ./fpf on a arch-linux2-c-debug named cromars by daniel Thu Aug? 1 15:40:18 2019 [0]PETSC ERROR: Configure options ??download?f2cblaslapack=yes ??with?debugging=1 ??download?metis=yes ??download?parmetis=yes ??with?fortran?bindings=0 ??with?python=0 -download-openmpi=1 ??with?c?support ??with?clanguage=cxx [0]PETSC ERROR: #1 DMPlexRefine_Internal() line 215 in /home/daniel/petsc/src/dm/impls/plex/plexadapt.c [0]PETSC ERROR: #2 DMRefine_Plex() line 10381 in /home/daniel/petsc/src/dm/impls/plex/plexrefine.c [0]PETSC ERROR: #3 DMRefine() line 1881 in /home/daniel/petsc/src/dm/interface/dm.c Any help would be greatly appreciated. Thanks, Daniel -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: for_petsc_forum.cpp Type: text/x-c++src Size: 1691 bytes Desc: not available URL: From wence at gmx.li Thu Aug 1 11:31:34 2019 From: wence at gmx.li (Lawrence Mitchell) Date: Thu, 1 Aug 2019 17:31:34 +0100 Subject: [petsc-users] Refine DMPlex with a Refinement Function In-Reply-To: <546286456.174214.1564675166914@mail.yahoo.com> References: <546286456.174214.1564675166914.ref@mail.yahoo.com> <546286456.174214.1564675166914@mail.yahoo.com> Message-ID: On Thu, 1 Aug 2019 at 16:59, Daniel Mckinnell via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > I have been having some trouble trying to refine a DMPlex object using a > Refinement Function. I have been working with reference to the code > discussed here: > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2019-April/038341.html > including the code from the GitHub directory mentioned. The error I have > been getting is as follows: > > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: No grid refiner of dimension 2 registered > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.11.2, unknown > [0]PETSC ERROR: ./fpf on a arch-linux2-c-debug named cromars by daniel Thu > Aug 1 15:40:18 2019 > [0]PETSC ERROR: Configure options ??download?f2cblaslapack=yes > ??with?debugging=1 ??download?metis=yes ??download?parmetis=yes > ??with?fortran?bindings=0 ??with?python=0 -download-openmpi=1 > ??with?c?support ??with?clanguage=cxx > [0]PETSC ERROR: #1 DMPlexRefine_Internal() line 215 in > /home/daniel/petsc/src/dm/impls/plex/plexadapt.c > [0]PETSC ERROR: #2 DMRefine_Plex() line 10381 in > /home/daniel/petsc/src/dm/impls/plex/plexrefine.c > [0]PETSC ERROR: #3 DMRefine() line 1881 in > /home/daniel/petsc/src/dm/interface/dm > I suspect you need to configure with --download-triangle? When you mark cells for refinement like this, petsc uses and external package to do the adaptation. I'm not sure what options are available as packages, but triangle I think is one. Cheers, Lawrence > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Thu Aug 1 13:15:19 2019 From: hzhang at mcs.anl.gov (Zhang, Hong) Date: Thu, 1 Aug 2019 18:15:19 +0000 Subject: [petsc-users] Solving a sequence of linear systems stored on disk with MUMPS In-Reply-To: <087f6c3b-9231-bb90-462c-7ceab317b4ac@imperial.ac.uk> References: <58C382E5-26BB-4960-83A9-F82CE372E6AD@anl.gov> <45999109-e951-50c8-ed25-8814838ba58d@imperial.ac.uk> <087f6c3b-9231-bb90-462c-7ceab317b4ac@imperial.ac.uk> Message-ID: Thibaut : In the branch hzhang/add-ksp-tutorials-ex6/master, I added another example src/mat/examples/tests/ex28.c which creates A[k], k=0,...,4 with same data structure. Using a single symbolic factor F, it runs a loop with updated numerical values on A[k] and solve. Hong Hi Hong, Thanks very much for that example I appreciate it. I'll test that in a few days and come back with questions if needed, Thibaut On 25/07/2019 21:25, hong at aspiritech.org wrote: Thibaut: I added an example (in the branch hzhang/add-ksp-tutorials-ex6/master) https://bitbucket.org/petsc/petsc/commits/cf847786fd804b3606d0281d404c4763f36fe475?at=hzhang/add-ksp-tutorials-ex6/master You can run it with mpiexec -n 2 ./ex6 -num_numfac 2 -pc_type lu -pc_factor_mat_solver_type mumps -ksp_monitor -log_view ... MatLUFactorSym 1 1.0 3.5911e-03 MatLUFactorNum 2 1.0 6.3920e-03 This shows the code does one symbolic factorization and two numeric factorizations. For your convenience, the code ex6.c is attached below. Let me know if you have encounter any problems. Hong On Thu, Jul 25, 2019 at 1:30 PM Zhang, Hong via petsc-users > wrote: Thibaut: I'm writing a simple example using KSP directly -- will send to you soon. Hong Hi Hong, That sounds like a more reasonable approach, I had no idea the PETSc/MUMPS interface could provide such a level of control on the solve process. Therefore, after assembling the matrix A_0, I should do something along the lines of: MatGetFactor(A_0, MATSOLVERMUMPS, MAT_FACTOR_LU, F) MatLUFactorSymbolic(F, A_0, NULL, NULL, info) MatLUFactorNumeric(F, A_0, info) and then call MatSolve? However I don't understand, I thought F would remain the same during the whole process but it's an input parameter of MatSolve so I'd need one F_m for each A_m? Which is not what you mentioned (do one symbolic factorization only) On a side note, after preallocating and assembling the first matrix, should I create/assemble all the others with MatDuplicate(A_0, MAT_DO_NOT_COPY_VALUES, A_m) Calls to MatSetValues( ... ) MatAssemblyBegin(A_m, MAT_FINAL_ASSEMBLY) MatAssemblyEnd(A_m, MAT_FINAL_ASSEMBLY) Is that the recommended/most scalable way of duplicating a matrix + its non-zero structure? Thank you for your support and suggestions, Thibaut On 23/07/2019 18:38, Zhang, Hong wrote: Thibaut: Thanks for taking the time. I would typically run that on a small cluster node of 16 or 32 physical cores with 2 or 4 sockets. I use 16 or 32 MPI ranks and bind them to cores. The matrices would ALL have the same size and the same nonzero structure - it's just a few numerical values that would differ. You may do one symbolic factorization of A_m, use it in the m-i loop: - numeric factorization of A_m - solve A_m x_m,i = b_m,i in mumps, numeric factorization and solve are scalable. Repeated numeric factorization of A_m are likely faster than reading data files from the disc. Hong This is a good point you've raised as I don't think MUMPS is able to exploit that - I asked the question in their users list just to be sure. There are some options in SuperLU dist to reuse permutation arrays, but there's no I/O for that solver. And the native PETSc LU solver is not parallel? I'm using high-order finite differences so I'm suffering from a lot of fill-in, one of the reasons why storing factorizations in RAM is not viable. In comparison, I have almost unlimited disk space. I'm aware my need might seem counter-intuitive, but I'm really willing to sacrifice performance in the I/O part. My code is already heavily based on PETSc (preallocation, assembly for matrices/vectors) coupled with MUMPS I'm minimizing the loss of efficiency. Thibaut On 23/07/2019 17:13, Smith, Barry F. wrote: > What types of computing systems will you be doing the computations? Roughly how many MPI_ranks? > > Are the matrices all the same size? Do they have the same or different nonzero structures? Would it be possible to use the same symbolic representation for all of them and just have different numerical values? > > Clusters and large scale computing centers are notoriously terrible at IO; often IO is orders of magnitude slower than compute/memory making this type of workflow unrealistically slow. From a cost analysis point of view often just buying lots of memory might be the most efficacious approach. > > That said, what you suggest might require only a few lines of code (though determining where to put them is the tricky part) depending on the MUMPS interface for saving a filer to disk. What we would do is keep the PETSc wrapper that lives around the MUMPS matrix Mat_MUMPS but using the MUMPS API save the information in the DMUMPS_STRUC_C id; and then reload it when needed. > > The user level API could be something like > > MatMumpsSaveToDisk(Mat) and MatMumpsLoadFromDisk(Mat) they would just money with DMUMPS_STRUC_C id; item. > > > Barry > > >> On Jul 23, 2019, at 9:24 AM, Thibaut Appel via petsc-users > wrote: >> >> Dear PETSc users, >> >> I need to solve several linear systems successively, with LU factorization, as part of an iterative process in my Fortran application code. >> >> The process would solve M systems (A_m)(x_m,i) = (b_m,i) for m=1,M at each iteration i, but computing the LU factorization of A_m only once. >> The RHSs (b_m,i+1) are computed from all the different (x_m,i) and all depend upon each other. >> >> The way I envisage to perform that is to use MUMPS to compute, successively, each of the LU factorizations (m) in parallel and store the factors on disk, creating/assembling/destroying the matrices A_m on the go. >> Then whenever needed, read the factors in parallel to solve the systems. Since version 5.2, MUMPS has a save/restore feature that allows that, see http://mumps.enseeiht.fr/doc/userguide_5.2.1.pdf p.20, 24 and 58. >> >> In its current state, the PETSc/MUMPS interface does not incorporate that feature. I'm an advanced Fortran programmer but not in C so I don't think I would do an amazing job having a go inside src/mat/impls/aij/mpi/mumps/mumps.c. >> >> I was picturing something like creating as many KSP objects as linear systems to be solved, with some sort of flag to force the storage of LU factors on disk after the first call to KSPSolve. Then keep calling KSPSolve as many times as needed. >> >> Would you support such a feature? >> >> Thanks for your support, >> >> Thibaut -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_mckinnell at aol.co.uk Fri Aug 2 08:10:42 2019 From: d_mckinnell at aol.co.uk (Daniel Mckinnell) Date: Fri, 2 Aug 2019 13:10:42 +0000 (UTC) Subject: [petsc-users] Refine DMPlex with a Refinement Function In-Reply-To: References: <546286456.174214.1564675166914.ref@mail.yahoo.com> <546286456.174214.1564675166914@mail.yahoo.com> Message-ID: <1891944408.421506.1564751442771@mail.yahoo.com> Thank you a lot, that works well for Simplex cells when running on one processor. Do you know anything that works for Tensor cells and/or when running in parallel?Thanks again for all of your help.Daniel -----Original Message----- From: Lawrence Mitchell To: Daniel Mckinnell CC: petsc-users Sent: Thu, 1 Aug 2019 17:36 Subject: Re: [petsc-users] Refine DMPlex with a Refinement Function On Thu, 1 Aug 2019 at 16:59, Daniel Mckinnell via petsc-users wrote: Hi, I have been having some trouble trying to refine a DMPlex object using a Refinement Function. I have been working with reference to the code discussed here: https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2019-April/038341.html including the code from the GitHub directory mentioned. The error I have been getting is as follows: [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: No grid refiner of dimension 2 registered [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.11.2, unknown [0]PETSC ERROR: ./fpf on a arch-linux2-c-debug named cromars by daniel Thu Aug? 1 15:40:18 2019 [0]PETSC ERROR: Configure options ??download?f2cblaslapack=yes ??with?debugging=1 ??download?metis=yes ??download?parmetis=yes ??with?fortran?bindings=0 ??with?python=0 -download-openmpi=1 ??with?c?support ??with?clanguage=cxx [0]PETSC ERROR: #1 DMPlexRefine_Internal() line 215 in /home/daniel/petsc/src/dm/impls/plex/plexadapt.c [0]PETSC ERROR: #2 DMRefine_Plex() line 10381 in /home/daniel/petsc/src/dm/impls/plex/plexrefine.c [0]PETSC ERROR: #3 DMRefine() line 1881 in /home/daniel/petsc/src/dm/interface/dm I suspect you need to configure with --download-triangle? When you mark cells for refinement like this, petsc uses and external package to do the adaptation.? I'm not sure what options are available as packages, but triangle I think is one.? Cheers, Lawrence -------------- next part -------------- An HTML attachment was scrubbed... URL: From t.appel17 at imperial.ac.uk Fri Aug 2 08:17:56 2019 From: t.appel17 at imperial.ac.uk (Thibaut Appel) Date: Fri, 2 Aug 2019 14:17:56 +0100 Subject: [petsc-users] Solving a sequence of linear systems stored on disk with MUMPS In-Reply-To: References: <58C382E5-26BB-4960-83A9-F82CE372E6AD@anl.gov> <45999109-e951-50c8-ed25-8814838ba58d@imperial.ac.uk> <087f6c3b-9231-bb90-462c-7ceab317b4ac@imperial.ac.uk> Message-ID: <13317120-3782-d4b4-2bab-00b6a314d2e1@imperial.ac.uk> Hi Hong, That's exactly what I was looking for that's perfect. This will be of significant help. Have a nice weekend, Thibaut On 01/08/2019 19:15, Zhang, Hong wrote: > Thibaut : > In the branch hzhang/add-ksp-tutorials-ex6/master, I added another example > src/mat/examples/tests/ex28.c > which creates A[k], k=0,...,4 with same data structure. > Using a single symbolic factor F, it runs a loop with updated > numerical values on A[k] and solve. > Hong > > Hi Hong, > > Thanks very much for that example I appreciate it. > > I'll test that in a few days and come back with questions if needed, > > > Thibaut > > On 25/07/2019 21:25, hong at aspiritech.org > wrote: >> Thibaut: >> I added an example (in the >> branch?hzhang/add-ksp-tutorials-ex6/master) >> https://bitbucket.org/petsc/petsc/commits/cf847786fd804b3606d0281d404c4763f36fe475?at=hzhang/add-ksp-tutorials-ex6/master >> >> You can run it with >> mpiexec -n 2 ./ex6 -num_numfac 2 -pc_type lu >> -pc_factor_mat_solver_type mumps -ksp_monitor -log_view >> ... >> MatLUFactorSym ? ? ? ? 1 1.0 3.5911e-03 >> MatLUFactorNum ? ? ? ? 2 1.0 6.3920e-03 >> >> This shows the code does one symbolic factorization and two >> numeric factorizations. >> For your convenience, the code ex6.c is attached below. >> Let?me know if you have encounter any problems. >> Hong >> >> >> On Thu, Jul 25, 2019 at 1:30 PM Zhang, Hong via petsc-users >> > wrote: >> >> Thibaut: >> I'm writing a simple example using KSP directly -- will send >> to you soon. >> Hong >> >> Hi Hong, >> >> That sounds like a more reasonable approach, I had no >> idea the PETSc/MUMPS interface could provide such a level >> of control on the solve process. Therefore, after >> assembling the matrix A_0, I should do something along >> the lines of: >> >> MatGetFactor(A_0, MATSOLVERMUMPS, MAT_FACTOR_LU, F) >> >> MatLUFactorSymbolic(F, A_0, NULL, NULL, info) >> >> MatLUFactorNumeric(F, A_0, info) >> >> and then call MatSolve? However I don't understand, I >> thought F would remain the same during the whole process >> but it's an input parameter of MatSolve so I'd need one >> F_m for each A_m? Which is not what you mentioned (do one >> symbolic factorization only) >> >> >> On a side note, after preallocating and assembling the >> first matrix, should I create/assemble all the others with >> >> MatDuplicate(A_0, MAT_DO_NOT_COPY_VALUES, A_m) >> >> Calls to MatSetValues( ... ) >> >> MatAssemblyBegin(A_m, MAT_FINAL_ASSEMBLY) >> MatAssemblyEnd(A_m, MAT_FINAL_ASSEMBLY) >> >> Is that the recommended/most scalable way of duplicating >> a matrix + its non-zero structure? >> >> >> Thank you for your support and suggestions, >> >> Thibaut >> >> >> On 23/07/2019 18:38, Zhang, Hong wrote: >>> Thibaut: >>> >>> Thanks for taking the time. I would typically run >>> that on a small >>> cluster node of 16 or 32 physical cores with 2 or 4 >>> sockets. I use 16 or >>> 32 MPI ranks and bind them to cores. >>> >>> The matrices would ALL have the same size and the >>> same nonzero structure >>> - it's just a few numerical values that would differ. >>> >>> You may do one symbolic factorization of A_m, use it in >>> the m-i loop: >>> - numeric factorization of A_m >>> - solve A_m x_m,i = b_m,i >>> in mumps, numeric factorization and solve are scalable. >>> Repeated numeric factorization of A_m are likely faster >>> than reading data files from the disc. >>> Hong >>> >>> >>> This is a good point you've raised as I don't think >>> MUMPS is able to >>> exploit that - I asked the question in their users >>> list just to be sure. >>> There are some options in SuperLU dist to reuse >>> permutation arrays, but >>> there's no I/O for that solver. And the native PETSc >>> LU solver is not >>> parallel? >>> >>> I'm using high-order finite differences so I'm >>> suffering from a lot of >>> fill-in, one of the reasons why storing >>> factorizations in RAM is not >>> viable. In comparison, I have almost unlimited disk >>> space. >>> >>> I'm aware my need might seem counter-intuitive, but >>> I'm really willing >>> to sacrifice performance in the I/O part. My code is >>> already heavily >>> based on PETSc (preallocation, assembly for >>> matrices/vectors) coupled >>> with MUMPS I'm minimizing the loss of efficiency. >>> >>> Thibaut >>> >>> On 23/07/2019 17:13, Smith, Barry F. wrote: >>> >? ? What types of computing systems will you be >>> doing the computations? Roughly how many MPI_ranks? >>> > >>> > Are the matrices all the same size? Do they have >>> the same or different nonzero structures? Would it >>> be possible to use the same symbolic representation >>> for all of them and just have different numerical >>> values? >>> > >>> >? ? Clusters and large scale computing centers are >>> notoriously terrible at IO; often IO is orders of >>> magnitude slower than compute/memory making this >>> type of workflow unrealistically slow. From a cost >>> analysis point of view often just buying lots of >>> memory might be the most? efficacious approach. >>> > >>> >? ? That said, what you suggest might require only >>> a few lines of code (though determining where to put >>> them is the tricky part) depending on the MUMPS >>> interface for saving a filer to disk. What we would >>> do is keep the PETSc wrapper that lives around the >>> MUMPS matrix Mat_MUMPS but using the MUMPS API save >>> the information in the DMUMPS_STRUC_C id; and then >>> reload it when needed. >>> > >>> >? ? The user level API could be something like >>> > >>> >? ? MatMumpsSaveToDisk(Mat) and >>> MatMumpsLoadFromDisk(Mat) they would just money with >>> DMUMPS_STRUC_C id; item. >>> > >>> > >>> >? ? Barry >>> > >>> > >>> >> On Jul 23, 2019, at 9:24 AM, Thibaut Appel via >>> petsc-users >> > wrote: >>> >> >>> >> Dear PETSc users, >>> >> >>> >> I need to solve several linear systems >>> successively, with LU factorization, as part of an >>> iterative process in my Fortran application code. >>> >> >>> >> The process would solve M systems (A_m)(x_m,i) = >>> (b_m,i) for m=1,M at each iteration i, but computing >>> the LU factorization of A_m only once. >>> >> The RHSs (b_m,i+1) are computed from all the >>> different (x_m,i) and all depend upon each other. >>> >> >>> >> The way I envisage to perform that is to use >>> MUMPS to compute, successively, each of the LU >>> factorizations (m) in parallel and store the factors >>> on disk, creating/assembling/destroying the matrices >>> A_m on the go. >>> >> Then whenever needed, read the factors in >>> parallel to solve the systems. Since version 5.2, >>> MUMPS has a save/restore feature that allows that, >>> see http://mumps.enseeiht.fr/doc/userguide_5.2.1.pdf >>> p.20, 24 and 58. >>> >> >>> >> In its current state, the PETSc/MUMPS interface >>> does not incorporate that feature. I'm an advanced >>> Fortran programmer but not in C so I don't think I >>> would do an amazing job having a go inside >>> src/mat/impls/aij/mpi/mumps/mumps.c. >>> >> >>> >> I was picturing something like creating as many >>> KSP objects as linear systems to be solved, with >>> some sort of flag to force the storage of LU factors >>> on disk after the first call to KSPSolve. Then keep >>> calling KSPSolve as many times as needed. >>> >> >>> >> Would you support such a feature? >>> >> >>> >> Thanks for your support, >>> >> >>> >> Thibaut >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 2 21:24:15 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 2 Aug 2019 22:24:15 -0400 Subject: [petsc-users] Refine DMPlex with a Refinement Function In-Reply-To: <1891944408.421506.1564751442771@mail.yahoo.com> References: <546286456.174214.1564675166914.ref@mail.yahoo.com> <546286456.174214.1564675166914@mail.yahoo.com> <1891944408.421506.1564751442771@mail.yahoo.com> Message-ID: On Fri, Aug 2, 2019 at 9:10 AM Daniel Mckinnell via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thank you a lot, that works well for Simplex cells when running on one > processor. Do you know anything that works for Tensor cells and/or when > running in parallel? > For simplex in parallel, we use Pragmatic, and for tensor cells in parallel we use p4est. Both of them operate through the very new DMAdapt() interface. We have some initial experience, but that part of PETSc is still unstable. Note that you can regularly refine simplices or hexes in parallel using just DMRefine(). Thanks, Matt > Thanks again for all of your help. > Daniel > > > -----Original Message----- > From: Lawrence Mitchell > To: Daniel Mckinnell > CC: petsc-users > Sent: Thu, 1 Aug 2019 17:36 > Subject: Re: [petsc-users] Refine DMPlex with a Refinement Function > > > > On Thu, 1 Aug 2019 at 16:59, Daniel Mckinnell via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi, > > I have been having some trouble trying to refine a DMPlex object using a > Refinement Function. I have been working with reference to the code > discussed here: > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2019-April/038341.html > including the code from the GitHub directory mentioned. The error I have > been getting is as follows: > > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: No grid refiner of dimension 2 registered > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.11.2, unknown > [0]PETSC ERROR: ./fpf on a arch-linux2-c-debug named cromars by daniel Thu > Aug 1 15:40:18 2019 > [0]PETSC ERROR: Configure options ??download?f2cblaslapack=yes > ??with?debugging=1 ??download?metis=yes ??download?parmetis=yes > ??with?fortran?bindings=0 ??with?python=0 -download-openmpi=1 > ??with?c?support ??with?clanguage=cxx > [0]PETSC ERROR: #1 DMPlexRefine_Internal() line 215 in > /home/daniel/petsc/src/dm/impls/plex/plexadapt.c > [0]PETSC ERROR: #2 DMRefine_Plex() line 10381 in > /home/daniel/petsc/src/dm/impls/plex/plexrefine.c > [0]PETSC ERROR: #3 DMRefine() line 1881 in > /home/daniel/petsc/src/dm/interface/dm > > > I suspect you need to configure with --download-triangle? When you mark > cells for refinement like this, petsc uses and external package to do the > adaptation. > > I'm not sure what options are available as packages, but triangle I think > is one. > > Cheers, > > Lawrence > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Moritz.Huck at rwth-aachen.de Mon Aug 5 04:16:55 2019 From: Moritz.Huck at rwth-aachen.de (Huck, Moritz) Date: Mon, 5 Aug 2019 09:16:55 +0000 Subject: [petsc-users] Problem with TS and SNES VI Message-ID: Hi, I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. Are there some tolerances I have to set for VI or something like this? Best Regards, Moritz From shrirang.abhyankar at pnnl.gov Mon Aug 5 10:21:41 2019 From: shrirang.abhyankar at pnnl.gov (Abhyankar, Shrirang G) Date: Mon, 5 Aug 2019 15:21:41 +0000 Subject: [petsc-users] Problem with TS and SNES VI Message-ID: For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html A brief intro to TSEvent can be found here. Thanks, Shri From: petsc-users on behalf of "Huck, Moritz via petsc-users" Reply-To: "Huck, Moritz" Date: Monday, August 5, 2019 at 5:18 AM To: "petsc-users at mcs.anl.gov" Subject: [petsc-users] Problem with TS and SNES VI Hi, I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. Are there some tolerances I have to set for VI or something like this? Best Regards, Moritz -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Mon Aug 5 10:38:56 2019 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 5 Aug 2019 09:38:56 -0600 Subject: [petsc-users] snes converged reason for newton trust region? In-Reply-To: References: Message-ID: I have zero experience on the trust region Newton. I would like PETSc Team chime in. Fande, On Mon, Aug 5, 2019 at 9:34 AM Gary Hu wrote: > Hello Group, > > When I use -snes_type newtontr > > I sometimes get CONVERGED_TR_DELTA instead of CONVERGED_FNORM. > > My guess is that the trust region size keeps reducing until it becomes > smaller than the tolerance. > > Is there a way to force the algorithm to converge only due to FNORM? > > Thanks, > Gary > > -- > You received this message because you are subscribed to the Google Groups > "moose-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to moose-users+unsubscribe at googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/moose-users/f1aa52c1-b146-4703-91f4-a034cbe6be3b%40googlegroups.com > > . > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Aug 5 11:36:09 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 5 Aug 2019 16:36:09 +0000 Subject: [petsc-users] snes converged reason for newton trust region? In-Reply-To: References: Message-ID: Yes, this is a bug. We previously declared a tiny trust region as converged but in fact it is not diverged. In the master branch it now comes up as a negative value in indicating diverging of the nonlinear solver. Sorry for the confusion. Barry > On Aug 5, 2019, at 10:38 AM, Fande Kong via petsc-users wrote: > > I have zero experience on the trust region Newton. I would like PETSc Team chime in. > > Fande, > > On Mon, Aug 5, 2019 at 9:34 AM Gary Hu wrote: > Hello Group, > > When I use -snes_type newtontr > > I sometimes get CONVERGED_TR_DELTA instead of CONVERGED_FNORM. > > My guess is that the trust region size keeps reducing until it becomes smaller than the tolerance. > > Is there a way to force the algorithm to converge only due to FNORM? > > Thanks, > Gary > > -- > You received this message because you are subscribed to the Google Groups "moose-users" group. > To unsubscribe from this group and stop receiving emails from it, send an email to moose-users+unsubscribe at googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/moose-users/f1aa52c1-b146-4703-91f4-a034cbe6be3b%40googlegroups.com. From knepley at gmail.com Mon Aug 5 12:09:52 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 5 Aug 2019 13:09:52 -0400 Subject: [petsc-users] snes converged reason for newton trust region? In-Reply-To: References: Message-ID: On Mon, Aug 5, 2019 at 12:36 PM Smith, Barry F. via petsc-users < petsc-users at mcs.anl.gov> wrote: > > Yes, this is a bug. We previously declared a tiny trust region as > converged but in fact it is not diverged. In the master branch it now > comes up as a negative value in indicating diverging of the nonlinear > solver. Sorry for the confusion. > Barry, you have to put in a deprecation entry for that change so that it does not break compiles. Thanks, Matt > Barry > > > > On Aug 5, 2019, at 10:38 AM, Fande Kong via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > I have zero experience on the trust region Newton. I would like PETSc > Team chime in. > > > > Fande, > > > > On Mon, Aug 5, 2019 at 9:34 AM Gary Hu wrote: > > Hello Group, > > > > When I use -snes_type newtontr > > > > I sometimes get CONVERGED_TR_DELTA instead of CONVERGED_FNORM. > > > > My guess is that the trust region size keeps reducing until it becomes > smaller than the tolerance. > > > > Is there a way to force the algorithm to converge only due to FNORM? > > > > Thanks, > > Gary > > > > -- > > You received this message because you are subscribed to the Google > Groups "moose-users" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to moose-users+unsubscribe at googlegroups.com. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/moose-users/f1aa52c1-b146-4703-91f4-a034cbe6be3b%40googlegroups.com > . > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Moritz.Huck at rwth-aachen.de Tue Aug 6 02:12:01 2019 From: Moritz.Huck at rwth-aachen.de (Huck, Moritz) Date: Tue, 6 Aug 2019 07:12:01 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: References: Message-ID: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> Hi, I think I am missing something here. How would events help to constrain the states. Do you mean to use the event to "pause" to integration an adjust the state manually? Or are the events to enforce smaller timesteps when the state come close to the constraints? Thank you, Moritz ________________________________________ Von: Abhyankar, Shrirang G Gesendet: Montag, 5. August 2019 17:21:41 An: Huck, Moritz; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html A brief intro to TSEvent can be found here. Thanks, Shri From: petsc-users on behalf of "Huck, Moritz via petsc-users" Reply-To: "Huck, Moritz" Date: Monday, August 5, 2019 at 5:18 AM To: "petsc-users at mcs.anl.gov" Subject: [petsc-users] Problem with TS and SNES VI Hi, I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. Are there some tolerances I have to set for VI or something like this? Best Regards, Moritz From mlohry at gmail.com Tue Aug 6 07:43:03 2019 From: mlohry at gmail.com (Mark Lohry) Date: Tue, 6 Aug 2019 08:43:03 -0400 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts Message-ID: I'm running some larger cases than I have previously with a working code, and I'm running into failures I don't see on smaller cases. Failures are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs successfully on half size case on 200 cores. 1) The first error output from petsc is "MPI_Allreduce() called in different locations". Is this a red herring, suggesting some process failed prior to this and processes have diverged? 2) I don't think I'm running out of memory -- globally at least. Slurm output shows e.g. Memory Utilized: 459.15 GB (estimated maximum) Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) I did try with and without --64-bit-indices. 3) The debug traces seem to vary, see below. I *think* the failure might be happening in the vicinity of a Coloring call. I'm using MatFDColoring like so: ISColoring iscoloring; MatFDColoring fdcoloring; MatColoring coloring; MatColoringCreate(ctx.JPre, &coloring); MatColoringSetType(coloring, MATCOLORINGGREEDY); // converges stalls badly without this on small cases, don't know why MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); // none of these worked. // MatColoringSetType(coloring, MATCOLORINGJP); // MatColoringSetType(coloring, MATCOLORINGSL); // MatColoringSetType(coloring, MATCOLORINGID); MatColoringSetFromOptions(coloring); MatColoringApply(coloring, &iscoloring); MatColoringDestroy(&coloring); MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); I have had issues in the past with getting a functional coloring setup for finite difference jacobians, and the above is the only configuration I've managed to get working successfully. Have there been any significant development changes to that area of code since v3.8.3? I'll try upgrading in the mean time and hope for the best. Any ideas? Thanks, Mark ************************************* mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429773.out [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: MPI_Allreduce() called in different locations (functions) on different processors [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Enum value must be same on all processes, argument # 2 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] PetscCommDuplicate line 130 /home/mlohry/build/external/petsc/src/sys/objects/tagm.c [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 /home/mlohry/build/external/petsc/src/sys/objects/inherit.c [0]PETSC ERROR: [0] DMCreate line 36 /home/mlohry/build/external/petsc/src/dm/interface/dm.c [0]PETSC ERROR: [0] DMShellCreate line 983 /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c [0]PETSC ERROR: [0] TSGetDM line 5287 /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSetIFunction line 1310 /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices [0]PETSC ERROR: #4 User provided function() line 0 in unknown file ************************************* mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429158.out [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun [0]PETSC ERROR: #4 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c [0]PETSC ERROR: #5 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c [0]PETSC ERROR: [0] MatColoringApply line 357 /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c [0]PETSC ERROR: [0] VecSetSizes line 1308 /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun [0]PETSC ERROR: #10 User provided function() line 0 in unknown file ************************* mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429134.out [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c [0]PETSC ERROR: #6 VecSetType() line 51 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c [0]PETSC ERROR: #7 VecCreateMPI() line 40 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Vec object's type is not set: Argument # 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c ************************************** mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429102.out [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun [0]PETSC ERROR: #11 User provided function() line 0 in unknown file -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 6 07:55:59 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 6 Aug 2019 12:55:59 +0000 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: References: Message-ID: My first guess is that the code is getting integer overflow somewhere. 25 billion is well over the 2 billion that 32 bit integers can hold. We urge you to upgrade. Regardless for problems this large you likely need the ./configure option --with-64-bit-indices We are adding more tests to nicely handle integer overflow but it is not easy since it can occur in so many places Hopefully this will resolve your problem with large process counts Barry > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users wrote: > > I'm running some larger cases than I have previously with a working code, and I'm running into failures I don't see on smaller cases. Failures are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs successfully on half size case on 200 cores. > > 1) The first error output from petsc is "MPI_Allreduce() called in different locations". Is this a red herring, suggesting some process failed prior to this and processes have diverged? > > 2) I don't think I'm running out of memory -- globally at least. Slurm output shows e.g. > Memory Utilized: 459.15 GB (estimated maximum) > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) > I did try with and without --64-bit-indices. > > 3) The debug traces seem to vary, see below. I *think* the failure might be happening in the vicinity of a Coloring call. I'm using MatFDColoring like so: > > ISColoring iscoloring; > MatFDColoring fdcoloring; > MatColoring coloring; > > MatColoringCreate(ctx.JPre, &coloring); > MatColoringSetType(coloring, MATCOLORINGGREEDY); > > // converges stalls badly without this on small cases, don't know why > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > // none of these worked. > // MatColoringSetType(coloring, MATCOLORINGJP); > // MatColoringSetType(coloring, MATCOLORINGSL); > // MatColoringSetType(coloring, MATCOLORINGID); > MatColoringSetFromOptions(coloring); > > MatColoringApply(coloring, &iscoloring); > MatColoringDestroy(&coloring); > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); > > I have had issues in the past with getting a functional coloring setup for finite difference jacobians, and the above is the only configuration I've managed to get working successfully. Have there been any significant development changes to that area of code since v3.8.3? I'll try upgrading in the mean time and hope for the best. > > > > Any ideas? > > > Thanks, > Mark > > > ************************************* > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429773.out > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: MPI_Allreduce() called in different locations (functions) on different processors > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Enum value must be same on all processes, argument # 2 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 /home/mlohry/build/external/petsc/src/sys/objects/tagm.c > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 /home/mlohry/build/external/petsc/src/sys/objects/inherit.c > [0]PETSC ERROR: [0] DMCreate line 36 /home/mlohry/build/external/petsc/src/dm/interface/dm.c > [0]PETSC ERROR: [0] DMShellCreate line 983 /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c > [0]PETSC ERROR: [0] TSGetDM line 5287 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSetIFunction line 1310 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > > > ************************************* > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429158.out > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > [0]PETSC ERROR: [0] MatColoringApply line 357 /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c > [0]PETSC ERROR: [0] VecSetSizes line 1308 /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file > > > > ************************* > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429134.out > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > [0]PETSC ERROR: #6 VecSetType() line 51 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > ************************************** > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429102.out > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file > > > From mlohry at gmail.com Tue Aug 6 08:19:57 2019 From: mlohry at gmail.com (Mark Lohry) Date: Tue, 6 Aug 2019 09:19:57 -0400 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: References: Message-ID: > > My first guess is that the code is getting integer overflow somewhere. 25 > billion is well over the 2 billion that 32 bit integers can hold. Mine as well -- though in later tests I have the same issue when using --with-64-bit-indices. Ironically I had removed that flag at some point because the coloring / index set was using a serious chunk of total memory on medium sized problems. Questions on the petsc internals there though: Are matrices indexed with two integers (i,j) so the max matrix dimension is (int limit) x (int limit) or a single integer so the max dimension is sqrt(int limit)? Also I was operating under the assumption the 32 bit limit should only constrain per-process problem sizes (25B over 400 processes giving 62M non-zeros per process), is that not right? We are adding more tests to nicely handle integer overflow but it is not > easy since it can occur in so many places Totally understood. I know the pain of only finding an overflow bug after days of waiting in a cluster queue for a big job. We urge you to upgrade. > I'll do that today and hope for the best. On first tests on 3.11.3, I still have a couple issues with the coloring code: * I am still getting the nasty hangs with MATCOLORINGJP mentioned here: https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a wrong jacobian unless I also set MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist. Thanks, Mark On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. wrote: > > My first guess is that the code is getting integer overflow somewhere. > 25 billion is well over the 2 billion that 32 bit integers can hold. > > We urge you to upgrade. > > Regardless for problems this large you likely need the ./configure > option --with-64-bit-indices > > We are adding more tests to nicely handle integer overflow but it is > not easy since it can occur in so many places > > Hopefully this will resolve your problem with large process counts > > Barry > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > I'm running some larger cases than I have previously with a working > code, and I'm running into failures I don't see on smaller cases. Failures > are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs > successfully on half size case on 200 cores. > > > > 1) The first error output from petsc is "MPI_Allreduce() called in > different locations". Is this a red herring, suggesting some process failed > prior to this and processes have diverged? > > > > 2) I don't think I'm running out of memory -- globally at least. Slurm > output shows e.g. > > Memory Utilized: 459.15 GB (estimated maximum) > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) > > I did try with and without --64-bit-indices. > > > > 3) The debug traces seem to vary, see below. I *think* the failure might > be happening in the vicinity of a Coloring call. I'm using MatFDColoring > like so: > > > > ISColoring iscoloring; > > MatFDColoring fdcoloring; > > MatColoring coloring; > > > > MatColoringCreate(ctx.JPre, &coloring); > > MatColoringSetType(coloring, MATCOLORINGGREEDY); > > > > // converges stalls badly without this on small cases, don't know why > > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > > > // none of these worked. > > // MatColoringSetType(coloring, MATCOLORINGJP); > > // MatColoringSetType(coloring, MATCOLORINGSL); > > // MatColoringSetType(coloring, MATCOLORINGID); > > MatColoringSetFromOptions(coloring); > > > > MatColoringApply(coloring, &iscoloring); > > MatColoringDestroy(&coloring); > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); > > > > I have had issues in the past with getting a functional coloring setup > for finite difference jacobians, and the above is the only configuration > I've managed to get working successfully. Have there been any significant > development changes to that area of code since v3.8.3? I'll try upgrading > in the mean time and hope for the best. > > > > > > > > Any ideas? > > > > > > Thanks, > > Mark > > > > > > ************************************* > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429773.out > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (functions) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by > mlohry Tue Aug 6 06:05:02 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Enum value must be same on all processes, argument # 2 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by > mlohry Tue Aug 6 06:05:02 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 > /home/mlohry/build/external/petsc/src/sys/objects/tagm.c > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 > /home/mlohry/build/external/petsc/src/sys/objects/inherit.c > > [0]PETSC ERROR: [0] DMCreate line 36 > /home/mlohry/build/external/petsc/src/dm/interface/dm.c > > [0]PETSC ERROR: [0] DMShellCreate line 983 > /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c > > [0]PETSC ERROR: [0] TSGetDM line 5287 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by > mlohry Tue Aug 6 06:05:02 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > > > > > > ************************************* > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429158.out > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 > /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 > /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 > /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > [0]PETSC ERROR: [0] MatColoringApply line 357 > /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c > > [0]PETSC ERROR: [0] VecSetSizes line 1308 > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file > > > > > > > > ************************* > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429134.out > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by > mlohry Mon Aug 5 23:24:23 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in > /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in > /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in > /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > [0]PETSC ERROR: #6 VecSetType() line 51 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by > mlohry Mon Aug 5 23:24:23 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > > > > ************************************** > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429102.out > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 6 08:36:47 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 6 Aug 2019 13:36:47 +0000 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: References: Message-ID: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> There is also $ ./configure --help | grep color --with-is-color-value-type= char, short can store 256, 65536 colors current: short I can't imagine you have over 65 k colors but something to check > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: > > My first guess is that the code is getting integer overflow somewhere. 25 billion is well over the 2 billion that 32 bit integers can hold. > > Mine as well -- though in later tests I have the same issue when using --with-64-bit-indices. Ironically I had removed that flag at some point because the coloring / index set was using a serious chunk of total memory on medium sized problems. Understood > > Questions on the petsc internals there though: Are matrices indexed with two integers (i,j) so the max matrix dimension is (int limit) x (int limit) or a single integer so the max dimension is sqrt(int limit)? > Also I was operating under the assumption the 32 bit limit should only constrain per-process problem sizes (25B over 400 processes giving 62M non-zeros per process), is that not right? It is mostly right but may not be right for everything in PETSc. For example I don't know about the MatFD code Since using a debugger is not practical for large code counts to find the point the two processes diverge you can try -log_trace or -log_trace filename in the second case it will generate one file per core called filename.%d note it will produce a lot of output Good luck > > We are adding more tests to nicely handle integer overflow but it is not easy since it can occur in so many places > > Totally understood. I know the pain of only finding an overflow bug after days of waiting in a cluster queue for a big job. > > We urge you to upgrade. > > I'll do that today and hope for the best. On first tests on 3.11.3, I still have a couple issues with the coloring code: > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned here: https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a wrong jacobian unless I also set MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist. > > Thanks, > Mark > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. wrote: > > My first guess is that the code is getting integer overflow somewhere. 25 billion is well over the 2 billion that 32 bit integers can hold. > > We urge you to upgrade. > > Regardless for problems this large you likely need the ./configure option --with-64-bit-indices > > We are adding more tests to nicely handle integer overflow but it is not easy since it can occur in so many places > > Hopefully this will resolve your problem with large process counts > > Barry > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users wrote: > > > > I'm running some larger cases than I have previously with a working code, and I'm running into failures I don't see on smaller cases. Failures are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs successfully on half size case on 200 cores. > > > > 1) The first error output from petsc is "MPI_Allreduce() called in different locations". Is this a red herring, suggesting some process failed prior to this and processes have diverged? > > > > 2) I don't think I'm running out of memory -- globally at least. Slurm output shows e.g. > > Memory Utilized: 459.15 GB (estimated maximum) > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) > > I did try with and without --64-bit-indices. > > > > 3) The debug traces seem to vary, see below. I *think* the failure might be happening in the vicinity of a Coloring call. I'm using MatFDColoring like so: > > > > ISColoring iscoloring; > > MatFDColoring fdcoloring; > > MatColoring coloring; > > > > MatColoringCreate(ctx.JPre, &coloring); > > MatColoringSetType(coloring, MATCOLORINGGREEDY); > > > > // converges stalls badly without this on small cases, don't know why > > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > > > // none of these worked. > > // MatColoringSetType(coloring, MATCOLORINGJP); > > // MatColoringSetType(coloring, MATCOLORINGSL); > > // MatColoringSetType(coloring, MATCOLORINGID); > > MatColoringSetFromOptions(coloring); > > > > MatColoringApply(coloring, &iscoloring); > > MatColoringDestroy(&coloring); > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); > > > > I have had issues in the past with getting a functional coloring setup for finite difference jacobians, and the above is the only configuration I've managed to get working successfully. Have there been any significant development changes to that area of code since v3.8.3? I'll try upgrading in the mean time and hope for the best. > > > > > > > > Any ideas? > > > > > > Thanks, > > Mark > > > > > > ************************************* > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429773.out > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (functions) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Enum value must be same on all processes, argument # 2 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 /home/mlohry/build/external/petsc/src/sys/objects/tagm.c > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 /home/mlohry/build/external/petsc/src/sys/objects/inherit.c > > [0]PETSC ERROR: [0] DMCreate line 36 /home/mlohry/build/external/petsc/src/dm/interface/dm.c > > [0]PETSC ERROR: [0] DMShellCreate line 983 /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c > > [0]PETSC ERROR: [0] TSGetDM line 5287 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > > > > > > ************************************* > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429158.out > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > [0]PETSC ERROR: [0] MatColoringApply line 357 /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c > > [0]PETSC ERROR: [0] VecSetSizes line 1308 /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file > > > > > > > > ************************* > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429134.out > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > [0]PETSC ERROR: #6 VecSetType() line 51 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > > > > ************************************** > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429102.out > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file > > > > > > > From bsmith at mcs.anl.gov Tue Aug 6 08:51:16 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 6 Aug 2019 13:51:16 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> Message-ID: Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? Is this within a stage or at the actual time-step after the stage? Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). This information can help us determine what approach you should take. Thanks Barry > On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users wrote: > > Hi, > I think I am missing something here. > How would events help to constrain the states. > Do you mean to use the event to "pause" to integration an adjust the state manually? > Or are the events to enforce smaller timesteps when the state come close to the constraints? > > Thank you, > Moritz > ________________________________________ > Von: Abhyankar, Shrirang G > Gesendet: Montag, 5. August 2019 17:21:41 > An: Huck, Moritz; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. > > An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html > > A brief intro to TSEvent can be found here. > > Thanks, > Shri > > > From: petsc-users on behalf of "Huck, Moritz via petsc-users" > Reply-To: "Huck, Moritz" > Date: Monday, August 5, 2019 at 5:18 AM > To: "petsc-users at mcs.anl.gov" > Subject: [petsc-users] Problem with TS and SNES VI > > Hi, > I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. > The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). > But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. > Are there some tolerances I have to set for VI or something like this? > > Best Regards, > Moritz > From Moritz.Huck at rwth-aachen.de Tue Aug 6 09:24:49 2019 From: Moritz.Huck at rwth-aachen.de (Huck, Moritz) Date: Tue, 6 Aug 2019 14:24:49 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de>, Message-ID: <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. Unphysical values are e.g. particle sizes below zero. My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. ________________________________________ Von: Smith, Barry F. Gesendet: Dienstag, 6. August 2019 15:51:16 An: Huck, Moritz Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? Is this within a stage or at the actual time-step after the stage? Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). This information can help us determine what approach you should take. Thanks Barry > On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users wrote: > > Hi, > I think I am missing something here. > How would events help to constrain the states. > Do you mean to use the event to "pause" to integration an adjust the state manually? > Or are the events to enforce smaller timesteps when the state come close to the constraints? > > Thank you, > Moritz > ________________________________________ > Von: Abhyankar, Shrirang G > Gesendet: Montag, 5. August 2019 17:21:41 > An: Huck, Moritz; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. > > An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html > > A brief intro to TSEvent can be found here. > > Thanks, > Shri > > > From: petsc-users on behalf of "Huck, Moritz via petsc-users" > Reply-To: "Huck, Moritz" > Date: Monday, August 5, 2019 at 5:18 AM > To: "petsc-users at mcs.anl.gov" > Subject: [petsc-users] Problem with TS and SNES VI > > Hi, > I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. > The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). > But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. > Are there some tolerances I have to set for VI or something like this? > > Best Regards, > Moritz > From bsmith at mcs.anl.gov Tue Aug 6 10:47:13 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 6 Aug 2019 15:47:13 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> Message-ID: Thanks, very useful. Are the non-physical values appearing in the nonlinear solver ? Or just at the time-step? Do you check for non-physical values each time you do a function evaluation needed by SNES/TS? If the non-physical values are an artifact of the steps taken in the nonlinear solver in SNES then the correct solution is to use SNESLineSearchSetPreCheck() what you do is change the step so the resulting solutions are physical. For you case where the sizes go negative I am not sure what to do. Are the sizes allowed to go to zero? If so then adjust the step so that the sizes that go to negative values just go to zero. If they are suppose to be always positive then you need to pick some tolerance (say epsilon) and adjust the step so they are of size epsilon. Note you don't scale the entire step vector by a small number to satisfy the constraint you change each entry in the step as needed to satisfy the constraints. Good luck and let us know how it goes Barry > On Aug 6, 2019, at 9:24 AM, Huck, Moritz wrote: > > At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. > Unphysical values are e.g. particle sizes below zero. > > My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. > > The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. > ________________________________________ > Von: Smith, Barry F. > Gesendet: Dienstag, 6. August 2019 15:51:16 > An: Huck, Moritz > Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? > > Is this within a stage or at the actual time-step after the stage? > > Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? > > Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). > > This information can help us determine what approach you should take. > > Thanks > > Barry > > >> On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users wrote: >> >> Hi, >> I think I am missing something here. >> How would events help to constrain the states. >> Do you mean to use the event to "pause" to integration an adjust the state manually? >> Or are the events to enforce smaller timesteps when the state come close to the constraints? >> >> Thank you, >> Moritz >> ________________________________________ >> Von: Abhyankar, Shrirang G >> Gesendet: Montag, 5. August 2019 17:21:41 >> An: Huck, Moritz; petsc-users at mcs.anl.gov >> Betreff: Re: [petsc-users] Problem with TS and SNES VI >> >> For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. >> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. >> >> An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html >> >> A brief intro to TSEvent can be found here. >> >> Thanks, >> Shri >> >> >> From: petsc-users on behalf of "Huck, Moritz via petsc-users" >> Reply-To: "Huck, Moritz" >> Date: Monday, August 5, 2019 at 5:18 AM >> To: "petsc-users at mcs.anl.gov" >> Subject: [petsc-users] Problem with TS and SNES VI >> >> Hi, >> I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. >> The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). >> But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. >> Are there some tolerances I have to set for VI or something like this? >> >> Best Regards, >> Moritz >> > From Moritz.Huck at rwth-aachen.de Wed Aug 7 07:45:53 2019 From: Moritz.Huck at rwth-aachen.de (Huck, Moritz) Date: Wed, 7 Aug 2019 12:45:53 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de>, Message-ID: Thank you for your response. The sizes are only allowed to go down to a certain value. The non-physical values do also occur during the function evaluations (IFunction). I will try to implment your suggestions with SNESLineSearchSetPreCheck. This would mean I dont have to use SNESVISetVariableBounds at all, right? ________________________________________ Von: Smith, Barry F. Gesendet: Dienstag, 6. August 2019 17:47:13 An: Huck, Moritz Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI Thanks, very useful. Are the non-physical values appearing in the nonlinear solver ? Or just at the time-step? Do you check for non-physical values each time you do a function evaluation needed by SNES/TS? If the non-physical values are an artifact of the steps taken in the nonlinear solver in SNES then the correct solution is to use SNESLineSearchSetPreCheck() what you do is change the step so the resulting solutions are physical. For you case where the sizes go negative I am not sure what to do. Are the sizes allowed to go to zero? If so then adjust the step so that the sizes that go to negative values just go to zero. If they are suppose to be always positive then you need to pick some tolerance (say epsilon) and adjust the step so they are of size epsilon. Note you don't scale the entire step vector by a small number to satisfy the constraint you change each entry in the step as needed to satisfy the constraints. Good luck and let us know how it goes Barry > On Aug 6, 2019, at 9:24 AM, Huck, Moritz wrote: > > At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. > Unphysical values are e.g. particle sizes below zero. > > My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. > > The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. > ________________________________________ > Von: Smith, Barry F. > Gesendet: Dienstag, 6. August 2019 15:51:16 > An: Huck, Moritz > Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? > > Is this within a stage or at the actual time-step after the stage? > > Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? > > Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). > > This information can help us determine what approach you should take. > > Thanks > > Barry > > >> On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users wrote: >> >> Hi, >> I think I am missing something here. >> How would events help to constrain the states. >> Do you mean to use the event to "pause" to integration an adjust the state manually? >> Or are the events to enforce smaller timesteps when the state come close to the constraints? >> >> Thank you, >> Moritz >> ________________________________________ >> Von: Abhyankar, Shrirang G >> Gesendet: Montag, 5. August 2019 17:21:41 >> An: Huck, Moritz; petsc-users at mcs.anl.gov >> Betreff: Re: [petsc-users] Problem with TS and SNES VI >> >> For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. >> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. >> >> An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html >> >> A brief intro to TSEvent can be found here. >> >> Thanks, >> Shri >> >> >> From: petsc-users on behalf of "Huck, Moritz via petsc-users" >> Reply-To: "Huck, Moritz" >> Date: Monday, August 5, 2019 at 5:18 AM >> To: "petsc-users at mcs.anl.gov" >> Subject: [petsc-users] Problem with TS and SNES VI >> >> Hi, >> I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. >> The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). >> But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. >> Are there some tolerances I have to set for VI or something like this? >> >> Best Regards, >> Moritz >> > From bsmith at mcs.anl.gov Wed Aug 7 09:38:37 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 7 Aug 2019 14:38:37 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> Message-ID: <46CDEF35-09E3-4053-B9ED-90A6DE37801C@mcs.anl.gov> > On Aug 7, 2019, at 7:45 AM, Huck, Moritz wrote: > > Thank you for your response. > The sizes are only allowed to go down to a certain value. > The non-physical values do also occur during the function evaluations (IFunction). > > I will try to implment your suggestions with SNESLineSearchSetPreCheck. This would mean I dont have to use SNESVISetVariableBounds at all, right? Yes Let us know how it goes. If the solutions to the underlying "analytic/continuum" model don't go to zero size then this approach has a good chance of working, since the non-physical values are just an artifact of the nonlinear solver algorithm. But if the "true" solutions do go to zero than likely this approach will fail because it is unable to find the zeros of the nonlinear function needed to evolve in time, and this makes sense since the discrete model should mimic the behavior of the "analytic/continuum" model. Barry > ________________________________________ > Von: Smith, Barry F. > Gesendet: Dienstag, 6. August 2019 17:47:13 > An: Huck, Moritz > Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > Thanks, very useful. > > Are the non-physical values appearing in the nonlinear solver ? Or just at the time-step? > > Do you check for non-physical values each time you do a function evaluation needed by SNES/TS? > > If the non-physical values are an artifact of the steps taken in the nonlinear solver in SNES then the correct solution is to use > SNESLineSearchSetPreCheck() what you do is change the step so the resulting solutions are physical. > > For you case where the sizes go negative I am not sure what to do. Are the sizes allowed to go to zero? If so then adjust the step so that the sizes that go to negative values just go to zero. If they are suppose to be always positive then you need to pick some tolerance (say epsilon) and adjust the step so they are of size epsilon. Note you don't scale the entire step vector by a small number to satisfy the constraint you change each entry in the step as needed to satisfy the constraints. > > Good luck and let us know how it goes > > Barry > > > >> On Aug 6, 2019, at 9:24 AM, Huck, Moritz wrote: >> >> At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. >> Unphysical values are e.g. particle sizes below zero. >> >> My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. >> >> The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. >> ________________________________________ >> Von: Smith, Barry F. >> Gesendet: Dienstag, 6. August 2019 15:51:16 >> An: Huck, Moritz >> Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov >> Betreff: Re: [petsc-users] Problem with TS and SNES VI >> >> Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? >> >> Is this within a stage or at the actual time-step after the stage? >> >> Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? >> >> Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). >> >> This information can help us determine what approach you should take. >> >> Thanks >> >> Barry >> >> >>> On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users wrote: >>> >>> Hi, >>> I think I am missing something here. >>> How would events help to constrain the states. >>> Do you mean to use the event to "pause" to integration an adjust the state manually? >>> Or are the events to enforce smaller timesteps when the state come close to the constraints? >>> >>> Thank you, >>> Moritz >>> ________________________________________ >>> Von: Abhyankar, Shrirang G >>> Gesendet: Montag, 5. August 2019 17:21:41 >>> An: Huck, Moritz; petsc-users at mcs.anl.gov >>> Betreff: Re: [petsc-users] Problem with TS and SNES VI >>> >>> For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. >>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. >>> >>> An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html >>> >>> A brief intro to TSEvent can be found here. >>> >>> Thanks, >>> Shri >>> >>> >>> From: petsc-users on behalf of "Huck, Moritz via petsc-users" >>> Reply-To: "Huck, Moritz" >>> Date: Monday, August 5, 2019 at 5:18 AM >>> To: "petsc-users at mcs.anl.gov" >>> Subject: [petsc-users] Problem with TS and SNES VI >>> >>> Hi, >>> I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. >>> The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). >>> But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. >>> Are there some tolerances I have to set for VI or something like this? >>> >>> Best Regards, >>> Moritz >>> >> > From d_mckinnell at aol.co.uk Thu Aug 8 10:59:00 2019 From: d_mckinnell at aol.co.uk (Daniel Mckinnell) Date: Thu, 8 Aug 2019 15:59:00 +0000 (UTC) Subject: [petsc-users] Questions about Field Data using DMPlex References: <1766324272.2176173.1565279940674.ref@mail.yahoo.com> Message-ID: <1766324272.2176173.1565279940674@mail.yahoo.com> Hi,I have a DMPlex mesh stored in a dm object with an associated PetscSection containing the information on the layout of field0 and field1, I can create a local Vec with the layout from the PetscSection and edit that data. I have a couple of questions about what I can do from this point: 1. Is there a way of attaching my field data to the dm object so that everything associated with my mesh is contained in a single object?2. Is there a way of creating a Vec object with the data from just field0?3. What is the best way to export my data to a VTK file, everything I have tried so far has just created meaningless vector fields on the vertices? Any help would be greatly appreciated.Thanks,Daniel Mckinnell -------------- next part -------------- An HTML attachment was scrubbed... URL: From shrirang.abhyankar at pnnl.gov Thu Aug 8 12:16:12 2019 From: shrirang.abhyankar at pnnl.gov (Abhyankar, Shrirang G) Date: Thu, 8 Aug 2019 17:16:12 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> Message-ID: <5A251CF0-D34E-4823-B0CA-695CC21AC1B5@pnnl.gov> Moritz, I think your case will also work with using TSEvent. I think your problem is similar, correct me if I am wrong, to my application where I need to constrain the states within some limits, lb \le x. I use events to handle this, where I use two event functions: (i) x ? lb = 0. if x > lb & (ii) \dot{x} = 0 x = lb The first event function is used to detect when x hits the limit lb. Once it hits the limit, the differential equation for x is changed to (x-lb = 0) in the model to hold x at limit lb. For releasing x, there is an event function on the derivative of x, \dot{x}, and x is released on detection of the condition \dot{x} > 0. This is done through the event function \dot{x} = 0 with a positive zero crossing. An example of how the above works is in the example src/ts/examples/tutorials/power_grid/stability_9bus/ex9bus.c. In this example, there is an event function that first checks whether the state VR has hit the upper limit VRMAX. Once it does so, the flag VRatmax is set by the post-event function. The event function is then switched to the \dot{VR} if (!VRatmax[i])) fvalue[2+2*i] = VRMAX[i] - VR; } else { fvalue[2+2*i] = (VR - KA[i]*RF + KA[i]*KF[i]*Efd/TF[i] - KA[i]*(Vref[i] - Vm))/TA[i]; } You can either try TSEvent or what Barry suggested SNESLineSearchSetPreCheck(), or both. Thanks, Shri From: "Huck, Moritz" Date: Wednesday, August 7, 2019 at 8:46 AM To: "Smith, Barry F." Cc: "Abhyankar, Shrirang G" , "petsc-users at mcs.anl.gov" Subject: AW: [petsc-users] Problem with TS and SNES VI Thank you for your response. The sizes are only allowed to go down to a certain value. The non-physical values do also occur during the function evaluations (IFunction). I will try to implment your suggestions with SNESLineSearchSetPreCheck. This would mean I dont have to use SNESVISetVariableBounds at all, right? ________________________________________ Von: Smith, Barry F. > Gesendet: Dienstag, 6. August 2019 17:47:13 An: Huck, Moritz Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI Thanks, very useful. Are the non-physical values appearing in the nonlinear solver ? Or just at the time-step? Do you check for non-physical values each time you do a function evaluation needed by SNES/TS? If the non-physical values are an artifact of the steps taken in the nonlinear solver in SNES then the correct solution is to use SNESLineSearchSetPreCheck() what you do is change the step so the resulting solutions are physical. For you case where the sizes go negative I am not sure what to do. Are the sizes allowed to go to zero? If so then adjust the step so that the sizes that go to negative values just go to zero. If they are suppose to be always positive then you need to pick some tolerance (say epsilon) and adjust the step so they are of size epsilon. Note you don't scale the entire step vector by a small number to satisfy the constraint you change each entry in the step as needed to satisfy the constraints. Good luck and let us know how it goes Barry On Aug 6, 2019, at 9:24 AM, Huck, Moritz > wrote: At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. Unphysical values are e.g. particle sizes below zero. My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. ________________________________________ Von: Smith, Barry F. > Gesendet: Dienstag, 6. August 2019 15:51:16 An: Huck, Moritz Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? Is this within a stage or at the actual time-step after the stage? Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). This information can help us determine what approach you should take. Thanks Barry On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users > wrote: Hi, I think I am missing something here. How would events help to constrain the states. Do you mean to use the event to "pause" to integration an adjust the state manually? Or are the events to enforce smaller timesteps when the state come close to the constraints? Thank you, Moritz ________________________________________ Von: Abhyankar, Shrirang G > Gesendet: Montag, 5. August 2019 17:21:41 An: Huck, Moritz; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html A brief intro to TSEvent can be found here. Thanks, Shri From: petsc-users > on behalf of "Huck, Moritz via petsc-users" > Reply-To: "Huck, Moritz" > Date: Monday, August 5, 2019 at 5:18 AM To: "petsc-users at mcs.anl.gov" > Subject: [petsc-users] Problem with TS and SNES VI Hi, I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. Are there some tolerances I have to set for VI or something like this? Best Regards, Moritz -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 8 13:44:03 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 8 Aug 2019 14:44:03 -0400 Subject: [petsc-users] Questions about Field Data using DMPlex In-Reply-To: <1766324272.2176173.1565279940674@mail.yahoo.com> References: <1766324272.2176173.1565279940674.ref@mail.yahoo.com> <1766324272.2176173.1565279940674@mail.yahoo.com> Message-ID: On Thu, Aug 8, 2019 at 11:59 AM Daniel Mckinnell via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > I have a DMPlex mesh stored in a dm object with an associated PetscSection > containing the information on the layout of field0 and field1, I can create > a local Vec with the layout from the PetscSection and edit that data. I > have a couple of questions about what I can do from this point: > > 1. Is there a way of attaching my field data to the dm object so that > everything associated with my mesh is contained in a single object? > You can use https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMGetNamedLocalVector.html We tend to discourage this organization, but it might work for you. If you use DMCreateLocalVector(), then VecGetDM() will get you back the DM. > 2. Is there a way of creating a Vec object with the data from just field0? > Yes, use DMCreateSubDM() selecting only field0, and then DMCreateLocalVector() > 3. What is the best way to export my data to a VTK file, everything I have > tried so far has just created meaningless vector fields on the vertices? > VTK files only understand vertex fields and cell fields. Is that what you have? Thanks, Matt > Any help would be greatly appreciated. > Thanks, > Daniel Mckinnell > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stevenbenbow at quintessa.org Fri Aug 9 09:25:47 2019 From: stevenbenbow at quintessa.org (Steve) Date: Fri, 9 Aug 2019 15:25:47 +0100 Subject: [petsc-users] Possibility of using out of date Jacobians with TS Message-ID: Hi, I'm experimenting with the use of PETSc to replace a DAE solver in an existing code that I use to solve stiff nonlinear problems.? I expect to use TSBDF in the final instance, and so am currently playing with it but applied to a simpler linear problem - just to get some experience with the SNES/KSP/PC controls before diving in to the hard problem. Below is some output from TSAdapt for the simple linear problem, using TSBDF and PCLU, so that the linear algebra solve in the newton loop is direct: ??? TSAdapt basic bdf 0:2 step?? 0 accepted t=0????????? + 1.000e-03 dt=2.000e-03? wlte=2.51e-07? wltea=?? -1 wlter=?? -1 ??? ??? TSResidual... ??? ??? TSJacobian... calculate ??? ??? TSResidual... ??? TSAdapt basic bdf 0:2 step?? 1 accepted t=0.001????? + 2.000e-03 dt=4.000e-03? wlte=2.83e-07? wltea=?? -1 wlter=?? -1 ??? ??? TSResidual... ??? ??? TSJacobian... calculate ??? ??? TSResidual... ??? TSAdapt basic bdf 0:2 step?? 2 accepted t=0.003????? + 4.000e-03 dt=8.000e-03? wlte=1.22e-07? wltea=?? -1 wlter=?? -1 ??? ??? TSResidual... ??? ??? TSJacobian... calculate ??? ??? TSResidual... I have added the "TSResidual..." and "TSJacobian..." echoes so that I can see when PETSc is requesting residuals and Jacobians to be computed.? (This is the Jacobian routine specified via TSSetIJacobian.) Regarding the above output, it appears that TS / SNES always requests a new (I)Jacobian at each new timestep (after the first residual is calculated).? I can see mathematically why this would be the default choice, but had hoped that it might be possible for out-of-date Jacobians to be used until they become inefficient. My reason for wanting this is that the Jacobian calculations for the intended application are particularly expensive, but for small enough timesteps out-of-date Jacobians may be good enough, for a few steps. Is there any way of specifying that out-of-date (I)Jacobians can be tolerated (at the expense of increased Newton iterations, or smaller timesteps)?? Alternatively would it make sense to include callbacks to TS / SNES from the Jacobian evaluation function to determine whether sufficiently few iterations have been used that it might be safe to return the previously calculated Jacobian (if I store a copy)?? If so, is there any advice on how I should do this? NB. I see that there is an option for TSRHSJacobianSetReuse(), but this only applies to the RHS component of the DAE (the G(t,u) part, using the terminology from the manual), but I am not using this as ultimately I expect to be solving strongly nonlinear problems with no "slow" G(t,u) part. Any advice would be greatly appreciated. From bsmith at mcs.anl.gov Fri Aug 9 09:43:08 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 9 Aug 2019 14:43:08 +0000 Subject: [petsc-users] Possibility of using out of date Jacobians with TS In-Reply-To: References: Message-ID: Steve, There are two possibilities 1) completely under user control, when you are asked for a new Jacobian you can evaluate the current conditions and decide whether to generate a new one. For example get from SNES the number of iterations it required and if that is starting to go up then get a new one or check if the time-step is being cut because the nonlinear solver is becoming "too hard" and generate a new one. It is also possible to use -snes_mf_operator (or an inline version) that uses matrix-free to apply the Jacobian and the Jacobian you provide to compute the preconditioner. This allows you to keep the current Jacobian/preconditioner even longer before rebuilding. Here you can use the increase in the number of linear iterations to decide when to change the Jacobian. 2) let PETSc decide when to rebuild the Jacobian. This is more limited since it has no direct measure of how well the Jacobian is doing. Some possibilities are -snes_lag_jacobian -snes_lag_jacobian_persists -snes_lag_preconditioner -snes_lag_preconditioner_persists This introduces yet another parameter under your control; you can lag the generation of the new preconditioner even when you get a new preconditioner (this makes sense only when you are not using -snes_mf_operator), So, at a high level, you have a great deal of freedom to control when you recreate the Jacobian (and preconditioner), will be problem dependent and the optimal value will depend on your problem and specific integrator. Final note, when you rebuild may also depend on how far you have integrated, when the nonlinear effects are strong you probably want to rebuild often but when the solution is close to linearly evolving less often. If generating the Jacobian/preconditioner is expensive relative to everything else a good combination is -snes_mf_operator and a pretty lagged generation of new Jacobians. Barry > On Aug 9, 2019, at 9:25 AM, Steve via petsc-users wrote: > > Hi, > > I'm experimenting with the use of PETSc to replace a DAE solver in an existing code that I use to solve stiff nonlinear problems. I expect to use TSBDF in the final instance, and so am currently playing with it but applied to a simpler linear problem - just to get some experience with the SNES/KSP/PC controls before diving in to the hard problem. > > Below is some output from TSAdapt for the simple linear problem, using TSBDF and PCLU, so that the linear algebra solve in the newton loop is direct: > > TSAdapt basic bdf 0:2 step 0 accepted t=0 + 1.000e-03 dt=2.000e-03 wlte=2.51e-07 wltea= -1 wlter= -1 > TSResidual... > TSJacobian... calculate > TSResidual... > TSAdapt basic bdf 0:2 step 1 accepted t=0.001 + 2.000e-03 dt=4.000e-03 wlte=2.83e-07 wltea= -1 wlter= -1 > TSResidual... > TSJacobian... calculate > TSResidual... > TSAdapt basic bdf 0:2 step 2 accepted t=0.003 + 4.000e-03 dt=8.000e-03 wlte=1.22e-07 wltea= -1 wlter= -1 > TSResidual... > TSJacobian... calculate > TSResidual... > > I have added the "TSResidual..." and "TSJacobian..." echoes so that I can see when PETSc is requesting residuals and Jacobians to be computed. (This is the Jacobian routine specified via TSSetIJacobian.) > > Regarding the above output, it appears that TS / SNES always requests a new (I)Jacobian at each new timestep (after the first residual is calculated). I can see mathematically why this would be the default choice, but had hoped that it might be possible for out-of-date Jacobians to be used until they become inefficient. My reason for wanting this is that the Jacobian calculations for the intended application are particularly expensive, but for small enough timesteps out-of-date Jacobians may be good enough, for a few steps. > > Is there any way of specifying that out-of-date (I)Jacobians can be tolerated (at the expense of increased Newton iterations, or smaller timesteps)? Alternatively would it make sense to include callbacks to TS / SNES from the Jacobian evaluation function to determine whether sufficiently few iterations have been used that it might be safe to return the previously calculated Jacobian (if I store a copy)? If so, is there any advice on how I should do this? > > NB. I see that there is an option for TSRHSJacobianSetReuse(), but this only applies to the RHS component of the DAE (the G(t,u) part, using the terminology from the manual), but I am not using this as ultimately I expect to be solving strongly nonlinear problems with no "slow" G(t,u) part. > > Any advice would be greatly appreciated. > > From stevenbenbow at quintessa.org Fri Aug 9 10:11:00 2019 From: stevenbenbow at quintessa.org (Steve) Date: Fri, 9 Aug 2019 16:11:00 +0100 Subject: [petsc-users] Possibility of using out of date Jacobians with TS In-Reply-To: References: Message-ID: Thank you Barry, that's very helpful. I'll have a play with those various options and see how I get on. On 09/08/2019 15:43, Smith, Barry F. wrote: > Steve, > > There are two possibilities > > 1) completely under user control, when you are asked for a new Jacobian you can evaluate the current conditions and decide whether to generate a new one. For example get from SNES the number of iterations it required and if that is starting to go up then get a new one or check if the time-step is being cut because the nonlinear solver is becoming "too hard" and generate a new one. > > It is also possible to use -snes_mf_operator (or an inline version) that uses matrix-free to apply the Jacobian and the Jacobian you provide to compute the preconditioner. This allows you to keep the current Jacobian/preconditioner even longer before rebuilding. Here you can use the increase in the number of linear iterations to decide when to change the Jacobian. > > 2) let PETSc decide when to rebuild the Jacobian. This is more limited since it has no direct measure of how well the Jacobian is doing. Some possibilities are > -snes_lag_jacobian -snes_lag_jacobian_persists -snes_lag_preconditioner -snes_lag_preconditioner_persists This introduces yet another parameter under your control; you can lag the generation of the new preconditioner even when you get a new preconditioner (this makes sense only when you are not using -snes_mf_operator), > > So, at a high level, you have a great deal of freedom to control when you recreate the Jacobian (and preconditioner), will be problem dependent and the optimal value will depend on your problem and specific integrator. Final note, when you rebuild may also depend on how far you have integrated, when the nonlinear effects are strong you probably want to rebuild often but when the solution is close to linearly evolving less often. > > If generating the Jacobian/preconditioner is expensive relative to everything else a good combination is -snes_mf_operator and a pretty lagged generation of new Jacobians. > > Barry > > > > >> On Aug 9, 2019, at 9:25 AM, Steve via petsc-users wrote: >> >> Hi, >> >> I'm experimenting with the use of PETSc to replace a DAE solver in an existing code that I use to solve stiff nonlinear problems. I expect to use TSBDF in the final instance, and so am currently playing with it but applied to a simpler linear problem - just to get some experience with the SNES/KSP/PC controls before diving in to the hard problem. >> >> Below is some output from TSAdapt for the simple linear problem, using TSBDF and PCLU, so that the linear algebra solve in the newton loop is direct: >> >> TSAdapt basic bdf 0:2 step 0 accepted t=0 + 1.000e-03 dt=2.000e-03 wlte=2.51e-07 wltea= -1 wlter= -1 >> TSResidual... >> TSJacobian... calculate >> TSResidual... >> TSAdapt basic bdf 0:2 step 1 accepted t=0.001 + 2.000e-03 dt=4.000e-03 wlte=2.83e-07 wltea= -1 wlter= -1 >> TSResidual... >> TSJacobian... calculate >> TSResidual... >> TSAdapt basic bdf 0:2 step 2 accepted t=0.003 + 4.000e-03 dt=8.000e-03 wlte=1.22e-07 wltea= -1 wlter= -1 >> TSResidual... >> TSJacobian... calculate >> TSResidual... >> >> I have added the "TSResidual..." and "TSJacobian..." echoes so that I can see when PETSc is requesting residuals and Jacobians to be computed. (This is the Jacobian routine specified via TSSetIJacobian.) >> >> Regarding the above output, it appears that TS / SNES always requests a new (I)Jacobian at each new timestep (after the first residual is calculated). I can see mathematically why this would be the default choice, but had hoped that it might be possible for out-of-date Jacobians to be used until they become inefficient. My reason for wanting this is that the Jacobian calculations for the intended application are particularly expensive, but for small enough timesteps out-of-date Jacobians may be good enough, for a few steps. >> >> Is there any way of specifying that out-of-date (I)Jacobians can be tolerated (at the expense of increased Newton iterations, or smaller timesteps)? Alternatively would it make sense to include callbacks to TS / SNES from the Jacobian evaluation function to determine whether sufficiently few iterations have been used that it might be safe to return the previously calculated Jacobian (if I store a copy)? If so, is there any advice on how I should do this? >> >> NB. I see that there is an option for TSRHSJacobianSetReuse(), but this only applies to the RHS component of the DAE (the G(t,u) part, using the terminology from the manual), but I am not using this as ultimately I expect to be solving strongly nonlinear problems with no "slow" G(t,u) part. >> >> Any advice would be greatly appreciated. >> >> -- Dr Steven J Benbow Quintessa Ltd, First Floor, West Wing, Videcom House, Newtown Road, Henley-on-Thames, Oxfordshire RG9 1HG, UK Tel: 01491 636246 DD: 01491 630051 Web: http://www.quintessa.org Quintessa Limited is an employee-owned company registered in England, Number 3716623. Registered office: Quintessa Ltd, First Floor, West Wing, Videcom House, Newtown Road, Henley-on-Thames, Oxfordshire RG9 1HG, UK If you have received this e-mail in error, please notify privacy at quintessa.org and delete it from your system From mlohry at gmail.com Sat Aug 10 13:56:41 2019 From: mlohry at gmail.com (Mark Lohry) Date: Sat, 10 Aug 2019 14:56:41 -0400 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> References: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> Message-ID: Thanks Barry, been trying all of the above. I think I've homed in on it to an out-of-memory and/or integer overflow inside MatColoringApply. Which makes some sense since I only have a sequential coloring algorithm working... Is anyone out there using coloring in parallel? I still have the same previously mentioned issues with MATCOLORINGJP (on small problems takes upwards of 30 minutes to run) which as far as I can see is the only "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on less large problems, MATCOLORINGGREEDY works on less large problems if and only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are failing on larger problems. On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. wrote: > > There is also > > $ ./configure --help | grep color > --with-is-color-value-type= > char, short can store 256, 65536 colors current: short > > I can't imagine you have over 65 k colors but something to check > > > > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: > > > > My first guess is that the code is getting integer overflow somewhere. > 25 billion is well over the 2 billion that 32 bit integers can hold. > > > > Mine as well -- though in later tests I have the same issue when using > --with-64-bit-indices. Ironically I had removed that flag at some point > because the coloring / index set was using a serious chunk of total memory > on medium sized problems. > > Understood > > > > > Questions on the petsc internals there though: Are matrices indexed with > two integers (i,j) so the max matrix dimension is (int limit) x (int limit) > or a single integer so the max dimension is sqrt(int limit)? > > Also I was operating under the assumption the 32 bit limit should only > constrain per-process problem sizes (25B over 400 processes giving 62M > non-zeros per process), is that not right? > > It is mostly right but may not be right for everything in PETSc. For > example I don't know about the MatFD code > > Since using a debugger is not practical for large code counts to find > the point the two processes diverge you can try > > -log_trace > > or > > -log_trace filename > > in the second case it will generate one file per core called filename.%d > note it will produce a lot of output > > Good luck > > > > > > > We are adding more tests to nicely handle integer overflow but it is > not easy since it can occur in so many places > > > > Totally understood. I know the pain of only finding an overflow bug > after days of waiting in a cluster queue for a big job. > > > > We urge you to upgrade. > > > > I'll do that today and hope for the best. On first tests on 3.11.3, I > still have a couple issues with the coloring code: > > > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned here: > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html > > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a > wrong jacobian unless I also set MatColoringSetWeightType(coloring, > MAT_COLORING_WEIGHT_LEXICAL); > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist. > > > > Thanks, > > Mark > > > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. > wrote: > > > > My first guess is that the code is getting integer overflow > somewhere. 25 billion is well over the 2 billion that 32 bit integers can > hold. > > > > We urge you to upgrade. > > > > Regardless for problems this large you likely need the ./configure > option --with-64-bit-indices > > > > We are adding more tests to nicely handle integer overflow but it is > not easy since it can occur in so many places > > > > Hopefully this will resolve your problem with large process counts > > > > Barry > > > > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > > > I'm running some larger cases than I have previously with a working > code, and I'm running into failures I don't see on smaller cases. Failures > are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs > successfully on half size case on 200 cores. > > > > > > 1) The first error output from petsc is "MPI_Allreduce() called in > different locations". Is this a red herring, suggesting some process failed > prior to this and processes have diverged? > > > > > > 2) I don't think I'm running out of memory -- globally at least. Slurm > output shows e.g. > > > Memory Utilized: 459.15 GB (estimated maximum) > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) > > > I did try with and without --64-bit-indices. > > > > > > 3) The debug traces seem to vary, see below. I *think* the failure > might be happening in the vicinity of a Coloring call. I'm using > MatFDColoring like so: > > > > > > ISColoring iscoloring; > > > MatFDColoring fdcoloring; > > > MatColoring coloring; > > > > > > MatColoringCreate(ctx.JPre, &coloring); > > > MatColoringSetType(coloring, MATCOLORINGGREEDY); > > > > > > // converges stalls badly without this on small cases, don't know > why > > > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > > > > > // none of these worked. > > > // MatColoringSetType(coloring, MATCOLORINGJP); > > > // MatColoringSetType(coloring, MATCOLORINGSL); > > > // MatColoringSetType(coloring, MATCOLORINGID); > > > MatColoringSetFromOptions(coloring); > > > > > > MatColoringApply(coloring, &iscoloring); > > > MatColoringDestroy(&coloring); > > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); > > > > > > I have had issues in the past with getting a functional coloring setup > for finite difference jacobians, and the above is the only configuration > I've managed to get working successfully. Have there been any significant > development changes to that area of code since v3.8.3? I'll try upgrading > in the mean time and hope for the best. > > > > > > > > > > > > Any ideas? > > > > > > > > > Thanks, > > > Mark > > > > > > > > > ************************************* > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429773.out > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (functions) on different processors > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by > mlohry Tue Aug 6 06:05:02 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Invalid argument > > > [0]PETSC ERROR: Enum value must be same on all processes, argument # 2 > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by > mlohry Tue Aug 6 06:05:02 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or > the batch system) has told this process to end > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 > /home/mlohry/build/external/petsc/src/sys/objects/tagm.c > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 > /home/mlohry/build/external/petsc/src/sys/objects/inherit.c > > > [0]PETSC ERROR: [0] DMCreate line 36 > /home/mlohry/build/external/petsc/src/dm/interface/dm.c > > > [0]PETSC ERROR: [0] DMShellCreate line 983 > /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c > > > [0]PETSC ERROR: [0] TSGetDM line 5287 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Signal received > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by > mlohry Tue Aug 6 06:05:02 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > > > > > > > > > ************************************* > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429158.out > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 > /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 > /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 > /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > [0]PETSC ERROR: [0] MatColoringApply line 357 > /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c > > > [0]PETSC ERROR: [0] VecSetSizes line 1308 > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Signal received > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file > > > > > > > > > > > > ************************* > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429134.out > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by > mlohry Mon Aug 5 23:24:23 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in > /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in > /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in > /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > [0]PETSC ERROR: #6 VecSetType() line 51 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Object is in wrong state > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by > mlohry Mon Aug 5 23:24:23 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > > > > > > > > ************************************** > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429102.out > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Signal received > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Aug 10 15:38:08 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 10 Aug 2019 20:38:08 +0000 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: References: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> Message-ID: <20CF735B-247A-4A0D-BBF7-8DD25AB95E51@mcs.anl.gov> Mark, Would you be able to cook up an example (or examples) that demonstrate the problem (or problems) and how to run it? If you send it to us and we can reproduce the problem then we'll fix it. If need be you can send large matrices to petsc-maint at mcs.anl.gov don't send them to petsc-users since it will reject large files. Barry > On Aug 10, 2019, at 1:56 PM, Mark Lohry wrote: > > Thanks Barry, been trying all of the above. I think I've homed in on it to an out-of-memory and/or integer overflow inside MatColoringApply. Which makes some sense since I only have a sequential coloring algorithm working... > > Is anyone out there using coloring in parallel? I still have the same previously mentioned issues with MATCOLORINGJP (on small problems takes upwards of 30 minutes to run) which as far as I can see is the only "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on less large problems, MATCOLORINGGREEDY works on less large problems if and only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are failing on larger problems. > > On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. wrote: > > There is also > > $ ./configure --help | grep color > --with-is-color-value-type= > char, short can store 256, 65536 colors current: short > > I can't imagine you have over 65 k colors but something to check > > > > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: > > > > My first guess is that the code is getting integer overflow somewhere. 25 billion is well over the 2 billion that 32 bit integers can hold. > > > > Mine as well -- though in later tests I have the same issue when using --with-64-bit-indices. Ironically I had removed that flag at some point because the coloring / index set was using a serious chunk of total memory on medium sized problems. > > Understood > > > > > Questions on the petsc internals there though: Are matrices indexed with two integers (i,j) so the max matrix dimension is (int limit) x (int limit) or a single integer so the max dimension is sqrt(int limit)? > > Also I was operating under the assumption the 32 bit limit should only constrain per-process problem sizes (25B over 400 processes giving 62M non-zeros per process), is that not right? > > It is mostly right but may not be right for everything in PETSc. For example I don't know about the MatFD code > > Since using a debugger is not practical for large code counts to find the point the two processes diverge you can try > > -log_trace > > or > > -log_trace filename > > in the second case it will generate one file per core called filename.%d note it will produce a lot of output > > Good luck > > > > > > > We are adding more tests to nicely handle integer overflow but it is not easy since it can occur in so many places > > > > Totally understood. I know the pain of only finding an overflow bug after days of waiting in a cluster queue for a big job. > > > > We urge you to upgrade. > > > > I'll do that today and hope for the best. On first tests on 3.11.3, I still have a couple issues with the coloring code: > > > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned here: https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html > > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a wrong jacobian unless I also set MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist. > > > > Thanks, > > Mark > > > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. wrote: > > > > My first guess is that the code is getting integer overflow somewhere. 25 billion is well over the 2 billion that 32 bit integers can hold. > > > > We urge you to upgrade. > > > > Regardless for problems this large you likely need the ./configure option --with-64-bit-indices > > > > We are adding more tests to nicely handle integer overflow but it is not easy since it can occur in so many places > > > > Hopefully this will resolve your problem with large process counts > > > > Barry > > > > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users wrote: > > > > > > I'm running some larger cases than I have previously with a working code, and I'm running into failures I don't see on smaller cases. Failures are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs successfully on half size case on 200 cores. > > > > > > 1) The first error output from petsc is "MPI_Allreduce() called in different locations". Is this a red herring, suggesting some process failed prior to this and processes have diverged? > > > > > > 2) I don't think I'm running out of memory -- globally at least. Slurm output shows e.g. > > > Memory Utilized: 459.15 GB (estimated maximum) > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) > > > I did try with and without --64-bit-indices. > > > > > > 3) The debug traces seem to vary, see below. I *think* the failure might be happening in the vicinity of a Coloring call. I'm using MatFDColoring like so: > > > > > > ISColoring iscoloring; > > > MatFDColoring fdcoloring; > > > MatColoring coloring; > > > > > > MatColoringCreate(ctx.JPre, &coloring); > > > MatColoringSetType(coloring, MATCOLORINGGREEDY); > > > > > > // converges stalls badly without this on small cases, don't know why > > > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > > > > > // none of these worked. > > > // MatColoringSetType(coloring, MATCOLORINGJP); > > > // MatColoringSetType(coloring, MATCOLORINGSL); > > > // MatColoringSetType(coloring, MATCOLORINGID); > > > MatColoringSetFromOptions(coloring); > > > > > > MatColoringApply(coloring, &iscoloring); > > > MatColoringDestroy(&coloring); > > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); > > > > > > I have had issues in the past with getting a functional coloring setup for finite difference jacobians, and the above is the only configuration I've managed to get working successfully. Have there been any significant development changes to that area of code since v3.8.3? I'll try upgrading in the mean time and hope for the best. > > > > > > > > > > > > Any ideas? > > > > > > > > > Thanks, > > > Mark > > > > > > > > > ************************************* > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429773.out > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (functions) on different processors > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Invalid argument > > > [0]PETSC ERROR: Enum value must be same on all processes, argument # 2 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 /home/mlohry/build/external/petsc/src/sys/objects/tagm.c > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 /home/mlohry/build/external/petsc/src/sys/objects/inherit.c > > > [0]PETSC ERROR: [0] DMCreate line 36 /home/mlohry/build/external/petsc/src/dm/interface/dm.c > > > [0]PETSC ERROR: [0] DMShellCreate line 983 /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c > > > [0]PETSC ERROR: [0] TSGetDM line 5287 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Signal received > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > > > > > > > > > ************************************* > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429158.out > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > [0]PETSC ERROR: [0] MatColoringApply line 357 /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c > > > [0]PETSC ERROR: [0] VecSetSizes line 1308 /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Signal received > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file > > > > > > > > > > > > ************************* > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429134.out > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > [0]PETSC ERROR: #6 VecSetType() line 51 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Object is in wrong state > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > > > > > > > > ************************************** > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429102.out > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Signal received > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file > > > > > > > > > > > > From mlohry at gmail.com Sun Aug 11 08:49:22 2019 From: mlohry at gmail.com (Mark Lohry) Date: Sun, 11 Aug 2019 09:49:22 -0400 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: <20CF735B-247A-4A0D-BBF7-8DD25AB95E51@mcs.anl.gov> References: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> <20CF735B-247A-4A0D-BBF7-8DD25AB95E51@mcs.anl.gov> Message-ID: Hi Barry, I made a minimum example comparing the colorings on a very small case. You'll need to unzip the jacobian_sparsity.tgz to run it. https://github.com/mlohry/petsc_miscellany This is sparse block system with 50x50 block sizes, ~7,680 blocks. Comparing the coloring types sl, lf, jp, id, greedy, I get these timings wallclock, running with -np 16: SL: 1.5s LF: 1.3s JP: 29s ! ID: 1.4s greedy: 2s As far as I'm aware, JP is the only parallel coloring implemented? It is looking as though I'm simply running out of memory with the sequential methods (I should apologize to my cluster admin for chewing up 10TB and crashing...). On this small problem JP is taking 30 seconds wallclock, but that time grows exponentially with larger problems (last I tried it, I killed the job after 24 hours of spinning.) Also as I mentioned, the "greedy" method appears to be producing an invalid coloring for me unless I also specify weights "lexical". But "-mat_coloring_test" doesn't complain. I'll have to make a different example to actually show it's an invalid coloring. Thanks, Mark On Sat, Aug 10, 2019 at 4:38 PM Smith, Barry F. wrote: > > Mark, > > Would you be able to cook up an example (or examples) that demonstrate > the problem (or problems) and how to run it? If you send it to us and we > can reproduce the problem then we'll fix it. If need be you can send large > matrices to petsc-maint at mcs.anl.gov don't send them to petsc-users since > it will reject large files. > > Barry > > > > On Aug 10, 2019, at 1:56 PM, Mark Lohry wrote: > > > > Thanks Barry, been trying all of the above. I think I've homed in on it > to an out-of-memory and/or integer overflow inside MatColoringApply. Which > makes some sense since I only have a sequential coloring algorithm > working... > > > > Is anyone out there using coloring in parallel? I still have the same > previously mentioned issues with MATCOLORINGJP (on small problems takes > upwards of 30 minutes to run) which as far as I can see is the only > "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on > less large problems, MATCOLORINGGREEDY works on less large problems if and > only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are > failing on larger problems. > > > > On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. > wrote: > > > > There is also > > > > $ ./configure --help | grep color > > --with-is-color-value-type= > > char, short can store 256, 65536 colors current: short > > > > I can't imagine you have over 65 k colors but something to check > > > > > > > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: > > > > > > My first guess is that the code is getting integer overflow somewhere. > 25 billion is well over the 2 billion that 32 bit integers can hold. > > > > > > Mine as well -- though in later tests I have the same issue when using > --with-64-bit-indices. Ironically I had removed that flag at some point > because the coloring / index set was using a serious chunk of total memory > on medium sized problems. > > > > Understood > > > > > > > > Questions on the petsc internals there though: Are matrices indexed > with two integers (i,j) so the max matrix dimension is (int limit) x (int > limit) or a single integer so the max dimension is sqrt(int limit)? > > > Also I was operating under the assumption the 32 bit limit should only > constrain per-process problem sizes (25B over 400 processes giving 62M > non-zeros per process), is that not right? > > > > It is mostly right but may not be right for everything in PETSc. For > example I don't know about the MatFD code > > > > Since using a debugger is not practical for large code counts to find > the point the two processes diverge you can try > > > > -log_trace > > > > or > > > > -log_trace filename > > > > in the second case it will generate one file per core called > filename.%d note it will produce a lot of output > > > > Good luck > > > > > > > > > > > > We are adding more tests to nicely handle integer overflow but it > is not easy since it can occur in so many places > > > > > > Totally understood. I know the pain of only finding an overflow bug > after days of waiting in a cluster queue for a big job. > > > > > > We urge you to upgrade. > > > > > > I'll do that today and hope for the best. On first tests on 3.11.3, I > still have a couple issues with the coloring code: > > > > > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned > here: > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html > > > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a > wrong jacobian unless I also set MatColoringSetWeightType(coloring, > MAT_COLORING_WEIGHT_LEXICAL); > > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist. > > > > > > Thanks, > > > Mark > > > > > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. > wrote: > > > > > > My first guess is that the code is getting integer overflow > somewhere. 25 billion is well over the 2 billion that 32 bit integers can > hold. > > > > > > We urge you to upgrade. > > > > > > Regardless for problems this large you likely need the ./configure > option --with-64-bit-indices > > > > > > We are adding more tests to nicely handle integer overflow but it > is not easy since it can occur in so many places > > > > > > Hopefully this will resolve your problem with large process counts > > > > > > Barry > > > > > > > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > > > > > I'm running some larger cases than I have previously with a working > code, and I'm running into failures I don't see on smaller cases. Failures > are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs > successfully on half size case on 200 cores. > > > > > > > > 1) The first error output from petsc is "MPI_Allreduce() called in > different locations". Is this a red herring, suggesting some process failed > prior to this and processes have diverged? > > > > > > > > 2) I don't think I'm running out of memory -- globally at least. > Slurm output shows e.g. > > > > Memory Utilized: 459.15 GB (estimated maximum) > > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) > > > > I did try with and without --64-bit-indices. > > > > > > > > 3) The debug traces seem to vary, see below. I *think* the failure > might be happening in the vicinity of a Coloring call. I'm using > MatFDColoring like so: > > > > > > > > ISColoring iscoloring; > > > > MatFDColoring fdcoloring; > > > > MatColoring coloring; > > > > > > > > MatColoringCreate(ctx.JPre, &coloring); > > > > MatColoringSetType(coloring, MATCOLORINGGREEDY); > > > > > > > > // converges stalls badly without this on small cases, don't know > why > > > > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > > > > > > > // none of these worked. > > > > // MatColoringSetType(coloring, MATCOLORINGJP); > > > > // MatColoringSetType(coloring, MATCOLORINGSL); > > > > // MatColoringSetType(coloring, MATCOLORINGID); > > > > MatColoringSetFromOptions(coloring); > > > > > > > > MatColoringApply(coloring, &iscoloring); > > > > MatColoringDestroy(&coloring); > > > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); > > > > > > > > I have had issues in the past with getting a functional coloring > setup for finite difference jacobians, and the above is the only > configuration I've managed to get working successfully. Have there been any > significant development changes to that area of code since v3.8.3? I'll try > upgrading in the mean time and hope for the best. > > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > > > > Thanks, > > > > Mark > > > > > > > > > > > > ************************************* > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" > slurm-3429773.out > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (functions) on different processors > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by > mlohry Tue Aug 6 06:05:02 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Invalid argument > > > > [0]PETSC ERROR: Enum value must be same on all processes, argument # > 2 > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by > mlohry Tue Aug 6 06:05:02 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or > the batch system) has told this process to end > > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple > Mac OS X to find memory corruption errors > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > > [0]PETSC ERROR: is given. > > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 > /home/mlohry/build/external/petsc/src/sys/objects/tagm.c > > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 > /home/mlohry/build/external/petsc/src/sys/objects/inherit.c > > > > [0]PETSC ERROR: [0] DMCreate line 36 > /home/mlohry/build/external/petsc/src/dm/interface/dm.c > > > > [0]PETSC ERROR: [0] DMShellCreate line 983 > /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c > > > > [0]PETSC ERROR: [0] TSGetDM line 5287 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Signal received > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by > mlohry Tue Aug 6 06:05:02 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > > > > > > > > > > > > ************************************* > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" > slurm-3429158.out > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple > Mac OS X to find memory corruption errors > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > > [0]PETSC ERROR: is given. > > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 > /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 > /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 > /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > > [0]PETSC ERROR: [0] MatColoringApply line 357 > /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c > > > > [0]PETSC ERROR: [0] VecSetSizes line 1308 > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Signal received > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file > > > > > > > > > > > > > > > > ************************* > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" > slurm-3429134.out > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by > mlohry Mon Aug 5 23:24:23 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in > /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in > /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in > /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > [0]PETSC ERROR: #6 VecSetType() line 51 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c > > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Object is in wrong state > > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by > mlohry Mon Aug 5 23:24:23 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > > > > > > > > > > > > ************************************** > > > > > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" > slurm-3429102.out > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code > lines) on different processors > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple > Mac OS X to find memory corruption errors > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > > [0]PETSC ERROR: is given. > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Signal received > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by > mlohry Mon Aug 5 22:50:12 2019 > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Sun Aug 11 09:17:10 2019 From: mlohry at gmail.com (Mark Lohry) Date: Sun, 11 Aug 2019 10:17:10 -0400 Subject: [petsc-users] Possibility of using out of date Jacobians with TS In-Reply-To: References: Message-ID: Anecdotal: I've been *shocked* at how long I can let the -snes_lag_preconditioner go with -snes_mf_operator. I have it configured to only recompute the preconditioner whenever it hits my linear solver iteration limit, which pretty much never happens on unsteady problems. On Fri, Aug 9, 2019 at 11:11 AM Steve via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thank you Barry, that's very helpful. > > I'll have a play with those various options and see how I get on. > > > On 09/08/2019 15:43, Smith, Barry F. wrote: > > Steve, > > > > There are two possibilities > > > > 1) completely under user control, when you are asked for a new Jacobian > you can evaluate the current conditions and decide whether to generate a > new one. For example get from SNES the number of iterations it required and > if that is starting to go up then get a new one or check if the time-step > is being cut because the nonlinear solver is becoming "too hard" and > generate a new one. > > > > It is also possible to use -snes_mf_operator (or an inline version) > that uses matrix-free to apply the Jacobian and the Jacobian you provide to > compute the preconditioner. This allows you to keep the current > Jacobian/preconditioner even longer before rebuilding. Here you can use the > increase in the number of linear iterations to decide when to change the > Jacobian. > > > > 2) let PETSc decide when to rebuild the Jacobian. This is more limited > since it has no direct measure of how well the Jacobian is doing. Some > possibilities are > > -snes_lag_jacobian -snes_lag_jacobian_persists -snes_lag_preconditioner > -snes_lag_preconditioner_persists This introduces yet another parameter > under your control; you can lag the generation of the new preconditioner > even when you get a new preconditioner (this makes sense only when you are > not using -snes_mf_operator), > > > > So, at a high level, you have a great deal of freedom to control when > you recreate the Jacobian (and preconditioner), will be problem dependent > and the optimal value will depend on your problem and specific integrator. > Final note, when you rebuild may also depend on how far you have > integrated, when the nonlinear effects are strong you probably want to > rebuild often but when the solution is close to linearly evolving less > often. > > > > If generating the Jacobian/preconditioner is expensive relative to > everything else a good combination is -snes_mf_operator and a pretty lagged > generation of new Jacobians. > > > > Barry > > > > > > > > > >> On Aug 9, 2019, at 9:25 AM, Steve via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> > >> Hi, > >> > >> I'm experimenting with the use of PETSc to replace a DAE solver in an > existing code that I use to solve stiff nonlinear problems. I expect to > use TSBDF in the final instance, and so am currently playing with it but > applied to a simpler linear problem - just to get some experience with the > SNES/KSP/PC controls before diving in to the hard problem. > >> > >> Below is some output from TSAdapt for the simple linear problem, using > TSBDF and PCLU, so that the linear algebra solve in the newton loop is > direct: > >> > >> TSAdapt basic bdf 0:2 step 0 accepted t=0 + 1.000e-03 > dt=2.000e-03 wlte=2.51e-07 wltea= -1 wlter= -1 > >> TSResidual... > >> TSJacobian... calculate > >> TSResidual... > >> TSAdapt basic bdf 0:2 step 1 accepted t=0.001 + 2.000e-03 > dt=4.000e-03 wlte=2.83e-07 wltea= -1 wlter= -1 > >> TSResidual... > >> TSJacobian... calculate > >> TSResidual... > >> TSAdapt basic bdf 0:2 step 2 accepted t=0.003 + 4.000e-03 > dt=8.000e-03 wlte=1.22e-07 wltea= -1 wlter= -1 > >> TSResidual... > >> TSJacobian... calculate > >> TSResidual... > >> > >> I have added the "TSResidual..." and "TSJacobian..." echoes so that I > can see when PETSc is requesting residuals and Jacobians to be computed. > (This is the Jacobian routine specified via TSSetIJacobian.) > >> > >> Regarding the above output, it appears that TS / SNES always requests a > new (I)Jacobian at each new timestep (after the first residual is > calculated). I can see mathematically why this would be the default > choice, but had hoped that it might be possible for out-of-date Jacobians > to be used until they become inefficient. My reason for wanting this is > that the Jacobian calculations for the intended application are > particularly expensive, but for small enough timesteps out-of-date > Jacobians may be good enough, for a few steps. > >> > >> Is there any way of specifying that out-of-date (I)Jacobians can be > tolerated (at the expense of increased Newton iterations, or smaller > timesteps)? Alternatively would it make sense to include callbacks to TS / > SNES from the Jacobian evaluation function to determine whether > sufficiently few iterations have been used that it might be safe to return > the previously calculated Jacobian (if I store a copy)? If so, is there > any advice on how I should do this? > >> > >> NB. I see that there is an option for TSRHSJacobianSetReuse(), but this > only applies to the RHS component of the DAE (the G(t,u) part, using the > terminology from the manual), but I am not using this as ultimately I > expect to be solving strongly nonlinear problems with no "slow" G(t,u) part. > >> > >> Any advice would be greatly appreciated. > >> > >> > -- > Dr Steven J Benbow > Quintessa Ltd, First Floor, West Wing, Videcom House, Newtown Road, > Henley-on-Thames, Oxfordshire RG9 1HG, UK > Tel: 01491 636246 DD: 01491 630051 Web: http://www.quintessa.org > > Quintessa Limited is an employee-owned company registered in England, > Number 3716623. > Registered office: Quintessa Ltd, First Floor, West Wing, Videcom House, > Newtown Road, Henley-on-Thames, Oxfordshire RG9 1HG, UK > > If you have received this e-mail in error, please notify > privacy at quintessa.org and delete it from your system > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Sun Aug 11 10:21:09 2019 From: mlohry at gmail.com (Mark Lohry) Date: Sun, 11 Aug 2019 11:21:09 -0400 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: References: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> <20CF735B-247A-4A0D-BBF7-8DD25AB95E51@mcs.anl.gov> Message-ID: On the very large case, there does appear to be some kind of overflow ending up with an attempt to allocate too much memory in MatFDColorCreate, even with --with-64-bit-indices. Full terminal output here: https://raw.githubusercontent.com/mlohry/petsc_miscellany/master/slurm-3451378.out In particular: PETSC ERROR: Memory requested 1036713571771129344 Log filename here: https://github.com/mlohry/petsc_miscellany/blob/master/petsclogfile.0 On Sun, Aug 11, 2019 at 9:49 AM Mark Lohry wrote: > Hi Barry, I made a minimum example comparing the colorings on a very small > case. You'll need to unzip the jacobian_sparsity.tgz to run it. > > https://github.com/mlohry/petsc_miscellany > > This is sparse block system with 50x50 block sizes, ~7,680 blocks. > Comparing the coloring types sl, lf, jp, id, greedy, I get these timings > wallclock, running with -np 16: > > SL: 1.5s > LF: 1.3s > JP: 29s ! > ID: 1.4s > greedy: 2s > > As far as I'm aware, JP is the only parallel coloring implemented? It is > looking as though I'm simply running out of memory with the sequential > methods (I should apologize to my cluster admin for chewing up 10TB and > crashing...). > > On this small problem JP is taking 30 seconds wallclock, but that time > grows exponentially with larger problems (last I tried it, I killed the job > after 24 hours of spinning.) > > Also as I mentioned, the "greedy" method appears to be producing an > invalid coloring for me unless I also specify weights "lexical". But > "-mat_coloring_test" doesn't complain. I'll have to make a different > example to actually show it's an invalid coloring. > > Thanks, > Mark > > > > On Sat, Aug 10, 2019 at 4:38 PM Smith, Barry F. > wrote: > >> >> Mark, >> >> Would you be able to cook up an example (or examples) that >> demonstrate the problem (or problems) and how to run it? If you send it to >> us and we can reproduce the problem then we'll fix it. If need be you can >> send large matrices to petsc-maint at mcs.anl.gov don't send them to >> petsc-users since it will reject large files. >> >> Barry >> >> >> > On Aug 10, 2019, at 1:56 PM, Mark Lohry wrote: >> > >> > Thanks Barry, been trying all of the above. I think I've homed in on it >> to an out-of-memory and/or integer overflow inside MatColoringApply. Which >> makes some sense since I only have a sequential coloring algorithm >> working... >> > >> > Is anyone out there using coloring in parallel? I still have the same >> previously mentioned issues with MATCOLORINGJP (on small problems takes >> upwards of 30 minutes to run) which as far as I can see is the only >> "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on >> less large problems, MATCOLORINGGREEDY works on less large problems if and >> only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are >> failing on larger problems. >> > >> > On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. >> wrote: >> > >> > There is also >> > >> > $ ./configure --help | grep color >> > --with-is-color-value-type= >> > char, short can store 256, 65536 colors current: short >> > >> > I can't imagine you have over 65 k colors but something to check >> > >> > >> > > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: >> > > >> > > My first guess is that the code is getting integer overflow >> somewhere. 25 billion is well over the 2 billion that 32 bit integers can >> hold. >> > > >> > > Mine as well -- though in later tests I have the same issue when >> using --with-64-bit-indices. Ironically I had removed that flag at some >> point because the coloring / index set was using a serious chunk of total >> memory on medium sized problems. >> > >> > Understood >> > >> > > >> > > Questions on the petsc internals there though: Are matrices indexed >> with two integers (i,j) so the max matrix dimension is (int limit) x (int >> limit) or a single integer so the max dimension is sqrt(int limit)? >> > > Also I was operating under the assumption the 32 bit limit should >> only constrain per-process problem sizes (25B over 400 processes giving 62M >> non-zeros per process), is that not right? >> > >> > It is mostly right but may not be right for everything in PETSc. For >> example I don't know about the MatFD code >> > >> > Since using a debugger is not practical for large code counts to >> find the point the two processes diverge you can try >> > >> > -log_trace >> > >> > or >> > >> > -log_trace filename >> > >> > in the second case it will generate one file per core called >> filename.%d note it will produce a lot of output >> > >> > Good luck >> > >> > >> > >> > > >> > > We are adding more tests to nicely handle integer overflow but it >> is not easy since it can occur in so many places >> > > >> > > Totally understood. I know the pain of only finding an overflow bug >> after days of waiting in a cluster queue for a big job. >> > > >> > > We urge you to upgrade. >> > > >> > > I'll do that today and hope for the best. On first tests on 3.11.3, I >> still have a couple issues with the coloring code: >> > > >> > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned >> here: >> https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html >> > > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a >> wrong jacobian unless I also set MatColoringSetWeightType(coloring, >> MAT_COLORING_WEIGHT_LEXICAL); >> > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist. >> > > >> > > Thanks, >> > > Mark >> > > >> > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. >> wrote: >> > > >> > > My first guess is that the code is getting integer overflow >> somewhere. 25 billion is well over the 2 billion that 32 bit integers can >> hold. >> > > >> > > We urge you to upgrade. >> > > >> > > Regardless for problems this large you likely need the >> ./configure option --with-64-bit-indices >> > > >> > > We are adding more tests to nicely handle integer overflow but it >> is not easy since it can occur in so many places >> > > >> > > Hopefully this will resolve your problem with large process counts >> > > >> > > Barry >> > > >> > > >> > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> > > > >> > > > I'm running some larger cases than I have previously with a working >> code, and I'm running into failures I don't see on smaller cases. Failures >> are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs >> successfully on half size case on 200 cores. >> > > > >> > > > 1) The first error output from petsc is "MPI_Allreduce() called in >> different locations". Is this a red herring, suggesting some process failed >> prior to this and processes have diverged? >> > > > >> > > > 2) I don't think I'm running out of memory -- globally at least. >> Slurm output shows e.g. >> > > > Memory Utilized: 459.15 GB (estimated maximum) >> > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) >> > > > I did try with and without --64-bit-indices. >> > > > >> > > > 3) The debug traces seem to vary, see below. I *think* the failure >> might be happening in the vicinity of a Coloring call. I'm using >> MatFDColoring like so: >> > > > >> > > > ISColoring iscoloring; >> > > > MatFDColoring fdcoloring; >> > > > MatColoring coloring; >> > > > >> > > > MatColoringCreate(ctx.JPre, &coloring); >> > > > MatColoringSetType(coloring, MATCOLORINGGREEDY); >> > > > >> > > > // converges stalls badly without this on small cases, don't >> know why >> > > > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); >> > > > >> > > > // none of these worked. >> > > > // MatColoringSetType(coloring, MATCOLORINGJP); >> > > > // MatColoringSetType(coloring, MATCOLORINGSL); >> > > > // MatColoringSetType(coloring, MATCOLORINGID); >> > > > MatColoringSetFromOptions(coloring); >> > > > >> > > > MatColoringApply(coloring, &iscoloring); >> > > > MatColoringDestroy(&coloring); >> > > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); >> > > > >> > > > I have had issues in the past with getting a functional coloring >> setup for finite difference jacobians, and the above is the only >> configuration I've managed to get working successfully. Have there been any >> significant development changes to that area of code since v3.8.3? I'll try >> upgrading in the mean time and hope for the best. >> > > > >> > > > >> > > > >> > > > Any ideas? >> > > > >> > > > >> > > > Thanks, >> > > > Mark >> > > > >> > > > >> > > > ************************************* >> > > > >> > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >> slurm-3429773.out >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >> (functions) on different processors >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by >> mlohry Tue Aug 6 06:05:02 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun --with-64-bit-indices >> > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Invalid argument >> > > > [0]PETSC ERROR: Enum value must be same on all processes, argument >> # 2 >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by >> mlohry Tue Aug 6 06:05:02 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun --with-64-bit-indices >> > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the batch system) has told this process to end >> > > > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > > [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >> Mac OS X to find memory corruption errors >> > > > [0]PETSC ERROR: likely location of problem given in stack below >> > > > [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> > > > [0]PETSC ERROR: INSTEAD the line number of the start of the >> function >> > > > [0]PETSC ERROR: is given. >> > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 >> /home/mlohry/build/external/petsc/src/sys/objects/tagm.c >> > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 >> /home/mlohry/build/external/petsc/src/sys/objects/inherit.c >> > > > [0]PETSC ERROR: [0] DMCreate line 36 >> /home/mlohry/build/external/petsc/src/dm/interface/dm.c >> > > > [0]PETSC ERROR: [0] DMShellCreate line 983 >> /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c >> > > > [0]PETSC ERROR: [0] TSGetDM line 5287 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Signal received >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by >> mlohry Tue Aug 6 06:05:02 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun --with-64-bit-indices >> > > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file >> > > > >> > > > >> > > > ************************************* >> > > > >> > > > >> > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >> slurm-3429158.out >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code >> lines) on different processors >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by >> mlohry Mon Aug 5 23:58:19 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code >> lines) on different processors >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by >> mlohry Mon Aug 5 23:58:19 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >> > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >> > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >> > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c >> > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >> Violation, probably memory access out of range >> > > > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > > [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >> Mac OS X to find memory corruption errors >> > > > [0]PETSC ERROR: likely location of problem given in stack below >> > > > [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> > > > [0]PETSC ERROR: INSTEAD the line number of the start of the >> function >> > > > [0]PETSC ERROR: is given. >> > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 >> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c >> > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 >> /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c >> > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 >> /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c >> > > > [0]PETSC ERROR: [0] MatColoringApply line 357 >> /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c >> > > > [0]PETSC ERROR: [0] VecSetSizes line 1308 >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >> > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >> > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c >> > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Signal received >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by >> mlohry Mon Aug 5 23:58:19 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file >> > > > >> > > > >> > > > >> > > > ************************* >> > > > >> > > > >> > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >> slurm-3429134.out >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code >> lines) on different processors >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by >> mlohry Mon Aug 5 23:24:23 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in >> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c >> > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in >> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c >> > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in >> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c >> > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >> > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >> > > > [0]PETSC ERROR: #6 VecSetType() line 51 in >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c >> > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Object is in wrong state >> > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by >> mlohry Mon Aug 5 23:24:23 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >> > > > >> > > > >> > > > >> > > > ************************************** >> > > > >> > > > >> > > > >> > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >> slurm-3429102.out >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code >> lines) on different processors >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by >> mlohry Mon Aug 5 22:50:12 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code >> lines) on different processors >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by >> mlohry Mon Aug 5 22:50:12 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code >> lines) on different processors >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by >> mlohry Mon Aug 5 22:50:12 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >> Violation, probably memory access out of range >> > > > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > > [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >> Mac OS X to find memory corruption errors >> > > > [0]PETSC ERROR: likely location of problem given in stack below >> > > > [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> > > > [0]PETSC ERROR: INSTEAD the line number of the start of the >> function >> > > > [0]PETSC ERROR: is given. >> > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > [0]PETSC ERROR: Signal received >> > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by >> mlohry Mon Aug 5 22:50:12 2019 >> > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file >> > > > >> > > > >> > > > >> > > >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Aug 11 13:12:52 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sun, 11 Aug 2019 18:12:52 +0000 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: References: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> <20CF735B-247A-4A0D-BBF7-8DD25AB95E51@mcs.anl.gov> Message-ID: <4379D27A-C950-4733-82F4-2BDDFF93D154@mcs.anl.gov> These are due to attempting to copy the entire matrix to one process and do the sequential coloring there. Definitely won't work for larger problems, we'll need to focus on 1) having useful parallel coloring and 2) maybe using an alternative way to determine the coloring: where does your matrix come from? A mesh? Structured, unstructured, a graph, something else? What type of discretization? Barry > On Aug 11, 2019, at 10:21 AM, Mark Lohry wrote: > > On the very large case, there does appear to be some kind of overflow ending up with an attempt to allocate too much memory in MatFDColorCreate, even with --with-64-bit-indices. Full terminal output here: > https://raw.githubusercontent.com/mlohry/petsc_miscellany/master/slurm-3451378.out > > In particular: > PETSC ERROR: Memory requested 1036713571771129344 > > Log filename here: > https://github.com/mlohry/petsc_miscellany/blob/master/petsclogfile.0 > > On Sun, Aug 11, 2019 at 9:49 AM Mark Lohry wrote: > Hi Barry, I made a minimum example comparing the colorings on a very small case. You'll need to unzip the jacobian_sparsity.tgz to run it. > > https://github.com/mlohry/petsc_miscellany > > This is sparse block system with 50x50 block sizes, ~7,680 blocks. Comparing the coloring types sl, lf, jp, id, greedy, I get these timings wallclock, running with -np 16: > > SL: 1.5s > LF: 1.3s > JP: 29s ! > ID: 1.4s > greedy: 2s > > As far as I'm aware, JP is the only parallel coloring implemented? It is looking as though I'm simply running out of memory with the sequential methods (I should apologize to my cluster admin for chewing up 10TB and crashing...). > > On this small problem JP is taking 30 seconds wallclock, but that time grows exponentially with larger problems (last I tried it, I killed the job after 24 hours of spinning.) > > Also as I mentioned, the "greedy" method appears to be producing an invalid coloring for me unless I also specify weights "lexical". But "-mat_coloring_test" doesn't complain. I'll have to make a different example to actually show it's an invalid coloring. > > Thanks, > Mark > > > > On Sat, Aug 10, 2019 at 4:38 PM Smith, Barry F. wrote: > > Mark, > > Would you be able to cook up an example (or examples) that demonstrate the problem (or problems) and how to run it? If you send it to us and we can reproduce the problem then we'll fix it. If need be you can send large matrices to petsc-maint at mcs.anl.gov don't send them to petsc-users since it will reject large files. > > Barry > > > > On Aug 10, 2019, at 1:56 PM, Mark Lohry wrote: > > > > Thanks Barry, been trying all of the above. I think I've homed in on it to an out-of-memory and/or integer overflow inside MatColoringApply. Which makes some sense since I only have a sequential coloring algorithm working... > > > > Is anyone out there using coloring in parallel? I still have the same previously mentioned issues with MATCOLORINGJP (on small problems takes upwards of 30 minutes to run) which as far as I can see is the only "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on less large problems, MATCOLORINGGREEDY works on less large problems if and only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are failing on larger problems. > > > > On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. wrote: > > > > There is also > > > > $ ./configure --help | grep color > > --with-is-color-value-type= > > char, short can store 256, 65536 colors current: short > > > > I can't imagine you have over 65 k colors but something to check > > > > > > > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: > > > > > > My first guess is that the code is getting integer overflow somewhere. 25 billion is well over the 2 billion that 32 bit integers can hold. > > > > > > Mine as well -- though in later tests I have the same issue when using --with-64-bit-indices. Ironically I had removed that flag at some point because the coloring / index set was using a serious chunk of total memory on medium sized problems. > > > > Understood > > > > > > > > Questions on the petsc internals there though: Are matrices indexed with two integers (i,j) so the max matrix dimension is (int limit) x (int limit) or a single integer so the max dimension is sqrt(int limit)? > > > Also I was operating under the assumption the 32 bit limit should only constrain per-process problem sizes (25B over 400 processes giving 62M non-zeros per process), is that not right? > > > > It is mostly right but may not be right for everything in PETSc. For example I don't know about the MatFD code > > > > Since using a debugger is not practical for large code counts to find the point the two processes diverge you can try > > > > -log_trace > > > > or > > > > -log_trace filename > > > > in the second case it will generate one file per core called filename.%d note it will produce a lot of output > > > > Good luck > > > > > > > > > > > > We are adding more tests to nicely handle integer overflow but it is not easy since it can occur in so many places > > > > > > Totally understood. I know the pain of only finding an overflow bug after days of waiting in a cluster queue for a big job. > > > > > > We urge you to upgrade. > > > > > > I'll do that today and hope for the best. On first tests on 3.11.3, I still have a couple issues with the coloring code: > > > > > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned here: https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html > > > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a wrong jacobian unless I also set MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist. > > > > > > Thanks, > > > Mark > > > > > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. wrote: > > > > > > My first guess is that the code is getting integer overflow somewhere. 25 billion is well over the 2 billion that 32 bit integers can hold. > > > > > > We urge you to upgrade. > > > > > > Regardless for problems this large you likely need the ./configure option --with-64-bit-indices > > > > > > We are adding more tests to nicely handle integer overflow but it is not easy since it can occur in so many places > > > > > > Hopefully this will resolve your problem with large process counts > > > > > > Barry > > > > > > > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users wrote: > > > > > > > > I'm running some larger cases than I have previously with a working code, and I'm running into failures I don't see on smaller cases. Failures are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs successfully on half size case on 200 cores. > > > > > > > > 1) The first error output from petsc is "MPI_Allreduce() called in different locations". Is this a red herring, suggesting some process failed prior to this and processes have diverged? > > > > > > > > 2) I don't think I'm running out of memory -- globally at least. Slurm output shows e.g. > > > > Memory Utilized: 459.15 GB (estimated maximum) > > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) > > > > I did try with and without --64-bit-indices. > > > > > > > > 3) The debug traces seem to vary, see below. I *think* the failure might be happening in the vicinity of a Coloring call. I'm using MatFDColoring like so: > > > > > > > > ISColoring iscoloring; > > > > MatFDColoring fdcoloring; > > > > MatColoring coloring; > > > > > > > > MatColoringCreate(ctx.JPre, &coloring); > > > > MatColoringSetType(coloring, MATCOLORINGGREEDY); > > > > > > > > // converges stalls badly without this on small cases, don't know why > > > > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > > > > > > > // none of these worked. > > > > // MatColoringSetType(coloring, MATCOLORINGJP); > > > > // MatColoringSetType(coloring, MATCOLORINGSL); > > > > // MatColoringSetType(coloring, MATCOLORINGID); > > > > MatColoringSetFromOptions(coloring); > > > > > > > > MatColoringApply(coloring, &iscoloring); > > > > MatColoringDestroy(&coloring); > > > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); > > > > > > > > I have had issues in the past with getting a functional coloring setup for finite difference jacobians, and the above is the only configuration I've managed to get working successfully. Have there been any significant development changes to that area of code since v3.8.3? I'll try upgrading in the mean time and hope for the best. > > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > > > > Thanks, > > > > Mark > > > > > > > > > > > > ************************************* > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429773.out > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (functions) on different processors > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Invalid argument > > > > [0]PETSC ERROR: Enum value must be same on all processes, argument # 2 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > > [0]PETSC ERROR: is given. > > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 /home/mlohry/build/external/petsc/src/sys/objects/tagm.c > > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 /home/mlohry/build/external/petsc/src/sys/objects/inherit.c > > > > [0]PETSC ERROR: [0] DMCreate line 36 /home/mlohry/build/external/petsc/src/dm/interface/dm.c > > > > [0]PETSC ERROR: [0] DMShellCreate line 983 /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c > > > > [0]PETSC ERROR: [0] TSGetDM line 5287 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Signal received > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > > > > > > > > > > > > ************************************* > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429158.out > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > > [0]PETSC ERROR: is given. > > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > > [0]PETSC ERROR: [0] MatColoringApply line 357 /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c > > > > [0]PETSC ERROR: [0] VecSetSizes line 1308 /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Signal received > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file > > > > > > > > > > > > > > > > ************************* > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429134.out > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > [0]PETSC ERROR: #6 VecSetType() line 51 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c > > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Object is in wrong state > > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > > > > > > > > > > > > ************************************** > > > > > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429102.out > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > > [0]PETSC ERROR: is given. > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > [0]PETSC ERROR: Signal received > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file > > > > > > > > > > > > > > > > > > From mlohry at gmail.com Sun Aug 11 14:47:31 2019 From: mlohry at gmail.com (Mark Lohry) Date: Sun, 11 Aug 2019 15:47:31 -0400 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: <4379D27A-C950-4733-82F4-2BDDFF93D154@mcs.anl.gov> References: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> <20CF735B-247A-4A0D-BBF7-8DD25AB95E51@mcs.anl.gov> <4379D27A-C950-4733-82F4-2BDDFF93D154@mcs.anl.gov> Message-ID: Sorry, forgot to reply to the mailing list. where does your matrix come from? A mesh? Structured, unstructured, a > graph, something else? What type of discretization? Unstructured tetrahedral mesh (CGNS, I can give links to the files if that's of interest), the discretization is arbitrary order discontinuous galerkin for compressible navier-stokes. 5 coupled equations x 10 nodes per element for this 2nd order case to give the 50x50 blocks. Each tet cell dependent on neighbors, so for tets 4 extra off-diagonal blocks per cell. I would expect one could exploit the large block size here in computing the coloring -- the underlying mesh is 2M nodes with the same connectivity as a standard cell-centered finite volume method. On Sun, Aug 11, 2019 at 2:12 PM Smith, Barry F. wrote: > > These are due to attempting to copy the entire matrix to one process and > do the sequential coloring there. Definitely won't work for larger > problems, we'll > > need to focus on > > 1) having useful parallel coloring and > 2) maybe using an alternative way to determine the coloring: > > where does your matrix come from? A mesh? Structured, unstructured, a > graph, something else? What type of discretization? > > Barry > > > > On Aug 11, 2019, at 10:21 AM, Mark Lohry wrote: > > > > On the very large case, there does appear to be some kind of overflow > ending up with an attempt to allocate too much memory in MatFDColorCreate, > even with --with-64-bit-indices. Full terminal output here: > > > https://raw.githubusercontent.com/mlohry/petsc_miscellany/master/slurm-3451378.out > > > > In particular: > > PETSC ERROR: Memory requested 1036713571771129344 > > > > Log filename here: > > https://github.com/mlohry/petsc_miscellany/blob/master/petsclogfile.0 > > > > On Sun, Aug 11, 2019 at 9:49 AM Mark Lohry wrote: > > Hi Barry, I made a minimum example comparing the colorings on a very > small case. You'll need to unzip the jacobian_sparsity.tgz to run it. > > > > https://github.com/mlohry/petsc_miscellany > > > > This is sparse block system with 50x50 block sizes, ~7,680 blocks. > Comparing the coloring types sl, lf, jp, id, greedy, I get these timings > wallclock, running with -np 16: > > > > SL: 1.5s > > LF: 1.3s > > JP: 29s ! > > ID: 1.4s > > greedy: 2s > > > > As far as I'm aware, JP is the only parallel coloring implemented? It is > looking as though I'm simply running out of memory with the sequential > methods (I should apologize to my cluster admin for chewing up 10TB and > crashing...). > > > > On this small problem JP is taking 30 seconds wallclock, but that time > grows exponentially with larger problems (last I tried it, I killed the job > after 24 hours of spinning.) > > > > Also as I mentioned, the "greedy" method appears to be producing an > invalid coloring for me unless I also specify weights "lexical". But > "-mat_coloring_test" doesn't complain. I'll have to make a different > example to actually show it's an invalid coloring. > > > > Thanks, > > Mark > > > > > > > > On Sat, Aug 10, 2019 at 4:38 PM Smith, Barry F. > wrote: > > > > Mark, > > > > Would you be able to cook up an example (or examples) that > demonstrate the problem (or problems) and how to run it? If you send it to > us and we can reproduce the problem then we'll fix it. If need be you can > send large matrices to petsc-maint at mcs.anl.gov don't send them to > petsc-users since it will reject large files. > > > > Barry > > > > > > > On Aug 10, 2019, at 1:56 PM, Mark Lohry wrote: > > > > > > Thanks Barry, been trying all of the above. I think I've homed in on > it to an out-of-memory and/or integer overflow inside MatColoringApply. > Which makes some sense since I only have a sequential coloring algorithm > working... > > > > > > Is anyone out there using coloring in parallel? I still have the same > previously mentioned issues with MATCOLORINGJP (on small problems takes > upwards of 30 minutes to run) which as far as I can see is the only > "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on > less large problems, MATCOLORINGGREEDY works on less large problems if and > only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are > failing on larger problems. > > > > > > On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. > wrote: > > > > > > There is also > > > > > > $ ./configure --help | grep color > > > --with-is-color-value-type= > > > char, short can store 256, 65536 colors current: short > > > > > > I can't imagine you have over 65 k colors but something to check > > > > > > > > > > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: > > > > > > > > My first guess is that the code is getting integer overflow > somewhere. 25 billion is well over the 2 billion that 32 bit integers can > hold. > > > > > > > > Mine as well -- though in later tests I have the same issue when > using --with-64-bit-indices. Ironically I had removed that flag at some > point because the coloring / index set was using a serious chunk of total > memory on medium sized problems. > > > > > > Understood > > > > > > > > > > > Questions on the petsc internals there though: Are matrices indexed > with two integers (i,j) so the max matrix dimension is (int limit) x (int > limit) or a single integer so the max dimension is sqrt(int limit)? > > > > Also I was operating under the assumption the 32 bit limit should > only constrain per-process problem sizes (25B over 400 processes giving 62M > non-zeros per process), is that not right? > > > > > > It is mostly right but may not be right for everything in PETSc. > For example I don't know about the MatFD code > > > > > > Since using a debugger is not practical for large code counts to > find the point the two processes diverge you can try > > > > > > -log_trace > > > > > > or > > > > > > -log_trace filename > > > > > > in the second case it will generate one file per core called > filename.%d note it will produce a lot of output > > > > > > Good luck > > > > > > > > > > > > > > > > > We are adding more tests to nicely handle integer overflow but it > is not easy since it can occur in so many places > > > > > > > > Totally understood. I know the pain of only finding an overflow bug > after days of waiting in a cluster queue for a big job. > > > > > > > > We urge you to upgrade. > > > > > > > > I'll do that today and hope for the best. On first tests on 3.11.3, > I still have a couple issues with the coloring code: > > > > > > > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned > here: > https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html > > > > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a > wrong jacobian unless I also set MatColoringSetWeightType(coloring, > MAT_COLORING_WEIGHT_LEXICAL); > > > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to > exist. > > > > > > > > Thanks, > > > > Mark > > > > > > > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. > wrote: > > > > > > > > My first guess is that the code is getting integer overflow > somewhere. 25 billion is well over the 2 billion that 32 bit integers can > hold. > > > > > > > > We urge you to upgrade. > > > > > > > > Regardless for problems this large you likely need the > ./configure option --with-64-bit-indices > > > > > > > > We are adding more tests to nicely handle integer overflow but it > is not easy since it can occur in so many places > > > > > > > > Hopefully this will resolve your problem with large process counts > > > > > > > > Barry > > > > > > > > > > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > > > > > > > I'm running some larger cases than I have previously with a > working code, and I'm running into failures I don't see on smaller cases. > Failures are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. > Runs successfully on half size case on 200 cores. > > > > > > > > > > 1) The first error output from petsc is "MPI_Allreduce() called in > different locations". Is this a red herring, suggesting some process failed > prior to this and processes have diverged? > > > > > > > > > > 2) I don't think I'm running out of memory -- globally at least. > Slurm output shows e.g. > > > > > Memory Utilized: 459.15 GB (estimated maximum) > > > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) > > > > > I did try with and without --64-bit-indices. > > > > > > > > > > 3) The debug traces seem to vary, see below. I *think* the failure > might be happening in the vicinity of a Coloring call. I'm using > MatFDColoring like so: > > > > > > > > > > ISColoring iscoloring; > > > > > MatFDColoring fdcoloring; > > > > > MatColoring coloring; > > > > > > > > > > MatColoringCreate(ctx.JPre, &coloring); > > > > > MatColoringSetType(coloring, MATCOLORINGGREEDY); > > > > > > > > > > // converges stalls badly without this on small cases, don't > know why > > > > > MatColoringSetWeightType(coloring, > MAT_COLORING_WEIGHT_LEXICAL); > > > > > > > > > > // none of these worked. > > > > > // MatColoringSetType(coloring, MATCOLORINGJP); > > > > > // MatColoringSetType(coloring, MATCOLORINGSL); > > > > > // MatColoringSetType(coloring, MATCOLORINGID); > > > > > MatColoringSetFromOptions(coloring); > > > > > > > > > > MatColoringApply(coloring, &iscoloring); > > > > > MatColoringDestroy(&coloring); > > > > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); > > > > > > > > > > I have had issues in the past with getting a functional coloring > setup for finite difference jacobians, and the above is the only > configuration I've managed to get working successfully. Have there been any > significant development changes to that area of code since v3.8.3? I'll try > upgrading in the mean time and hope for the best. > > > > > > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > > > > > > > Thanks, > > > > > Mark > > > > > > > > > > > > > > > ************************************* > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" > slurm-3429773.out > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (functions) on different processors > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 > by mlohry Tue Aug 6 06:05:02 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Invalid argument > > > > > [0]PETSC ERROR: Enum value must be same on all processes, argument > # 2 > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 > by mlohry Tue Aug 6 06:05:02 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process > (or the batch system) has told this process to end > > > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple > Mac OS X to find memory corruption errors > > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > > > [0]PETSC ERROR: is given. > > > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 > /home/mlohry/build/external/petsc/src/sys/objects/tagm.c > > > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 > /home/mlohry/build/external/petsc/src/sys/objects/inherit.c > > > > > [0]PETSC ERROR: [0] DMCreate line 36 > /home/mlohry/build/external/petsc/src/dm/interface/dm.c > > > > > [0]PETSC ERROR: [0] DMShellCreate line 983 > /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c > > > > > [0]PETSC ERROR: [0] TSGetDM line 5287 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Signal received > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 > by mlohry Tue Aug 6 06:05:02 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > > > > > > > > > > > > > > > ************************************* > > > > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" > slurm-3429158.out > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (code lines) on different processors > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (code lines) on different processors > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > > > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple > Mac OS X to find memory corruption errors > > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > > > [0]PETSC ERROR: is given. > > > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 > /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line > 208 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 > /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > > > [0]PETSC ERROR: [0] MatColoringApply line 357 > /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c > > > > > [0]PETSC ERROR: [0] VecSetSizes line 1308 > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Signal received > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by > mlohry Mon Aug 5 23:58:19 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown > file > > > > > > > > > > > > > > > > > > > > ************************* > > > > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" > slurm-3429134.out > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (code lines) on different processors > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by > mlohry Mon Aug 5 23:24:23 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in > /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in > /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in > /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > > [0]PETSC ERROR: #6 VecSetType() line 51 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c > > > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in > /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Object is in wrong state > > > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by > mlohry Mon Aug 5 23:24:23 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in > /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > > > > > > > > > > > > > > > > ************************************** > > > > > > > > > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" > slurm-3429102.out > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (code lines) on different processors > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 > by mlohry Mon Aug 5 22:50:12 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (code lines) on different processors > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 > by mlohry Mon Aug 5 22:50:12 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations > (code lines) on different processors > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 > by mlohry Mon Aug 5 22:50:12 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > > > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple > Mac OS X to find memory corruption errors > > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the > function > > > > > [0]PETSC ERROR: is given. > > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 > /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 > /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 > /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 > /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Signal received > > > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 > by mlohry Mon Aug 5 22:50:12 2019 > > > > > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt > --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc > --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx > --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes > COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 > --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS > --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown > file > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at gmail.com Sun Aug 11 20:41:35 2019 From: mlohry at gmail.com (Mark Lohry) Date: Sun, 11 Aug 2019 21:41:35 -0400 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: References: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> <20CF735B-247A-4A0D-BBF7-8DD25AB95E51@mcs.anl.gov> <4379D27A-C950-4733-82F4-2BDDFF93D154@mcs.anl.gov> Message-ID: So the parallel JP runs just as proportionally slow in serial as it does in parallel. valgrind --tool=callgrind shows essentially 100% of the runtime in jp.c:255-262, within the larger loop commented /* pass two -- color it by looking at nearby vertices and building a mask */ for (j=0;j wrote: > Sorry, forgot to reply to the mailing list. > > where does your matrix come from? A mesh? Structured, unstructured, a >> graph, something else? What type of discretization? > > > Unstructured tetrahedral mesh (CGNS, I can give links to the files if > that's of interest), the discretization is arbitrary order discontinuous > galerkin for compressible navier-stokes. 5 coupled equations x 10 nodes per > element for this 2nd order case to give the 50x50 blocks. Each tet cell > dependent on neighbors, so for tets 4 extra off-diagonal blocks per cell. > > I would expect one could exploit the large block size here in computing > the coloring -- the underlying mesh is 2M nodes with the same connectivity > as a standard cell-centered finite volume method. > > > > On Sun, Aug 11, 2019 at 2:12 PM Smith, Barry F. > wrote: > >> >> These are due to attempting to copy the entire matrix to one process >> and do the sequential coloring there. Definitely won't work for larger >> problems, we'll >> >> need to focus on >> >> 1) having useful parallel coloring and >> 2) maybe using an alternative way to determine the coloring: >> >> where does your matrix come from? A mesh? Structured, unstructured, >> a graph, something else? What type of discretization? >> >> Barry >> >> >> > On Aug 11, 2019, at 10:21 AM, Mark Lohry wrote: >> > >> > On the very large case, there does appear to be some kind of overflow >> ending up with an attempt to allocate too much memory in MatFDColorCreate, >> even with --with-64-bit-indices. Full terminal output here: >> > >> https://raw.githubusercontent.com/mlohry/petsc_miscellany/master/slurm-3451378.out >> > >> > In particular: >> > PETSC ERROR: Memory requested 1036713571771129344 >> > >> > Log filename here: >> > https://github.com/mlohry/petsc_miscellany/blob/master/petsclogfile.0 >> > >> > On Sun, Aug 11, 2019 at 9:49 AM Mark Lohry wrote: >> > Hi Barry, I made a minimum example comparing the colorings on a very >> small case. You'll need to unzip the jacobian_sparsity.tgz to run it. >> > >> > https://github.com/mlohry/petsc_miscellany >> > >> > This is sparse block system with 50x50 block sizes, ~7,680 blocks. >> Comparing the coloring types sl, lf, jp, id, greedy, I get these timings >> wallclock, running with -np 16: >> > >> > SL: 1.5s >> > LF: 1.3s >> > JP: 29s ! >> > ID: 1.4s >> > greedy: 2s >> > >> > As far as I'm aware, JP is the only parallel coloring implemented? It >> is looking as though I'm simply running out of memory with the sequential >> methods (I should apologize to my cluster admin for chewing up 10TB and >> crashing...). >> > >> > On this small problem JP is taking 30 seconds wallclock, but that time >> grows exponentially with larger problems (last I tried it, I killed the job >> after 24 hours of spinning.) >> > >> > Also as I mentioned, the "greedy" method appears to be producing an >> invalid coloring for me unless I also specify weights "lexical". But >> "-mat_coloring_test" doesn't complain. I'll have to make a different >> example to actually show it's an invalid coloring. >> > >> > Thanks, >> > Mark >> > >> > >> > >> > On Sat, Aug 10, 2019 at 4:38 PM Smith, Barry F. >> wrote: >> > >> > Mark, >> > >> > Would you be able to cook up an example (or examples) that >> demonstrate the problem (or problems) and how to run it? If you send it to >> us and we can reproduce the problem then we'll fix it. If need be you can >> send large matrices to petsc-maint at mcs.anl.gov don't send them to >> petsc-users since it will reject large files. >> > >> > Barry >> > >> > >> > > On Aug 10, 2019, at 1:56 PM, Mark Lohry wrote: >> > > >> > > Thanks Barry, been trying all of the above. I think I've homed in on >> it to an out-of-memory and/or integer overflow inside MatColoringApply. >> Which makes some sense since I only have a sequential coloring algorithm >> working... >> > > >> > > Is anyone out there using coloring in parallel? I still have the same >> previously mentioned issues with MATCOLORINGJP (on small problems takes >> upwards of 30 minutes to run) which as far as I can see is the only >> "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on >> less large problems, MATCOLORINGGREEDY works on less large problems if and >> only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are >> failing on larger problems. >> > > >> > > On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. >> wrote: >> > > >> > > There is also >> > > >> > > $ ./configure --help | grep color >> > > --with-is-color-value-type= >> > > char, short can store 256, 65536 colors current: short >> > > >> > > I can't imagine you have over 65 k colors but something to check >> > > >> > > >> > > > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: >> > > > >> > > > My first guess is that the code is getting integer overflow >> somewhere. 25 billion is well over the 2 billion that 32 bit integers can >> hold. >> > > > >> > > > Mine as well -- though in later tests I have the same issue when >> using --with-64-bit-indices. Ironically I had removed that flag at some >> point because the coloring / index set was using a serious chunk of total >> memory on medium sized problems. >> > > >> > > Understood >> > > >> > > > >> > > > Questions on the petsc internals there though: Are matrices indexed >> with two integers (i,j) so the max matrix dimension is (int limit) x (int >> limit) or a single integer so the max dimension is sqrt(int limit)? >> > > > Also I was operating under the assumption the 32 bit limit should >> only constrain per-process problem sizes (25B over 400 processes giving 62M >> non-zeros per process), is that not right? >> > > >> > > It is mostly right but may not be right for everything in PETSc. >> For example I don't know about the MatFD code >> > > >> > > Since using a debugger is not practical for large code counts to >> find the point the two processes diverge you can try >> > > >> > > -log_trace >> > > >> > > or >> > > >> > > -log_trace filename >> > > >> > > in the second case it will generate one file per core called >> filename.%d note it will produce a lot of output >> > > >> > > Good luck >> > > >> > > >> > > >> > > > >> > > > We are adding more tests to nicely handle integer overflow but >> it is not easy since it can occur in so many places >> > > > >> > > > Totally understood. I know the pain of only finding an overflow bug >> after days of waiting in a cluster queue for a big job. >> > > > >> > > > We urge you to upgrade. >> > > > >> > > > I'll do that today and hope for the best. On first tests on 3.11.3, >> I still have a couple issues with the coloring code: >> > > > >> > > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned >> here: >> https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html >> > > > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a >> wrong jacobian unless I also set MatColoringSetWeightType(coloring, >> MAT_COLORING_WEIGHT_LEXICAL); >> > > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to >> exist. >> > > > >> > > > Thanks, >> > > > Mark >> > > > >> > > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. >> wrote: >> > > > >> > > > My first guess is that the code is getting integer overflow >> somewhere. 25 billion is well over the 2 billion that 32 bit integers can >> hold. >> > > > >> > > > We urge you to upgrade. >> > > > >> > > > Regardless for problems this large you likely need the >> ./configure option --with-64-bit-indices >> > > > >> > > > We are adding more tests to nicely handle integer overflow but >> it is not easy since it can occur in so many places >> > > > >> > > > Hopefully this will resolve your problem with large process >> counts >> > > > >> > > > Barry >> > > > >> > > > >> > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> > > > > >> > > > > I'm running some larger cases than I have previously with a >> working code, and I'm running into failures I don't see on smaller cases. >> Failures are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. >> Runs successfully on half size case on 200 cores. >> > > > > >> > > > > 1) The first error output from petsc is "MPI_Allreduce() called >> in different locations". Is this a red herring, suggesting some process >> failed prior to this and processes have diverged? >> > > > > >> > > > > 2) I don't think I'm running out of memory -- globally at least. >> Slurm output shows e.g. >> > > > > Memory Utilized: 459.15 GB (estimated maximum) >> > > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) >> > > > > I did try with and without --64-bit-indices. >> > > > > >> > > > > 3) The debug traces seem to vary, see below. I *think* the >> failure might be happening in the vicinity of a Coloring call. I'm using >> MatFDColoring like so: >> > > > > >> > > > > ISColoring iscoloring; >> > > > > MatFDColoring fdcoloring; >> > > > > MatColoring coloring; >> > > > > >> > > > > MatColoringCreate(ctx.JPre, &coloring); >> > > > > MatColoringSetType(coloring, MATCOLORINGGREEDY); >> > > > > >> > > > > // converges stalls badly without this on small cases, don't >> know why >> > > > > MatColoringSetWeightType(coloring, >> MAT_COLORING_WEIGHT_LEXICAL); >> > > > > >> > > > > // none of these worked. >> > > > > // MatColoringSetType(coloring, MATCOLORINGJP); >> > > > > // MatColoringSetType(coloring, MATCOLORINGSL); >> > > > > // MatColoringSetType(coloring, MATCOLORINGID); >> > > > > MatColoringSetFromOptions(coloring); >> > > > > >> > > > > MatColoringApply(coloring, &iscoloring); >> > > > > MatColoringDestroy(&coloring); >> > > > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); >> > > > > >> > > > > I have had issues in the past with getting a functional coloring >> setup for finite difference jacobians, and the above is the only >> configuration I've managed to get working successfully. Have there been any >> significant development changes to that area of code since v3.8.3? I'll try >> upgrading in the mean time and hope for the best. >> > > > > >> > > > > >> > > > > >> > > > > Any ideas? >> > > > > >> > > > > >> > > > > Thanks, >> > > > > Mark >> > > > > >> > > > > >> > > > > ************************************* >> > > > > >> > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >> slurm-3429773.out >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >> (functions) on different processors >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 >> by mlohry Tue Aug 6 06:05:02 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun --with-64-bit-indices >> > > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Invalid argument >> > > > > [0]PETSC ERROR: Enum value must be same on all processes, >> argument # 2 >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 >> by mlohry Tue Aug 6 06:05:02 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun --with-64-bit-indices >> > > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process >> (or the batch system) has told this process to end >> > > > > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > > > [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and >> Apple Mac OS X to find memory corruption errors >> > > > > [0]PETSC ERROR: likely location of problem given in stack below >> > > > > [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the >> function >> > > > > [0]PETSC ERROR: is given. >> > > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 >> /home/mlohry/build/external/petsc/src/sys/objects/tagm.c >> > > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 >> /home/mlohry/build/external/petsc/src/sys/objects/inherit.c >> > > > > [0]PETSC ERROR: [0] DMCreate line 36 >> /home/mlohry/build/external/petsc/src/dm/interface/dm.c >> > > > > [0]PETSC ERROR: [0] DMShellCreate line 983 >> /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c >> > > > > [0]PETSC ERROR: [0] TSGetDM line 5287 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Signal received >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 >> by mlohry Tue Aug 6 06:05:02 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun --with-64-bit-indices >> > > > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown >> file >> > > > > >> > > > > >> > > > > ************************************* >> > > > > >> > > > > >> > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >> slurm-3429158.out >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >> (code lines) on different processors >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 >> by mlohry Mon Aug 5 23:58:19 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >> (code lines) on different processors >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 >> by mlohry Mon Aug 5 23:58:19 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >> > > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >> > > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >> > > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c >> > > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >> Violation, probably memory access out of range >> > > > > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > > > [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and >> Apple Mac OS X to find memory corruption errors >> > > > > [0]PETSC ERROR: likely location of problem given in stack below >> > > > > [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the >> function >> > > > > [0]PETSC ERROR: is given. >> > > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 >> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c >> > > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line >> 208 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c >> > > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 >> /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c >> > > > > [0]PETSC ERROR: [0] MatColoringApply line 357 >> /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c >> > > > > [0]PETSC ERROR: [0] VecSetSizes line 1308 >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >> > > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >> > > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c >> > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Signal received >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 >> by mlohry Mon Aug 5 23:58:19 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown >> file >> > > > > >> > > > > >> > > > > >> > > > > ************************* >> > > > > >> > > > > >> > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >> slurm-3429134.out >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >> (code lines) on different processors >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 >> by mlohry Mon Aug 5 23:24:23 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in >> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c >> > > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in >> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c >> > > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in >> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c >> > > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >> > > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >> > > > > [0]PETSC ERROR: #6 VecSetType() line 51 in >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c >> > > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in >> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Object is in wrong state >> > > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 >> by mlohry Mon Aug 5 23:24:23 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in >> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >> > > > > >> > > > > >> > > > > >> > > > > ************************************** >> > > > > >> > > > > >> > > > > >> > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >> slurm-3429102.out >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >> (code lines) on different processors >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 >> by mlohry Mon Aug 5 22:50:12 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >> (code lines) on different processors >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 >> by mlohry Mon Aug 5 22:50:12 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >> (code lines) on different processors >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 >> by mlohry Mon Aug 5 22:50:12 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >> Violation, probably memory access out of range >> > > > > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > > > [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and >> Apple Mac OS X to find memory corruption errors >> > > > > [0]PETSC ERROR: likely location of problem given in stack below >> > > > > [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the >> function >> > > > > [0]PETSC ERROR: is given. >> > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 >> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >> > > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 >> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >> > > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 >> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >> > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 >> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >> > > > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > > > [0]PETSC ERROR: Signal received >> > > > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 >> by mlohry Mon Aug 5 22:50:12 2019 >> > > > > [0]PETSC ERROR: Configure options >> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >> --with-mpiexec=/usr/bin/srun >> > > > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown >> file >> > > > > >> > > > > >> > > > > >> > > > >> > > >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From stevenbenbow at quintessa.org Mon Aug 12 02:49:47 2019 From: stevenbenbow at quintessa.org (Steve) Date: Mon, 12 Aug 2019 08:49:47 +0100 Subject: [petsc-users] Possibility of using out of date Jacobians with TS In-Reply-To: References: Message-ID: <45e0f72a-0451-da80-7d47-425b0a554e45@quintessa.org> Thanks Mark, if if I see similar performance I'll be very happy! On 11/08/2019 15:17, Mark Lohry wrote: > Anecdotal: I've been *shocked* at how long I can let the > -snes_lag_preconditioner go with -snes_mf_operator. I have it > configured to only recompute the preconditioner whenever it hits my > linear solver iteration limit, which pretty much never happens on > unsteady problems. > > On Fri, Aug 9, 2019 at 11:11 AM Steve via petsc-users > > wrote: > > Thank you Barry, that's very helpful. > > I'll have a play with those various options and see how I get on. > > > On 09/08/2019 15:43, Smith, Barry F. wrote: > >? ? ?Steve, > > > >? ? ? ?There are two possibilities > > > > 1) completely under user control, when you are asked for a new > Jacobian you can evaluate the current conditions and decide > whether to generate a new one. For example get from SNES the > number of iterations it required and if that is starting to go up > then get a new one or check if the time-step is being cut because > the nonlinear solver is becoming "too hard" and generate a new one. > > > >? ? ?It is also possible to use -snes_mf_operator (or an inline > version) that uses matrix-free to apply the Jacobian and the > Jacobian you provide to compute the preconditioner. This allows > you to keep the current Jacobian/preconditioner even longer before > rebuilding. Here you can use the increase in the number of linear > iterations to decide when to change the Jacobian. > > > > 2) let PETSc decide when to rebuild the Jacobian. This is more > limited since it has no direct measure of how well the Jacobian is > doing. Some possibilities are > > -snes_lag_jacobian -snes_lag_jacobian_persists > -snes_lag_preconditioner -snes_lag_preconditioner_persists This > introduces yet another parameter under your control; you can lag > the generation of the new preconditioner even when you get a new > preconditioner (this makes sense only when you are not using > -snes_mf_operator), > > > >? ? So, at a high level, you have a great deal of freedom to > control when you recreate the Jacobian (and preconditioner), will > be problem dependent and the optimal value will depend on your > problem and specific integrator. Final note, when you rebuild may > also depend on how far you have integrated, when the nonlinear > effects are strong you probably want to rebuild often but when the > solution is close to linearly evolving less often. > > > >? ? If generating the Jacobian/preconditioner is expensive > relative to everything else a good combination is > -snes_mf_operator and a pretty lagged generation of new Jacobians. > > > >? ? ?Barry > > > > > > > > > >> On Aug 9, 2019, at 9:25 AM, Steve via petsc-users > > wrote: > >> > >> Hi, > >> > >> I'm experimenting with the use of PETSc to replace a DAE solver > in an existing code that I use to solve stiff nonlinear problems.? > I expect to use TSBDF in the final instance, and so am currently > playing with it but applied to a simpler linear problem - just to > get some experience with the SNES/KSP/PC controls before diving in > to the hard problem. > >> > >> Below is some output from TSAdapt for the simple linear > problem, using TSBDF and PCLU, so that the linear algebra solve in > the newton loop is direct: > >> > >>? ? ? TSAdapt basic bdf 0:2 step? ?0 accepted t=0 ? ? + > 1.000e-03 dt=2.000e-03? wlte=2.51e-07? wltea=? ?-1 wlter=? ?-1 > >>? ? ? ? ? TSResidual... > >>? ? ? ? ? TSJacobian... calculate > >>? ? ? ? ? TSResidual... > >>? ? ? TSAdapt basic bdf 0:2 step? ?1 accepted t=0.001 ? ? + > 2.000e-03 dt=4.000e-03? wlte=2.83e-07? wltea=? ?-1 wlter=? ?-1 > >>? ? ? ? ? TSResidual... > >>? ? ? ? ? TSJacobian... calculate > >>? ? ? ? ? TSResidual... > >>? ? ? TSAdapt basic bdf 0:2 step? ?2 accepted t=0.003 ? ? + > 4.000e-03 dt=8.000e-03? wlte=1.22e-07? wltea=? ?-1 wlter=? ?-1 > >>? ? ? ? ? TSResidual... > >>? ? ? ? ? TSJacobian... calculate > >>? ? ? ? ? TSResidual... > >> > >> I have added the "TSResidual..." and "TSJacobian..." echoes so > that I can see when PETSc is requesting residuals and Jacobians to > be computed.? (This is the Jacobian routine specified via > TSSetIJacobian.) > >> > >> Regarding the above output, it appears that TS / SNES always > requests a new (I)Jacobian at each new timestep (after the first > residual is calculated).? I can see mathematically why this would > be the default choice, but had hoped that it might be possible for > out-of-date Jacobians to be used until they become inefficient. My > reason for wanting this is that the Jacobian calculations for the > intended application are particularly expensive, but for small > enough timesteps out-of-date Jacobians may be good enough, for a > few steps. > >> > >> Is there any way of specifying that out-of-date (I)Jacobians > can be tolerated (at the expense of increased Newton iterations, > or smaller timesteps)?? Alternatively would it make sense to > include callbacks to TS / SNES from the Jacobian evaluation > function to determine whether sufficiently few iterations have > been used that it might be safe to return the previously > calculated Jacobian (if I store a copy)?? If so, is there any > advice on how I should do this? > >> > >> NB. I see that there is an option for TSRHSJacobianSetReuse(), > but this only applies to the RHS component of the DAE (the G(t,u) > part, using the terminology from the manual), but I am not using > this as ultimately I expect to be solving strongly nonlinear > problems with no "slow" G(t,u) part. > >> > >> Any advice would be greatly appreciated. > >> > >> > -- > Dr Steven J Benbow > Quintessa Ltd, First Floor, West Wing, Videcom House, Newtown > Road, Henley-on-Thames, Oxfordshire RG9 1HG, UK > Tel: 01491 636246? DD: 01491 630051? Web: http://www.quintessa.org > > Quintessa Limited is an employee-owned company registered in > England, Number 3716623. > Registered office: Quintessa Ltd, First Floor, West Wing, Videcom > House, Newtown Road, Henley-on-Thames, Oxfordshire RG9 1HG, UK > > If you have received this e-mail in error, please notify > privacy at quintessa.org and delete it > from your system > -- Dr Steven J Benbow Quintessa Ltd, First Floor, West Wing, Videcom House, Newtown Road, Henley-on-Thames, Oxfordshire RG9 1HG, UK Tel: 01491 636246 DD: 01491 630051 Web: http://www.quintessa.org Quintessa Limited is an employee-owned company registered in England, Number 3716623. Registered office: Quintessa Ltd, First Floor, West Wing, Videcom House, Newtown Road, Henley-on-Thames, Oxfordshire RG9 1HG, UK If you have received this e-mail in error, please notify privacy at quintessa.org and delete it from your system -------------- next part -------------- An HTML attachment was scrubbed... URL: From Moritz.Huck at rwth-aachen.de Mon Aug 12 10:25:20 2019 From: Moritz.Huck at rwth-aachen.de (Huck, Moritz) Date: Mon, 12 Aug 2019 15:25:20 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: <5A251CF0-D34E-4823-B0CA-695CC21AC1B5@pnnl.gov> References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> , <5A251CF0-D34E-4823-B0CA-695CC21AC1B5@pnnl.gov> Message-ID: <002a46ca0d73467aa4a7a4f9dfb503ea@rwth-aachen.de> Hi, at the moment I am trying the Precheck version (I will try the event one afterwards). My precheckfunction is (pseudo code): precheckfunction(Vec X,Vec Y,PetscBool *changed){ if any(X+Y Gesendet: Donnerstag, 8. August 2019 19:16:12 An: Huck, Moritz; Smith, Barry F. Cc: petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI Moritz, I think your case will also work with using TSEvent. I think your problem is similar, correct me if I am wrong, to my application where I need to constrain the states within some limits, lb \le x. I use events to handle this, where I use two event functions: (i) x ? lb = 0. if x > lb & (ii) \dot{x} = 0 x = lb The first event function is used to detect when x hits the limit lb. Once it hits the limit, the differential equation for x is changed to (x-lb = 0) in the model to hold x at limit lb. For releasing x, there is an event function on the derivative of x, \dot{x}, and x is released on detection of the condition \dot{x} > 0. This is done through the event function \dot{x} = 0 with a positive zero crossing. An example of how the above works is in the example src/ts/examples/tutorials/power_grid/stability_9bus/ex9bus.c. In this example, there is an event function that first checks whether the state VR has hit the upper limit VRMAX. Once it does so, the flag VRatmax is set by the post-event function. The event function is then switched to the \dot{VR} if (!VRatmax[i])) fvalue[2+2*i] = VRMAX[i] - VR; } else { fvalue[2+2*i] = (VR - KA[i]*RF + KA[i]*KF[i]*Efd/TF[i] - KA[i]*(Vref[i] - Vm))/TA[i]; } You can either try TSEvent or what Barry suggested SNESLineSearchSetPreCheck(), or both. Thanks, Shri From: "Huck, Moritz" Date: Wednesday, August 7, 2019 at 8:46 AM To: "Smith, Barry F." Cc: "Abhyankar, Shrirang G" , "petsc-users at mcs.anl.gov" Subject: AW: [petsc-users] Problem with TS and SNES VI Thank you for your response. The sizes are only allowed to go down to a certain value. The non-physical values do also occur during the function evaluations (IFunction). I will try to implment your suggestions with SNESLineSearchSetPreCheck. This would mean I dont have to use SNESVISetVariableBounds at all, right? ________________________________________ Von: Smith, Barry F. > Gesendet: Dienstag, 6. August 2019 17:47:13 An: Huck, Moritz Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI Thanks, very useful. Are the non-physical values appearing in the nonlinear solver ? Or just at the time-step? Do you check for non-physical values each time you do a function evaluation needed by SNES/TS? If the non-physical values are an artifact of the steps taken in the nonlinear solver in SNES then the correct solution is to use SNESLineSearchSetPreCheck() what you do is change the step so the resulting solutions are physical. For you case where the sizes go negative I am not sure what to do. Are the sizes allowed to go to zero? If so then adjust the step so that the sizes that go to negative values just go to zero. If they are suppose to be always positive then you need to pick some tolerance (say epsilon) and adjust the step so they are of size epsilon. Note you don't scale the entire step vector by a small number to satisfy the constraint you change each entry in the step as needed to satisfy the constraints. Good luck and let us know how it goes Barry On Aug 6, 2019, at 9:24 AM, Huck, Moritz > wrote: At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. Unphysical values are e.g. particle sizes below zero. My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. ________________________________________ Von: Smith, Barry F. > Gesendet: Dienstag, 6. August 2019 15:51:16 An: Huck, Moritz Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? Is this within a stage or at the actual time-step after the stage? Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). This information can help us determine what approach you should take. Thanks Barry On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users > wrote: Hi, I think I am missing something here. How would events help to constrain the states. Do you mean to use the event to "pause" to integration an adjust the state manually? Or are the events to enforce smaller timesteps when the state come close to the constraints? Thank you, Moritz ________________________________________ Von: Abhyankar, Shrirang G > Gesendet: Montag, 5. August 2019 17:21:41 An: Huck, Moritz; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html A brief intro to TSEvent can be found here. Thanks, Shri From: petsc-users > on behalf of "Huck, Moritz via petsc-users" > Reply-To: "Huck, Moritz" > Date: Monday, August 5, 2019 at 5:18 AM To: "petsc-users at mcs.anl.gov" > Subject: [petsc-users] Problem with TS and SNES VI Hi, I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. Are there some tolerances I have to set for VI or something like this? Best Regards, Moritz From juaneah at gmail.com Mon Aug 12 13:37:08 2019 From: juaneah at gmail.com (Emmanuel Ayala) Date: Mon, 12 Aug 2019 13:37:08 -0500 Subject: [petsc-users] MatCreateSubMatrix Message-ID: Hello, I have some doubts. I'm trying to create a submatrix (MatCreateSubMatrix). I have an identity matrix A. Then, in each process, I select some indices (owned in this process but in global numbering) that i need to remove (only columns) from the matrix A. These indices are storage in an PetscInt array (arr_cols), created with PetscMalloc1 (size: n_cols). My goal is to use the index set tool (IS). For this purpose, i did the next: 1. I defined the sub-matrix (submat, specifying the size) 2. I created the IS for the rows, in this case taking into account the MatGetOwnershipRange(A,&row_low,&row_high); ISCreateGeneral(COMMUNICATOR,n_rows,arr_rows,PETSC_COPY_VALUES,&is_rows);. This consider all the rows owned by the process. 2. I created the IS for columns: ISCreateGeneral(COMMUNICATOR,n_cols,arr_col,PETSC_COPY_VALUES,&is_col); The range for the is_col, was defined taking into account the the columns that will be in IT's "diagonal part", using MatGetOwnershipRangeColumn(A,&col_low,&col_high); 3. I create the submatrix: MatCreateSubMatrix(A,is_row,is_col,MAT_INITIAL_MATRIX,&submat); Questions: a. Which communicator i need to use, MPI_COMM_SELF or PETSC_COMM_WORLD? and why? b. Moreover, my approach to create the submatrix is ok? c. In the example: https://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex49.c.html there is a line 1239: ISSetBlockSize(is,2); what is the meaning of this line? Best regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 12 14:03:47 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 12 Aug 2019 15:03:47 -0400 Subject: [petsc-users] MatCreateSubMatrix In-Reply-To: References: Message-ID: On Mon, Aug 12, 2019 at 2:55 PM Emmanuel Ayala via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, I have some doubts. > > I'm trying to create a submatrix (MatCreateSubMatrix). I have an identity > matrix A. Then, in each process, I select some indices (owned in this > process but in global numbering) that i need to remove (only columns) from > the matrix A. These indices are storage in an PetscInt array (arr_cols), > created with PetscMalloc1 (size: n_cols). My goal is to use the index set > tool (IS). For this purpose, i did the next: > > 1. I defined the sub-matrix (submat, specifying the size) > 2. I created the IS for the rows, in this case taking into account the > MatGetOwnershipRange(A,&row_low,&row_high); > ISCreateGeneral(COMMUNICATOR,n_rows,arr_rows,PETSC_COPY_VALUES,&is_rows);. > This consider all the rows owned by the process. > 2. I created the IS for columns: > ISCreateGeneral(COMMUNICATOR,n_cols,arr_col,PETSC_COPY_VALUES,&is_col); > The range for the is_col, was defined taking into account the the columns > that will be in IT's "diagonal part", using > MatGetOwnershipRangeColumn(A,&col_low,&col_high); > 3. I create the submatrix: > MatCreateSubMatrix(A,is_row,is_col,MAT_INITIAL_MATRIX,&submat); > > Questions: > > a. Which communicator i need to use, MPI_COMM_SELF or PETSC_COMM_WORLD? > and why? > You will generally use the same communicator as A. MatGetSubmatrices() makes sequential matrices with SELF. > b. Moreover, my approach to create the submatrix is ok? > It sounds right. > c. In the example: > > https://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex49.c.html > there is a line > 1239: ISSetBlockSize(is,2); > what is the meaning of this line? > If your matrix has a blocksize, you can simplify indexing. I would get everything working right first. Thanks, Matt > Best regards. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Aug 12 22:58:29 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 13 Aug 2019 03:58:29 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: <002a46ca0d73467aa4a7a4f9dfb503ea@rwth-aachen.de> References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> <5A251CF0-D34E-4823-B0CA-695CC21AC1B5@pnnl.gov> <002a46ca0d73467aa4a7a4f9dfb503ea@rwth-aachen.de> Message-ID: > On Aug 12, 2019, at 10:25 AM, Huck, Moritz wrote: > > Hi, > at the moment I am trying the Precheck version (I will try the event one afterwards). > My precheckfunction is (pseudo code): > precheckfunction(Vec X,Vec Y,PetscBool *changed){ > if any(X+Y *changed=True > Y[where(X+Y } > Inside precheck around 10-20 occurences of X+Y In my understanding this should not happen, since the precheck should be called before the IFunction call. For the nonlinear system solve once the precheck is done the new nonlinear solution approximation is computed via a line search X = X + lamba Y where lambda > 0 and for most line searches lambda <=1 (for example the SNESLINESEARCHBT will always result in lambda <=1, I am not sure about the other line searchs) X_i = X_i + lambda (lowerbound - X_i) = X_i - lambda X_i + lambda lowerbound = (1 - lambda) X_i + lambda lowerbound => (1 - lambda) lowerbound + lambda lowerbound = lowerbound Thus it seems you are correct, each step that the line search tries should satisfy the bounds. Possible issues: 1) the line search produces lambda > 1. Make sure you use SNESLINESEARCHBT ??? Here you would need to determine exactly when in the algorithm the IFunction is having as input X < lower bound. Somewhere in the ARKIMEX integrator? Are you using fully implicit? You might need to use fully implicit in order to enforce the bound? What I would do is run in the debugger and have it stop inside IFunction when the lower bound is not satisfied. Then do bt to see where the code is, in what part of the algorithms. If inside the line search you'll need to poke around at the values to see why the step could produce something below the bound which in theory it shouldn't Good luck Barry > > ________________________________________ > Von: Abhyankar, Shrirang G > Gesendet: Donnerstag, 8. August 2019 19:16:12 > An: Huck, Moritz; Smith, Barry F. > Cc: petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > Moritz, > I think your case will also work with using TSEvent. I think your problem is similar, correct me if I am wrong, to my application where I need to constrain the states within some limits, lb \le x. I use events to handle this, where I use two event functions: > (i) x ? lb = 0. if x > lb & > (ii) \dot{x} = 0 x = lb > > The first event function is used to detect when x hits the limit lb. Once it hits the limit, the differential equation for x is changed to (x-lb = 0) in the model to hold x at limit lb. For releasing x, there is an event function on the derivative of x, \dot{x}, and x is released on detection of the condition \dot{x} > 0. This is done through the event function \dot{x} = 0 with a positive zero crossing. > > An example of how the above works is in the example src/ts/examples/tutorials/power_grid/stability_9bus/ex9bus.c. In this example, there is an event function that first checks whether the state VR has hit the upper limit VRMAX. Once it does so, the flag VRatmax is set by the post-event function. The event function is then switched to the \dot{VR} > if (!VRatmax[i])) > fvalue[2+2*i] = VRMAX[i] - VR; > } else { > fvalue[2+2*i] = (VR - KA[i]*RF + KA[i]*KF[i]*Efd/TF[i] - KA[i]*(Vref[i] - Vm))/TA[i]; > } > > You can either try TSEvent or what Barry suggested SNESLineSearchSetPreCheck(), or both. > > Thanks, > Shri > > > From: "Huck, Moritz" > Date: Wednesday, August 7, 2019 at 8:46 AM > To: "Smith, Barry F." > Cc: "Abhyankar, Shrirang G" , "petsc-users at mcs.anl.gov" > Subject: AW: [petsc-users] Problem with TS and SNES VI > > Thank you for your response. > The sizes are only allowed to go down to a certain value. > The non-physical values do also occur during the function evaluations (IFunction). > > I will try to implment your suggestions with SNESLineSearchSetPreCheck. This would mean I dont have to use SNESVISetVariableBounds at all, right? > ________________________________________ > Von: Smith, Barry F. > > Gesendet: Dienstag, 6. August 2019 17:47:13 > An: Huck, Moritz > Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > Thanks, very useful. > > Are the non-physical values appearing in the nonlinear solver ? Or just at the time-step? > > Do you check for non-physical values each time you do a function evaluation needed by SNES/TS? > > If the non-physical values are an artifact of the steps taken in the nonlinear solver in SNES then the correct solution is to use > SNESLineSearchSetPreCheck() what you do is change the step so the resulting solutions are physical. > > For you case where the sizes go negative I am not sure what to do. Are the sizes allowed to go to zero? If so then adjust the step so that the sizes that go to negative values just go to zero. If they are suppose to be always positive then you need to pick some tolerance (say epsilon) and adjust the step so they are of size epsilon. Note you don't scale the entire step vector by a small number to satisfy the constraint you change each entry in the step as needed to satisfy the constraints. > > Good luck and let us know how it goes > > Barry > > > > On Aug 6, 2019, at 9:24 AM, Huck, Moritz > wrote: > > At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. > Unphysical values are e.g. particle sizes below zero. > > My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. > > The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. > ________________________________________ > Von: Smith, Barry F. > > Gesendet: Dienstag, 6. August 2019 15:51:16 > An: Huck, Moritz > Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? > > Is this within a stage or at the actual time-step after the stage? > > Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? > > Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). > > This information can help us determine what approach you should take. > > Thanks > > Barry > > > On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users > wrote: > > Hi, > I think I am missing something here. > How would events help to constrain the states. > Do you mean to use the event to "pause" to integration an adjust the state manually? > Or are the events to enforce smaller timesteps when the state come close to the constraints? > > Thank you, > Moritz > ________________________________________ > Von: Abhyankar, Shrirang G > > Gesendet: Montag, 5. August 2019 17:21:41 > An: Huck, Moritz; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. > > An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html > > A brief intro to TSEvent can be found here. > > Thanks, > Shri > > > From: petsc-users > on behalf of "Huck, Moritz via petsc-users" > > Reply-To: "Huck, Moritz" > > Date: Monday, August 5, 2019 at 5:18 AM > To: "petsc-users at mcs.anl.gov" > > Subject: [petsc-users] Problem with TS and SNES VI > > Hi, > I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. > The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). > But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. > Are there some tolerances I have to set for VI or something like this? > > Best Regards, > Moritz From tangqi at msu.edu Tue Aug 13 19:27:04 2019 From: tangqi at msu.edu (Tang, Qi) Date: Wed, 14 Aug 2019 00:27:04 +0000 Subject: [petsc-users] How to choose mat_mffd_err in JFNK Message-ID: ?Hi, I am using JFNK, inexact Newton and a shell "physics-based" preconditioning to solve some multiphysics problems. I have been playing with mat_mffd_err, and it gives me some results I do not fully understand. I believe the default value of mat_mffd_err is 10^-8 for double precision, which seem too large for my problems. When I test a problem essentially still in the linear regime, I expect my converged "unpreconditioned resid norm of KSP" should be identical to "SNES Function norm" (Should I?). This is exactly what I found if I could find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it happens, the whole algorithm works as expected. When those two norms are off, the inexact Newton becomes very inefficient. For instance, it may take many ksp iterations to converge but the snes norm is only reduced slightly. According to the manual, mat_mffd_err should be "square root of relative error in function evaluation". But is there a simple way to estimate it? Is there anything else I could possibly tune in this context? The discretization is through mfem and I use standard H1 for my problem. Thanks, Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 13 20:07:22 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 14 Aug 2019 01:07:22 +0000 Subject: [petsc-users] How to choose mat_mffd_err in JFNK In-Reply-To: References: Message-ID: <459CE6EB-5359-4D35-B90F-CECFD32C0EB4@anl.gov> > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users wrote: > > ?Hi, > I am using JFNK, inexact Newton and a shell "physics-based" preconditioning to solve some multiphysics problems. I have been playing with mat_mffd_err, and it gives me some results I do not fully understand. > > I believe the default value of mat_mffd_err is 10^-8 for double precision, which seem too large for my problems. When I test a problem essentially still in the linear regime, I expect my converged "unpreconditioned resid norm of KSP" should be identical to "SNES Function norm" (Should I?). This is exactly what I found if I could find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it happens, the whole algorithm works as expected. When those two norms are off, the inexact Newton becomes very inefficient. For instance, it may take many ksp iterations to converge but the snes norm is only reduced slightly. > > According to the manual, mat_mffd_err should be "square root of relative error in function evaluation". But is there a simple way to estimate it? First a related note: there are two different algorithms that PETSc provides for computing the factor h using the err parameter. They can be set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some people have better luck with one or the other for their problem. There is some code in PETSc to compute what is called the "noise" of the function which in theory leads to a better err value. For example if say the last four digits of your function are "noise" (that is meaningless stuff) which is possible for complicated multiphysics problems due to round off in the function evaluations then you should use an err that is 2 digits bigger than the default (note this is just the square root of the "function epsilon" instead of the machine epsilon, because the "function epsilon" is much larger than the machine epsilon). We never had a great model for how to hook the noise computation code up into SNES so it is a bit disconnected, we should revisit this. Perhaps you will find it useful and we can figure out together how to hook it in cleanly for use. The code that computes and reports the noise is in the directory src/snes/interface/noise You can use this version with the option -snes_mf_version 2 (version 1 is the normal behavior) The code has the ability to print to a file or the screen information about the noise it is finding, maybe with command line options, you'll have to look directly at the code to see how. I don't like the current setup because if you use the noise based computation of h you have to use SNESMatrixFreeMult2_Private() which is a slightly different routine for doing the actually differencing than the two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it can't use the standard differencing routines but call the noise routine to compute h (just requires some minor code refactoring). Also I don't see an automatic way to just compute the noise of a function at a bunch of points independent of actually using the noise to compute and use h. It is sort of there in the routine SNESDiffParameterCompute_More() but that routine is not documented or user friendly. I would start by rigging up calls to SNESDiffParameterCompute_More() to see what it says the noise of your function is, based on your hand determined values optimal values for err it should be roughly the square of them. Then if this makes sense you can trying using the -snes_mf_version 2 code to see if it helps the matrix free multiple behave consistently well for your problem by adaptively computing the err and using that in computing h. Good luck and I'd love to hear back from you if it helps and suggestions/code for making this more integrated with SNES so it is easier to use. For example perhaps we want a function SNESComputeNoise() that uses SNESDiffParameterCompute_More() to compute and report the noise for a range of values of u. Barry > > Is there anything else I could possibly tune in this context? > > The discretization is through mfem and I use standard H1 for my problem. > > Thanks, > Qi From swarnava89 at gmail.com Wed Aug 14 16:29:14 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Wed, 14 Aug 2019 14:29:14 -0700 Subject: [petsc-users] Creating a 3D dmplex mesh with cell list and distributing it Message-ID: Hi PETSc team and users, I am trying to create a 3D dmplex mesh using DMPlexCreateFromCellList, then distribute it, and find out the coordinates of the vertices owned by each process. My cell list is as follows: numCells: 6 numVertices: 7 numCorners: 4 cells: 0 3 2 1 4 0 2 1 6 4 2 1 3 6 2 1 5 4 6 1 3 5 6 1 vertexCoords: -6.043000 -5.233392 -4.924000 -3.021500 0.000000 -4.924000 -3.021500 -3.488928 0.000000 -6.043000 1.744464 0.000000 0.000000 -5.233392 -4.924000 3.021500 0.000000 -4.924000 3.021500 -3.488928 0.000000 After reading this information, I do ierr= DMPlexCreateFromCellList(PETSC_COMM_WORLD,3,pCgdft->numCellsESP,pCgdft->NESP,pCgdft->numCornersESP,interpolate,pCgdft->cellsESP,3,pCgdft->vertexCoordsESP,&pCgdft->dmplex); ierr = DMPlexDistribute(pCgdft->dmplex,0,&pCgdft->dmplexSF, &distributedMesh);CHKERRQ(ierr); if (distributedMesh) { printf("mesh is distributed \n"); ierr = DMDestroy(&pCgdft->dmplex);CHKERRQ(ierr); pCgdft->dmplex = distributedMesh; } DMGetCoordinates(pCgdft->dmplex,&VC); VecView(VC,PETSC_VIEWER_STDOUT_WORLD); On running this with 3 mpi processes, From VecView, I see that all the processes own all the vertices. Why is the dmplex not being distributed? The VecView is : Process [0] -6.043 -5.23339 -4.924 -3.0215 0. -4.924 -3.0215 -3.48893 0. -6.043 1.74446 0. 0. -5.23339 -4.924 3.0215 0. -4.924 3.0215 -3.48893 0. Process [1] -6.043 -5.23339 -4.924 -3.0215 0. -4.924 -3.0215 -3.48893 0. -6.043 1.74446 0. 0. -5.23339 -4.924 3.0215 0. -4.924 3.0215 -3.48893 0. Process [2] -6.043 -5.23339 -4.924 -3.0215 0. -4.924 -3.0215 -3.48893 0. -6.043 1.74446 0. 0. -5.23339 -4.924 3.0215 0. -4.924 3.0215 -3.48893 0. Thanks, SG -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 14 20:48:41 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Aug 2019 21:48:41 -0400 Subject: [petsc-users] Creating a 3D dmplex mesh with cell list and distributing it In-Reply-To: References: Message-ID: DMView() the mesh before and after distribution, so we can see what we have. Thanks, Matt On Wed, Aug 14, 2019 at 5:30 PM Swarnava Ghosh via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi PETSc team and users, > > I am trying to create a 3D dmplex mesh using DMPlexCreateFromCellList, > then distribute it, and find out the coordinates of the vertices owned by > each process. > My cell list is as follows: > numCells: 6 > numVertices: 7 > numCorners: 4 > cells: > 0 > 3 > 2 > 1 > 4 > 0 > 2 > 1 > 6 > 4 > 2 > 1 > 3 > 6 > 2 > 1 > 5 > 4 > 6 > 1 > 3 > 5 > 6 > 1 > vertexCoords: > -6.043000 > -5.233392 > -4.924000 > -3.021500 > 0.000000 > -4.924000 > -3.021500 > -3.488928 > 0.000000 > -6.043000 > 1.744464 > 0.000000 > 0.000000 > -5.233392 > -4.924000 > 3.021500 > 0.000000 > -4.924000 > 3.021500 > -3.488928 > 0.000000 > > After reading this information, I do > ierr= > DMPlexCreateFromCellList(PETSC_COMM_WORLD,3,pCgdft->numCellsESP,pCgdft->NESP,pCgdft->numCornersESP,interpolate,pCgdft->cellsESP,3,pCgdft->vertexCoordsESP,&pCgdft->dmplex); > > ierr = DMPlexDistribute(pCgdft->dmplex,0,&pCgdft->dmplexSF, > &distributedMesh);CHKERRQ(ierr); > > if (distributedMesh) { > printf("mesh is distributed \n"); > ierr = DMDestroy(&pCgdft->dmplex);CHKERRQ(ierr); > pCgdft->dmplex = distributedMesh; > } > > DMGetCoordinates(pCgdft->dmplex,&VC); > VecView(VC,PETSC_VIEWER_STDOUT_WORLD); > > On running this with 3 mpi processes, From VecView, I see that all the > processes own all the vertices. Why is the dmplex not being distributed? > > The VecView is : > Process [0] > -6.043 > -5.23339 > -4.924 > -3.0215 > 0. > -4.924 > -3.0215 > -3.48893 > 0. > -6.043 > 1.74446 > 0. > 0. > -5.23339 > -4.924 > 3.0215 > 0. > -4.924 > 3.0215 > -3.48893 > 0. > Process [1] > -6.043 > -5.23339 > -4.924 > -3.0215 > 0. > -4.924 > -3.0215 > -3.48893 > 0. > -6.043 > 1.74446 > 0. > 0. > -5.23339 > -4.924 > 3.0215 > 0. > -4.924 > 3.0215 > -3.48893 > 0. > Process [2] > -6.043 > -5.23339 > -4.924 > -3.0215 > 0. > -4.924 > -3.0215 > -3.48893 > 0. > -6.043 > 1.74446 > 0. > 0. > -5.23339 > -4.924 > 3.0215 > 0. > -4.924 > 3.0215 > -3.48893 > 0. > > Thanks, > SG > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From swarnava89 at gmail.com Wed Aug 14 21:23:35 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Wed, 14 Aug 2019 19:23:35 -0700 Subject: [petsc-users] Creating a 3D dmplex mesh with cell list and distributing it In-Reply-To: References: Message-ID: Hi Matthew, I added DMView(pCgdft->dmplex,PETSC_VIEWER_STDOUT_WORLD); before and after distribution, and I get the following: dmplex before distribution DM Object: 3 MPI processes type: plex DM_0x84000004_0 in 3 dimensions: 0-cells: 7 7 7 1-cells: 17 17 17 2-cells: 17 17 17 3-cells: 6 6 6 Labels: depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) dmplex after distribution DM Object: Parallel Mesh 3 MPI processes type: plex Parallel Mesh in 3 dimensions: 0-cells: 7 7 7 1-cells: 17 17 17 2-cells: 17 17 17 3-cells: 6 6 6 Labels: depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) Thanks, SG On Wed, Aug 14, 2019 at 6:48 PM Matthew Knepley wrote: > DMView() the mesh before and after distribution, so we can see what we > have. > > Thanks, > > Matt > > On Wed, Aug 14, 2019 at 5:30 PM Swarnava Ghosh via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi PETSc team and users, >> >> I am trying to create a 3D dmplex mesh using DMPlexCreateFromCellList, >> then distribute it, and find out the coordinates of the vertices owned by >> each process. >> My cell list is as follows: >> numCells: 6 >> numVertices: 7 >> numCorners: 4 >> cells: >> 0 >> 3 >> 2 >> 1 >> 4 >> 0 >> 2 >> 1 >> 6 >> 4 >> 2 >> 1 >> 3 >> 6 >> 2 >> 1 >> 5 >> 4 >> 6 >> 1 >> 3 >> 5 >> 6 >> 1 >> vertexCoords: >> -6.043000 >> -5.233392 >> -4.924000 >> -3.021500 >> 0.000000 >> -4.924000 >> -3.021500 >> -3.488928 >> 0.000000 >> -6.043000 >> 1.744464 >> 0.000000 >> 0.000000 >> -5.233392 >> -4.924000 >> 3.021500 >> 0.000000 >> -4.924000 >> 3.021500 >> -3.488928 >> 0.000000 >> >> After reading this information, I do >> ierr= >> DMPlexCreateFromCellList(PETSC_COMM_WORLD,3,pCgdft->numCellsESP,pCgdft->NESP,pCgdft->numCornersESP,interpolate,pCgdft->cellsESP,3,pCgdft->vertexCoordsESP,&pCgdft->dmplex); >> >> ierr = DMPlexDistribute(pCgdft->dmplex,0,&pCgdft->dmplexSF, >> &distributedMesh);CHKERRQ(ierr); >> >> if (distributedMesh) { >> printf("mesh is distributed \n"); >> ierr = DMDestroy(&pCgdft->dmplex);CHKERRQ(ierr); >> pCgdft->dmplex = distributedMesh; >> } >> >> DMGetCoordinates(pCgdft->dmplex,&VC); >> VecView(VC,PETSC_VIEWER_STDOUT_WORLD); >> >> On running this with 3 mpi processes, From VecView, I see that all the >> processes own all the vertices. Why is the dmplex not being distributed? >> >> The VecView is : >> Process [0] >> -6.043 >> -5.23339 >> -4.924 >> -3.0215 >> 0. >> -4.924 >> -3.0215 >> -3.48893 >> 0. >> -6.043 >> 1.74446 >> 0. >> 0. >> -5.23339 >> -4.924 >> 3.0215 >> 0. >> -4.924 >> 3.0215 >> -3.48893 >> 0. >> Process [1] >> -6.043 >> -5.23339 >> -4.924 >> -3.0215 >> 0. >> -4.924 >> -3.0215 >> -3.48893 >> 0. >> -6.043 >> 1.74446 >> 0. >> 0. >> -5.23339 >> -4.924 >> 3.0215 >> 0. >> -4.924 >> 3.0215 >> -3.48893 >> 0. >> Process [2] >> -6.043 >> -5.23339 >> -4.924 >> -3.0215 >> 0. >> -4.924 >> -3.0215 >> -3.48893 >> 0. >> -6.043 >> 1.74446 >> 0. >> 0. >> -5.23339 >> -4.924 >> 3.0215 >> 0. >> -4.924 >> 3.0215 >> -3.48893 >> 0. >> >> Thanks, >> SG >> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 14 21:35:15 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Aug 2019 22:35:15 -0400 Subject: [petsc-users] Creating a 3D dmplex mesh with cell list and distributing it In-Reply-To: References: Message-ID: On Wed, Aug 14, 2019 at 10:23 PM Swarnava Ghosh wrote: > Hi Matthew, > > I added DMView(pCgdft->dmplex,PETSC_VIEWER_STDOUT_WORLD); before and after > distribution, and I get the following: > It looks like you are running things with the wrong 'mpirun' Thanks, Matt > dmplex before distribution > DM Object: 3 MPI processes > type: plex > DM_0x84000004_0 in 3 dimensions: > 0-cells: 7 7 7 > 1-cells: 17 17 17 > 2-cells: 17 17 17 > 3-cells: 6 6 6 > Labels: > depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) > dmplex after distribution > DM Object: Parallel Mesh 3 MPI processes > type: plex > Parallel Mesh in 3 dimensions: > 0-cells: 7 7 7 > 1-cells: 17 17 17 > 2-cells: 17 17 17 > 3-cells: 6 6 6 > Labels: > depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) > > Thanks, > SG > > On Wed, Aug 14, 2019 at 6:48 PM Matthew Knepley wrote: > >> DMView() the mesh before and after distribution, so we can see what we >> have. >> >> Thanks, >> >> Matt >> >> On Wed, Aug 14, 2019 at 5:30 PM Swarnava Ghosh via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hi PETSc team and users, >>> >>> I am trying to create a 3D dmplex mesh using DMPlexCreateFromCellList, >>> then distribute it, and find out the coordinates of the vertices owned by >>> each process. >>> My cell list is as follows: >>> numCells: 6 >>> numVertices: 7 >>> numCorners: 4 >>> cells: >>> 0 >>> 3 >>> 2 >>> 1 >>> 4 >>> 0 >>> 2 >>> 1 >>> 6 >>> 4 >>> 2 >>> 1 >>> 3 >>> 6 >>> 2 >>> 1 >>> 5 >>> 4 >>> 6 >>> 1 >>> 3 >>> 5 >>> 6 >>> 1 >>> vertexCoords: >>> -6.043000 >>> -5.233392 >>> -4.924000 >>> -3.021500 >>> 0.000000 >>> -4.924000 >>> -3.021500 >>> -3.488928 >>> 0.000000 >>> -6.043000 >>> 1.744464 >>> 0.000000 >>> 0.000000 >>> -5.233392 >>> -4.924000 >>> 3.021500 >>> 0.000000 >>> -4.924000 >>> 3.021500 >>> -3.488928 >>> 0.000000 >>> >>> After reading this information, I do >>> ierr= >>> DMPlexCreateFromCellList(PETSC_COMM_WORLD,3,pCgdft->numCellsESP,pCgdft->NESP,pCgdft->numCornersESP,interpolate,pCgdft->cellsESP,3,pCgdft->vertexCoordsESP,&pCgdft->dmplex); >>> >>> ierr = DMPlexDistribute(pCgdft->dmplex,0,&pCgdft->dmplexSF, >>> &distributedMesh);CHKERRQ(ierr); >>> >>> if (distributedMesh) { >>> printf("mesh is distributed \n"); >>> ierr = DMDestroy(&pCgdft->dmplex);CHKERRQ(ierr); >>> pCgdft->dmplex = distributedMesh; >>> } >>> >>> DMGetCoordinates(pCgdft->dmplex,&VC); >>> VecView(VC,PETSC_VIEWER_STDOUT_WORLD); >>> >>> On running this with 3 mpi processes, From VecView, I see that all the >>> processes own all the vertices. Why is the dmplex not being distributed? >>> >>> The VecView is : >>> Process [0] >>> -6.043 >>> -5.23339 >>> -4.924 >>> -3.0215 >>> 0. >>> -4.924 >>> -3.0215 >>> -3.48893 >>> 0. >>> -6.043 >>> 1.74446 >>> 0. >>> 0. >>> -5.23339 >>> -4.924 >>> 3.0215 >>> 0. >>> -4.924 >>> 3.0215 >>> -3.48893 >>> 0. >>> Process [1] >>> -6.043 >>> -5.23339 >>> -4.924 >>> -3.0215 >>> 0. >>> -4.924 >>> -3.0215 >>> -3.48893 >>> 0. >>> -6.043 >>> 1.74446 >>> 0. >>> 0. >>> -5.23339 >>> -4.924 >>> 3.0215 >>> 0. >>> -4.924 >>> 3.0215 >>> -3.48893 >>> 0. >>> Process [2] >>> -6.043 >>> -5.23339 >>> -4.924 >>> -3.0215 >>> 0. >>> -4.924 >>> -3.0215 >>> -3.48893 >>> 0. >>> -6.043 >>> 1.74446 >>> 0. >>> 0. >>> -5.23339 >>> -4.924 >>> 3.0215 >>> 0. >>> -4.924 >>> 3.0215 >>> -3.48893 >>> 0. >>> >>> Thanks, >>> SG >>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tangqi at msu.edu Wed Aug 14 21:41:09 2019 From: tangqi at msu.edu (Tang, Qi) Date: Thu, 15 Aug 2019 02:41:09 +0000 Subject: [petsc-users] How to choose mat_mffd_err in JFNK In-Reply-To: <459CE6EB-5359-4D35-B90F-CECFD32C0EB4@anl.gov> References: , <459CE6EB-5359-4D35-B90F-CECFD32C0EB4@anl.gov> Message-ID: Thanks for the help, Barry. I tired both ds and wp, and again it depends on if I could find the correct parameter set. It is getting harder as I refine the mesh. So I try to use SNESDefaultMatrixFreeCreate2, SNESMatrixFreeMult2_Private or SNESDiffParameterCompute_More in mfem. But it looks like these functions are not in linked in petscsnes.h. How could I call them? Also, could I call SNESDiffParameterCompute_More in snes_monitor? But it needs "a" in (F(u + ha) - F(u)) /h as the input. So I am not sure how I could properly use it. Maybe use SNESDefaultMatrixFreeCreate2 inside SNESSetJacobian would be easier to try? Thanks again, Qi ________________________________ From: Smith, Barry F. Sent: Tuesday, August 13, 2019 9:07 PM To: Tang, Qi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users wrote: > > ?Hi, > I am using JFNK, inexact Newton and a shell "physics-based" preconditioning to solve some multiphysics problems. I have been playing with mat_mffd_err, and it gives me some results I do not fully understand. > > I believe the default value of mat_mffd_err is 10^-8 for double precision, which seem too large for my problems. When I test a problem essentially still in the linear regime, I expect my converged "unpreconditioned resid norm of KSP" should be identical to "SNES Function norm" (Should I?). This is exactly what I found if I could find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it happens, the whole algorithm works as expected. When those two norms are off, the inexact Newton becomes very inefficient. For instance, it may take many ksp iterations to converge but the snes norm is only reduced slightly. > > According to the manual, mat_mffd_err should be "square root of relative error in function evaluation". But is there a simple way to estimate it? First a related note: there are two different algorithms that PETSc provides for computing the factor h using the err parameter. They can be set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some people have better luck with one or the other for their problem. There is some code in PETSc to compute what is called the "noise" of the function which in theory leads to a better err value. For example if say the last four digits of your function are "noise" (that is meaningless stuff) which is possible for complicated multiphysics problems due to round off in the function evaluations then you should use an err that is 2 digits bigger than the default (note this is just the square root of the "function epsilon" instead of the machine epsilon, because the "function epsilon" is much larger than the machine epsilon). We never had a great model for how to hook the noise computation code up into SNES so it is a bit disconnected, we should revisit this. Perhaps you will find it useful and we can figure out together how to hook it in cleanly for use. The code that computes and reports the noise is in the directory src/snes/interface/noise You can use this version with the option -snes_mf_version 2 (version 1 is the normal behavior) The code has the ability to print to a file or the screen information about the noise it is finding, maybe with command line options, you'll have to look directly at the code to see how. I don't like the current setup because if you use the noise based computation of h you have to use SNESMatrixFreeMult2_Private() which is a slightly different routine for doing the actually differencing than the two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it can't use the standard differencing routines but call the noise routine to compute h (just requires some minor code refactoring). Also I don't see an automatic way to just compute the noise of a function at a bunch of points independent of actually using the noise to compute and use h. It is sort of there in the routine SNESDiffParameterCompute_More() but that routine is not documented or user friendly. I would start by rigging up calls to SNESDiffParameterCompute_More() to see what it says the noise of your function is, based on your hand determined values optimal values for err it should be roughly the square of them. Then if this makes sense you can trying using the -snes_mf_version 2 code to see if it helps the matrix free multiple behave consistently well for your problem by adaptively computing the err and using that in computing h. Good luck and I'd love to hear back from you if it helps and suggestions/code for making this more integrated with SNES so it is easier to use. For example perhaps we want a function SNESComputeNoise() that uses SNESDiffParameterCompute_More() to compute and report the noise for a range of values of u. Barry > > Is there anything else I could possibly tune in this context? > > The discretization is through mfem and I use standard H1 for my problem. > > Thanks, > Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From swarnava89 at gmail.com Wed Aug 14 21:47:52 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Wed, 14 Aug 2019 19:47:52 -0700 Subject: [petsc-users] Creating a 3D dmplex mesh with cell list and distributing it In-Reply-To: References: Message-ID: Hi Matthew, "It looks like you are running things with the wrong 'mpirun' ", Could you please elaborate on this? I have another DMDA in my code, which is correctly being parallelized. Thanks, SG On Wed, Aug 14, 2019 at 7:35 PM Matthew Knepley wrote: > On Wed, Aug 14, 2019 at 10:23 PM Swarnava Ghosh > wrote: > >> Hi Matthew, >> >> I added DMView(pCgdft->dmplex,PETSC_VIEWER_STDOUT_WORLD); before and >> after distribution, and I get the following: >> > > It looks like you are running things with the wrong 'mpirun' > > Thanks, > > Matt > > >> dmplex before distribution >> DM Object: 3 MPI processes >> type: plex >> DM_0x84000004_0 in 3 dimensions: >> 0-cells: 7 7 7 >> 1-cells: 17 17 17 >> 2-cells: 17 17 17 >> 3-cells: 6 6 6 >> Labels: >> depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) >> dmplex after distribution >> DM Object: Parallel Mesh 3 MPI processes >> type: plex >> Parallel Mesh in 3 dimensions: >> 0-cells: 7 7 7 >> 1-cells: 17 17 17 >> 2-cells: 17 17 17 >> 3-cells: 6 6 6 >> Labels: >> depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) >> >> Thanks, >> SG >> >> On Wed, Aug 14, 2019 at 6:48 PM Matthew Knepley >> wrote: >> >>> DMView() the mesh before and after distribution, so we can see what we >>> have. >>> >>> Thanks, >>> >>> Matt >>> >>> On Wed, Aug 14, 2019 at 5:30 PM Swarnava Ghosh via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hi PETSc team and users, >>>> >>>> I am trying to create a 3D dmplex mesh using DMPlexCreateFromCellList, >>>> then distribute it, and find out the coordinates of the vertices owned by >>>> each process. >>>> My cell list is as follows: >>>> numCells: 6 >>>> numVertices: 7 >>>> numCorners: 4 >>>> cells: >>>> 0 >>>> 3 >>>> 2 >>>> 1 >>>> 4 >>>> 0 >>>> 2 >>>> 1 >>>> 6 >>>> 4 >>>> 2 >>>> 1 >>>> 3 >>>> 6 >>>> 2 >>>> 1 >>>> 5 >>>> 4 >>>> 6 >>>> 1 >>>> 3 >>>> 5 >>>> 6 >>>> 1 >>>> vertexCoords: >>>> -6.043000 >>>> -5.233392 >>>> -4.924000 >>>> -3.021500 >>>> 0.000000 >>>> -4.924000 >>>> -3.021500 >>>> -3.488928 >>>> 0.000000 >>>> -6.043000 >>>> 1.744464 >>>> 0.000000 >>>> 0.000000 >>>> -5.233392 >>>> -4.924000 >>>> 3.021500 >>>> 0.000000 >>>> -4.924000 >>>> 3.021500 >>>> -3.488928 >>>> 0.000000 >>>> >>>> After reading this information, I do >>>> ierr= >>>> DMPlexCreateFromCellList(PETSC_COMM_WORLD,3,pCgdft->numCellsESP,pCgdft->NESP,pCgdft->numCornersESP,interpolate,pCgdft->cellsESP,3,pCgdft->vertexCoordsESP,&pCgdft->dmplex); >>>> >>>> ierr = DMPlexDistribute(pCgdft->dmplex,0,&pCgdft->dmplexSF, >>>> &distributedMesh);CHKERRQ(ierr); >>>> >>>> if (distributedMesh) { >>>> printf("mesh is distributed \n"); >>>> ierr = DMDestroy(&pCgdft->dmplex);CHKERRQ(ierr); >>>> pCgdft->dmplex = distributedMesh; >>>> } >>>> >>>> DMGetCoordinates(pCgdft->dmplex,&VC); >>>> VecView(VC,PETSC_VIEWER_STDOUT_WORLD); >>>> >>>> On running this with 3 mpi processes, From VecView, I see that all the >>>> processes own all the vertices. Why is the dmplex not being distributed? >>>> >>>> The VecView is : >>>> Process [0] >>>> -6.043 >>>> -5.23339 >>>> -4.924 >>>> -3.0215 >>>> 0. >>>> -4.924 >>>> -3.0215 >>>> -3.48893 >>>> 0. >>>> -6.043 >>>> 1.74446 >>>> 0. >>>> 0. >>>> -5.23339 >>>> -4.924 >>>> 3.0215 >>>> 0. >>>> -4.924 >>>> 3.0215 >>>> -3.48893 >>>> 0. >>>> Process [1] >>>> -6.043 >>>> -5.23339 >>>> -4.924 >>>> -3.0215 >>>> 0. >>>> -4.924 >>>> -3.0215 >>>> -3.48893 >>>> 0. >>>> -6.043 >>>> 1.74446 >>>> 0. >>>> 0. >>>> -5.23339 >>>> -4.924 >>>> 3.0215 >>>> 0. >>>> -4.924 >>>> 3.0215 >>>> -3.48893 >>>> 0. >>>> Process [2] >>>> -6.043 >>>> -5.23339 >>>> -4.924 >>>> -3.0215 >>>> 0. >>>> -4.924 >>>> -3.0215 >>>> -3.48893 >>>> 0. >>>> -6.043 >>>> 1.74446 >>>> 0. >>>> 0. >>>> -5.23339 >>>> -4.924 >>>> 3.0215 >>>> 0. >>>> -4.924 >>>> 3.0215 >>>> -3.48893 >>>> 0. >>>> >>>> Thanks, >>>> SG >>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 14 21:57:26 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 15 Aug 2019 02:57:26 +0000 Subject: [petsc-users] How to choose mat_mffd_err in JFNK In-Reply-To: References: <459CE6EB-5359-4D35-B90F-CECFD32C0EB4@anl.gov> Message-ID: > On Aug 14, 2019, at 9:41 PM, Tang, Qi wrote: > > Thanks for the help, Barry. I tired both ds and wp, and again it depends on if I could find the correct parameter set. It is getting harder as I refine the mesh. > > So I try to use SNESDefaultMatrixFreeCreate2, SNESMatrixFreeMult2_Private or SNESDiffParameterCompute_More in mfem. But it looks like these functions are not in linked in petscsnes.h. How could I call them? They may not be listed in petscsnes.h but I think they should be in the library. You can just stick the prototypes for the functions anywhere you need them for now. You should be able to use ierr = PetscOptionsInt("-snes_mf_version","Matrix-Free routines version 1 or 2","None",snes->mf_version,&snes->mf_version,0);CHKERRQ(ierr); then this info is passed with if (snes->mf) { ierr = SNESSetUpMatrixFree_Private(snes, snes->mf_operator, snes->mf_version);CHKERRQ(ierr); } this routine has if (version == 1) { ierr = MatCreateSNESMF(snes,&J);CHKERRQ(ierr); ierr = MatMFFDSetOptionsPrefix(J,((PetscObject)snes)->prefix);CHKERRQ(ierr); ierr = MatSetFromOptions(J);CHKERRQ(ierr); } else if (version == 2) { if (!snes->vec_func) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"SNESSetFunction() must be called first"); #if !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_REAL_SINGLE) && !defined(PETSC_USE_REAL___FLOAT128) && !defined(PETSC_USE_REAL___FP16) ierr = SNESDefaultMatrixFreeCreate2(snes,snes->vec_func,&J);CHKERRQ(ierr); #else and this routine has ierr = VecDuplicate(x,&mfctx->w);CHKERRQ(ierr); ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr); ierr = VecGetSize(x,&n);CHKERRQ(ierr); ierr = VecGetLocalSize(x,&nloc);CHKERRQ(ierr); ierr = MatCreate(comm,J);CHKERRQ(ierr); ierr = MatSetSizes(*J,nloc,n,n,n);CHKERRQ(ierr); ierr = MatSetType(*J,MATSHELL);CHKERRQ(ierr); ierr = MatShellSetContext(*J,mfctx);CHKERRQ(ierr); ierr = MatShellSetOperation(*J,MATOP_MULT,(void (*)(void))SNESMatrixFreeMult2_Private);CHKERRQ(ierr); ierr = MatShellSetOperation(*J,MATOP_DESTROY,(void (*)(void))SNESMatrixFreeDestroy2_Private);CHKERRQ(ierr); ierr = MatShellSetOperation(*J,MATOP_VIEW,(void (*)(void))SNESMatrixFreeView2_Private);CHKERRQ(ierr); ierr = MatSetUp(*J);CHKERRQ(ierr); > > Also, could I call SNESDiffParameterCompute_More in snes_monitor? But it needs "a" in (F(u + ha) - F(u)) /h as the input. So I am not sure how I could properly use it. Maybe use SNESDefaultMatrixFreeCreate2 inside SNESSetJacobian would be easier to try? If you use the flag -snes_mf_noise_file filename when it runs it will save all the noise information it computes along the way to that file (yes it is crude and doesn't match the PETSc Viewer/monitor style but it should work). Thus I think you can use it and get all the possible monitoring information without actually writing any code. Just -snes_mf_version 2 -snes_mf_noise_file filename Barry > > Thanks again, > Qi > > From: Smith, Barry F. > Sent: Tuesday, August 13, 2019 9:07 PM > To: Tang, Qi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users wrote: > > > > ?Hi, > > I am using JFNK, inexact Newton and a shell "physics-based" preconditioning to solve some multiphysics problems. I have been playing with mat_mffd_err, and it gives me some results I do not fully understand. > > > > I believe the default value of mat_mffd_err is 10^-8 for double precision, which seem too large for my problems. When I test a problem essentially still in the linear regime, I expect my converged "unpreconditioned resid norm of KSP" should be identical to "SNES Function norm" (Should I?). This is exactly what I found if I could find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it happens, the whole algorithm works as expected. When those two norms are off, the inexact Newton becomes very inefficient. For instance, it may take many ksp iterations to converge but the snes norm is only reduced slightly. > > > > According to the manual, mat_mffd_err should be "square root of relative error in function evaluation". But is there a simple way to estimate it? > > First a related note: there are two different algorithms that PETSc provides for computing the factor h using the err parameter. They can be set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some people have better luck with one or the other for their problem. > > > There is some code in PETSc to compute what is called the "noise" of the function which in theory leads to a better err value. For example if say the last four digits of your function are "noise" (that is meaningless stuff) which is possible for complicated multiphysics problems due to round off in the function evaluations then you should use an err that is 2 digits bigger than the default (note this is just the square root of the "function epsilon" instead of the machine epsilon, because the "function epsilon" is much larger than the machine epsilon). > > We never had a great model for how to hook the noise computation code up into SNES so it is a bit disconnected, we should revisit this. Perhaps you will find it useful and we can figure out together how to hook it in cleanly for use. > > The code that computes and reports the noise is in the directory src/snes/interface/noise > > You can use this version with the option -snes_mf_version 2 (version 1 is the normal behavior) > > The code has the ability to print to a file or the screen information about the noise it is finding, maybe with command line options, you'll have to look directly at the code to see how. > > I don't like the current setup because if you use the noise based computation of h you have to use SNESMatrixFreeMult2_Private() which is a slightly different routine for doing the actually differencing than the two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it can't use the standard differencing routines but call the noise routine to compute h (just requires some minor code refactoring). Also I don't see an automatic way to just compute the noise of a function at a bunch of points independent of actually using the noise to compute and use h. It is sort of there in the routine SNESDiffParameterCompute_More() but that routine is not documented or user friendly. > > I would start by rigging up calls to SNESDiffParameterCompute_More() to see what it says the noise of your function is, based on your hand determined values optimal values for err it should be roughly the square of them. Then if this makes sense you can trying using the -snes_mf_version 2 code to see if it helps the matrix free multiple behave consistently well for your problem by adaptively computing the err and using that in computing h. > > Good luck and I'd love to hear back from you if it helps and suggestions/code for making this more integrated with SNES so it is easier to use. For example perhaps we want a function SNESComputeNoise() that uses SNESDiffParameterCompute_More() to compute and report the noise for a range of values of u. > > Barry > > > > > > > > > Is there anything else I could possibly tune in this context? > > > > The discretization is through mfem and I use standard H1 for my problem. > > > > Thanks, > > Qi From knepley at gmail.com Wed Aug 14 22:46:58 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 14 Aug 2019 23:46:58 -0400 Subject: [petsc-users] Creating a 3D dmplex mesh with cell list and distributing it In-Reply-To: References: Message-ID: On Wed, Aug 14, 2019 at 10:48 PM Swarnava Ghosh wrote: > Hi Matthew, > > "It looks like you are running things with the wrong 'mpirun' ", Could you > please elaborate on this? I have another DMDA in my code, which is > correctly being parallelized. > Ah, I see it now. You are feeding in the initial mesh on every process. Normally one process generates the mesh and the other ones just give 0 for num cells and vertices. Thanks, Matt > Thanks, > SG > > On Wed, Aug 14, 2019 at 7:35 PM Matthew Knepley wrote: > >> On Wed, Aug 14, 2019 at 10:23 PM Swarnava Ghosh >> wrote: >> >>> Hi Matthew, >>> >>> I added DMView(pCgdft->dmplex,PETSC_VIEWER_STDOUT_WORLD); before and >>> after distribution, and I get the following: >>> >> >> It looks like you are running things with the wrong 'mpirun' >> >> Thanks, >> >> Matt >> >> >>> dmplex before distribution >>> DM Object: 3 MPI processes >>> type: plex >>> DM_0x84000004_0 in 3 dimensions: >>> 0-cells: 7 7 7 >>> 1-cells: 17 17 17 >>> 2-cells: 17 17 17 >>> 3-cells: 6 6 6 >>> Labels: >>> depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) >>> dmplex after distribution >>> DM Object: Parallel Mesh 3 MPI processes >>> type: plex >>> Parallel Mesh in 3 dimensions: >>> 0-cells: 7 7 7 >>> 1-cells: 17 17 17 >>> 2-cells: 17 17 17 >>> 3-cells: 6 6 6 >>> Labels: >>> depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) >>> >>> Thanks, >>> SG >>> >>> On Wed, Aug 14, 2019 at 6:48 PM Matthew Knepley >>> wrote: >>> >>>> DMView() the mesh before and after distribution, so we can see what we >>>> have. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On Wed, Aug 14, 2019 at 5:30 PM Swarnava Ghosh via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hi PETSc team and users, >>>>> >>>>> I am trying to create a 3D dmplex mesh using DMPlexCreateFromCellList, >>>>> then distribute it, and find out the coordinates of the vertices owned by >>>>> each process. >>>>> My cell list is as follows: >>>>> numCells: 6 >>>>> numVertices: 7 >>>>> numCorners: 4 >>>>> cells: >>>>> 0 >>>>> 3 >>>>> 2 >>>>> 1 >>>>> 4 >>>>> 0 >>>>> 2 >>>>> 1 >>>>> 6 >>>>> 4 >>>>> 2 >>>>> 1 >>>>> 3 >>>>> 6 >>>>> 2 >>>>> 1 >>>>> 5 >>>>> 4 >>>>> 6 >>>>> 1 >>>>> 3 >>>>> 5 >>>>> 6 >>>>> 1 >>>>> vertexCoords: >>>>> -6.043000 >>>>> -5.233392 >>>>> -4.924000 >>>>> -3.021500 >>>>> 0.000000 >>>>> -4.924000 >>>>> -3.021500 >>>>> -3.488928 >>>>> 0.000000 >>>>> -6.043000 >>>>> 1.744464 >>>>> 0.000000 >>>>> 0.000000 >>>>> -5.233392 >>>>> -4.924000 >>>>> 3.021500 >>>>> 0.000000 >>>>> -4.924000 >>>>> 3.021500 >>>>> -3.488928 >>>>> 0.000000 >>>>> >>>>> After reading this information, I do >>>>> ierr= >>>>> DMPlexCreateFromCellList(PETSC_COMM_WORLD,3,pCgdft->numCellsESP,pCgdft->NESP,pCgdft->numCornersESP,interpolate,pCgdft->cellsESP,3,pCgdft->vertexCoordsESP,&pCgdft->dmplex); >>>>> >>>>> ierr = DMPlexDistribute(pCgdft->dmplex,0,&pCgdft->dmplexSF, >>>>> &distributedMesh);CHKERRQ(ierr); >>>>> >>>>> if (distributedMesh) { >>>>> printf("mesh is distributed \n"); >>>>> ierr = DMDestroy(&pCgdft->dmplex);CHKERRQ(ierr); >>>>> pCgdft->dmplex = distributedMesh; >>>>> } >>>>> >>>>> DMGetCoordinates(pCgdft->dmplex,&VC); >>>>> VecView(VC,PETSC_VIEWER_STDOUT_WORLD); >>>>> >>>>> On running this with 3 mpi processes, From VecView, I see that all the >>>>> processes own all the vertices. Why is the dmplex not being distributed? >>>>> >>>>> The VecView is : >>>>> Process [0] >>>>> -6.043 >>>>> -5.23339 >>>>> -4.924 >>>>> -3.0215 >>>>> 0. >>>>> -4.924 >>>>> -3.0215 >>>>> -3.48893 >>>>> 0. >>>>> -6.043 >>>>> 1.74446 >>>>> 0. >>>>> 0. >>>>> -5.23339 >>>>> -4.924 >>>>> 3.0215 >>>>> 0. >>>>> -4.924 >>>>> 3.0215 >>>>> -3.48893 >>>>> 0. >>>>> Process [1] >>>>> -6.043 >>>>> -5.23339 >>>>> -4.924 >>>>> -3.0215 >>>>> 0. >>>>> -4.924 >>>>> -3.0215 >>>>> -3.48893 >>>>> 0. >>>>> -6.043 >>>>> 1.74446 >>>>> 0. >>>>> 0. >>>>> -5.23339 >>>>> -4.924 >>>>> 3.0215 >>>>> 0. >>>>> -4.924 >>>>> 3.0215 >>>>> -3.48893 >>>>> 0. >>>>> Process [2] >>>>> -6.043 >>>>> -5.23339 >>>>> -4.924 >>>>> -3.0215 >>>>> 0. >>>>> -4.924 >>>>> -3.0215 >>>>> -3.48893 >>>>> 0. >>>>> -6.043 >>>>> 1.74446 >>>>> 0. >>>>> 0. >>>>> -5.23339 >>>>> -4.924 >>>>> 3.0215 >>>>> 0. >>>>> -4.924 >>>>> 3.0215 >>>>> -3.48893 >>>>> 0. >>>>> >>>>> Thanks, >>>>> SG >>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From swarnava89 at gmail.com Wed Aug 14 23:50:42 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Wed, 14 Aug 2019 21:50:42 -0700 Subject: [petsc-users] Creating a 3D dmplex mesh with cell list and distributing it In-Reply-To: References: Message-ID: Hi Matthew, Thank you for your response. It works right now. dmplex before distribution DM Object: 3 MPI processes type: plex DM_0x84000004_0 in 3 dimensions: 0-cells: 7 0 0 1-cells: 17 0 0 2-cells: 17 0 0 3-cells: 6 0 0 Labels: depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) dmplex after distribution DM Object: Parallel Mesh 3 MPI processes type: plex Parallel Mesh in 3 dimensions: 0-cells: 5 5 5 1-cells: 9 9 9 2-cells: 7 7 7 3-cells: 2 2 2 Labels: depth: 4 strata with value/size (0 (5), 1 (9), 2 (7), 3 (2)) Thanks, Swarnava On Wed, Aug 14, 2019 at 8:47 PM Matthew Knepley wrote: > On Wed, Aug 14, 2019 at 10:48 PM Swarnava Ghosh > wrote: > >> Hi Matthew, >> >> "It looks like you are running things with the wrong 'mpirun' ", Could >> you please elaborate on this? I have another DMDA in my code, which is >> correctly being parallelized. >> > > Ah, I see it now. You are feeding in the initial mesh on every process. > Normally one process generates the mesh > and the other ones just give 0 for num cells and vertices. > > Thanks, > > Matt > > >> Thanks, >> SG >> >> On Wed, Aug 14, 2019 at 7:35 PM Matthew Knepley >> wrote: >> >>> On Wed, Aug 14, 2019 at 10:23 PM Swarnava Ghosh >>> wrote: >>> >>>> Hi Matthew, >>>> >>>> I added DMView(pCgdft->dmplex,PETSC_VIEWER_STDOUT_WORLD); before and >>>> after distribution, and I get the following: >>>> >>> >>> It looks like you are running things with the wrong 'mpirun' >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> dmplex before distribution >>>> DM Object: 3 MPI processes >>>> type: plex >>>> DM_0x84000004_0 in 3 dimensions: >>>> 0-cells: 7 7 7 >>>> 1-cells: 17 17 17 >>>> 2-cells: 17 17 17 >>>> 3-cells: 6 6 6 >>>> Labels: >>>> depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) >>>> dmplex after distribution >>>> DM Object: Parallel Mesh 3 MPI processes >>>> type: plex >>>> Parallel Mesh in 3 dimensions: >>>> 0-cells: 7 7 7 >>>> 1-cells: 17 17 17 >>>> 2-cells: 17 17 17 >>>> 3-cells: 6 6 6 >>>> Labels: >>>> depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) >>>> >>>> Thanks, >>>> SG >>>> >>>> On Wed, Aug 14, 2019 at 6:48 PM Matthew Knepley >>>> wrote: >>>> >>>>> DMView() the mesh before and after distribution, so we can see what we >>>>> have. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> On Wed, Aug 14, 2019 at 5:30 PM Swarnava Ghosh via petsc-users < >>>>> petsc-users at mcs.anl.gov> wrote: >>>>> >>>>>> Hi PETSc team and users, >>>>>> >>>>>> I am trying to create a 3D dmplex mesh >>>>>> using DMPlexCreateFromCellList, then distribute it, and find out the >>>>>> coordinates of the vertices owned by each process. >>>>>> My cell list is as follows: >>>>>> numCells: 6 >>>>>> numVertices: 7 >>>>>> numCorners: 4 >>>>>> cells: >>>>>> 0 >>>>>> 3 >>>>>> 2 >>>>>> 1 >>>>>> 4 >>>>>> 0 >>>>>> 2 >>>>>> 1 >>>>>> 6 >>>>>> 4 >>>>>> 2 >>>>>> 1 >>>>>> 3 >>>>>> 6 >>>>>> 2 >>>>>> 1 >>>>>> 5 >>>>>> 4 >>>>>> 6 >>>>>> 1 >>>>>> 3 >>>>>> 5 >>>>>> 6 >>>>>> 1 >>>>>> vertexCoords: >>>>>> -6.043000 >>>>>> -5.233392 >>>>>> -4.924000 >>>>>> -3.021500 >>>>>> 0.000000 >>>>>> -4.924000 >>>>>> -3.021500 >>>>>> -3.488928 >>>>>> 0.000000 >>>>>> -6.043000 >>>>>> 1.744464 >>>>>> 0.000000 >>>>>> 0.000000 >>>>>> -5.233392 >>>>>> -4.924000 >>>>>> 3.021500 >>>>>> 0.000000 >>>>>> -4.924000 >>>>>> 3.021500 >>>>>> -3.488928 >>>>>> 0.000000 >>>>>> >>>>>> After reading this information, I do >>>>>> ierr= >>>>>> DMPlexCreateFromCellList(PETSC_COMM_WORLD,3,pCgdft->numCellsESP,pCgdft->NESP,pCgdft->numCornersESP,interpolate,pCgdft->cellsESP,3,pCgdft->vertexCoordsESP,&pCgdft->dmplex); >>>>>> >>>>>> ierr = DMPlexDistribute(pCgdft->dmplex,0,&pCgdft->dmplexSF, >>>>>> &distributedMesh);CHKERRQ(ierr); >>>>>> >>>>>> if (distributedMesh) { >>>>>> printf("mesh is distributed \n"); >>>>>> ierr = DMDestroy(&pCgdft->dmplex);CHKERRQ(ierr); >>>>>> pCgdft->dmplex = distributedMesh; >>>>>> } >>>>>> >>>>>> DMGetCoordinates(pCgdft->dmplex,&VC); >>>>>> VecView(VC,PETSC_VIEWER_STDOUT_WORLD); >>>>>> >>>>>> On running this with 3 mpi processes, From VecView, I see that all >>>>>> the processes own all the vertices. Why is the dmplex not being >>>>>> distributed? >>>>>> >>>>>> The VecView is : >>>>>> Process [0] >>>>>> -6.043 >>>>>> -5.23339 >>>>>> -4.924 >>>>>> -3.0215 >>>>>> 0. >>>>>> -4.924 >>>>>> -3.0215 >>>>>> -3.48893 >>>>>> 0. >>>>>> -6.043 >>>>>> 1.74446 >>>>>> 0. >>>>>> 0. >>>>>> -5.23339 >>>>>> -4.924 >>>>>> 3.0215 >>>>>> 0. >>>>>> -4.924 >>>>>> 3.0215 >>>>>> -3.48893 >>>>>> 0. >>>>>> Process [1] >>>>>> -6.043 >>>>>> -5.23339 >>>>>> -4.924 >>>>>> -3.0215 >>>>>> 0. >>>>>> -4.924 >>>>>> -3.0215 >>>>>> -3.48893 >>>>>> 0. >>>>>> -6.043 >>>>>> 1.74446 >>>>>> 0. >>>>>> 0. >>>>>> -5.23339 >>>>>> -4.924 >>>>>> 3.0215 >>>>>> 0. >>>>>> -4.924 >>>>>> 3.0215 >>>>>> -3.48893 >>>>>> 0. >>>>>> Process [2] >>>>>> -6.043 >>>>>> -5.23339 >>>>>> -4.924 >>>>>> -3.0215 >>>>>> 0. >>>>>> -4.924 >>>>>> -3.0215 >>>>>> -3.48893 >>>>>> 0. >>>>>> -6.043 >>>>>> 1.74446 >>>>>> 0. >>>>>> 0. >>>>>> -5.23339 >>>>>> -4.924 >>>>>> 3.0215 >>>>>> 0. >>>>>> -4.924 >>>>>> 3.0215 >>>>>> -3.48893 >>>>>> 0. >>>>>> >>>>>> Thanks, >>>>>> SG >>>>>> >>>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://www.cse.buffalo.edu/~knepley/ >>>>> >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tangqi at msu.edu Thu Aug 15 00:36:36 2019 From: tangqi at msu.edu (Tang, Qi) Date: Thu, 15 Aug 2019 05:36:36 +0000 Subject: [petsc-users] How to choose mat_mffd_err in JFNK In-Reply-To: References: <459CE6EB-5359-4D35-B90F-CECFD32C0EB4@anl.gov> , Message-ID: Thanks, it works. snes_mf_jorge works for me. It appears to compute h in every ksp. Without -snes_mf_jorge, it is not working. For some reason, it only computes h once, but that h is bad. My gmres residual is not decaying. Indeed, the noise in my function becomes larger when I refine the mesh. I think it makes sense as I use the same time step for different meshes (that is the goal of the preconditioning). However, even when the algorithm is working, sqrt(noise) is much less than the good mat_mffd_err I previously found (10^-6 vs 10^-3). I do not understand why. Although snes_mf_jorge is working, it is very expensive, as it has to evaluate F many times when estimating h. Unfortunately, to achieve the nonlinearity, I have to assemble some operators inside my F. There seem no easy solutions. I will try to compute h multiple times without using snes_mf_jorge. But let me know if you have other suggestions. Thanks! Qi ________________________________ From: Smith, Barry F. Sent: Wednesday, August 14, 2019 10:57 PM To: Tang, Qi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > On Aug 14, 2019, at 9:41 PM, Tang, Qi wrote: > > Thanks for the help, Barry. I tired both ds and wp, and again it depends on if I could find the correct parameter set. It is getting harder as I refine the mesh. > > So I try to use SNESDefaultMatrixFreeCreate2, SNESMatrixFreeMult2_Private or SNESDiffParameterCompute_More in mfem. But it looks like these functions are not in linked in petscsnes.h. How could I call them? They may not be listed in petscsnes.h but I think they should be in the library. You can just stick the prototypes for the functions anywhere you need them for now. You should be able to use ierr = PetscOptionsInt("-snes_mf_version","Matrix-Free routines version 1 or 2","None",snes->mf_version,&snes->mf_version,0);CHKERRQ(ierr); then this info is passed with if (snes->mf) { ierr = SNESSetUpMatrixFree_Private(snes, snes->mf_operator, snes->mf_version);CHKERRQ(ierr); } this routine has if (version == 1) { ierr = MatCreateSNESMF(snes,&J);CHKERRQ(ierr); ierr = MatMFFDSetOptionsPrefix(J,((PetscObject)snes)->prefix);CHKERRQ(ierr); ierr = MatSetFromOptions(J);CHKERRQ(ierr); } else if (version == 2) { if (!snes->vec_func) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"SNESSetFunction() must be called first"); #if !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_REAL_SINGLE) && !defined(PETSC_USE_REAL___FLOAT128) && !defined(PETSC_USE_REAL___FP16) ierr = SNESDefaultMatrixFreeCreate2(snes,snes->vec_func,&J);CHKERRQ(ierr); #else and this routine has ierr = VecDuplicate(x,&mfctx->w);CHKERRQ(ierr); ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr); ierr = VecGetSize(x,&n);CHKERRQ(ierr); ierr = VecGetLocalSize(x,&nloc);CHKERRQ(ierr); ierr = MatCreate(comm,J);CHKERRQ(ierr); ierr = MatSetSizes(*J,nloc,n,n,n);CHKERRQ(ierr); ierr = MatSetType(*J,MATSHELL);CHKERRQ(ierr); ierr = MatShellSetContext(*J,mfctx);CHKERRQ(ierr); ierr = MatShellSetOperation(*J,MATOP_MULT,(void (*)(void))SNESMatrixFreeMult2_Private);CHKERRQ(ierr); ierr = MatShellSetOperation(*J,MATOP_DESTROY,(void (*)(void))SNESMatrixFreeDestroy2_Private);CHKERRQ(ierr); ierr = MatShellSetOperation(*J,MATOP_VIEW,(void (*)(void))SNESMatrixFreeView2_Private);CHKERRQ(ierr); ierr = MatSetUp(*J);CHKERRQ(ierr); > > Also, could I call SNESDiffParameterCompute_More in snes_monitor? But it needs "a" in (F(u + ha) - F(u)) /h as the input. So I am not sure how I could properly use it. Maybe use SNESDefaultMatrixFreeCreate2 inside SNESSetJacobian would be easier to try? If you use the flag -snes_mf_noise_file filename when it runs it will save all the noise information it computes along the way to that file (yes it is crude and doesn't match the PETSc Viewer/monitor style but it should work). Thus I think you can use it and get all the possible monitoring information without actually writing any code. Just -snes_mf_version 2 -snes_mf_noise_file filename Barry > > Thanks again, > Qi > > From: Smith, Barry F. > Sent: Tuesday, August 13, 2019 9:07 PM > To: Tang, Qi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users wrote: > > > > ?Hi, > > I am using JFNK, inexact Newton and a shell "physics-based" preconditioning to solve some multiphysics problems. I have been playing with mat_mffd_err, and it gives me some results I do not fully understand. > > > > I believe the default value of mat_mffd_err is 10^-8 for double precision, which seem too large for my problems. When I test a problem essentially still in the linear regime, I expect my converged "unpreconditioned resid norm of KSP" should be identical to "SNES Function norm" (Should I?). This is exactly what I found if I could find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it happens, the whole algorithm works as expected. When those two norms are off, the inexact Newton becomes very inefficient. For instance, it may take many ksp iterations to converge but the snes norm is only reduced slightly. > > > > According to the manual, mat_mffd_err should be "square root of relative error in function evaluation". But is there a simple way to estimate it? > > First a related note: there are two different algorithms that PETSc provides for computing the factor h using the err parameter. They can be set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some people have better luck with one or the other for their problem. > > > There is some code in PETSc to compute what is called the "noise" of the function which in theory leads to a better err value. For example if say the last four digits of your function are "noise" (that is meaningless stuff) which is possible for complicated multiphysics problems due to round off in the function evaluations then you should use an err that is 2 digits bigger than the default (note this is just the square root of the "function epsilon" instead of the machine epsilon, because the "function epsilon" is much larger than the machine epsilon). > > We never had a great model for how to hook the noise computation code up into SNES so it is a bit disconnected, we should revisit this. Perhaps you will find it useful and we can figure out together how to hook it in cleanly for use. > > The code that computes and reports the noise is in the directory src/snes/interface/noise > > You can use this version with the option -snes_mf_version 2 (version 1 is the normal behavior) > > The code has the ability to print to a file or the screen information about the noise it is finding, maybe with command line options, you'll have to look directly at the code to see how. > > I don't like the current setup because if you use the noise based computation of h you have to use SNESMatrixFreeMult2_Private() which is a slightly different routine for doing the actually differencing than the two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it can't use the standard differencing routines but call the noise routine to compute h (just requires some minor code refactoring). Also I don't see an automatic way to just compute the noise of a function at a bunch of points independent of actually using the noise to compute and use h. It is sort of there in the routine SNESDiffParameterCompute_More() but that routine is not documented or user friendly. > > I would start by rigging up calls to SNESDiffParameterCompute_More() to see what it says the noise of your function is, based on your hand determined values optimal values for err it should be roughly the square of them. Then if this makes sense you can trying using the -snes_mf_version 2 code to see if it helps the matrix free multiple behave consistently well for your problem by adaptively computing the err and using that in computing h. > > Good luck and I'd love to hear back from you if it helps and suggestions/code for making this more integrated with SNES so it is easier to use. For example perhaps we want a function SNESComputeNoise() that uses SNESDiffParameterCompute_More() to compute and report the noise for a range of values of u. > > Barry > > > > > > > > > Is there anything else I could possibly tune in this context? > > > > The discretization is through mfem and I use standard H1 for my problem. > > > > Thanks, > > Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 15 02:30:52 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 15 Aug 2019 07:30:52 +0000 Subject: [petsc-users] How to choose mat_mffd_err in JFNK In-Reply-To: References: <459CE6EB-5359-4D35-B90F-CECFD32C0EB4@anl.gov> Message-ID: <827791B1-D492-49DE-954B-BCD6E3253360@mcs.anl.gov> > On Aug 15, 2019, at 12:36 AM, Tang, Qi wrote: > > Thanks, it works. snes_mf_jorge works for me. Great. > It appears to compute h in every ksp. Each matrix vector product or each KSPSolve()? From the code looks like each matrix-vector product. > > Without -snes_mf_jorge, it is not working. For some reason, it only computes h once, but that h is bad. My gmres residual is not decaying. > > Indeed, the noise in my function becomes larger when I refine the mesh. I think it makes sense as I use the same time step for different meshes (that is the goal of the preconditioning). However, even when the algorithm is working, sqrt(noise) is much less than the good mat_mffd_err I previously found (10^-6 vs 10^-3). I do not understand why. I have no explanation. The details of the code that computes err are difficult to trace exactly; would take a while. Perhaps there is some parameter in there that is too "conservative"? > > Although snes_mf_jorge is working, it is very expensive, as it has to evaluate F many times when estimating h. Unfortunately, to achieve the nonlinearity, I have to assemble some operators inside my F. There seem no easy solutions. > > I will try to compute h multiple times without using snes_mf_jorge. But let me know if you have other suggestions. Thanks! Yes, it seems to me computing the err less often would be a way to make the code go faster. I looked at the code more closely and noticed a couple of things. if (ctx->jorge) { ierr = SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); /* Use the Brown/Saad method to compute h */ } else { /* Compute error if desired */ ierr = SNESGetIterationNumber(snes,&iter);CHKERRQ(ierr); if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) { /* Use Jorge's method to compute noise */ ierr = SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); ctx->error_rel = PetscSqrtReal(noise); ierr = PetscInfo3(snes,"Using Jorge's noise: noise=%g, sqrt(noise)=%g, h_more=%g\n",(double)noise,(double)ctx->error_rel,(double)h);CHKERRQ(ierr); ctx->compute_err_iter = iter; ctx->need_err = PETSC_FALSE; } So if jorge is set it uses the jorge algorithm for each matrix multiple to compute a new err and h. If jorge is not set but -snes_mf_compute_err is set then it computes a new err and h depending on the parameters (you can run with -info and grep for noise to see the PetscInfo lines printed (maybe add iter to the output to see how often it is recomputed) if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) { so the compute_err_freq determines when the err and h are recomputed. The logic is a bit strange so I cannot decipher exactly how often it is recomputing. I'm guess at the first Newton iteration and then some number of iterations later. You could try to rig it so that it is at every new Newton step (this would mean when computing f(x + h a) - f(x) for each new x it will recompute I think). A more "research type" approach to try to reduce the work to a reasonable level would be to keep the same err until "things start to go bad" and then recompute it. But how to measure "when things start to go bad?" It is related to GMRES stagnating, so you could for example track the convergence of GMRES and if it is less than expected, kill the linear solve, reset the computation of err and then start the KSP at the same point as before. But this could be terribly expensive since all the steps in the most recent KSP are lost. Another possibility is to assume that the err is valid in a certain size ball around the current x. When x + ha or the new x is outside that ball then recompute the err. But how to choose the ball size and is there a way to adjust the ball size depending on how the computations proceed. For example if everything is going great you could slowly increase the size of the ball and hope for the best; but how to detect if the ball got too big (as above but terribly expensive)? Track the size of the err each time it is computed. If it stays about the same for a small number of times just freeze it for a while at that value? But again how to determine when it is no longer good? Just a few wild thoughts, I really have no solid ideas on how to reduce the work requirements. Barry > > Qi > > > From: Smith, Barry F. > Sent: Wednesday, August 14, 2019 10:57 PM > To: Tang, Qi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > On Aug 14, 2019, at 9:41 PM, Tang, Qi wrote: > > > > Thanks for the help, Barry. I tired both ds and wp, and again it depends on if I could find the correct parameter set. It is getting harder as I refine the mesh. > > > > So I try to use SNESDefaultMatrixFreeCreate2, SNESMatrixFreeMult2_Private or SNESDiffParameterCompute_More in mfem. But it looks like these functions are not in linked in petscsnes.h. How could I call them? > > They may not be listed in petscsnes.h but I think they should be in the library. You can just stick the prototypes for the functions anywhere you need them for now. > > > You should be able to use > > ierr = PetscOptionsInt("-snes_mf_version","Matrix-Free routines version 1 or 2","None",snes->mf_version,&snes->mf_version,0);CHKERRQ(ierr); > > then this info is passed with > > if (snes->mf) { > ierr = SNESSetUpMatrixFree_Private(snes, snes->mf_operator, snes->mf_version);CHKERRQ(ierr); > } > > this routine has > > if (version == 1) { > ierr = MatCreateSNESMF(snes,&J);CHKERRQ(ierr); > ierr = MatMFFDSetOptionsPrefix(J,((PetscObject)snes)->prefix);CHKERRQ(ierr); > ierr = MatSetFromOptions(J);CHKERRQ(ierr); > } else if (version == 2) { > if (!snes->vec_func) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"SNESSetFunction() must be called first"); > #if !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_REAL_SINGLE) && !defined(PETSC_USE_REAL___FLOAT128) && !defined(PETSC_USE_REAL___FP16) > ierr = SNESDefaultMatrixFreeCreate2(snes,snes->vec_func,&J);CHKERRQ(ierr); > #else > > and this routine has > > ierr = VecDuplicate(x,&mfctx->w);CHKERRQ(ierr); > ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr); > ierr = VecGetSize(x,&n);CHKERRQ(ierr); > ierr = VecGetLocalSize(x,&nloc);CHKERRQ(ierr); > ierr = MatCreate(comm,J);CHKERRQ(ierr); > ierr = MatSetSizes(*J,nloc,n,n,n);CHKERRQ(ierr); > ierr = MatSetType(*J,MATSHELL);CHKERRQ(ierr); > ierr = MatShellSetContext(*J,mfctx);CHKERRQ(ierr); > ierr = MatShellSetOperation(*J,MATOP_MULT,(void (*)(void))SNESMatrixFreeMult2_Private);CHKERRQ(ierr); > ierr = MatShellSetOperation(*J,MATOP_DESTROY,(void (*)(void))SNESMatrixFreeDestroy2_Private);CHKERRQ(ierr); > ierr = MatShellSetOperation(*J,MATOP_VIEW,(void (*)(void))SNESMatrixFreeView2_Private);CHKERRQ(ierr); > ierr = MatSetUp(*J);CHKERRQ(ierr); > > > > > > > > > > Also, could I call SNESDiffParameterCompute_More in snes_monitor? But it needs "a" in (F(u + ha) - F(u)) /h as the input. So I am not sure how I could properly use it. Maybe use SNESDefaultMatrixFreeCreate2 inside SNESSetJacobian would be easier to try? > > If you use the flag -snes_mf_noise_file filename when it runs it will save all the noise information it computes along the way to that file (yes it is crude and doesn't match the PETSc Viewer/monitor style but it should work). > > Thus I think you can use it and get all the possible monitoring information without actually writing any code. Just > > -snes_mf_version 2 > -snes_mf_noise_file filename > > > > Barry > > > > > Thanks again, > > Qi > > > > From: Smith, Barry F. > > Sent: Tuesday, August 13, 2019 9:07 PM > > To: Tang, Qi > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > > > > > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users wrote: > > > > > > ?Hi, > > > I am using JFNK, inexact Newton and a shell "physics-based" preconditioning to solve some multiphysics problems. I have been playing with mat_mffd_err, and it gives me some results I do not fully understand. > > > > > > I believe the default value of mat_mffd_err is 10^-8 for double precision, which seem too large for my problems. When I test a problem essentially still in the linear regime, I expect my converged "unpreconditioned resid norm of KSP" should be identical to "SNES Function norm" (Should I?). This is exactly what I found if I could find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it happens, the whole algorithm works as expected. When those two norms are off, the inexact Newton becomes very inefficient. For instance, it may take many ksp iterations to converge but the snes norm is only reduced slightly. > > > > > > According to the manual, mat_mffd_err should be "square root of relative error in function evaluation". But is there a simple way to estimate it? > > > > First a related note: there are two different algorithms that PETSc provides for computing the factor h using the err parameter. They can be set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some people have better luck with one or the other for their problem. > > > > > > There is some code in PETSc to compute what is called the "noise" of the function which in theory leads to a better err value. For example if say the last four digits of your function are "noise" (that is meaningless stuff) which is possible for complicated multiphysics problems due to round off in the function evaluations then you should use an err that is 2 digits bigger than the default (note this is just the square root of the "function epsilon" instead of the machine epsilon, because the "function epsilon" is much larger than the machine epsilon). > > > > We never had a great model for how to hook the noise computation code up into SNES so it is a bit disconnected, we should revisit this. Perhaps you will find it useful and we can figure out together how to hook it in cleanly for use. > > > > The code that computes and reports the noise is in the directory src/snes/interface/noise > > > > You can use this version with the option -snes_mf_version 2 (version 1 is the normal behavior) > > > > The code has the ability to print to a file or the screen information about the noise it is finding, maybe with command line options, you'll have to look directly at the code to see how. > > > > I don't like the current setup because if you use the noise based computation of h you have to use SNESMatrixFreeMult2_Private() which is a slightly different routine for doing the actually differencing than the two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it can't use the standard differencing routines but call the noise routine to compute h (just requires some minor code refactoring). Also I don't see an automatic way to just compute the noise of a function at a bunch of points independent of actually using the noise to compute and use h. It is sort of there in the routine SNESDiffParameterCompute_More() but that routine is not documented or user friendly. > > > > I would start by rigging up calls to SNESDiffParameterCompute_More() to see what it says the noise of your function is, based on your hand determined values optimal values for err it should be roughly the square of them. Then if this makes sense you can trying using the -snes_mf_version 2 code to see if it helps the matrix free multiple behave consistently well for your problem by adaptively computing the err and using that in computing h. > > > > Good luck and I'd love to hear back from you if it helps and suggestions/code for making this more integrated with SNES so it is easier to use. For example perhaps we want a function SNESComputeNoise() that uses SNESDiffParameterCompute_More() to compute and report the noise for a range of values of u. > > > > Barry > > > > > > > > > > > > > > > > Is there anything else I could possibly tune in this context? > > > > > > The discretization is through mfem and I use standard H1 for my problem. > > > > > > Thanks, > > > Qi From knepley at gmail.com Thu Aug 15 07:48:36 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Aug 2019 08:48:36 -0400 Subject: [petsc-users] Creating a 3D dmplex mesh with cell list and distributing it In-Reply-To: References: Message-ID: On Thu, Aug 15, 2019 at 12:50 AM Swarnava Ghosh wrote: > Hi Matthew, > > Thank you for your response. It works right now. > Great! Thanks, Matt > dmplex before distribution > DM Object: 3 MPI processes > type: plex > DM_0x84000004_0 in 3 dimensions: > 0-cells: 7 0 0 > 1-cells: 17 0 0 > 2-cells: 17 0 0 > 3-cells: 6 0 0 > Labels: > depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) > dmplex after distribution > DM Object: Parallel Mesh 3 MPI processes > type: plex > Parallel Mesh in 3 dimensions: > 0-cells: 5 5 5 > 1-cells: 9 9 9 > 2-cells: 7 7 7 > 3-cells: 2 2 2 > Labels: > depth: 4 strata with value/size (0 (5), 1 (9), 2 (7), 3 (2)) > > Thanks, > Swarnava > > On Wed, Aug 14, 2019 at 8:47 PM Matthew Knepley wrote: > >> On Wed, Aug 14, 2019 at 10:48 PM Swarnava Ghosh >> wrote: >> >>> Hi Matthew, >>> >>> "It looks like you are running things with the wrong 'mpirun' ", Could >>> you please elaborate on this? I have another DMDA in my code, which is >>> correctly being parallelized. >>> >> >> Ah, I see it now. You are feeding in the initial mesh on every process. >> Normally one process generates the mesh >> and the other ones just give 0 for num cells and vertices. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> SG >>> >>> On Wed, Aug 14, 2019 at 7:35 PM Matthew Knepley >>> wrote: >>> >>>> On Wed, Aug 14, 2019 at 10:23 PM Swarnava Ghosh >>>> wrote: >>>> >>>>> Hi Matthew, >>>>> >>>>> I added DMView(pCgdft->dmplex,PETSC_VIEWER_STDOUT_WORLD); before and >>>>> after distribution, and I get the following: >>>>> >>>> >>>> It looks like you are running things with the wrong 'mpirun' >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> dmplex before distribution >>>>> DM Object: 3 MPI processes >>>>> type: plex >>>>> DM_0x84000004_0 in 3 dimensions: >>>>> 0-cells: 7 7 7 >>>>> 1-cells: 17 17 17 >>>>> 2-cells: 17 17 17 >>>>> 3-cells: 6 6 6 >>>>> Labels: >>>>> depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) >>>>> dmplex after distribution >>>>> DM Object: Parallel Mesh 3 MPI processes >>>>> type: plex >>>>> Parallel Mesh in 3 dimensions: >>>>> 0-cells: 7 7 7 >>>>> 1-cells: 17 17 17 >>>>> 2-cells: 17 17 17 >>>>> 3-cells: 6 6 6 >>>>> Labels: >>>>> depth: 4 strata with value/size (0 (7), 1 (17), 2 (17), 3 (6)) >>>>> >>>>> Thanks, >>>>> SG >>>>> >>>>> On Wed, Aug 14, 2019 at 6:48 PM Matthew Knepley >>>>> wrote: >>>>> >>>>>> DMView() the mesh before and after distribution, so we can see what >>>>>> we have. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> On Wed, Aug 14, 2019 at 5:30 PM Swarnava Ghosh via petsc-users < >>>>>> petsc-users at mcs.anl.gov> wrote: >>>>>> >>>>>>> Hi PETSc team and users, >>>>>>> >>>>>>> I am trying to create a 3D dmplex mesh >>>>>>> using DMPlexCreateFromCellList, then distribute it, and find out the >>>>>>> coordinates of the vertices owned by each process. >>>>>>> My cell list is as follows: >>>>>>> numCells: 6 >>>>>>> numVertices: 7 >>>>>>> numCorners: 4 >>>>>>> cells: >>>>>>> 0 >>>>>>> 3 >>>>>>> 2 >>>>>>> 1 >>>>>>> 4 >>>>>>> 0 >>>>>>> 2 >>>>>>> 1 >>>>>>> 6 >>>>>>> 4 >>>>>>> 2 >>>>>>> 1 >>>>>>> 3 >>>>>>> 6 >>>>>>> 2 >>>>>>> 1 >>>>>>> 5 >>>>>>> 4 >>>>>>> 6 >>>>>>> 1 >>>>>>> 3 >>>>>>> 5 >>>>>>> 6 >>>>>>> 1 >>>>>>> vertexCoords: >>>>>>> -6.043000 >>>>>>> -5.233392 >>>>>>> -4.924000 >>>>>>> -3.021500 >>>>>>> 0.000000 >>>>>>> -4.924000 >>>>>>> -3.021500 >>>>>>> -3.488928 >>>>>>> 0.000000 >>>>>>> -6.043000 >>>>>>> 1.744464 >>>>>>> 0.000000 >>>>>>> 0.000000 >>>>>>> -5.233392 >>>>>>> -4.924000 >>>>>>> 3.021500 >>>>>>> 0.000000 >>>>>>> -4.924000 >>>>>>> 3.021500 >>>>>>> -3.488928 >>>>>>> 0.000000 >>>>>>> >>>>>>> After reading this information, I do >>>>>>> ierr= >>>>>>> DMPlexCreateFromCellList(PETSC_COMM_WORLD,3,pCgdft->numCellsESP,pCgdft->NESP,pCgdft->numCornersESP,interpolate,pCgdft->cellsESP,3,pCgdft->vertexCoordsESP,&pCgdft->dmplex); >>>>>>> >>>>>>> ierr = DMPlexDistribute(pCgdft->dmplex,0,&pCgdft->dmplexSF, >>>>>>> &distributedMesh);CHKERRQ(ierr); >>>>>>> >>>>>>> if (distributedMesh) { >>>>>>> printf("mesh is distributed \n"); >>>>>>> ierr = DMDestroy(&pCgdft->dmplex);CHKERRQ(ierr); >>>>>>> pCgdft->dmplex = distributedMesh; >>>>>>> } >>>>>>> >>>>>>> DMGetCoordinates(pCgdft->dmplex,&VC); >>>>>>> VecView(VC,PETSC_VIEWER_STDOUT_WORLD); >>>>>>> >>>>>>> On running this with 3 mpi processes, From VecView, I see that all >>>>>>> the processes own all the vertices. Why is the dmplex not being >>>>>>> distributed? >>>>>>> >>>>>>> The VecView is : >>>>>>> Process [0] >>>>>>> -6.043 >>>>>>> -5.23339 >>>>>>> -4.924 >>>>>>> -3.0215 >>>>>>> 0. >>>>>>> -4.924 >>>>>>> -3.0215 >>>>>>> -3.48893 >>>>>>> 0. >>>>>>> -6.043 >>>>>>> 1.74446 >>>>>>> 0. >>>>>>> 0. >>>>>>> -5.23339 >>>>>>> -4.924 >>>>>>> 3.0215 >>>>>>> 0. >>>>>>> -4.924 >>>>>>> 3.0215 >>>>>>> -3.48893 >>>>>>> 0. >>>>>>> Process [1] >>>>>>> -6.043 >>>>>>> -5.23339 >>>>>>> -4.924 >>>>>>> -3.0215 >>>>>>> 0. >>>>>>> -4.924 >>>>>>> -3.0215 >>>>>>> -3.48893 >>>>>>> 0. >>>>>>> -6.043 >>>>>>> 1.74446 >>>>>>> 0. >>>>>>> 0. >>>>>>> -5.23339 >>>>>>> -4.924 >>>>>>> 3.0215 >>>>>>> 0. >>>>>>> -4.924 >>>>>>> 3.0215 >>>>>>> -3.48893 >>>>>>> 0. >>>>>>> Process [2] >>>>>>> -6.043 >>>>>>> -5.23339 >>>>>>> -4.924 >>>>>>> -3.0215 >>>>>>> 0. >>>>>>> -4.924 >>>>>>> -3.0215 >>>>>>> -3.48893 >>>>>>> 0. >>>>>>> -6.043 >>>>>>> 1.74446 >>>>>>> 0. >>>>>>> 0. >>>>>>> -5.23339 >>>>>>> -4.924 >>>>>>> 3.0215 >>>>>>> 0. >>>>>>> -4.924 >>>>>>> 3.0215 >>>>>>> -3.48893 >>>>>>> 0. >>>>>>> >>>>>>> Thanks, >>>>>>> SG >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_mckinnell at aol.co.uk Thu Aug 15 09:59:28 2019 From: d_mckinnell at aol.co.uk (Daniel Mckinnell) Date: Thu, 15 Aug 2019 14:59:28 +0000 (UTC) Subject: [petsc-users] Accessing Global Vector Data with References: <2145279736.4118543.1565881168869.ref@mail.yahoo.com> Message-ID: <2145279736.4118543.1565881168869@mail.yahoo.com> Hi,Attached is a way I came up with to access data in a global vector, is this the best way to do this or are there other ways? It would seem intuitive to use the global PetscSection and VecGetValuesSection but this doesn't seem to work on global vectors. Instead I have used VecGetValues and VecSetValues, however I also have a problem with these when extracting more than one value, I initialise the output of VecGetValues as PetscScalar *values; and then call VecGetValues(Other stuff... , values). This seems to work some times and not others and I can't find any rhyme or reason to it? Finally I was wondering if there is a good reference code base on Github including DMPlex that would be helpful in viewing applications of the DMPlex functions?Thanks,Daniel Mckinnell -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: for_petsc_forum.cpp Type: text/x-c++src Size: 2772 bytes Desc: not available URL: From knepley at gmail.com Thu Aug 15 10:09:46 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Aug 2019 11:09:46 -0400 Subject: [petsc-users] Accessing Global Vector Data with In-Reply-To: <2145279736.4118543.1565881168869@mail.yahoo.com> References: <2145279736.4118543.1565881168869.ref@mail.yahoo.com> <2145279736.4118543.1565881168869@mail.yahoo.com> Message-ID: On Thu, Aug 15, 2019 at 10:59 AM Daniel Mckinnell via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > Attached is a way I came up with to access data in a global vector, is > this the best way to do this or are there other ways? It would seem > intuitive to use the global PetscSection and VecGetValuesSection but this > doesn't seem to work on global vectors. > > Instead I have used VecGetValues and VecSetValues, however I also have a > problem with these when extracting more than one value, I initialise the > output of VecGetValues as PetscScalar *values; and then call VecGetValues(Other > stuff... , values). This seems to work some times and not others and I > can't find any rhyme or reason to it? > I guess I should write something about this. I like to think of it as a sort of decision tree. 1) Get just local values, meaning those owned by this process These can be obtained from either a local or global vector. 2) Get ghosted values, meaning those values lying on unowned points that exist in the default PetscSF These can only get obtained from a local vector 3) Get arbitrary values You must use a VecScatter or custom PetscSF to get these For 1) and 2), I think the best way to get values normally is to use https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointLocalRef.html https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointGlobalRef.html These also have Read versions, and Field version to split off a particular field in the Section. Does this help? Thanks, Matt > Finally I was wondering if there is a good reference code base on Github > including DMPlex that would be helpful in viewing applications of the > DMPlex functions? > Thanks, > Daniel Mckinnell > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_mckinnell at aol.co.uk Thu Aug 15 11:08:53 2019 From: d_mckinnell at aol.co.uk (Daniel Mckinnell) Date: Thu, 15 Aug 2019 16:08:53 +0000 (UTC) Subject: [petsc-users] Fwd: Accessing Global Vector Data with In-Reply-To: <801821969.4134946.1565885148972@mail.yahoo.com> References: <2145279736.4118543.1565881168869.ref@mail.yahoo.com> <2145279736.4118543.1565881168869@mail.yahoo.com> <801821969.4134946.1565885148972@mail.yahoo.com> Message-ID: <1774482897.4127969.1565885333418@mail.yahoo.com> -----Original Message----- From: Daniel Mckinnell To: knepley Sent: Thu, 15 Aug 2019 17:05 Subject: Re: [petsc-users] Accessing Global Vector Data with Thank you, it seemed to work nicely for the fields associated with cells but I'm having problems with the field on vertices. The code for altering the field is as follows: for (int i = vStart; i < vEnd; i++) ??? { ??????? PetscScalar *values; ??????? PetscInt vsize, voff; ??????? PetscSectionGetDof(globalsection, i, &vsize); ??????? PetscSectionGetOffset(globalsection, i, &voff); ??????? VecGetArray(u, &values); ??????? if (voff >= 0) ??????? { ??????????? PetscScalar *p; ??????????? DMPlexPointLocalRef(*dm, i, values, &p); ??????????? p[0] = i + 1; ??????? } ??????? VecRestoreArray(u, &values); The Global PetscSection is: PetscSection Object: 2 MPI processes ? type not yet set Process 0: ? (?? 0) dim? 2 offset?? 0 ? (?? 1) dim? 2 offset?? 2 ? (?? 2) dim -3 offset -13 ? (?? 3) dim -3 offset -15 ? (?? 4) dim? 1 offset?? 4 ? (?? 5) dim? 1 offset?? 5 ? (?? 6) dim? 1 offset?? 6 ? (?? 7) dim -2 offset -17 ? (?? 8) dim -2 offset -18 ? (?? 9) dim -2 offset -19 ? (? 10) dim -2 offset -20 ? (? 11) dim -2 offset -21 ? (? 12) dim -2 offset -22 ? (? 13) dim? 1 offset?? 7 ? (? 14) dim? 1 offset?? 8 ? (? 15) dim? 1 offset?? 9 ? (? 16) dim? 1 offset? 10 ? (? 17) dim? 1 offset? 11 ? (? 18) dim -2 offset -23 ? (? 19) dim -2 offset -24 ? (? 20) dim -2 offset -25 ? (? 21) dim -2 offset -26 ? (? 22) dim -2 offset -27 ? (? 23) dim -2 offset -28 ? (? 24) dim -2 offset -29 Process 1: ? (?? 0) dim? 2 offset? 12 ? (?? 1) dim? 2 offset? 14 ? (?? 2) dim -3 offset? -1 ? (?? 3) dim -3 offset? -3 ? (?? 4) dim? 1 offset? 16 ? (?? 5) dim? 1 offset? 17 ? (?? 6) dim? 1 offset? 18 ? (?? 7) dim? 1 offset? 19 ? (?? 8) dim? 1 offset? 20 ? (?? 9) dim? 1 offset? 21 ? (? 10) dim -2 offset? -5 ? (? 11) dim -2 offset? -6 ? (? 12) dim -2 offset? -7 ? (? 13) dim? 1 offset? 22 ? (? 14) dim? 1 offset? 23 ? (? 15) dim? 1 offset? 24 ? (? 16) dim? 1 offset? 25 ? (? 17) dim? 1 offset? 26 ? (? 18) dim? 1 offset? 27 ? (? 19) dim? 1 offset? 28 ? (? 20) dim -2 offset? -8 ? (? 21) dim -2 offset? -9 ? (? 22) dim -2 offset -10 ? (? 23) dim -2 offset -11 ? (? 24) dim -2 offset -12 where points 4-6 on proc 0 and points 4-9 on proc 1 are vertices. The altered vec is: Vec Object: 2 MPI processes ? type: mpi Process [0] 0. 0. 0. 0. 0. 0. 0. 0. 5. 6. 7. 0. Process [1] 0. 0. 0. 0. 0. 0. 0. 0. 5. 6. 7. 8. 9. 10. 0. 0. 0. However the altered values do not correspond with the offsets in the PetscSection and I'm not sure what is going wrong.Thanks for all of the help.Daniel -----Original Message----- From: Matthew Knepley via petsc-users To: Daniel Mckinnell CC: PETSc Sent: Thu, 15 Aug 2019 16:11 Subject: Re: [petsc-users] Accessing Global Vector Data with On Thu, Aug 15, 2019 at 10:59 AM Daniel Mckinnell via petsc-users wrote: Hi,Attached is a way I came up with to access data in a global vector, is this the best way to do this or are there other ways? It would seem intuitive to use the global PetscSection and VecGetValuesSection but this doesn't seem to work on global vectors. Instead I have used VecGetValues and VecSetValues, however I also have a problem with these when extracting more than one value, I initialise the output of VecGetValues as PetscScalar *values; and then call VecGetValues(Other stuff... , values). This seems to work some times and not others and I can't find any rhyme or reason to it? I guess I should write something about this. I like to think of it as a sort of decision tree. 1) Get just local values, meaning those owned by this process ? ? These can be obtained from either a local or global vector. 2) Get ghosted values, meaning those values lying on unowned points that exist in the default PetscSF ? ? These can only get obtained from a local vector 3) Get arbitrary values ? ? You must use a VecScatter or custom PetscSF to get these For 1) and 2), I think the best way to get values normally is to use ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointLocalRef.html??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointGlobalRef.html These also have Read versions, and Field version to split off a particular field in the Section. ? Does this help? ? ? Thanks, ? ? ? Matt? Finally I was wondering if there is a good reference code base on Github including DMPlex that would be helpful in viewing applications of the DMPlex functions?Thanks,Daniel Mckinnell -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Thu Aug 15 11:18:22 2019 From: danyang.su at gmail.com (Danyang Su) Date: Thu, 15 Aug 2019 09:18:22 -0700 Subject: [petsc-users] different dof in DMDA creating In-Reply-To: <6ee19350-d059-703b-9c52-b03ae6bf7a4e@gmail.com> References: <6ee19350-d059-703b-9c52-b03ae6bf7a4e@gmail.com> Message-ID: <4c326b30-1180-9c0a-8e64-5753502abac2@gmail.com> Hi Barry and Matt, Would you please give me some advice on the functions I need to use to set different dof to the specified nodes. For now, I use DMPlexCreateSection that dof is uniform throughout the domain. I am a bit lost in choosing the right DMPlex functions, unfortunately. Thanks and regards, Danyang On 2019-02-07 1:53 p.m., Danyang Su wrote: > Thanks, Barry. DMPlex also works for my code. > > Danyang > > On 2019-02-07 1:14 p.m., Smith, Barry F. wrote: >> ?? No, you would need to use the more flexible DMPlex >> >> >>> On Feb 7, 2019, at 3:04 PM, Danyang Su via petsc-users >>> wrote: >>> >>> Dear PETSc Users, >>> >>> Does DMDA support different number of degrees of freedom for >>> different node? For example I have a 2D subsurface flow problem with >>> the default dof = 1 throughout the domain. Now I want to add some >>> sparse fractures in the domain. For the nodes connected to the >>> sparse fractures, I want to set dof to 2. Is it possible to set dof >>> to 2 for those nodes only? >>> >>> Thanks, >>> >>> Danyang >>> From knepley at gmail.com Thu Aug 15 11:22:04 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 15 Aug 2019 12:22:04 -0400 Subject: [petsc-users] different dof in DMDA creating In-Reply-To: <4c326b30-1180-9c0a-8e64-5753502abac2@gmail.com> References: <6ee19350-d059-703b-9c52-b03ae6bf7a4e@gmail.com> <4c326b30-1180-9c0a-8e64-5753502abac2@gmail.com> Message-ID: On Thu, Aug 15, 2019 at 12:18 PM Danyang Su wrote: > Hi Barry and Matt, > > Would you please give me some advice on the functions I need to use to > set different dof to the specified nodes. For now, I use > DMPlexCreateSection that dof is uniform throughout the domain. I am a > bit lost in choosing the right DMPlex functions, unfortunately. > Chapter 10 of the PETSc User Manual discusses low-level data layout using Section. How about we start there and then clarify anything that is confusing. Thanks, Matt > Thanks and regards, > > Danyang > > On 2019-02-07 1:53 p.m., Danyang Su wrote: > > Thanks, Barry. DMPlex also works for my code. > > > > Danyang > > > > On 2019-02-07 1:14 p.m., Smith, Barry F. wrote: > >> No, you would need to use the more flexible DMPlex > >> > >> > >>> On Feb 7, 2019, at 3:04 PM, Danyang Su via petsc-users > >>> wrote: > >>> > >>> Dear PETSc Users, > >>> > >>> Does DMDA support different number of degrees of freedom for > >>> different node? For example I have a 2D subsurface flow problem with > >>> the default dof = 1 throughout the domain. Now I want to add some > >>> sparse fractures in the domain. For the nodes connected to the > >>> sparse fractures, I want to set dof to 2. Is it possible to set dof > >>> to 2 for those nodes only? > >>> > >>> Thanks, > >>> > >>> Danyang > >>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From danyang.su at gmail.com Thu Aug 15 11:33:25 2019 From: danyang.su at gmail.com (Danyang Su) Date: Thu, 15 Aug 2019 09:33:25 -0700 Subject: [petsc-users] different dof in DMDA creating In-Reply-To: References: <6ee19350-d059-703b-9c52-b03ae6bf7a4e@gmail.com> <4c326b30-1180-9c0a-8e64-5753502abac2@gmail.com> Message-ID: <994acc05-046f-e5c6-e2ec-5a94fa113068@gmail.com> Hi Matt, Thanks for the quick reply. Will let you know when I am confused in using it. Regards, Danyang On 2019-08-15 9:22 a.m., Matthew Knepley wrote: > On Thu, Aug 15, 2019 at 12:18 PM Danyang Su > wrote: > > Hi Barry and Matt, > > Would you please give me some advice on the functions I need to > use to > set different dof to the specified nodes. For now, I use > DMPlexCreateSection that dof is uniform throughout the domain. I am a > bit lost in choosing the right DMPlex functions, unfortunately. > > > Chapter 10 of the PETSc User Manual discusses low-level data layout > using Section. How about we start > there and then clarify anything that is confusing. > > ? Thanks, > > ? ? Matt > > Thanks and regards, > > Danyang > > On 2019-02-07 1:53 p.m., Danyang Su wrote: > > Thanks, Barry. DMPlex also works for my code. > > > > Danyang > > > > On 2019-02-07 1:14 p.m., Smith, Barry F. wrote: > >> ?? No, you would need to use the more flexible DMPlex > >> > >> > >>> On Feb 7, 2019, at 3:04 PM, Danyang Su via petsc-users > >>> > wrote: > >>> > >>> Dear PETSc Users, > >>> > >>> Does DMDA support different number of degrees of freedom for > >>> different node? For example I have a 2D subsurface flow > problem with > >>> the default dof = 1 throughout the domain. Now I want to add some > >>> sparse fractures in the domain. For the nodes connected to the > >>> sparse fractures, I want to set dof to 2. Is it possible to > set dof > >>> to 2 for those nodes only? > >>> > >>> Thanks, > >>> > >>> Danyang > >>> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfaibussowitsch at anl.gov Fri Aug 16 10:21:58 2019 From: jfaibussowitsch at anl.gov (Faibussowitsch, Jacob) Date: Fri, 16 Aug 2019 15:21:58 +0000 Subject: [petsc-users] Working Group Beginners: Feedback On Layout Message-ID: Hello All PETSC Developers/Users! As many of you may or may not know, PETSc recently held an all-hands strategic meeting to chart the medium term course for the group. As part of this meeting a working group was formed to focus on beginner tutorial guides aimed at bringing new users up to speed on how to program basic to intermediate PETSc scripts. We have just completed a first draft of our template for these guides and would like to ask you all for your feedback! Any and all feedback would be greatly appreciated, however please limit your feedback to the general layout and structure. The visual presentation of the web page and content is still all a WIP, and is not necessarily representative of the finished product. That being said, in order to keep the project moving forward we will soft-cap feedback collection by the end of next Friday (August 23) so that we can get started on writing the tutorials and integrating them with the rest of the revamped user-guides. Please email me directly at jfaibussowitsch at anl.gov with your comments! Be sure to include specific details and examples of what you like and don?t like with your mail. Here is the template: http://patricksanan.com/temp/_build/html/introductory_tutorial_ksp.html Sincerely, Jacob Faibussowitsch -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 16 10:59:23 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 16 Aug 2019 11:59:23 -0400 Subject: [petsc-users] [petsc-dev] Working Group Beginners: Feedback On Layout In-Reply-To: References: Message-ID: On Fri, Aug 16, 2019 at 11:22 AM Faibussowitsch, Jacob via petsc-dev < petsc-dev at mcs.anl.gov> wrote: > Hello All PETSC Developers/Users! > > As many of you may or may not know, PETSc recently held an all-hands > strategic meeting to chart the medium term course for the group. As part of > this meeting a working group was formed to focus on beginner tutorial > guides aimed at bringing new users up to speed on how to program basic to > intermediate PETSc scripts. We have just completed a first draft of our > template for these guides and would like to ask you all for your feedback! > Any and all feedback would be greatly appreciated, however please limit > your feedback to the general *layout* and *structure*. The visual > presentation of the web page and content is still all a WIP, and is not > necessarily representative of the finished product. > > That being said, in order to keep the project moving forward we will *soft-cap > feedback collection by the end of next Friday (August 23)* so that we can > get started on writing the tutorials and integrating them with the rest of > the revamped user-guides. Please email me directly at > jfaibussowitsch at anl.gov with your comments! Be sure to include specific > details and examples of what you like and don?t like with your mail. > > Here is the template: > http://patricksanan.com/temp/_build/html/introductory_tutorial_ksp.html > I would really like to see a link to the full source right at the top. I like to look at everything most times rather than having it broken up, but I see the pedagogical value in the pieces. I would also like to see the build/run section also have instructions for running it in the test system. Thanks, Matt > Sincerely, > > Jacob Faibussowitsch > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From joslorgom at gmail.com Fri Aug 16 11:21:18 2019 From: joslorgom at gmail.com (=?UTF-8?Q?Jos=C3=A9_Lorenzo?=) Date: Fri, 16 Aug 2019 18:21:18 +0200 Subject: [petsc-users] VecAXPY Message-ID: Hello, I am struggling with a strange error when using VecAXPY. I have a ghost vector H that needs to be updated as H = H + eta * dH - eta_old * dH However, for some reason I obtain different results when using call VecAXPY(H, eta - eta_old, dH, ierr) instead of call VecAXPY(H, - eta_old, dH, ierr) call VecAXPY(H, eta, dH, ierr) where eta and eta_old are PetscScalars. The first option seems to provide a wrong output, but I do not understand what can go wrong in such simple operation. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 16 11:25:55 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 16 Aug 2019 12:25:55 -0400 Subject: [petsc-users] VecAXPY In-Reply-To: References: Message-ID: On Fri, Aug 16, 2019 at 12:22 PM Jos? Lorenzo via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I am struggling with a strange error when using VecAXPY. I have a ghost > vector H that needs to be updated as > > H = H + eta * dH - eta_old * dH > > However, for some reason I obtain different results when using > > call VecAXPY(H, eta - eta_old, dH, ierr) > > instead of > > call VecAXPY(H, - eta_old, dH, ierr) > > call VecAXPY(H, eta, dH, ierr) > > where eta and eta_old are PetscScalars. > > The first option seems to provide a wrong output, but I do not understand > what can go wrong in such simple operation. > Fortran is really unforgiving about inputs. Try declaring a new PetscScalar diff = eta - eta_old and trying the first option with that. Thanks, Matt > Thank you. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From joslorgom at gmail.com Fri Aug 16 11:57:09 2019 From: joslorgom at gmail.com (=?UTF-8?Q?Jos=C3=A9_Lorenzo?=) Date: Fri, 16 Aug 2019 18:57:09 +0200 Subject: [petsc-users] VecAXPY In-Reply-To: References: Message-ID: I tried that one as well but the result was the same as using eta-eta_old when calling VecAXPY. Thank you. El vie., 16 ago. 2019 18:26, Matthew Knepley escribi?: > On Fri, Aug 16, 2019 at 12:22 PM Jos? Lorenzo via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> I am struggling with a strange error when using VecAXPY. I have a ghost >> vector H that needs to be updated as >> >> H = H + eta * dH - eta_old * dH >> >> However, for some reason I obtain different results when using >> >> call VecAXPY(H, eta - eta_old, dH, ierr) >> >> instead of >> >> call VecAXPY(H, - eta_old, dH, ierr) >> >> call VecAXPY(H, eta, dH, ierr) >> >> where eta and eta_old are PetscScalars. >> >> The first option seems to provide a wrong output, but I do not understand >> what can go wrong in such simple operation. >> > Fortran is really unforgiving about inputs. Try declaring a new > PetscScalar diff = eta - eta_old and trying the first option with that. > > Thanks, > > Matt > >> Thank you. >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 16 12:15:51 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 16 Aug 2019 13:15:51 -0400 Subject: [petsc-users] VecAXPY In-Reply-To: References: Message-ID: On Fri, Aug 16, 2019 at 12:57 PM Jos? Lorenzo wrote: > I tried that one as well but the result was the same as using eta-eta_old > when calling VecAXPY. > Do you think you could send a small example of getting different answers? We are fairly confident of the correctness, so something else must be going on. Thanks, Matt > Thank you. > > El vie., 16 ago. 2019 18:26, Matthew Knepley escribi?: > >> On Fri, Aug 16, 2019 at 12:22 PM Jos? Lorenzo via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hello, >>> >>> I am struggling with a strange error when using VecAXPY. I have a ghost >>> vector H that needs to be updated as >>> >>> H = H + eta * dH - eta_old * dH >>> >>> However, for some reason I obtain different results when using >>> >>> call VecAXPY(H, eta - eta_old, dH, ierr) >>> >>> instead of >>> >>> call VecAXPY(H, - eta_old, dH, ierr) >>> >>> call VecAXPY(H, eta, dH, ierr) >>> >>> where eta and eta_old are PetscScalars. >>> >>> The first option seems to provide a wrong output, but I do not >>> understand what can go wrong in such simple operation. >>> >> Fortran is really unforgiving about inputs. Try declaring a new >> PetscScalar diff = eta - eta_old and trying the first option with that. >> >> Thanks, >> >> Matt >> >>> Thank you. >>> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Fri Aug 16 13:00:55 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 16 Aug 2019 19:00:55 +0100 Subject: [petsc-users] VecAXPY In-Reply-To: References: Message-ID: On Fri, 16 Aug 2019 at 17:22, Jos? Lorenzo via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > > I am struggling with a strange error when using VecAXPY. I have a ghost > vector H that needs to be updated as > > H = H + eta * dH - eta_old * dH > > However, for some reason I obtain different results when using > > call VecAXPY(H, eta - eta_old, dH, ierr) > > instead of > > call VecAXPY(H, - eta_old, dH, ierr) > > call VecAXPY(H, eta, dH, ierr) > Does the code work if you do the following? call VecAXPY(H, eta, dH, ierr) eta_old = -eta_old call VecAXPY(H, eta_old, dH, ierr) Fortran pass variables by reference not value, so I don't think it is valid to pass in -eta_old as the argument. > where eta and eta_old are PetscScalars. > > The first option seems to provide a wrong output, but I do not understand > what can go wrong in such simple operation. > > Thank you. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Fri Aug 16 14:02:27 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 16 Aug 2019 20:02:27 +0100 Subject: [petsc-users] [petsc-dev] Working Group Beginners: Feedback On Layout In-Reply-To: References: Message-ID: I think it would useful to have links to all the man pages in the table of contents. I also think it would be useful to have links to the man pages for specific key functions which are fundamental to the objectives of the tutorial. These could appear at the end of the tutorial under a new section heading (eg "Further reading"). It would be good to keep the list of man pages displayed to a minimum to avoid info overload and obscuring the primary objectives of the tut. Overall I like it. Nice work. Cheers, Dave On Fri, 16 Aug 2019 at 16:22, Faibussowitsch, Jacob via petsc-dev < petsc-dev at mcs.anl.gov> wrote: > Hello All PETSC Developers/Users! > > As many of you may or may not know, PETSc recently held an all-hands > strategic meeting to chart the medium term course for the group. As part of > this meeting a working group was formed to focus on beginner tutorial > guides aimed at bringing new users up to speed on how to program basic to > intermediate PETSc scripts. We have just completed a first draft of our > template for these guides and would like to ask you all for your feedback! > Any and all feedback would be greatly appreciated, however please limit > your feedback to the general *layout* and *structure*. The visual > presentation of the web page and content is still all a WIP, and is not > necessarily representative of the finished product. > > That being said, in order to keep the project moving forward we will *soft-cap > feedback collection by the end of next Friday (August 23)* so that we can > get started on writing the tutorials and integrating them with the rest of > the revamped user-guides. Please email me directly at > jfaibussowitsch at anl.gov with your comments! Be sure to include specific > details and examples of what you like and don?t like with your mail. > > Here is the template: > http://patricksanan.com/temp/_build/html/introductory_tutorial_ksp.html > > Sincerely, > > Jacob Faibussowitsch > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Fri Aug 16 14:15:58 2019 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 16 Aug 2019 20:15:58 +0100 Subject: [petsc-users] [petsc-dev] Working Group Beginners: Feedback On Layout In-Reply-To: References: Message-ID: On Fri, 16 Aug 2019 at 20:02, Dave May wrote: > I think it would useful to have links to all the man pages in the table of > contents. > Sorry - what I wrote was ambiguous. I am proposing "a single link to all man pages" in the ToC. E.g. a link to this https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/singleindex.html > I also think it would be useful to have links to the man pages for > specific key functions which are fundamental to the objectives of the > tutorial. These could appear at the end of the tutorial under a new section > heading (eg "Further reading"). It would be good to keep the list of man > pages displayed to a minimum to avoid info overload and obscuring the > primary objectives of the tut. > > Overall I like it. Nice work. > > Cheers, > Dave > > On Fri, 16 Aug 2019 at 16:22, Faibussowitsch, Jacob via petsc-dev < > petsc-dev at mcs.anl.gov> wrote: > >> Hello All PETSC Developers/Users! >> >> As many of you may or may not know, PETSc recently held an all-hands >> strategic meeting to chart the medium term course for the group. As part of >> this meeting a working group was formed to focus on beginner tutorial >> guides aimed at bringing new users up to speed on how to program basic to >> intermediate PETSc scripts. We have just completed a first draft of our >> template for these guides and would like to ask you all for your feedback! >> Any and all feedback would be greatly appreciated, however please limit >> your feedback to the general *layout* and *structure*. The visual >> presentation of the web page and content is still all a WIP, and is not >> necessarily representative of the finished product. >> >> That being said, in order to keep the project moving forward we will *soft-cap >> feedback collection by the end of next Friday (August 23)* so that we >> can get started on writing the tutorials and integrating them with the rest >> of the revamped user-guides. Please email me directly at >> jfaibussowitsch at anl.gov with your comments! Be sure to include specific >> details and examples of what you like and don?t like with your mail. >> >> Here is the template: >> http://patricksanan.com/temp/_build/html/introductory_tutorial_ksp.html >> >> Sincerely, >> >> Jacob Faibussowitsch >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfaibussowitsch at anl.gov Fri Aug 16 17:35:56 2019 From: jfaibussowitsch at anl.gov (Faibussowitsch, Jacob) Date: Fri, 16 Aug 2019 22:35:56 +0000 Subject: [petsc-users] [petsc-dev] Working Group Beginners: Feedback On Layout In-Reply-To: References: Message-ID: <50BEF078-25D3-43AE-9B1A-526902D1BF67@anl.gov> I also think it would be useful to have links to the man pages for specific key functions which are fundamental to the objectives of the tutorial. These could appear at the end of the tutorial under a new section heading (eg "Further reading"). It would be good to keep the list of man pages displayed to a minimum to avoid info overload and obscuring the primary objectives of the tut. This is a key goal of the tutorials, in that a new user should not be overloaded with too much info but also have to look in 10 different places in order to get all of the information they need. In the finished product every Petsc function in the text will have links to the user manual or man pages. Its not currently working in the example shown due to some limitations in the rst framework were using but it will we working in the final product. Another key aspect here is that we are also revamping the user manual and bringing it to a bit more prominence than it was before, the idea being that these tutorials show the basics of how to interact with Petsc and get something going, and then have links all over the tutorials keywords that would go to the user manual where there is much more detailed information. Thank you for sharing your thoughts, I will relay them back to the rest of the team! Best, Jacob On Aug 16, 2019, at 2:02 PM, Dave May > wrote: I think it would useful to have links to all the man pages in the table of contents. I also think it would be useful to have links to the man pages for specific key functions which are fundamental to the objectives of the tutorial. These could appear at the end of the tutorial under a new section heading (eg "Further reading"). It would be good to keep the list of man pages displayed to a minimum to avoid info overload and obscuring the primary objectives of the tut. Overall I like it. Nice work. Cheers, Dave On Fri, 16 Aug 2019 at 16:22, Faibussowitsch, Jacob via petsc-dev > wrote: Hello All PETSC Developers/Users! As many of you may or may not know, PETSc recently held an all-hands strategic meeting to chart the medium term course for the group. As part of this meeting a working group was formed to focus on beginner tutorial guides aimed at bringing new users up to speed on how to program basic to intermediate PETSc scripts. We have just completed a first draft of our template for these guides and would like to ask you all for your feedback! Any and all feedback would be greatly appreciated, however please limit your feedback to the general layout and structure. The visual presentation of the web page and content is still all a WIP, and is not necessarily representative of the finished product. That being said, in order to keep the project moving forward we will soft-cap feedback collection by the end of next Friday (August 23) so that we can get started on writing the tutorials and integrating them with the rest of the revamped user-guides. Please email me directly at jfaibussowitsch at anl.gov with your comments! Be sure to include specific details and examples of what you like and don?t like with your mail. Here is the template: http://patricksanan.com/temp/_build/html/introductory_tutorial_ksp.html Sincerely, Jacob Faibussowitsch -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 16 22:52:37 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sat, 17 Aug 2019 03:52:37 +0000 Subject: [petsc-users] VecAXPY In-Reply-To: References: Message-ID: What version of PETSc are you using and are you using the standard precision and scalar so that PetscScalar is double precision? Also what Fortran compiler? Barry > On Aug 16, 2019, at 11:21 AM, Jos? Lorenzo via petsc-users wrote: > > Hello, > > I am struggling with a strange error when using VecAXPY. I have a ghost vector H that needs to be updated as > > H = H + eta * dH - eta_old * dH > > However, for some reason I obtain different results when using > > call VecAXPY(H, eta - eta_old, dH, ierr) > > instead of > > call VecAXPY(H, - eta_old, dH, ierr) > > call VecAXPY(H, eta, dH, ierr) > > where eta and eta_old are PetscScalars. > > The first option seems to provide a wrong output, but I do not understand what can go wrong in such simple operation. > > Thank you. > From mlohry at gmail.com Sun Aug 18 13:19:19 2019 From: mlohry at gmail.com (Mark Lohry) Date: Sun, 18 Aug 2019 14:19:19 -0400 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: References: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> <20CF735B-247A-4A0D-BBF7-8DD25AB95E51@mcs.anl.gov> <4379D27A-C950-4733-82F4-2BDDFF93D154@mcs.anl.gov> Message-ID: Barry, thanks for your suggestion to do the serial coloring on the mesh itself / block size 1 case first, and then manually color the blocks. Works like a charm. The 2 million cell case is small enough to create the sparse system on one process and color it in about a second. On Sun, Aug 11, 2019 at 9:41 PM Mark Lohry wrote: > So the parallel JP runs just as proportionally slow in serial as it does > in parallel. > > valgrind --tool=callgrind shows essentially 100% of the runtime in > jp.c:255-262, within the larger loop commented > /* pass two -- color it by looking at nearby vertices and building a mask > */ > > for (j=0;j if (seen[cols[j]] != cidx) { > bidx++; > seen[cols[j]] = cidx; > idxbuf[bidx] = cols[j]; > distbuf[bidx] = dist+1; > } > } > > I'll dig into how this algorithm is supposed to work, but anything obvious > in there? It kinda feels like something is doing something N^2 or worse > when it doesn't need to be. > > On Sun, Aug 11, 2019 at 3:47 PM Mark Lohry wrote: > >> Sorry, forgot to reply to the mailing list. >> >> where does your matrix come from? A mesh? Structured, unstructured, a >>> graph, something else? What type of discretization? >> >> >> Unstructured tetrahedral mesh (CGNS, I can give links to the files if >> that's of interest), the discretization is arbitrary order discontinuous >> galerkin for compressible navier-stokes. 5 coupled equations x 10 nodes per >> element for this 2nd order case to give the 50x50 blocks. Each tet cell >> dependent on neighbors, so for tets 4 extra off-diagonal blocks per cell. >> >> I would expect one could exploit the large block size here in computing >> the coloring -- the underlying mesh is 2M nodes with the same connectivity >> as a standard cell-centered finite volume method. >> >> >> >> On Sun, Aug 11, 2019 at 2:12 PM Smith, Barry F. >> wrote: >> >>> >>> These are due to attempting to copy the entire matrix to one process >>> and do the sequential coloring there. Definitely won't work for larger >>> problems, we'll >>> >>> need to focus on >>> >>> 1) having useful parallel coloring and >>> 2) maybe using an alternative way to determine the coloring: >>> >>> where does your matrix come from? A mesh? Structured, unstructured, >>> a graph, something else? What type of discretization? >>> >>> Barry >>> >>> >>> > On Aug 11, 2019, at 10:21 AM, Mark Lohry wrote: >>> > >>> > On the very large case, there does appear to be some kind of overflow >>> ending up with an attempt to allocate too much memory in MatFDColorCreate, >>> even with --with-64-bit-indices. Full terminal output here: >>> > >>> https://raw.githubusercontent.com/mlohry/petsc_miscellany/master/slurm-3451378.out >>> > >>> > In particular: >>> > PETSC ERROR: Memory requested 1036713571771129344 >>> > >>> > Log filename here: >>> > https://github.com/mlohry/petsc_miscellany/blob/master/petsclogfile.0 >>> > >>> > On Sun, Aug 11, 2019 at 9:49 AM Mark Lohry wrote: >>> > Hi Barry, I made a minimum example comparing the colorings on a very >>> small case. You'll need to unzip the jacobian_sparsity.tgz to run it. >>> > >>> > https://github.com/mlohry/petsc_miscellany >>> > >>> > This is sparse block system with 50x50 block sizes, ~7,680 blocks. >>> Comparing the coloring types sl, lf, jp, id, greedy, I get these timings >>> wallclock, running with -np 16: >>> > >>> > SL: 1.5s >>> > LF: 1.3s >>> > JP: 29s ! >>> > ID: 1.4s >>> > greedy: 2s >>> > >>> > As far as I'm aware, JP is the only parallel coloring implemented? It >>> is looking as though I'm simply running out of memory with the sequential >>> methods (I should apologize to my cluster admin for chewing up 10TB and >>> crashing...). >>> > >>> > On this small problem JP is taking 30 seconds wallclock, but that time >>> grows exponentially with larger problems (last I tried it, I killed the job >>> after 24 hours of spinning.) >>> > >>> > Also as I mentioned, the "greedy" method appears to be producing an >>> invalid coloring for me unless I also specify weights "lexical". But >>> "-mat_coloring_test" doesn't complain. I'll have to make a different >>> example to actually show it's an invalid coloring. >>> > >>> > Thanks, >>> > Mark >>> > >>> > >>> > >>> > On Sat, Aug 10, 2019 at 4:38 PM Smith, Barry F. >>> wrote: >>> > >>> > Mark, >>> > >>> > Would you be able to cook up an example (or examples) that >>> demonstrate the problem (or problems) and how to run it? If you send it to >>> us and we can reproduce the problem then we'll fix it. If need be you can >>> send large matrices to petsc-maint at mcs.anl.gov don't send them to >>> petsc-users since it will reject large files. >>> > >>> > Barry >>> > >>> > >>> > > On Aug 10, 2019, at 1:56 PM, Mark Lohry wrote: >>> > > >>> > > Thanks Barry, been trying all of the above. I think I've homed in on >>> it to an out-of-memory and/or integer overflow inside MatColoringApply. >>> Which makes some sense since I only have a sequential coloring algorithm >>> working... >>> > > >>> > > Is anyone out there using coloring in parallel? I still have the >>> same previously mentioned issues with MATCOLORINGJP (on small problems >>> takes upwards of 30 minutes to run) which as far as I can see is the only >>> "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on >>> less large problems, MATCOLORINGGREEDY works on less large problems if and >>> only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are >>> failing on larger problems. >>> > > >>> > > On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. >>> wrote: >>> > > >>> > > There is also >>> > > >>> > > $ ./configure --help | grep color >>> > > --with-is-color-value-type= >>> > > char, short can store 256, 65536 colors current: short >>> > > >>> > > I can't imagine you have over 65 k colors but something to check >>> > > >>> > > >>> > > > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: >>> > > > >>> > > > My first guess is that the code is getting integer overflow >>> somewhere. 25 billion is well over the 2 billion that 32 bit integers can >>> hold. >>> > > > >>> > > > Mine as well -- though in later tests I have the same issue when >>> using --with-64-bit-indices. Ironically I had removed that flag at some >>> point because the coloring / index set was using a serious chunk of total >>> memory on medium sized problems. >>> > > >>> > > Understood >>> > > >>> > > > >>> > > > Questions on the petsc internals there though: Are matrices >>> indexed with two integers (i,j) so the max matrix dimension is (int limit) >>> x (int limit) or a single integer so the max dimension is sqrt(int limit)? >>> > > > Also I was operating under the assumption the 32 bit limit should >>> only constrain per-process problem sizes (25B over 400 processes giving 62M >>> non-zeros per process), is that not right? >>> > > >>> > > It is mostly right but may not be right for everything in PETSc. >>> For example I don't know about the MatFD code >>> > > >>> > > Since using a debugger is not practical for large code counts to >>> find the point the two processes diverge you can try >>> > > >>> > > -log_trace >>> > > >>> > > or >>> > > >>> > > -log_trace filename >>> > > >>> > > in the second case it will generate one file per core called >>> filename.%d note it will produce a lot of output >>> > > >>> > > Good luck >>> > > >>> > > >>> > > >>> > > > >>> > > > We are adding more tests to nicely handle integer overflow but >>> it is not easy since it can occur in so many places >>> > > > >>> > > > Totally understood. I know the pain of only finding an overflow >>> bug after days of waiting in a cluster queue for a big job. >>> > > > >>> > > > We urge you to upgrade. >>> > > > >>> > > > I'll do that today and hope for the best. On first tests on >>> 3.11.3, I still have a couple issues with the coloring code: >>> > > > >>> > > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned >>> here: >>> https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html >>> > > > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces >>> a wrong jacobian unless I also set MatColoringSetWeightType(coloring, >>> MAT_COLORING_WEIGHT_LEXICAL); >>> > > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to >>> exist. >>> > > > >>> > > > Thanks, >>> > > > Mark >>> > > > >>> > > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. >>> wrote: >>> > > > >>> > > > My first guess is that the code is getting integer overflow >>> somewhere. 25 billion is well over the 2 billion that 32 bit integers can >>> hold. >>> > > > >>> > > > We urge you to upgrade. >>> > > > >>> > > > Regardless for problems this large you likely need the >>> ./configure option --with-64-bit-indices >>> > > > >>> > > > We are adding more tests to nicely handle integer overflow but >>> it is not easy since it can occur in so many places >>> > > > >>> > > > Hopefully this will resolve your problem with large process >>> counts >>> > > > >>> > > > Barry >>> > > > >>> > > > >>> > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> > > > > >>> > > > > I'm running some larger cases than I have previously with a >>> working code, and I'm running into failures I don't see on smaller cases. >>> Failures are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. >>> Runs successfully on half size case on 200 cores. >>> > > > > >>> > > > > 1) The first error output from petsc is "MPI_Allreduce() called >>> in different locations". Is this a red herring, suggesting some process >>> failed prior to this and processes have diverged? >>> > > > > >>> > > > > 2) I don't think I'm running out of memory -- globally at least. >>> Slurm output shows e.g. >>> > > > > Memory Utilized: 459.15 GB (estimated maximum) >>> > > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) >>> > > > > I did try with and without --64-bit-indices. >>> > > > > >>> > > > > 3) The debug traces seem to vary, see below. I *think* the >>> failure might be happening in the vicinity of a Coloring call. I'm using >>> MatFDColoring like so: >>> > > > > >>> > > > > ISColoring iscoloring; >>> > > > > MatFDColoring fdcoloring; >>> > > > > MatColoring coloring; >>> > > > > >>> > > > > MatColoringCreate(ctx.JPre, &coloring); >>> > > > > MatColoringSetType(coloring, MATCOLORINGGREEDY); >>> > > > > >>> > > > > // converges stalls badly without this on small cases, don't >>> know why >>> > > > > MatColoringSetWeightType(coloring, >>> MAT_COLORING_WEIGHT_LEXICAL); >>> > > > > >>> > > > > // none of these worked. >>> > > > > // MatColoringSetType(coloring, MATCOLORINGJP); >>> > > > > // MatColoringSetType(coloring, MATCOLORINGSL); >>> > > > > // MatColoringSetType(coloring, MATCOLORINGID); >>> > > > > MatColoringSetFromOptions(coloring); >>> > > > > >>> > > > > MatColoringApply(coloring, &iscoloring); >>> > > > > MatColoringDestroy(&coloring); >>> > > > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); >>> > > > > >>> > > > > I have had issues in the past with getting a functional coloring >>> setup for finite difference jacobians, and the above is the only >>> configuration I've managed to get working successfully. Have there been any >>> significant development changes to that area of code since v3.8.3? I'll try >>> upgrading in the mean time and hope for the best. >>> > > > > >>> > > > > >>> > > > > >>> > > > > Any ideas? >>> > > > > >>> > > > > >>> > > > > Thanks, >>> > > > > Mark >>> > > > > >>> > > > > >>> > > > > ************************************* >>> > > > > >>> > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >>> slurm-3429773.out >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >>> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >>> (functions) on different processors >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 >>> by mlohry Tue Aug 6 06:05:02 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun --with-64-bit-indices >>> > > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Invalid argument >>> > > > > [0]PETSC ERROR: Enum value must be same on all processes, >>> argument # 2 >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 >>> by mlohry Tue Aug 6 06:05:02 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun --with-64-bit-indices >>> > > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process >>> (or the batch system) has told this process to end >>> > > > > [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> > > > > [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and >>> Apple Mac OS X to find memory corruption errors >>> > > > > [0]PETSC ERROR: likely location of problem given in stack below >>> > > > > [0]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are >>> not available, >>> > > > > [0]PETSC ERROR: INSTEAD the line number of the start of >>> the function >>> > > > > [0]PETSC ERROR: is given. >>> > > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 >>> /home/mlohry/build/external/petsc/src/sys/objects/tagm.c >>> > > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 >>> /home/mlohry/build/external/petsc/src/sys/objects/inherit.c >>> > > > > [0]PETSC ERROR: [0] DMCreate line 36 >>> /home/mlohry/build/external/petsc/src/dm/interface/dm.c >>> > > > > [0]PETSC ERROR: [0] DMShellCreate line 983 >>> /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c >>> > > > > [0]PETSC ERROR: [0] TSGetDM line 5287 >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Signal received >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 >>> by mlohry Tue Aug 6 06:05:02 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun --with-64-bit-indices >>> > > > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown >>> file >>> > > > > >>> > > > > >>> > > > > ************************************* >>> > > > > >>> > > > > >>> > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >>> slurm-3429158.out >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >>> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >>> (code lines) on different processors >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 >>> by mlohry Mon Aug 5 23:58:19 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun >>> > > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >>> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >>> (code lines) on different processors >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 >>> by mlohry Mon Aug 5 23:58:19 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun >>> > > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in >>> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >>> > > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in >>> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >>> > > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in >>> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >>> > > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in >>> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c >>> > > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in >>> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >>> > > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>> Violation, probably memory access out of range >>> > > > > [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> > > > > [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and >>> Apple Mac OS X to find memory corruption errors >>> > > > > [0]PETSC ERROR: likely location of problem given in stack below >>> > > > > [0]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are >>> not available, >>> > > > > [0]PETSC ERROR: INSTEAD the line number of the start of >>> the function >>> > > > > [0]PETSC ERROR: is given. >>> > > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 >>> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c >>> > > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line >>> 208 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c >>> > > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 >>> /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c >>> > > > > [0]PETSC ERROR: [0] MatColoringApply line 357 >>> /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c >>> > > > > [0]PETSC ERROR: [0] VecSetSizes line 1308 >>> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >>> > > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 >>> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >>> > > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 >>> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c >>> > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 >>> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >>> > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Signal received >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 >>> by mlohry Mon Aug 5 23:58:19 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun >>> > > > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown >>> file >>> > > > > >>> > > > > >>> > > > > >>> > > > > ************************* >>> > > > > >>> > > > > >>> > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >>> slurm-3429134.out >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >>> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >>> (code lines) on different processors >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 >>> by mlohry Mon Aug 5 23:24:23 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun >>> > > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in >>> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c >>> > > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in >>> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c >>> > > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in >>> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c >>> > > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in >>> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >>> > > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in >>> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c >>> > > > > [0]PETSC ERROR: #6 VecSetType() line 51 in >>> /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c >>> > > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in >>> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Object is in wrong state >>> > > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 >>> by mlohry Mon Aug 5 23:24:23 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun >>> > > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in >>> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c >>> > > > > >>> > > > > >>> > > > > >>> > > > > ************************************** >>> > > > > >>> > > > > >>> > > > > >>> > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" >>> slurm-3429102.out >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >>> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >>> (code lines) on different processors >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 >>> by mlohry Mon Aug 5 22:50:12 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun >>> > > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >>> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >>> (code lines) on different processors >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 >>> by mlohry Mon Aug 5 22:50:12 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun >>> > > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Petsc has generated inconsistent data >>> > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations >>> (code lines) on different processors >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 >>> by mlohry Mon Aug 5 22:50:12 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun >>> > > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in >>> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >>> > > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in >>> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >>> > > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in >>> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >>> > > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in >>> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >>> > > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>> Violation, probably memory access out of range >>> > > > > [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> > > > > [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and >>> Apple Mac OS X to find memory corruption errors >>> > > > > [0]PETSC ERROR: likely location of problem given in stack below >>> > > > > [0]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are >>> not available, >>> > > > > [0]PETSC ERROR: INSTEAD the line number of the start of >>> the function >>> > > > > [0]PETSC ERROR: is given. >>> > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 >>> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >>> > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 >>> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >>> > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 >>> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >>> > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 >>> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >>> > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 >>> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c >>> > > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 >>> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c >>> > > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 >>> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c >>> > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 >>> /home/mlohry/build/external/petsc/src/ts/interface/ts.c >>> > > > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > > [0]PETSC ERROR: Signal received >>> > > > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 >>> > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 >>> by mlohry Mon Aug 5 22:50:12 2019 >>> > > > > [0]PETSC ERROR: Configure options >>> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt >>> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc >>> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx >>> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes >>> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 >>> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS >>> --with-mpiexec=/usr/bin/srun >>> > > > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown >>> file >>> > > > > >>> > > > > >>> > > > > >>> > > > >>> > > >>> > >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Aug 18 13:38:35 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Sun, 18 Aug 2019 18:38:35 +0000 Subject: [petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts In-Reply-To: References: <79C34557-36B4-4243-94D6-0FDDB228593F@mcs.anl.gov> <20CF735B-247A-4A0D-BBF7-8DD25AB95E51@mcs.anl.gov> <4379D27A-C950-4733-82F4-2BDDFF93D154@mcs.anl.gov> Message-ID: Excellant, we'll still need to fix the parallel coloring but I'm glad we can put that off :-) Barry > On Aug 18, 2019, at 1:19 PM, Mark Lohry wrote: > > Barry, thanks for your suggestion to do the serial coloring on the mesh itself / block size 1 case first, and then manually color the blocks. Works like a charm. The 2 million cell case is small enough to create the sparse system on one process and color it in about a second. > > On Sun, Aug 11, 2019 at 9:41 PM Mark Lohry wrote: > So the parallel JP runs just as proportionally slow in serial as it does in parallel. > > valgrind --tool=callgrind shows essentially 100% of the runtime in jp.c:255-262, within the larger loop commented > /* pass two -- color it by looking at nearby vertices and building a mask */ > > for (j=0;j if (seen[cols[j]] != cidx) { > bidx++; > seen[cols[j]] = cidx; > idxbuf[bidx] = cols[j]; > distbuf[bidx] = dist+1; > } > } > > I'll dig into how this algorithm is supposed to work, but anything obvious in there? It kinda feels like something is doing something N^2 or worse when it doesn't need to be. > > On Sun, Aug 11, 2019 at 3:47 PM Mark Lohry wrote: > Sorry, forgot to reply to the mailing list. > > where does your matrix come from? A mesh? Structured, unstructured, a graph, something else? What type of discretization? > > Unstructured tetrahedral mesh (CGNS, I can give links to the files if that's of interest), the discretization is arbitrary order discontinuous galerkin for compressible navier-stokes. 5 coupled equations x 10 nodes per element for this 2nd order case to give the 50x50 blocks. Each tet cell dependent on neighbors, so for tets 4 extra off-diagonal blocks per cell. > > I would expect one could exploit the large block size here in computing the coloring -- the underlying mesh is 2M nodes with the same connectivity as a standard cell-centered finite volume method. > > > > On Sun, Aug 11, 2019 at 2:12 PM Smith, Barry F. wrote: > > These are due to attempting to copy the entire matrix to one process and do the sequential coloring there. Definitely won't work for larger problems, we'll > > need to focus on > > 1) having useful parallel coloring and > 2) maybe using an alternative way to determine the coloring: > > where does your matrix come from? A mesh? Structured, unstructured, a graph, something else? What type of discretization? > > Barry > > > > On Aug 11, 2019, at 10:21 AM, Mark Lohry wrote: > > > > On the very large case, there does appear to be some kind of overflow ending up with an attempt to allocate too much memory in MatFDColorCreate, even with --with-64-bit-indices. Full terminal output here: > > https://raw.githubusercontent.com/mlohry/petsc_miscellany/master/slurm-3451378.out > > > > In particular: > > PETSC ERROR: Memory requested 1036713571771129344 > > > > Log filename here: > > https://github.com/mlohry/petsc_miscellany/blob/master/petsclogfile.0 > > > > On Sun, Aug 11, 2019 at 9:49 AM Mark Lohry wrote: > > Hi Barry, I made a minimum example comparing the colorings on a very small case. You'll need to unzip the jacobian_sparsity.tgz to run it. > > > > https://github.com/mlohry/petsc_miscellany > > > > This is sparse block system with 50x50 block sizes, ~7,680 blocks. Comparing the coloring types sl, lf, jp, id, greedy, I get these timings wallclock, running with -np 16: > > > > SL: 1.5s > > LF: 1.3s > > JP: 29s ! > > ID: 1.4s > > greedy: 2s > > > > As far as I'm aware, JP is the only parallel coloring implemented? It is looking as though I'm simply running out of memory with the sequential methods (I should apologize to my cluster admin for chewing up 10TB and crashing...). > > > > On this small problem JP is taking 30 seconds wallclock, but that time grows exponentially with larger problems (last I tried it, I killed the job after 24 hours of spinning.) > > > > Also as I mentioned, the "greedy" method appears to be producing an invalid coloring for me unless I also specify weights "lexical". But "-mat_coloring_test" doesn't complain. I'll have to make a different example to actually show it's an invalid coloring. > > > > Thanks, > > Mark > > > > > > > > On Sat, Aug 10, 2019 at 4:38 PM Smith, Barry F. wrote: > > > > Mark, > > > > Would you be able to cook up an example (or examples) that demonstrate the problem (or problems) and how to run it? If you send it to us and we can reproduce the problem then we'll fix it. If need be you can send large matrices to petsc-maint at mcs.anl.gov don't send them to petsc-users since it will reject large files. > > > > Barry > > > > > > > On Aug 10, 2019, at 1:56 PM, Mark Lohry wrote: > > > > > > Thanks Barry, been trying all of the above. I think I've homed in on it to an out-of-memory and/or integer overflow inside MatColoringApply. Which makes some sense since I only have a sequential coloring algorithm working... > > > > > > Is anyone out there using coloring in parallel? I still have the same previously mentioned issues with MATCOLORINGJP (on small problems takes upwards of 30 minutes to run) which as far as I can see is the only "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on less large problems, MATCOLORINGGREEDY works on less large problems if and only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are failing on larger problems. > > > > > > On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. wrote: > > > > > > There is also > > > > > > $ ./configure --help | grep color > > > --with-is-color-value-type= > > > char, short can store 256, 65536 colors current: short > > > > > > I can't imagine you have over 65 k colors but something to check > > > > > > > > > > On Aug 6, 2019, at 8:19 AM, Mark Lohry wrote: > > > > > > > > My first guess is that the code is getting integer overflow somewhere. 25 billion is well over the 2 billion that 32 bit integers can hold. > > > > > > > > Mine as well -- though in later tests I have the same issue when using --with-64-bit-indices. Ironically I had removed that flag at some point because the coloring / index set was using a serious chunk of total memory on medium sized problems. > > > > > > Understood > > > > > > > > > > > Questions on the petsc internals there though: Are matrices indexed with two integers (i,j) so the max matrix dimension is (int limit) x (int limit) or a single integer so the max dimension is sqrt(int limit)? > > > > Also I was operating under the assumption the 32 bit limit should only constrain per-process problem sizes (25B over 400 processes giving 62M non-zeros per process), is that not right? > > > > > > It is mostly right but may not be right for everything in PETSc. For example I don't know about the MatFD code > > > > > > Since using a debugger is not practical for large code counts to find the point the two processes diverge you can try > > > > > > -log_trace > > > > > > or > > > > > > -log_trace filename > > > > > > in the second case it will generate one file per core called filename.%d note it will produce a lot of output > > > > > > Good luck > > > > > > > > > > > > > > > > > We are adding more tests to nicely handle integer overflow but it is not easy since it can occur in so many places > > > > > > > > Totally understood. I know the pain of only finding an overflow bug after days of waiting in a cluster queue for a big job. > > > > > > > > We urge you to upgrade. > > > > > > > > I'll do that today and hope for the best. On first tests on 3.11.3, I still have a couple issues with the coloring code: > > > > > > > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned here: https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html > > > > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a wrong jacobian unless I also set MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist. > > > > > > > > Thanks, > > > > Mark > > > > > > > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. wrote: > > > > > > > > My first guess is that the code is getting integer overflow somewhere. 25 billion is well over the 2 billion that 32 bit integers can hold. > > > > > > > > We urge you to upgrade. > > > > > > > > Regardless for problems this large you likely need the ./configure option --with-64-bit-indices > > > > > > > > We are adding more tests to nicely handle integer overflow but it is not easy since it can occur in so many places > > > > > > > > Hopefully this will resolve your problem with large process counts > > > > > > > > Barry > > > > > > > > > > > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users wrote: > > > > > > > > > > I'm running some larger cases than I have previously with a working code, and I'm running into failures I don't see on smaller cases. Failures are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs successfully on half size case on 200 cores. > > > > > > > > > > 1) The first error output from petsc is "MPI_Allreduce() called in different locations". Is this a red herring, suggesting some process failed prior to this and processes have diverged? > > > > > > > > > > 2) I don't think I'm running out of memory -- globally at least. Slurm output shows e.g. > > > > > Memory Utilized: 459.15 GB (estimated maximum) > > > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node) > > > > > I did try with and without --64-bit-indices. > > > > > > > > > > 3) The debug traces seem to vary, see below. I *think* the failure might be happening in the vicinity of a Coloring call. I'm using MatFDColoring like so: > > > > > > > > > > ISColoring iscoloring; > > > > > MatFDColoring fdcoloring; > > > > > MatColoring coloring; > > > > > > > > > > MatColoringCreate(ctx.JPre, &coloring); > > > > > MatColoringSetType(coloring, MATCOLORINGGREEDY); > > > > > > > > > > // converges stalls badly without this on small cases, don't know why > > > > > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL); > > > > > > > > > > // none of these worked. > > > > > // MatColoringSetType(coloring, MATCOLORINGJP); > > > > > // MatColoringSetType(coloring, MATCOLORINGSL); > > > > > // MatColoringSetType(coloring, MATCOLORINGID); > > > > > MatColoringSetFromOptions(coloring); > > > > > > > > > > MatColoringApply(coloring, &iscoloring); > > > > > MatColoringDestroy(&coloring); > > > > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring); > > > > > > > > > > I have had issues in the past with getting a functional coloring setup for finite difference jacobians, and the above is the only configuration I've managed to get working successfully. Have there been any significant development changes to that area of code since v3.8.3? I'll try upgrading in the mean time and hope for the best. > > > > > > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > > > > > > > Thanks, > > > > > Mark > > > > > > > > > > > > > > > ************************************* > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429773.out > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (functions) on different processors > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Invalid argument > > > > > [0]PETSC ERROR: Enum value must be same on all processes, argument # 2 > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > > > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > > > [0]PETSC ERROR: is given. > > > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130 /home/mlohry/build/external/petsc/src/sys/objects/tagm.c > > > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34 /home/mlohry/build/external/petsc/src/sys/objects/inherit.c > > > > > [0]PETSC ERROR: [0] DMCreate line 36 /home/mlohry/build/external/petsc/src/dm/interface/dm.c > > > > > [0]PETSC ERROR: [0] DMShellCreate line 983 /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c > > > > > [0]PETSC ERROR: [0] TSGetDM line 5287 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Signal received > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by mlohry Tue Aug 6 06:05:02 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun --with-64-bit-indices > > > > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file > > > > > > > > > > > > > > > ************************************* > > > > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429158.out > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > > > [0]PETSC ERROR: is given. > > > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497 /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559 /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c > > > > > [0]PETSC ERROR: [0] MatColoringApply line 357 /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c > > > > > [0]PETSC ERROR: [0] VecSetSizes line 1308 /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605 /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c > > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Signal received > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by mlohry Mon Aug 5 23:58:19 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file > > > > > > > > > > > > > > > > > > > > ************************* > > > > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429134.out > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in /home/mlohry/build/external/petsc/src/sys/utils/psplit.c > > > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c > > > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c > > > > > [0]PETSC ERROR: #6 VecSetType() line 51 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c > > > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Object is in wrong state > > > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1 > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by mlohry Mon Aug 5 23:24:23 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c > > > > > > > > > > > > > > > > > > > > ************************************** > > > > > > > > > > > > > > > > > > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429102.out > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Petsc has generated inconsistent data > > > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > > > [0]PETSC ERROR: likely location of problem given in stack below > > > > > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > > > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > > > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > > > [0]PETSC ERROR: is given. > > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454 /home/mlohry/build/external/petsc/src/mat/utils/matstash.c > > > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676 /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c > > > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167 /home/mlohry/build/external/petsc/src/mat/interface/matrix.c > > > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248 /home/mlohry/build/external/petsc/src/ts/interface/ts.c > > > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > > > [0]PETSC ERROR: Signal received > > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017 > > > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by mlohry Mon Aug 5 22:50:12 2019 > > > > > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1 --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS --with-mpiexec=/usr/bin/srun > > > > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file > > > > > > > > > > > > > > > > > > > > > > > > > From d_mckinnell at aol.co.uk Mon Aug 19 04:05:51 2019 From: d_mckinnell at aol.co.uk (Daniel Mckinnell) Date: Mon, 19 Aug 2019 09:05:51 +0000 (UTC) Subject: [petsc-users] Fwd: Accessing Global Vector Data with In-Reply-To: <1918107909.762470.1566204356701@mail.yahoo.com> References: <2145279736.4118543.1565881168869.ref@mail.yahoo.com> <2145279736.4118543.1565881168869@mail.yahoo.com> <801821969.4134946.1565885148972@mail.yahoo.com> <1918107909.762470.1566204356701@mail.yahoo.com> Message-ID: <1462650183.755054.1566205551926@mail.yahoo.com> -----Original Message----- From: Daniel Mckinnell To: knepley Sent: Mon, 19 Aug 2019 9:45 Subject: Re: [petsc-users] Accessing Global Vector Data with Thank you for the help, the "if (voff >= 0)" line was to prevent errors from looping over ghost vertices but it is better to just loop from the start of the vertices to the end of the physical vertices. I am really just trying to create an example file of a method to access the associated field data of a sample vertex in the global Vec, alter the field data and then set it back into the Vec object.Best Regards,Daniel -----Original Message----- From: Matthew Knepley To: Daniel Mckinnell CC: PETSc Sent: Thu, 15 Aug 2019 17:19 Subject: Re: [petsc-users] Accessing Global Vector Data with On Thu, Aug 15, 2019 at 12:05 PM Daniel Mckinnell wrote: Thank you, it seemed to work nicely for the fields associated with cells but I'm having problems with the field on vertices. The code for altering the field is as follows: for (int i = vStart; i < vEnd; i++) ??? { ??????? PetscScalar *values; ??????? PetscInt vsize, voff; ??????? PetscSectionGetDof(globalsection, i, &vsize); ??????? PetscSectionGetOffset(globalsection, i, &voff); You do not need the Section calls I think.? ??????? VecGetArray(u, &values); Hoist GetArray out of v loop? ??????? if (voff >= 0) This check is not needed I think. What you are doing here is a mix of local and global.What do you actually want to?do? ??????? { ??????????? PetscScalar *p; ??????????? DMPlexPointLocalRef(*dm, i, values, &p); ??????????? p[0] = i + 1; ??????? } ??????? VecRestoreArray(u, &values); Hoist RestoreArray out of v loop. ? Thanks, ? ? Matt? The Global PetscSection is: PetscSection Object: 2 MPI processes ? type not yet set Process 0: ? (?? 0) dim? 2 offset?? 0 ? (?? 1) dim? 2 offset?? 2 ? (?? 2) dim -3 offset -13 ? (?? 3) dim -3 offset -15 ? (?? 4) dim? 1 offset?? 4 ? (?? 5) dim? 1 offset?? 5 ? (?? 6) dim? 1 offset?? 6 ? (?? 7) dim -2 offset -17 ? (?? 8) dim -2 offset -18 ? (?? 9) dim -2 offset -19 ? (? 10) dim -2 offset -20 ? (? 11) dim -2 offset -21 ? (? 12) dim -2 offset -22 ? (? 13) dim? 1 offset?? 7 ? (? 14) dim? 1 offset?? 8 ? (? 15) dim? 1 offset?? 9 ? (? 16) dim? 1 offset? 10 ? (? 17) dim? 1 offset? 11 ? (? 18) dim -2 offset -23 ? (? 19) dim -2 offset -24 ? (? 20) dim -2 offset -25 ? (? 21) dim -2 offset -26 ? (? 22) dim -2 offset -27 ? (? 23) dim -2 offset -28 ? (? 24) dim -2 offset -29 Process 1: ? (?? 0) dim? 2 offset? 12 ? (?? 1) dim? 2 offset? 14 ? (?? 2) dim -3 offset? -1 ? (?? 3) dim -3 offset? -3 ? (?? 4) dim? 1 offset? 16 ? (?? 5) dim? 1 offset? 17 ? (?? 6) dim? 1 offset? 18 ? (?? 7) dim? 1 offset? 19 ? (?? 8) dim? 1 offset? 20 ? (?? 9) dim? 1 offset? 21 ? (? 10) dim -2 offset? -5 ? (? 11) dim -2 offset? -6 ? (? 12) dim -2 offset? -7 ? (? 13) dim? 1 offset? 22 ? (? 14) dim? 1 offset? 23 ? (? 15) dim? 1 offset? 24 ? (? 16) dim? 1 offset? 25 ? (? 17) dim? 1 offset? 26 ? (? 18) dim? 1 offset? 27 ? (? 19) dim? 1 offset? 28 ? (? 20) dim -2 offset? -8 ? (? 21) dim -2 offset? -9 ? (? 22) dim -2 offset -10 ? (? 23) dim -2 offset -11 ? (? 24) dim -2 offset -12 where points 4-6 on proc 0 and points 4-9 on proc 1 are vertices. The altered vec is: Vec Object: 2 MPI processes ? type: mpi Process [0] 0. 0. 0. 0. 0. 0. 0. 0. 5. 6. 7. 0. Process [1] 0. 0. 0. 0. 0. 0. 0. 0. 5. 6. 7. 8. 9. 10. 0. 0. 0. However the altered values do not correspond with the offsets in the PetscSection and I'm not sure what is going wrong.Thanks for all of the help.Daniel -----Original Message----- From: Matthew Knepley via petsc-users To: Daniel Mckinnell CC: PETSc Sent: Thu, 15 Aug 2019 16:11 Subject: Re: [petsc-users] Accessing Global Vector Data with On Thu, Aug 15, 2019 at 10:59 AM Daniel Mckinnell via petsc-users wrote: Hi,Attached is a way I came up with to access data in a global vector, is this the best way to do this or are there other ways? It would seem intuitive to use the global PetscSection and VecGetValuesSection but this doesn't seem to work on global vectors. Instead I have used VecGetValues and VecSetValues, however I also have a problem with these when extracting more than one value, I initialise the output of VecGetValues as PetscScalar *values; and then call VecGetValues(Other stuff... , values). This seems to work some times and not others and I can't find any rhyme or reason to it? I guess I should write something about this. I like to think of it as a sort of decision tree. 1) Get just local values, meaning those owned by this process ? ? These can be obtained from either a local or global vector. 2) Get ghosted values, meaning those values lying on unowned points that exist in the default PetscSF ? ? These can only get obtained from a local vector 3) Get arbitrary values ? ? You must use a VecScatter or custom PetscSF to get these For 1) and 2), I think the best way to get values normally is to use ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointLocalRef.html??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointGlobalRef.html These also have Read versions, and Field version to split off a particular field in the Section. ? Does this help? ? ? Thanks, ? ? ? Matt? Finally I was wondering if there is a good reference code base on Github including DMPlex that would be helpful in viewing applications of the DMPlex functions?Thanks,Daniel Mckinnell -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From d_mckinnell at aol.co.uk Mon Aug 19 09:24:12 2019 From: d_mckinnell at aol.co.uk (Daniel Mckinnell) Date: Mon, 19 Aug 2019 14:24:12 +0000 (UTC) Subject: [petsc-users] Accessing Global Vector Data with References: <882440.847303.1566224652448.ref@mail.yahoo.com> Message-ID: <882440.847303.1566224652448@mail.yahoo.com> Sorry I'm still a bit confused about this. I changed DMPlexPointLocalRef for DMPlexPointGlobalRef as shown in my attached code but got the following error: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: ---------------------? Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR:?????? INSTEAD the line number of the start of the function [0]PETSC ERROR:?????? is given. [0]PETSC ERROR: [0] DMPlexPointGlobalRef line 313 /home/daniel/petsc/src/dm/impls/plex/plexpoint.c [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: ---------------------? Stack Frames ------------------------------------ [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR:?????? INSTEAD the line number of the start of the function [1]PETSC ERROR:?????? is given. [1]PETSC ERROR: [1] DMPlexPointGlobalRef line 313 /home/daniel/petsc/src/dm/impls/plex/plexpoint.c [1]PETSC ERROR: -------------------------------------------------------------------------- I am guessing that this is because the array is not a global one but I'm not sure how you can produce a global array from a global Vec?Thanks,Daniel -----Original Message----- From: Matthew Knepley To: Daniel Mckinnell CC: PETSc Sent: Mon, 19 Aug 2019 12:13 Subject: Re: [petsc-users] Accessing Global Vector Data with On Mon, Aug 19, 2019 at 4:46 AM Daniel Mckinnell wrote: Thank you for the help, the "if (voff >= 0)" line was to prevent errors from looping over ghost vertices but it is better to just loop from the start of the vertices to the end of the physical vertices. I am really just trying to create an example file of a method to access the associated field data of a sample vertex in the global Vec, alter the field data and then set it back into the Vec object. The easiest thing to do is use ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointGlobalRef.html ? Thanks, ? ? ? Matt? Best Regards,Daniel -----Original Message----- From: Matthew Knepley To: Daniel Mckinnell CC: PETSc Sent: Thu, 15 Aug 2019 17:19 Subject: Re: [petsc-users] Accessing Global Vector Data with On Thu, Aug 15, 2019 at 12:05 PM Daniel Mckinnell wrote: Thank you, it seemed to work nicely for the fields associated with cells but I'm having problems with the field on vertices. The code for altering the field is as follows: for (int i = vStart; i < vEnd; i++) ??? { ??????? PetscScalar *values; ??????? PetscInt vsize, voff; ??????? PetscSectionGetDof(globalsection, i, &vsize); ??????? PetscSectionGetOffset(globalsection, i, &voff); You do not need the Section calls I think.? ??????? VecGetArray(u, &values); Hoist GetArray out of v loop? ??????? if (voff >= 0) This check is not needed I think. What you are doing here is a mix of local and global.What do you actually want to?do? ??????? { ??????????? PetscScalar *p; ??????????? DMPlexPointLocalRef(*dm, i, values, &p); ??????????? p[0] = i + 1; ??????? } ??????? VecRestoreArray(u, &values); Hoist RestoreArray out of v loop. ? Thanks, ? ? Matt? The Global PetscSection is: PetscSection Object: 2 MPI processes ? type not yet set Process 0: ? (?? 0) dim? 2 offset?? 0 ? (?? 1) dim? 2 offset?? 2 ? (?? 2) dim -3 offset -13 ? (?? 3) dim -3 offset -15 ? (?? 4) dim? 1 offset?? 4 ? (?? 5) dim? 1 offset?? 5 ? (?? 6) dim? 1 offset?? 6 ? (?? 7) dim -2 offset -17 ? (?? 8) dim -2 offset -18 ? (?? 9) dim -2 offset -19 ? (? 10) dim -2 offset -20 ? (? 11) dim -2 offset -21 ? (? 12) dim -2 offset -22 ? (? 13) dim? 1 offset?? 7 ? (? 14) dim? 1 offset?? 8 ? (? 15) dim? 1 offset?? 9 ? (? 16) dim? 1 offset? 10 ? (? 17) dim? 1 offset? 11 ? (? 18) dim -2 offset -23 ? (? 19) dim -2 offset -24 ? (? 20) dim -2 offset -25 ? (? 21) dim -2 offset -26 ? (? 22) dim -2 offset -27 ? (? 23) dim -2 offset -28 ? (? 24) dim -2 offset -29 Process 1: ? (?? 0) dim? 2 offset? 12 ? (?? 1) dim? 2 offset? 14 ? (?? 2) dim -3 offset? -1 ? (?? 3) dim -3 offset? -3 ? (?? 4) dim? 1 offset? 16 ? (?? 5) dim? 1 offset? 17 ? (?? 6) dim? 1 offset? 18 ? (?? 7) dim? 1 offset? 19 ? (?? 8) dim? 1 offset? 20 ? (?? 9) dim? 1 offset? 21 ? (? 10) dim -2 offset? -5 ? (? 11) dim -2 offset? -6 ? (? 12) dim -2 offset? -7 ? (? 13) dim? 1 offset? 22 ? (? 14) dim? 1 offset? 23 ? (? 15) dim? 1 offset? 24 ? (? 16) dim? 1 offset? 25 ? (? 17) dim? 1 offset? 26 ? (? 18) dim? 1 offset? 27 ? (? 19) dim? 1 offset? 28 ? (? 20) dim -2 offset? -8 ? (? 21) dim -2 offset? -9 ? (? 22) dim -2 offset -10 ? (? 23) dim -2 offset -11 ? (? 24) dim -2 offset -12 where points 4-6 on proc 0 and points 4-9 on proc 1 are vertices. The altered vec is: Vec Object: 2 MPI processes ? type: mpi Process [0] 0. 0. 0. 0. 0. 0. 0. 0. 5. 6. 7. 0. Process [1] 0. 0. 0. 0. 0. 0. 0. 0. 5. 6. 7. 8. 9. 10. 0. 0. 0. However the altered values do not correspond with the offsets in the PetscSection and I'm not sure what is going wrong.Thanks for all of the help.Daniel -----Original Message----- From: Matthew Knepley via petsc-users To: Daniel Mckinnell CC: PETSc Sent: Thu, 15 Aug 2019 16:11 Subject: Re: [petsc-users] Accessing Global Vector Data with On Thu, Aug 15, 2019 at 10:59 AM Daniel Mckinnell via petsc-users wrote: Hi,Attached is a way I came up with to access data in a global vector, is this the best way to do this or are there other ways? It would seem intuitive to use the global PetscSection and VecGetValuesSection but this doesn't seem to work on global vectors. Instead I have used VecGetValues and VecSetValues, however I also have a problem with these when extracting more than one value, I initialise the output of VecGetValues as PetscScalar *values; and then call VecGetValues(Other stuff... , values). This seems to work some times and not others and I can't find any rhyme or reason to it? I guess I should write something about this. I like to think of it as a sort of decision tree. 1) Get just local values, meaning those owned by this process ? ? These can be obtained from either a local or global vector. 2) Get ghosted values, meaning those values lying on unowned points that exist in the default PetscSF ? ? These can only get obtained from a local vector 3) Get arbitrary values ? ? You must use a VecScatter or custom PetscSF to get these For 1) and 2), I think the best way to get values normally is to use ??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointLocalRef.html??https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointGlobalRef.html These also have Read versions, and Field version to split off a particular field in the Section. ? Does this help? ? ? Thanks, ? ? ? Matt? Finally I was wondering if there is a good reference code base on Github including DMPlex that would be helpful in viewing applications of the DMPlex functions?Thanks,Daniel Mckinnell -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: for_petsc_forum.cpp Type: text/x-c++src Size: 2543 bytes Desc: not available URL: From knepley at gmail.com Mon Aug 19 10:41:59 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 Aug 2019 11:41:59 -0400 Subject: [petsc-users] Accessing Global Vector Data with In-Reply-To: <882440.847303.1566224652448@mail.yahoo.com> References: <882440.847303.1566224652448.ref@mail.yahoo.com> <882440.847303.1566224652448@mail.yahoo.com> Message-ID: On Mon, Aug 19, 2019 at 10:24 AM Daniel Mckinnell wrote: > Sorry I'm still a bit confused about this. I changed DMPlexPointLocalRef > for DMPlexPointGlobalRef as shown in my attached code but got the > following error: > You do not need the global section. I have rewritten your code and it works for me. Thanks, Matt > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory > corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] DMPlexPointGlobalRef line 313 > /home/daniel/petsc/src/dm/impls/plex/plexpoint.c > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] DMPlexPointGlobalRef line 313 > /home/daniel/petsc/src/dm/impls/plex/plexpoint.c > [1]PETSC ERROR: > -------------------------------------------------------------------------- > > I am guessing that this is because the array is not a global one but I'm > not sure how you can produce a global array from a global Vec? > Thanks, > Daniel > > > -----Original Message----- > From: Matthew Knepley > To: Daniel Mckinnell > CC: PETSc > Sent: Mon, 19 Aug 2019 12:13 > Subject: Re: [petsc-users] Accessing Global Vector Data with > > On Mon, Aug 19, 2019 at 4:46 AM Daniel Mckinnell > wrote: > > Thank you for the help, the "if (voff >= 0)" line was to prevent errors > from looping over ghost vertices but it is better to just loop from the > start of the vertices to the end of the physical vertices. > > I am really just trying to create an example file of a method to access > the associated field data of a sample vertex in the global Vec, alter the > field data and then set it back into the Vec object. > > > The easiest thing to do is use > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointGlobalRef.html > > Thanks, > > Matt > > > Best Regards, > Daniel > > > -----Original Message----- > From: Matthew Knepley > To: Daniel Mckinnell > CC: PETSc > Sent: Thu, 15 Aug 2019 17:19 > Subject: Re: [petsc-users] Accessing Global Vector Data with > > On Thu, Aug 15, 2019 at 12:05 PM Daniel Mckinnell > wrote: > > Thank you, it seemed to work nicely for the fields associated with cells > but I'm having problems with the field on vertices. The code for altering > the field is as follows: > > for (int i = vStart; i < vEnd; i++) > { > PetscScalar *values; > PetscInt vsize, voff; > PetscSectionGetDof(globalsection, i, &vsize); > PetscSectionGetOffset(globalsection, i, &voff); > > > You do not need the Section calls I think. > > > VecGetArray(u, &values); > > > Hoist GetArray out of v loop > > > if (voff >= 0) > > > This check is not needed I think. What you are doing here is a mix of > local and global. > What do you actually want to do? > > { > PetscScalar *p; > DMPlexPointLocalRef(*dm, i, values, &p); > p[0] = i + 1; > } > VecRestoreArray(u, &values); > > > Hoist RestoreArray out of v loop. > > Thanks, > > Matt > > > > The Global PetscSection is: > > PetscSection Object: 2 MPI processes > type not yet set > Process 0: > ( 0) dim 2 offset 0 > ( 1) dim 2 offset 2 > ( 2) dim -3 offset -13 > ( 3) dim -3 offset -15 > ( 4) dim 1 offset 4 > ( 5) dim 1 offset 5 > ( 6) dim 1 offset 6 > ( 7) dim -2 offset -17 > ( 8) dim -2 offset -18 > ( 9) dim -2 offset -19 > ( 10) dim -2 offset -20 > ( 11) dim -2 offset -21 > ( 12) dim -2 offset -22 > ( 13) dim 1 offset 7 > ( 14) dim 1 offset 8 > ( 15) dim 1 offset 9 > ( 16) dim 1 offset 10 > ( 17) dim 1 offset 11 > ( 18) dim -2 offset -23 > ( 19) dim -2 offset -24 > ( 20) dim -2 offset -25 > ( 21) dim -2 offset -26 > ( 22) dim -2 offset -27 > ( 23) dim -2 offset -28 > ( 24) dim -2 offset -29 > Process 1: > ( 0) dim 2 offset 12 > ( 1) dim 2 offset 14 > ( 2) dim -3 offset -1 > ( 3) dim -3 offset -3 > ( 4) dim 1 offset 16 > ( 5) dim 1 offset 17 > ( 6) dim 1 offset 18 > ( 7) dim 1 offset 19 > ( 8) dim 1 offset 20 > ( 9) dim 1 offset 21 > ( 10) dim -2 offset -5 > ( 11) dim -2 offset -6 > ( 12) dim -2 offset -7 > ( 13) dim 1 offset 22 > ( 14) dim 1 offset 23 > ( 15) dim 1 offset 24 > ( 16) dim 1 offset 25 > ( 17) dim 1 offset 26 > ( 18) dim 1 offset 27 > ( 19) dim 1 offset 28 > ( 20) dim -2 offset -8 > ( 21) dim -2 offset -9 > ( 22) dim -2 offset -10 > ( 23) dim -2 offset -11 > ( 24) dim -2 offset -12 > > where points 4-6 on proc 0 and points 4-9 on proc 1 are vertices. > > The altered vec is: > > Vec Object: 2 MPI processes > type: mpi > Process [0] > 0. > 0. > 0. > 0. > 0. > 0. > 0. > 0. > 5. > 6. > 7. > 0. > Process [1] > 0. > 0. > 0. > 0. > 0. > 0. > 0. > 0. > 5. > 6. > 7. > 8. > 9. > 10. > 0. > 0. > 0. > > However the altered values do not correspond with the offsets in the > PetscSection and I'm not sure what is going wrong. > Thanks for all of the help. > Daniel > > > -----Original Message----- > From: Matthew Knepley via petsc-users > To: Daniel Mckinnell > CC: PETSc > Sent: Thu, 15 Aug 2019 16:11 > Subject: Re: [petsc-users] Accessing Global Vector Data with > > On Thu, Aug 15, 2019 at 10:59 AM Daniel Mckinnell via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > Hi, > Attached is a way I came up with to access data in a global vector, is > this the best way to do this or are there other ways? It would seem > intuitive to use the global PetscSection and VecGetValuesSection but this > doesn't seem to work on global vectors. > > Instead I have used VecGetValues and VecSetValues, however I also have a > problem with these when extracting more than one value, I initialise the > output of VecGetValues as PetscScalar *values; and then call VecGetValues(Other > stuff... , values). This seems to work some times and not others and I > can't find any rhyme or reason to it? > > > I guess I should write something about this. I like to think of it as a > sort of decision tree. > > 1) Get just local values, meaning those owned by this process > > These can be obtained from either a local or global vector. > > 2) Get ghosted values, meaning those values lying on unowned points that > exist in the default PetscSF > > These can only get obtained from a local vector > > 3) Get arbitrary values > > You must use a VecScatter or custom PetscSF to get these > > For 1) and 2), I think the best way to get values normally is to use > > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointLocalRef.html > > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMPLEX/DMPlexPointGlobalRef.html > > These also have Read versions, and Field version to split off a particular > field in the Section. > > Does this help? > > Thanks, > > Matt > > > Finally I was wondering if there is a good reference code base on Github > including DMPlex that would be helpful in viewing applications of the > DMPlex functions? > Thanks, > Daniel Mckinnell > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tester.c Type: application/octet-stream Size: 2773 bytes Desc: not available URL: From bsmith at mcs.anl.gov Mon Aug 19 14:51:53 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 19 Aug 2019 19:51:53 +0000 Subject: [petsc-users] IMPORTANT PETSc repository changed from Bucketbit to GitLab References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> Message-ID: <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> PETSc folks. This announcement is for people who access PETSc from the BitBucket repository or post issues or have other activities with the Bitbucket repository We have changed the location of the PETSc repository from BitBucket to GitLab. For each copy of the repository you need to do git remote set-url origin git at gitlab.com:petsc/petsc.git or git remote set-url origin https://gitlab.com/petsc/petsc.git You will likely also want to set up an account on gitlab and remember to set the ssh key information if you previously had write permission to the petsc repository and cannot write to the new repository please email bsmith at mcs.anl.gov with your GitLab username and the email used Please do not make pull requests to the Gitlab site yet; we will be manually processing the PR from the BitBucket site over the next couple of days as we implement the testing. Please be patient, this is all new to use and it may take a few days to get out all the glitches. Thanks for your support Barry The reason for switching to GitLab is that it has a better testing system than BitBucket and Gitlab. We hope that will allow us to test and manage pull requests more rapidly, efficiently and accurately, thus allowing us to improve and add to PETSc more quickly. -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Aug 19 15:47:48 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Mon, 19 Aug 2019 20:47:48 +0000 Subject: [petsc-users] [petsc-dev] IMPORTANT PETSc repository changed from Bucketbit to GitLab In-Reply-To: <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> Message-ID: A note: The bitbucket repository is saved at https://bitbucket.org/petsc/petsc-pre-gitlab The git part is now read-only. The other parts [issues, PRs, wiki etc] are perhaps writable - but we should avoid that. Satish On Mon, 19 Aug 2019, Smith, Barry F. via petsc-dev wrote: > > > PETSc folks. > > This announcement is for people who access PETSc from the BitBucket repository or post issues or have other activities with the Bitbucket repository > > We have changed the location of the PETSc repository from BitBucket to GitLab. For each copy of the repository you need to do > > git remote set-url origin git at gitlab.com:petsc/petsc.git > or > git remote set-url origin https://gitlab.com/petsc/petsc.git > > You will likely also want to set up an account on gitlab and remember to set the ssh key information > > if you previously had write permission to the petsc repository and cannot write to the new repository please email bsmith at mcs.anl.gov with your GitLab > username and the email used > > Please do not make pull requests to the Gitlab site yet; we will be manually processing the PR from the BitBucket site over the next couple of > days as we implement the testing. > > Please be patient, this is all new to use and it may take a few days to get out all the glitches. > > Thanks for your support > > Barry > > The reason for switching to GitLab is that it has a better testing system than BitBucket and Gitlab. We hope that will allow us to test and manage > pull requests more rapidly, efficiently and accurately, thus allowing us to improve and add to PETSc more quickly. > From KJiao at slb.com Mon Aug 19 15:49:28 2019 From: KJiao at slb.com (Kun Jiao) Date: Mon, 19 Aug 2019 20:49:28 +0000 Subject: [petsc-users] PetscInitialize take various time to run on different node Message-ID: Hi, I am running into a problem, different node/process take different time to run petscinitialize. The difference could be 5~10mins. I put some check before petscinitialize, and after petscinitialize. MPI_Init(argc, args); int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank ); std::cout << "I am here :"<< rank << "!!!!" << std::endl; PetscErrorCode ierr = PetscInitialize(&argc, &args, (char*)0, help); std::cout << "You am here 2:"<< rank << "!!!!" << std::endl; "I am here" message was printout for every process, but not "You are here 2" message. Very strange. Any suggestion is appreciated. Regards, Kun Schlumberger-Private -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Aug 19 16:37:42 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Mon, 19 Aug 2019 21:37:42 +0000 Subject: [petsc-users] PetscInitialize take various time to run on different node In-Reply-To: References: Message-ID: You need to start up a case of two processes (or more) that "hang" and use a debugger to attach to it to see where it is in the code that would make it hang. Each debugger has its own syntax to attach to a running process, so you'll need to check, but it is straight forward. Barry > On Aug 19, 2019, at 3:49 PM, Kun Jiao via petsc-users wrote: > > Hi, > > I am running into a problem, different node/process take different time to run petscinitialize. The difference could be 5~10mins. > > I put some check before petscinitialize, and after petscinitialize. > > MPI_Init(argc, args); > > int rank; > MPI_Comm_rank(MPI_COMM_WORLD, &rank ); > > std::cout << "I am here :"<< rank << "!!!!" << std::endl; > > PetscErrorCode ierr = PetscInitialize(&argc, &args, (char*)0, help); > > std::cout << "You am here 2:"<< rank << "!!!!" << std::endl; > > > ?I am here? message was printout for every process, but not ?You are here 2? message. > > Very strange. > > Any suggestion is appreciated. > > Regards, > Kun > > > Schlumberger-Private From knepley at gmail.com Mon Aug 19 17:41:59 2019 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 19 Aug 2019 18:41:59 -0400 Subject: [petsc-users] PetscInitialize take various time to run on different node In-Reply-To: References: Message-ID: On Mon, Aug 19, 2019 at 4:49 PM Kun Jiao via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > > > I am running into a problem, different node/process take different time to > run petscinitialize. The difference could be 5~10mins. > This sounds like an MPI installation error. I would consult your sysadmin. Thanks, Matt > I put some check before petscinitialize, and after petscinitialize. > > > > MPI_Init(argc, args); > > > > int rank; > > MPI_Comm_rank(MPI_COMM_WORLD, &rank ); > > > > std::cout << "I am here :"<< rank << "!!!!" << std::endl; > > > > PetscErrorCode ierr = PetscInitialize(&argc, &args, (char*)0, *help*); > > > > std::cout << "You am here 2:"<< rank << "!!!!" << std::endl; > > > > > > ?I am here? message was printout for every process, but not ?You are here > 2? message. > > > > Very strange. > > > > Any suggestion is appreciated. > > > Regards, > > Kun > > > > Schlumberger-Private > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tangqi at msu.edu Mon Aug 19 19:51:44 2019 From: tangqi at msu.edu (Tang, Qi) Date: Tue, 20 Aug 2019 00:51:44 +0000 Subject: [petsc-users] How to choose mat_mffd_err in JFNK In-Reply-To: <827791B1-D492-49DE-954B-BCD6E3253360@mcs.anl.gov> References: <459CE6EB-5359-4D35-B90F-CECFD32C0EB4@anl.gov> , <827791B1-D492-49DE-954B-BCD6E3253360@mcs.anl.gov> Message-ID: Thanks again, Barry. I am testing more based on your suggestions. One thing I do not understand is when I use -mat_mffd_type wp -mat_mffd_err 1e-3 -ksp_type fgmres. It computes "h" of JFNK twice sequentially in each KSP iteration. For instance, the output in info looks like ... [0] MatMult_MFFD(): Current differencing parameter: 1.953103182078e-09 [0] MatMult_MFFD(): Current differencing parameter: 6.054449182838e-01 ... I found it called MatMult_MFFD twice in a row. Do you happen to know why petsc calls MatMult_MFFD twice when using fgmre? I also do not understand the relationship between these two h. They are very off (because apparently ||a|| is very different). On the other hand, h is only computed once when I use gmres. The h there is clearly scaled with mat_mffd_err. That is easy to understand. However, I think I need to use fgmres because my preconditioner is not quite linear. BTW, my snes view is: SNES Object: () 16 MPI processes type: newtonls maximum iterations=20, maximum function evaluations=10000 tolerances: relative=0.0001, absolute=0., solution=0. total number of linear solver iterations=70 total number of function evaluations=149 norm schedule ALWAYS Eisenstat-Walker computation of KSP relative tolerance (version 3) rtol_0=0.3, rtol_max=0.9, threshold=0.1 gamma=0.9, alpha=1.5, alpha2=1.5 SNESLineSearch Object: () 16 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: () 16 MPI processes type: fgmres restart=50, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=0.0564799, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: () 16 MPI processes type: shell JFNK preconditioner No information available on the mfem::Solver Number of preconditioners created by the factory 8 linear system matrix followed by preconditioner matrix: Mat Object: 16 MPI processes type: mffd rows=787968, cols=787968 Matrix-free approximation: err=0.001 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: 16 MPI processes type: nest rows=787968, cols=787968 Matrix object: type=nest, rows=3, cols=3 MatNest structure: (0,0) : type=mpiaij, rows=262656, cols=262656 (0,1) : type=mpiaij, rows=262656, cols=262656 (0,2) : NULL (1,0) : type=mpiaij, rows=262656, cols=262656 (1,1) : type=mpiaij, rows=262656, cols=262656 (1,2) : NULL (2,0) : type=mpiaij, rows=262656, cols=262656 (2,1) : NULL (2,2) : type=mpiaij, rows=262656, cols=262656 ________________________________ From: Smith, Barry F. Sent: Thursday, August 15, 2019 3:30 AM To: Tang, Qi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > On Aug 15, 2019, at 12:36 AM, Tang, Qi wrote: > > Thanks, it works. snes_mf_jorge works for me. Great. > It appears to compute h in every ksp. Each matrix vector product or each KSPSolve()? From the code looks like each matrix-vector product. > > Without -snes_mf_jorge, it is not working. For some reason, it only computes h once, but that h is bad. My gmres residual is not decaying. > > Indeed, the noise in my function becomes larger when I refine the mesh. I think it makes sense as I use the same time step for different meshes (that is the goal of the preconditioning). However, even when the algorithm is working, sqrt(noise) is much less than the good mat_mffd_err I previously found (10^-6 vs 10^-3). I do not understand why. I have no explanation. The details of the code that computes err are difficult to trace exactly; would take a while. Perhaps there is some parameter in there that is too "conservative"? > > Although snes_mf_jorge is working, it is very expensive, as it has to evaluate F many times when estimating h. Unfortunately, to achieve the nonlinearity, I have to assemble some operators inside my F. There seem no easy solutions. > > I will try to compute h multiple times without using snes_mf_jorge. But let me know if you have other suggestions. Thanks! Yes, it seems to me computing the err less often would be a way to make the code go faster. I looked at the code more closely and noticed a couple of things. if (ctx->jorge) { ierr = SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); /* Use the Brown/Saad method to compute h */ } else { /* Compute error if desired */ ierr = SNESGetIterationNumber(snes,&iter);CHKERRQ(ierr); if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) { /* Use Jorge's method to compute noise */ ierr = SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); ctx->error_rel = PetscSqrtReal(noise); ierr = PetscInfo3(snes,"Using Jorge's noise: noise=%g, sqrt(noise)=%g, h_more=%g\n",(double)noise,(double)ctx->error_rel,(double)h);CHKERRQ(ierr); ctx->compute_err_iter = iter; ctx->need_err = PETSC_FALSE; } So if jorge is set it uses the jorge algorithm for each matrix multiple to compute a new err and h. If jorge is not set but -snes_mf_compute_err is set then it computes a new err and h depending on the parameters (you can run with -info and grep for noise to see the PetscInfo lines printed (maybe add iter to the output to see how often it is recomputed) if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) { so the compute_err_freq determines when the err and h are recomputed. The logic is a bit strange so I cannot decipher exactly how often it is recomputing. I'm guess at the first Newton iteration and then some number of iterations later. You could try to rig it so that it is at every new Newton step (this would mean when computing f(x + h a) - f(x) for each new x it will recompute I think). A more "research type" approach to try to reduce the work to a reasonable level would be to keep the same err until "things start to go bad" and then recompute it. But how to measure "when things start to go bad?" It is related to GMRES stagnating, so you could for example track the convergence of GMRES and if it is less than expected, kill the linear solve, reset the computation of err and then start the KSP at the same point as before. But this could be terribly expensive since all the steps in the most recent KSP are lost. Another possibility is to assume that the err is valid in a certain size ball around the current x. When x + ha or the new x is outside that ball then recompute the err. But how to choose the ball size and is there a way to adjust the ball size depending on how the computations proceed. For example if everything is going great you could slowly increase the size of the ball and hope for the best; but how to detect if the ball got too big (as above but terribly expensive)? Track the size of the err each time it is computed. If it stays about the same for a small number of times just freeze it for a while at that value? But again how to determine when it is no longer good? Just a few wild thoughts, I really have no solid ideas on how to reduce the work requirements. Barry > > Qi > > > From: Smith, Barry F. > Sent: Wednesday, August 14, 2019 10:57 PM > To: Tang, Qi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > On Aug 14, 2019, at 9:41 PM, Tang, Qi wrote: > > > > Thanks for the help, Barry. I tired both ds and wp, and again it depends on if I could find the correct parameter set. It is getting harder as I refine the mesh. > > > > So I try to use SNESDefaultMatrixFreeCreate2, SNESMatrixFreeMult2_Private or SNESDiffParameterCompute_More in mfem. But it looks like these functions are not in linked in petscsnes.h. How could I call them? > > They may not be listed in petscsnes.h but I think they should be in the library. You can just stick the prototypes for the functions anywhere you need them for now. > > > You should be able to use > > ierr = PetscOptionsInt("-snes_mf_version","Matrix-Free routines version 1 or 2","None",snes->mf_version,&snes->mf_version,0);CHKERRQ(ierr); > > then this info is passed with > > if (snes->mf) { > ierr = SNESSetUpMatrixFree_Private(snes, snes->mf_operator, snes->mf_version);CHKERRQ(ierr); > } > > this routine has > > if (version == 1) { > ierr = MatCreateSNESMF(snes,&J);CHKERRQ(ierr); > ierr = MatMFFDSetOptionsPrefix(J,((PetscObject)snes)->prefix);CHKERRQ(ierr); > ierr = MatSetFromOptions(J);CHKERRQ(ierr); > } else if (version == 2) { > if (!snes->vec_func) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"SNESSetFunction() must be called first"); > #if !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_REAL_SINGLE) && !defined(PETSC_USE_REAL___FLOAT128) && !defined(PETSC_USE_REAL___FP16) > ierr = SNESDefaultMatrixFreeCreate2(snes,snes->vec_func,&J);CHKERRQ(ierr); > #else > > and this routine has > > ierr = VecDuplicate(x,&mfctx->w);CHKERRQ(ierr); > ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr); > ierr = VecGetSize(x,&n);CHKERRQ(ierr); > ierr = VecGetLocalSize(x,&nloc);CHKERRQ(ierr); > ierr = MatCreate(comm,J);CHKERRQ(ierr); > ierr = MatSetSizes(*J,nloc,n,n,n);CHKERRQ(ierr); > ierr = MatSetType(*J,MATSHELL);CHKERRQ(ierr); > ierr = MatShellSetContext(*J,mfctx);CHKERRQ(ierr); > ierr = MatShellSetOperation(*J,MATOP_MULT,(void (*)(void))SNESMatrixFreeMult2_Private);CHKERRQ(ierr); > ierr = MatShellSetOperation(*J,MATOP_DESTROY,(void (*)(void))SNESMatrixFreeDestroy2_Private);CHKERRQ(ierr); > ierr = MatShellSetOperation(*J,MATOP_VIEW,(void (*)(void))SNESMatrixFreeView2_Private);CHKERRQ(ierr); > ierr = MatSetUp(*J);CHKERRQ(ierr); > > > > > > > > > > Also, could I call SNESDiffParameterCompute_More in snes_monitor? But it needs "a" in (F(u + ha) - F(u)) /h as the input. So I am not sure how I could properly use it. Maybe use SNESDefaultMatrixFreeCreate2 inside SNESSetJacobian would be easier to try? > > If you use the flag -snes_mf_noise_file filename when it runs it will save all the noise information it computes along the way to that file (yes it is crude and doesn't match the PETSc Viewer/monitor style but it should work). > > Thus I think you can use it and get all the possible monitoring information without actually writing any code. Just > > -snes_mf_version 2 > -snes_mf_noise_file filename > > > > Barry > > > > > Thanks again, > > Qi > > > > From: Smith, Barry F. > > Sent: Tuesday, August 13, 2019 9:07 PM > > To: Tang, Qi > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > > > > > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users wrote: > > > > > > ?Hi, > > > I am using JFNK, inexact Newton and a shell "physics-based" preconditioning to solve some multiphysics problems. I have been playing with mat_mffd_err, and it gives me some results I do not fully understand. > > > > > > I believe the default value of mat_mffd_err is 10^-8 for double precision, which seem too large for my problems. When I test a problem essentially still in the linear regime, I expect my converged "unpreconditioned resid norm of KSP" should be identical to "SNES Function norm" (Should I?). This is exactly what I found if I could find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it happens, the whole algorithm works as expected. When those two norms are off, the inexact Newton becomes very inefficient. For instance, it may take many ksp iterations to converge but the snes norm is only reduced slightly. > > > > > > According to the manual, mat_mffd_err should be "square root of relative error in function evaluation". But is there a simple way to estimate it? > > > > First a related note: there are two different algorithms that PETSc provides for computing the factor h using the err parameter. They can be set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some people have better luck with one or the other for their problem. > > > > > > There is some code in PETSc to compute what is called the "noise" of the function which in theory leads to a better err value. For example if say the last four digits of your function are "noise" (that is meaningless stuff) which is possible for complicated multiphysics problems due to round off in the function evaluations then you should use an err that is 2 digits bigger than the default (note this is just the square root of the "function epsilon" instead of the machine epsilon, because the "function epsilon" is much larger than the machine epsilon). > > > > We never had a great model for how to hook the noise computation code up into SNES so it is a bit disconnected, we should revisit this. Perhaps you will find it useful and we can figure out together how to hook it in cleanly for use. > > > > The code that computes and reports the noise is in the directory src/snes/interface/noise > > > > You can use this version with the option -snes_mf_version 2 (version 1 is the normal behavior) > > > > The code has the ability to print to a file or the screen information about the noise it is finding, maybe with command line options, you'll have to look directly at the code to see how. > > > > I don't like the current setup because if you use the noise based computation of h you have to use SNESMatrixFreeMult2_Private() which is a slightly different routine for doing the actually differencing than the two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it can't use the standard differencing routines but call the noise routine to compute h (just requires some minor code refactoring). Also I don't see an automatic way to just compute the noise of a function at a bunch of points independent of actually using the noise to compute and use h. It is sort of there in the routine SNESDiffParameterCompute_More() but that routine is not documented or user friendly. > > > > I would start by rigging up calls to SNESDiffParameterCompute_More() to see what it says the noise of your function is, based on your hand determined values optimal values for err it should be roughly the square of them. Then if this makes sense you can trying using the -snes_mf_version 2 code to see if it helps the matrix free multiple behave consistently well for your problem by adaptively computing the err and using that in computing h. > > > > Good luck and I'd love to hear back from you if it helps and suggestions/code for making this more integrated with SNES so it is easier to use. For example perhaps we want a function SNESComputeNoise() that uses SNESDiffParameterCompute_More() to compute and report the noise for a range of values of u. > > > > Barry > > > > > > > > > > > > > > > > Is there anything else I could possibly tune in this context? > > > > > > The discretization is through mfem and I use standard H1 for my problem. > > > > > > Thanks, > > > Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Aug 19 21:04:27 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 20 Aug 2019 02:04:27 +0000 Subject: [petsc-users] How to choose mat_mffd_err in JFNK In-Reply-To: References: <459CE6EB-5359-4D35-B90F-CECFD32C0EB4@anl.gov> <827791B1-D492-49DE-954B-BCD6E3253360@mcs.anl.gov> Message-ID: If I run a "standard" example, such as snes/examples/tutorials/ex19 it does not have this "double" business with fgmres/gmres I think what you are seeing is related to where and how the shell preconditioner is working. By default GMRES uses left preconditioning (you can change it to right with -ksp_pc_side right) while FMGRES has to use right preconditioner (it makes no mathematical sense with left preconditioning). My first guess is if you use GMRES with -ksp_pc_side right you will see the double "business". Right preconditioned GMRES/FGMRES solves A B y= b so an application of the operator is A B but then once y is compute it needs to compute x = B y which is another application of your preconditioner, hence requires another matrix free operation. The h is different for the two multiplies because the a is different. I don't see that this would happen for every KSP iteration, I think there will be one "extra" for each KSP solve. I you truly see two MatMult_MFFD() for each GMRES/FGMRES I would run in the debugger, put a break point in MatMult_MFFD() to see exactly where each one is called. From the manual page MATMFFD_WP - Implements an alternative approach for computing the differencing parameter h used with the finite difference based matrix-free Jacobian. This code implements the strategy of M. Pernice and H. Walker: h = error_rel * sqrt(1 + ||U||) / ||a|| Notes: 1) || U || does not change between linear iterations so is reused 2) In GMRES || a || == 1 and so does not need to ever be computed except at restart when it is recomputed. > On Aug 19, 2019, at 7:51 PM, Tang, Qi wrote: > > Thanks again, Barry. I am testing more based on your suggestions. > > One thing I do not understand is when I use -mat_mffd_type wp -mat_mffd_err 1e-3 -ksp_type fgmres. It computes "h" of JFNK twice sequentially in each KSP iteration. For instance, the output in info looks like > ... > [0] MatMult_MFFD(): Current differencing parameter: 1.953103182078e-09 > [0] MatMult_MFFD(): Current differencing parameter: 6.054449182838e-01 > ... > I found it called MatMult_MFFD twice in a row. Do you happen to know why petsc calls MatMult_MFFD twice when using fgmre? I also do not understand the relationship between these two h. They are very off (because apparently ||a|| is very different). > > On the other hand, h is only computed once when I use gmres. The h there is clearly scaled with mat_mffd_err. That is easy to understand. However, I think I need to use fgmres because my preconditioner is not quite linear. > > BTW, my snes view is: > > SNES Object: () 16 MPI processes > type: newtonls > maximum iterations=20, maximum function evaluations=10000 > tolerances: relative=0.0001, absolute=0., solution=0. > total number of linear solver iterations=70 > total number of function evaluations=149 > norm schedule ALWAYS > Eisenstat-Walker computation of KSP relative tolerance (version 3) > rtol_0=0.3, rtol_max=0.9, threshold=0.1 > gamma=0.9, alpha=1.5, alpha2=1.5 > SNESLineSearch Object: () 16 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: () 16 MPI processes > type: fgmres > restart=50, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=0.0564799, absolute=1e-50, divergence=10000. > right preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: () 16 MPI processes > type: shell > JFNK preconditioner > No information available on the mfem::Solver > Number of preconditioners created by the factory 8 > linear system matrix followed by preconditioner matrix: > Mat Object: 16 MPI processes > type: mffd > rows=787968, cols=787968 > Matrix-free approximation: > err=0.001 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 16 MPI processes > type: nest > rows=787968, cols=787968 > Matrix object: > type=nest, rows=3, cols=3 > MatNest structure: > (0,0) : type=mpiaij, rows=262656, cols=262656 > (0,1) : type=mpiaij, rows=262656, cols=262656 > (0,2) : NULL > (1,0) : type=mpiaij, rows=262656, cols=262656 > (1,1) : type=mpiaij, rows=262656, cols=262656 > (1,2) : NULL > (2,0) : type=mpiaij, rows=262656, cols=262656 > (2,1) : NULL > (2,2) : type=mpiaij, rows=262656, cols=262656 > From: Smith, Barry F. > Sent: Thursday, August 15, 2019 3:30 AM > To: Tang, Qi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > On Aug 15, 2019, at 12:36 AM, Tang, Qi wrote: > > > > Thanks, it works. snes_mf_jorge works for me. > > Great. > > > It appears to compute h in every ksp. > > Each matrix vector product or each KSPSolve()? From the code looks like each matrix-vector product. > > > > > Without -snes_mf_jorge, it is not working. For some reason, it only computes h once, but that h is bad. My gmres residual is not decaying. > > > > > Indeed, the noise in my function becomes larger when I refine the mesh. I think it makes sense as I use the same time step for different meshes (that is the goal of the preconditioning). However, even when the algorithm is working, sqrt(noise) is much less than the good mat_mffd_err I previously found (10^-6 vs 10^-3). I do not understand why. > > I have no explanation. The details of the code that computes err are difficult to trace exactly; would take a while. Perhaps there is some parameter in there that is too "conservative"? > > > > > Although snes_mf_jorge is working, it is very expensive, as it has to evaluate F many times when estimating h. Unfortunately, to achieve the nonlinearity, I have to assemble some operators inside my F. There seem no easy solutions. > > > > I will try to compute h multiple times without using snes_mf_jorge. But let me know if you have other suggestions. Thanks! > > Yes, it seems to me computing the err less often would be a way to make the code go faster. I looked at the code more closely and noticed a couple of things. > > if (ctx->jorge) { > ierr = SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); > > /* Use the Brown/Saad method to compute h */ > } else { > /* Compute error if desired */ > ierr = SNESGetIterationNumber(snes,&iter);CHKERRQ(ierr); > if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) { > /* Use Jorge's method to compute noise */ > ierr = SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); > > ctx->error_rel = PetscSqrtReal(noise); > > ierr = PetscInfo3(snes,"Using Jorge's noise: noise=%g, sqrt(noise)=%g, h_more=%g\n",(double)noise,(double)ctx->error_rel,(double)h);CHKERRQ(ierr); > > ctx->compute_err_iter = iter; > ctx->need_err = PETSC_FALSE; > } > > So if jorge is set it uses the jorge algorithm for each matrix multiple to compute a new err and h. If jorge is not set but -snes_mf_compute_err is set then it computes a new err and h depending on the parameters (you can run with -info and grep for noise to see the PetscInfo lines printed (maybe add iter to the output to see how often it is recomputed) > > if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) { > > so the compute_err_freq determines when the err and h are recomputed. The logic is a bit strange so I cannot decipher exactly how often it is recomputing. I'm guess at the first Newton iteration and then some number of iterations later. > > You could try to rig it so that it is at every new Newton step (this would mean when computing f(x + h a) - f(x) for each new x it will recompute I think). > > > A more "research type" approach to try to reduce the work to a reasonable level would be to keep the same err until "things start to go bad" and then recompute it. But how to measure "when things start to go bad?" It is related to GMRES stagnating, so you could for example track the convergence of GMRES and if it is less than expected, kill the linear solve, reset the computation of err and then start the KSP at the same point as before. But this could be terribly expensive since all the steps in the most recent KSP are lost. > > Another possibility is to assume that the err is valid in a certain size ball around the current x. When x + ha or the new x is outside that ball then recompute the err. But how to choose the ball size and is there a way to adjust the ball size depending on how the computations proceed. For example if everything is going great you could slowly increase the size of the ball and hope for the best; but how to detect if the ball got too big (as above but terribly expensive)? > > Track the size of the err each time it is computed. If it stays about the same for a small number of times just freeze it for a while at that value? But again how to determine when it is no longer good? > > Just a few wild thoughts, I really have no solid ideas on how to reduce the work requirements. > > Barry > > > > > > > > > > Qi > > > > > > From: Smith, Barry F. > > Sent: Wednesday, August 14, 2019 10:57 PM > > To: Tang, Qi > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > > > > > On Aug 14, 2019, at 9:41 PM, Tang, Qi wrote: > > > > > > Thanks for the help, Barry. I tired both ds and wp, and again it depends on if I could find the correct parameter set. It is getting harder as I refine the mesh. > > > > > > So I try to use SNESDefaultMatrixFreeCreate2, SNESMatrixFreeMult2_Private or SNESDiffParameterCompute_More in mfem. But it looks like these functions are not in linked in petscsnes.h. How could I call them? > > > > They may not be listed in petscsnes.h but I think they should be in the library. You can just stick the prototypes for the functions anywhere you need them for now. > > > > > > You should be able to use > > > > ierr = PetscOptionsInt("-snes_mf_version","Matrix-Free routines version 1 or 2","None",snes->mf_version,&snes->mf_version,0);CHKERRQ(ierr); > > > > then this info is passed with > > > > if (snes->mf) { > > ierr = SNESSetUpMatrixFree_Private(snes, snes->mf_operator, snes->mf_version);CHKERRQ(ierr); > > } > > > > this routine has > > > > if (version == 1) { > > ierr = MatCreateSNESMF(snes,&J);CHKERRQ(ierr); > > ierr = MatMFFDSetOptionsPrefix(J,((PetscObject)snes)->prefix);CHKERRQ(ierr); > > ierr = MatSetFromOptions(J);CHKERRQ(ierr); > > } else if (version == 2) { > > if (!snes->vec_func) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"SNESSetFunction() must be called first"); > > #if !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_REAL_SINGLE) && !defined(PETSC_USE_REAL___FLOAT128) && !defined(PETSC_USE_REAL___FP16) > > ierr = SNESDefaultMatrixFreeCreate2(snes,snes->vec_func,&J);CHKERRQ(ierr); > > #else > > > > and this routine has > > > > ierr = VecDuplicate(x,&mfctx->w);CHKERRQ(ierr); > > ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr); > > ierr = VecGetSize(x,&n);CHKERRQ(ierr); > > ierr = VecGetLocalSize(x,&nloc);CHKERRQ(ierr); > > ierr = MatCreate(comm,J);CHKERRQ(ierr); > > ierr = MatSetSizes(*J,nloc,n,n,n);CHKERRQ(ierr); > > ierr = MatSetType(*J,MATSHELL);CHKERRQ(ierr); > > ierr = MatShellSetContext(*J,mfctx);CHKERRQ(ierr); > > ierr = MatShellSetOperation(*J,MATOP_MULT,(void (*)(void))SNESMatrixFreeMult2_Private);CHKERRQ(ierr); > > ierr = MatShellSetOperation(*J,MATOP_DESTROY,(void (*)(void))SNESMatrixFreeDestroy2_Private);CHKERRQ(ierr); > > ierr = MatShellSetOperation(*J,MATOP_VIEW,(void (*)(void))SNESMatrixFreeView2_Private);CHKERRQ(ierr); > > ierr = MatSetUp(*J);CHKERRQ(ierr); > > > > > > > > > > > > > > > > > > Also, could I call SNESDiffParameterCompute_More in snes_monitor? But it needs "a" in (F(u + ha) - F(u)) /h as the input. So I am not sure how I could properly use it. Maybe use SNESDefaultMatrixFreeCreate2 inside SNESSetJacobian would be easier to try? > > > > If you use the flag -snes_mf_noise_file filename when it runs it will save all the noise information it computes along the way to that file (yes it is crude and doesn't match the PETSc Viewer/monitor style but it should work). > > > > Thus I think you can use it and get all the possible monitoring information without actually writing any code. Just > > > > -snes_mf_version 2 > > -snes_mf_noise_file filename > > > > > > > > Barry > > > > > > > > Thanks again, > > > Qi > > > > > > From: Smith, Barry F. > > > Sent: Tuesday, August 13, 2019 9:07 PM > > > To: Tang, Qi > > > Cc: petsc-users at mcs.anl.gov > > > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > > > > > > > > > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users wrote: > > > > > > > > ?Hi, > > > > I am using JFNK, inexact Newton and a shell "physics-based" preconditioning to solve some multiphysics problems. I have been playing with mat_mffd_err, and it gives me some results I do not fully understand. > > > > > > > > I believe the default value of mat_mffd_err is 10^-8 for double precision, which seem too large for my problems. When I test a problem essentially still in the linear regime, I expect my converged "unpreconditioned resid norm of KSP" should be identical to "SNES Function norm" (Should I?). This is exactly what I found if I could find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it happens, the whole algorithm works as expected. When those two norms are off, the inexact Newton becomes very inefficient. For instance, it may take many ksp iterations to converge but the snes norm is only reduced slightly. > > > > > > > > According to the manual, mat_mffd_err should be "square root of relative error in function evaluation". But is there a simple way to estimate it? > > > > > > First a related note: there are two different algorithms that PETSc provides for computing the factor h using the err parameter. They can be set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some people have better luck with one or the other for their problem. > > > > > > > > > There is some code in PETSc to compute what is called the "noise" of the function which in theory leads to a better err value. For example if say the last four digits of your function are "noise" (that is meaningless stuff) which is possible for complicated multiphysics problems due to round off in the function evaluations then you should use an err that is 2 digits bigger than the default (note this is just the square root of the "function epsilon" instead of the machine epsilon, because the "function epsilon" is much larger than the machine epsilon). > > > > > > We never had a great model for how to hook the noise computation code up into SNES so it is a bit disconnected, we should revisit this. Perhaps you will find it useful and we can figure out together how to hook it in cleanly for use. > > > > > > The code that computes and reports the noise is in the directory src/snes/interface/noise > > > > > > You can use this version with the option -snes_mf_version 2 (version 1 is the normal behavior) > > > > > > The code has the ability to print to a file or the screen information about the noise it is finding, maybe with command line options, you'll have to look directly at the code to see how. > > > > > > I don't like the current setup because if you use the noise based computation of h you have to use SNESMatrixFreeMult2_Private() which is a slightly different routine for doing the actually differencing than the two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it can't use the standard differencing routines but call the noise routine to compute h (just requires some minor code refactoring). Also I don't see an automatic way to just compute the noise of a function at a bunch of points independent of actually using the noise to compute and use h. It is sort of there in the routine SNESDiffParameterCompute_More() but that routine is not documented or user friendly. > > > > > > I would start by rigging up calls to SNESDiffParameterCompute_More() to see what it says the noise of your function is, based on your hand determined values optimal values for err it should be roughly the square of them. Then if this makes sense you can trying using the -snes_mf_version 2 code to see if it helps the matrix free multiple behave consistently well for your problem by adaptively computing the err and using that in computing h. > > > > > > Good luck and I'd love to hear back from you if it helps and suggestions/code for making this more integrated with SNES so it is easier to use. For example perhaps we want a function SNESComputeNoise() that uses SNESDiffParameterCompute_More() to compute and report the noise for a range of values of u. > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > > Is there anything else I could possibly tune in this context? > > > > > > > > The discretization is through mfem and I use standard H1 for my problem. > > > > > > > > Thanks, > > > > Qi From zakaryah at gmail.com Tue Aug 20 00:42:09 2019 From: zakaryah at gmail.com (zakaryah .) Date: Tue, 20 Aug 2019 01:42:09 -0400 Subject: [petsc-users] Jacobian for DMComposite, containing DMRedundant Message-ID: I'm working on solving a nonlinear system with a DMComposite, which is composed of a DMRedundant and a DMDA. The DMRedundant has a single dof, while the DMDA can be large (n grid points, where n is a few million). The Jacobian is then (n+1)x(n+1). I set it by calling MatGetLocalSubMatrix 4 times, to get the submatrices J11 (1 x 1), J12 (1 x n), J21 (n x 1), and J22 (n x n). In the past, setting the values for these matrices seemed to work fine - the Jacobian was correct, the code was valgrind quiet, etc. Since then, something has changed - I went from v. 3.11.0 to 3.11.3, my sysadmin may have fiddled with MPI, I'm experimenting with different problem sizes and number of processors, etc. Now I am getting segfaults, and the valgrind trace has not been terribly illuminating. Both valgrind and the debugger indicate the segfault originating in PetscSortIntWithArrayPair_Private, with the back trace showing nested calls to this function running *thousands* deep... I'm not sure but I suspect that the trace to PetscSortIntWithArrayPair_Private originates in MatAssemblyBegin. I've created a working example which is minimal for my purposes (although probably not fun for anyone else to inspect), and I've narrowed the issue down to the code which sets the values of J12, the submatrix with a single row (the DMRedundant dof) and many columns (n). If I don't set any values in that submatrix, and change the function so that this Jacobian is correct, then the code goes back to being valgrind quiet, completing correctly (for the modified problem), and not throwing a segfault. If I run in serial, it also works. Searching the archive, I found an old thread about segfaults in MatAssemblyBegin when a very large number of values were set between flushes, and Matt said that even though there should not be a segfault, it is a good idea in practice to frequently flush the assembly within a loop. I guess my question is whether you think the segfault I am getting is due to setting too many matrix values, and how I should go about flushing the assembly. Since I add values to J12 using MatSetValuesLocal, do I need to call MatRestoreLocalSubmatrix to restore J12, MatAssemblyBegin and End with MAT_FLUSH_ASSEMBLY on the full Jacobian, then call MatGetLocalSubmatrix to get J12 again, all inside the loop? One last point - when I call MatSetValuesLocal on J12, I do that on all processors, with each processor accessing only the columns which correspond to the grid points it owns. Is there anything wrong with doing that? Do I need to only set values on the processor which owns the DMRedundant? If I call MatGetOwnershipRange on J12, it appears that all processes own the matrix's single row. Thanks for any help you can share. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 20 01:07:48 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 20 Aug 2019 06:07:48 +0000 Subject: [petsc-users] Jacobian for DMComposite, containing DMRedundant In-Reply-To: References: Message-ID: <0831A110-9967-4E31-8965-62EA48B1723C@anl.gov> Before the latest release we changed the way we handled communicating matrix entries that needed to be sent between processors; normally that way would be much faster but another user reported a pathologically case where the PetscSortIntWithArrayPair() was producing a huge number of nested calls. That user actually provided a fix (a better sort) that resolved his problem. You can either run with -matstash_legacy (may need a prefix on the this your matrix has a prefix) to get the old behavior or You can switch to the master branch in the PETSc repository version to use it. Note we have switched to Gitlab for our repository. You can get the master branch with git clone https://gitlab.com/petsc/petsc Please let us know if this does not resolve your problem Barry > On Aug 20, 2019, at 12:42 AM, zakaryah . via petsc-users wrote: > > I'm working on solving a nonlinear system with a DMComposite, which is composed of a DMRedundant and a DMDA. The DMRedundant has a single dof, while the DMDA can be large (n grid points, where n is a few million). The Jacobian is then (n+1)x(n+1). I set it by calling MatGetLocalSubMatrix 4 times, to get the submatrices J11 (1 x 1), J12 (1 x n), J21 (n x 1), and J22 (n x n). In the past, setting the values for these matrices seemed to work fine - the Jacobian was correct, the code was valgrind quiet, etc. Since then, something has changed - I went from v. 3.11.0 to 3.11.3, my sysadmin may have fiddled with MPI, I'm experimenting with different problem sizes and number of processors, etc. > > Now I am getting segfaults, and the valgrind trace has not been terribly illuminating. Both valgrind and the debugger indicate the segfault originating in PetscSortIntWithArrayPair_Private, with the back trace showing nested calls to this function running *thousands* deep... I'm not sure but I suspect that the trace to PetscSortIntWithArrayPair_Private originates in MatAssemblyBegin. > > I've created a working example which is minimal for my purposes (although probably not fun for anyone else to inspect), and I've narrowed the issue down to the code which sets the values of J12, the submatrix with a single row (the DMRedundant dof) and many columns (n). If I don't set any values in that submatrix, and change the function so that this Jacobian is correct, then the code goes back to being valgrind quiet, completing correctly (for the modified problem), and not throwing a segfault. If I run in serial, it also works. > > Searching the archive, I found an old thread about segfaults in MatAssemblyBegin when a very large number of values were set between flushes, and Matt said that even though there should not be a segfault, it is a good idea in practice to frequently flush the assembly within a loop. > > I guess my question is whether you think the segfault I am getting is due to setting too many matrix values, and how I should go about flushing the assembly. Since I add values to J12 using MatSetValuesLocal, do I need to call MatRestoreLocalSubmatrix to restore J12, MatAssemblyBegin and End with MAT_FLUSH_ASSEMBLY on the full Jacobian, then call MatGetLocalSubmatrix to get J12 again, all inside the loop? > > One last point - when I call MatSetValuesLocal on J12, I do that on all processors, with each processor accessing only the columns which correspond to the grid points it owns. Is there anything wrong with doing that? Do I need to only set values on the processor which owns the DMRedundant? If I call MatGetOwnershipRange on J12, it appears that all processes own the matrix's single row. > > Thanks for any help you can share. > > > From dalcinl at gmail.com Tue Aug 20 03:51:21 2019 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 20 Aug 2019 11:51:21 +0300 Subject: [petsc-users] [petsc-dev] IMPORTANT PETSc repository changed from Bucketbit to GitLab In-Reply-To: References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> Message-ID: Dear Satish, are you planning to migrate petsc4py? On Mon, 19 Aug 2019 at 23:47, Balay, Satish via petsc-dev < petsc-dev at mcs.anl.gov> wrote: > A note: > > The bitbucket repository is saved at > https://bitbucket.org/petsc/petsc-pre-gitlab > > The git part is now read-only. The other parts [issues, PRs, wiki etc] are > perhaps writable - but we should avoid that. > > Satish > > On Mon, 19 Aug 2019, Smith, Barry F. via petsc-dev wrote: > > > > > > > PETSc folks. > > > > This announcement is for people who access PETSc from the BitBucket > repository or post issues or have other activities with the Bitbucket > repository > > > > We have changed the location of the PETSc repository from BitBucket > to GitLab. For each copy of the repository you need to do > > > > git remote set-url origin git at gitlab.com >:petsc/petsc.git > > or > > git remote set-url origin https://gitlab.com/petsc/petsc.git > > > > You will likely also want to set up an account on gitlab and remember > to set the ssh key information > > > > if you previously had write permission to the petsc repository and > cannot write to the new repository please email bsmith at mcs.anl.gov bsmith at mcs.anl.gov> with your GitLab > > username and the email used > > > > Please do not make pull requests to the Gitlab site yet; we will be > manually processing the PR from the BitBucket site over the next couple of > > days as we implement the testing. > > > > Please be patient, this is all new to use and it may take a few days > to get out all the glitches. > > > > Thanks for your support > > > > Barry > > > > The reason for switching to GitLab is that it has a better testing > system than BitBucket and Gitlab. We hope that will allow us to test and > manage > > pull requests more rapidly, efficiently and accurately, thus allowing > us to improve and add to PETSc more quickly. > > > > -- Lisandro Dalcin ============ Research Scientist Extreme Computing Research Center (ECRC) King Abdullah University of Science and Technology (KAUST) http://ecrc.kaust.edu.sa/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From zakaryah at gmail.com Tue Aug 20 10:55:42 2019 From: zakaryah at gmail.com (zakaryah .) Date: Tue, 20 Aug 2019 11:55:42 -0400 Subject: [petsc-users] Jacobian for DMComposite, containing DMRedundant In-Reply-To: <0831A110-9967-4E31-8965-62EA48B1723C@anl.gov> References: <0831A110-9967-4E31-8965-62EA48B1723C@anl.gov> Message-ID: Thanks Barry - adding -matstash_legacy seemed to fix everything. Does it look like I also stumbled onto the pathological case? On Tue, Aug 20, 2019 at 2:07 AM Smith, Barry F. wrote: > > Before the latest release we changed the way we handled communicating > matrix entries that needed to be sent between processors; normally that way > would be much faster but another user reported a pathologically case where > the PetscSortIntWithArrayPair() was producing a huge number of nested > calls. That user actually provided a fix (a better sort) that resolved his > problem. > > You can either run with -matstash_legacy (may need a prefix on the this > your matrix has a prefix) to get the old behavior or > > You can switch to the master branch in the PETSc repository version to > use it. Note we have switched to Gitlab for our repository. You can get > the master branch with git clone https://gitlab.com/petsc/petsc > > Please let us know if this does not resolve your problem > > Barry > > > > > > > On Aug 20, 2019, at 12:42 AM, zakaryah . via petsc-users < > petsc-users at mcs.anl.gov> wrote: > > > > I'm working on solving a nonlinear system with a DMComposite, which is > composed of a DMRedundant and a DMDA. The DMRedundant has a single dof, > while the DMDA can be large (n grid points, where n is a few million). The > Jacobian is then (n+1)x(n+1). I set it by calling MatGetLocalSubMatrix 4 > times, to get the submatrices J11 (1 x 1), J12 (1 x n), J21 (n x 1), and > J22 (n x n). In the past, setting the values for these matrices seemed to > work fine - the Jacobian was correct, the code was valgrind quiet, etc. > Since then, something has changed - I went from v. 3.11.0 to 3.11.3, my > sysadmin may have fiddled with MPI, I'm experimenting with different > problem sizes and number of processors, etc. > > > > Now I am getting segfaults, and the valgrind trace has not been terribly > illuminating. Both valgrind and the debugger indicate the segfault > originating in PetscSortIntWithArrayPair_Private, with the back trace > showing nested calls to this function running *thousands* deep... I'm not > sure but I suspect that the trace to PetscSortIntWithArrayPair_Private > originates in MatAssemblyBegin. > > > > I've created a working example which is minimal for my purposes > (although probably not fun for anyone else to inspect), and I've narrowed > the issue down to the code which sets the values of J12, the submatrix with > a single row (the DMRedundant dof) and many columns (n). If I don't set > any values in that submatrix, and change the function so that this Jacobian > is correct, then the code goes back to being valgrind quiet, completing > correctly (for the modified problem), and not throwing a segfault. If I > run in serial, it also works. > > > > Searching the archive, I found an old thread about segfaults in > MatAssemblyBegin when a very large number of values were set between > flushes, and Matt said that even though there should not be a segfault, it > is a good idea in practice to frequently flush the assembly within a loop. > > > > I guess my question is whether you think the segfault I am getting is > due to setting too many matrix values, and how I should go about flushing > the assembly. Since I add values to J12 using MatSetValuesLocal, do I need > to call MatRestoreLocalSubmatrix to restore J12, MatAssemblyBegin and End > with MAT_FLUSH_ASSEMBLY on the full Jacobian, then call > MatGetLocalSubmatrix to get J12 again, all inside the loop? > > > > One last point - when I call MatSetValuesLocal on J12, I do that on all > processors, with each processor accessing only the columns which correspond > to the grid points it owns. Is there anything wrong with doing that? Do I > need to only set values on the processor which owns the DMRedundant? If I > call MatGetOwnershipRange on J12, it appears that all processes own the > matrix's single row. > > > > Thanks for any help you can share. > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 20 14:13:10 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Tue, 20 Aug 2019 19:13:10 +0000 Subject: [petsc-users] Jacobian for DMComposite, containing DMRedundant In-Reply-To: References: <0831A110-9967-4E31-8965-62EA48B1723C@anl.gov> Message-ID: > On Aug 20, 2019, at 10:55 AM, zakaryah . via petsc-users wrote: > > Thanks Barry - adding -matstash_legacy seemed to fix everything. Does it look like I also stumbled onto the pathological case? Yes, > > On Tue, Aug 20, 2019 at 2:07 AM Smith, Barry F. wrote: > > Before the latest release we changed the way we handled communicating matrix entries that needed to be sent between processors; normally that way would be much faster but another user reported a pathologically case where the PetscSortIntWithArrayPair() was producing a huge number of nested calls. That user actually provided a fix (a better sort) that resolved his problem. > > You can either run with -matstash_legacy (may need a prefix on the this your matrix has a prefix) to get the old behavior or > > You can switch to the master branch in the PETSc repository version to use it. Note we have switched to Gitlab for our repository. You can get the master branch with git clone https://gitlab.com/petsc/petsc > > Please let us know if this does not resolve your problem > > Barry > > > > > > > On Aug 20, 2019, at 12:42 AM, zakaryah . via petsc-users wrote: > > > > I'm working on solving a nonlinear system with a DMComposite, which is composed of a DMRedundant and a DMDA. The DMRedundant has a single dof, while the DMDA can be large (n grid points, where n is a few million). The Jacobian is then (n+1)x(n+1). I set it by calling MatGetLocalSubMatrix 4 times, to get the submatrices J11 (1 x 1), J12 (1 x n), J21 (n x 1), and J22 (n x n). In the past, setting the values for these matrices seemed to work fine - the Jacobian was correct, the code was valgrind quiet, etc. Since then, something has changed - I went from v. 3.11.0 to 3.11.3, my sysadmin may have fiddled with MPI, I'm experimenting with different problem sizes and number of processors, etc. > > > > Now I am getting segfaults, and the valgrind trace has not been terribly illuminating. Both valgrind and the debugger indicate the segfault originating in PetscSortIntWithArrayPair_Private, with the back trace showing nested calls to this function running *thousands* deep... I'm not sure but I suspect that the trace to PetscSortIntWithArrayPair_Private originates in MatAssemblyBegin. > > > > I've created a working example which is minimal for my purposes (although probably not fun for anyone else to inspect), and I've narrowed the issue down to the code which sets the values of J12, the submatrix with a single row (the DMRedundant dof) and many columns (n). If I don't set any values in that submatrix, and change the function so that this Jacobian is correct, then the code goes back to being valgrind quiet, completing correctly (for the modified problem), and not throwing a segfault. If I run in serial, it also works. > > > > Searching the archive, I found an old thread about segfaults in MatAssemblyBegin when a very large number of values were set between flushes, and Matt said that even though there should not be a segfault, it is a good idea in practice to frequently flush the assembly within a loop. > > > > I guess my question is whether you think the segfault I am getting is due to setting too many matrix values, and how I should go about flushing the assembly. Since I add values to J12 using MatSetValuesLocal, do I need to call MatRestoreLocalSubmatrix to restore J12, MatAssemblyBegin and End with MAT_FLUSH_ASSEMBLY on the full Jacobian, then call MatGetLocalSubmatrix to get J12 again, all inside the loop? > > > > One last point - when I call MatSetValuesLocal on J12, I do that on all processors, with each processor accessing only the columns which correspond to the grid points it owns. Is there anything wrong with doing that? Do I need to only set values on the processor which owns the DMRedundant? If I call MatGetOwnershipRange on J12, it appears that all processes own the matrix's single row. > > > > Thanks for any help you can share. > > > > > > > From fdkong.jd at gmail.com Tue Aug 20 16:40:36 2019 From: fdkong.jd at gmail.com (Fande Kong) Date: Tue, 20 Aug 2019 15:40:36 -0600 Subject: [petsc-users] [petsc-dev] IMPORTANT PETSc repository changed from Bucketbit to GitLab In-Reply-To: References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> Message-ID: Any way to have the link https://bitbucket.org/petsc/petsc valid (read only) for a couple of weeks? We have some tests that dynamically picks up a particular version of PETSc (master, maint, tag). The link was removed now, and that disabled us to have daily tests. If we could keep this link for a couple of weeks, that will give me a grace period to update all the impacted tests. Any particular reason that we have to rename petsc to petsc-pre-gitlab? Thanks, Fande Kong, On Tue, Aug 20, 2019 at 2:52 AM Lisandro Dalcin via petsc-dev < petsc-dev at mcs.anl.gov> wrote: > Dear Satish, are you planning to migrate petsc4py? > > On Mon, 19 Aug 2019 at 23:47, Balay, Satish via petsc-dev < > petsc-dev at mcs.anl.gov> wrote: > >> A note: >> >> The bitbucket repository is saved at >> https://bitbucket.org/petsc/petsc-pre-gitlab >> >> The git part is now read-only. The other parts [issues, PRs, wiki etc] >> are perhaps writable - but we should avoid that. >> >> Satish >> >> On Mon, 19 Aug 2019, Smith, Barry F. via petsc-dev wrote: >> >> > >> > >> > PETSc folks. >> > >> > This announcement is for people who access PETSc from the BitBucket >> repository or post issues or have other activities with the Bitbucket >> repository >> > >> > We have changed the location of the PETSc repository from BitBucket >> to GitLab. For each copy of the repository you need to do >> > >> > git remote set-url origin git at gitlab.com> >:petsc/petsc.git >> > or >> > git remote set-url origin https://gitlab.com/petsc/petsc.git >> > >> > You will likely also want to set up an account on gitlab and >> remember to set the ssh key information >> > >> > if you previously had write permission to the petsc repository and >> cannot write to the new repository please email bsmith at mcs.anl.gov >> with your GitLab >> > username and the email used >> > >> > Please do not make pull requests to the Gitlab site yet; we will be >> manually processing the PR from the BitBucket site over the next couple of >> > days as we implement the testing. >> > >> > Please be patient, this is all new to use and it may take a few >> days to get out all the glitches. >> > >> > Thanks for your support >> > >> > Barry >> > >> > The reason for switching to GitLab is that it has a better testing >> system than BitBucket and Gitlab. We hope that will allow us to test and >> manage >> > pull requests more rapidly, efficiently and accurately, thus >> allowing us to improve and add to PETSc more quickly. >> > >> >> > > -- > Lisandro Dalcin > ============ > Research Scientist > Extreme Computing Research Center (ECRC) > King Abdullah University of Science and Technology (KAUST) > http://ecrc.kaust.edu.sa/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 20 19:05:25 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 21 Aug 2019 00:05:25 +0000 Subject: [petsc-users] [petsc-dev] IMPORTANT PETSc repository changed from Bucketbit to GitLab In-Reply-To: References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> Message-ID: <7C8FC122-A7D3-445B-BA2E-34DCB7071ED1@anl.gov> Is it so difficult for you to change the link in your test system to petsc-pre-gitlab? Barry > On Aug 20, 2019, at 4:40 PM, Fande Kong via petsc-dev wrote: > > Any way to have the link https://bitbucket.org/petsc/petsc valid (read only) for a couple of weeks? We have some tests that dynamically picks up a particular version of PETSc (master, maint, tag). The link was removed now, and that disabled us to have daily tests. If we could keep this link for a couple of weeks, that will give me a grace period to update all the impacted tests. > > Any particular reason that we have to rename petsc to petsc-pre-gitlab? > > Thanks, > > Fande Kong, > > On Tue, Aug 20, 2019 at 2:52 AM Lisandro Dalcin via petsc-dev wrote: > Dear Satish, are you planning to migrate petsc4py? > > On Mon, 19 Aug 2019 at 23:47, Balay, Satish via petsc-dev wrote: > A note: > > The bitbucket repository is saved at https://bitbucket.org/petsc/petsc-pre-gitlab > > The git part is now read-only. The other parts [issues, PRs, wiki etc] are perhaps writable - but we should avoid that. > > Satish > > On Mon, 19 Aug 2019, Smith, Barry F. via petsc-dev wrote: > > > > > > > PETSc folks. > > > > This announcement is for people who access PETSc from the BitBucket repository or post issues or have other activities with the Bitbucket repository > > > > We have changed the location of the PETSc repository from BitBucket to GitLab. For each copy of the repository you need to do > > > > git remote set-url origin git at gitlab.com:petsc/petsc.git > > or > > git remote set-url origin https://gitlab.com/petsc/petsc.git > > > > You will likely also want to set up an account on gitlab and remember to set the ssh key information > > > > if you previously had write permission to the petsc repository and cannot write to the new repository please email bsmith at mcs.anl.gov with your GitLab > > username and the email used > > > > Please do not make pull requests to the Gitlab site yet; we will be manually processing the PR from the BitBucket site over the next couple of > > days as we implement the testing. > > > > Please be patient, this is all new to use and it may take a few days to get out all the glitches. > > > > Thanks for your support > > > > Barry > > > > The reason for switching to GitLab is that it has a better testing system than BitBucket and Gitlab. We hope that will allow us to test and manage > > pull requests more rapidly, efficiently and accurately, thus allowing us to improve and add to PETSc more quickly. > > > > > > -- > Lisandro Dalcin > ============ > Research Scientist > Extreme Computing Research Center (ECRC) > King Abdullah University of Science and Technology (KAUST) > http://ecrc.kaust.edu.sa/ From fdkong.jd at gmail.com Tue Aug 20 19:22:47 2019 From: fdkong.jd at gmail.com (Fande Kong) Date: Tue, 20 Aug 2019 18:22:47 -0600 Subject: [petsc-users] [petsc-dev] IMPORTANT PETSc repository changed from Bucketbit to GitLab In-Reply-To: <7C8FC122-A7D3-445B-BA2E-34DCB7071ED1@anl.gov> References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> <7C8FC122-A7D3-445B-BA2E-34DCB7071ED1@anl.gov> Message-ID: <2C6D39D2-784E-452E-8FF1-DBFCE8269A7A@gmail.com> Never mind. I updated everything to point to gitlab. Everything looks fine for me now. Fande > On Aug 20, 2019, at 6:05 PM, Smith, Barry F. wrote: > > > Is it so difficult for you to change the link in your test system to petsc-pre-gitlab? > > Barry > > >> On Aug 20, 2019, at 4:40 PM, Fande Kong via petsc-dev wrote: >> >> Any way to have the link https://bitbucket.org/petsc/petsc valid (read only) for a couple of weeks? We have some tests that dynamically picks up a particular version of PETSc (master, maint, tag). The link was removed now, and that disabled us to have daily tests. If we could keep this link for a couple of weeks, that will give me a grace period to update all the impacted tests. >> >> Any particular reason that we have to rename petsc to petsc-pre-gitlab? >> >> Thanks, >> >> Fande Kong, >> >> On Tue, Aug 20, 2019 at 2:52 AM Lisandro Dalcin via petsc-dev wrote: >> Dear Satish, are you planning to migrate petsc4py? >> >> On Mon, 19 Aug 2019 at 23:47, Balay, Satish via petsc-dev wrote: >> A note: >> >> The bitbucket repository is saved at https://bitbucket.org/petsc/petsc-pre-gitlab >> >> The git part is now read-only. The other parts [issues, PRs, wiki etc] are perhaps writable - but we should avoid that. >> >> Satish >> >>> On Mon, 19 Aug 2019, Smith, Barry F. via petsc-dev wrote: >>> >>> >>> >>> PETSc folks. >>> >>> This announcement is for people who access PETSc from the BitBucket repository or post issues or have other activities with the Bitbucket repository >>> >>> We have changed the location of the PETSc repository from BitBucket to GitLab. For each copy of the repository you need to do >>> >>> git remote set-url origin git at gitlab.com:petsc/petsc.git >>> or >>> git remote set-url origin https://gitlab.com/petsc/petsc.git >>> >>> You will likely also want to set up an account on gitlab and remember to set the ssh key information >>> >>> if you previously had write permission to the petsc repository and cannot write to the new repository please email bsmith at mcs.anl.gov with your GitLab >>> username and the email used >>> >>> Please do not make pull requests to the Gitlab site yet; we will be manually processing the PR from the BitBucket site over the next couple of >>> days as we implement the testing. >>> >>> Please be patient, this is all new to use and it may take a few days to get out all the glitches. >>> >>> Thanks for your support >>> >>> Barry >>> >>> The reason for switching to GitLab is that it has a better testing system than BitBucket and Gitlab. We hope that will allow us to test and manage >>> pull requests more rapidly, efficiently and accurately, thus allowing us to improve and add to PETSc more quickly. >>> >> >> >> >> -- >> Lisandro Dalcin >> ============ >> Research Scientist >> Extreme Computing Research Center (ECRC) >> King Abdullah University of Science and Technology (KAUST) >> http://ecrc.kaust.edu.sa/ > From bsmith at mcs.anl.gov Tue Aug 20 19:25:34 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 21 Aug 2019 00:25:34 +0000 Subject: [petsc-users] [petsc-dev] IMPORTANT PETSc repository changed from Bucketbit to GitLab In-Reply-To: <2C6D39D2-784E-452E-8FF1-DBFCE8269A7A@gmail.com> References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> <7C8FC122-A7D3-445B-BA2E-34DCB7071ED1@anl.gov> <2C6D39D2-784E-452E-8FF1-DBFCE8269A7A@gmail.com> Message-ID: Great > On Aug 20, 2019, at 7:22 PM, Fande Kong wrote: > > Never mind. I updated everything to point to gitlab. Everything looks fine for me now. > > > Fande > >> On Aug 20, 2019, at 6:05 PM, Smith, Barry F. wrote: >> >> >> Is it so difficult for you to change the link in your test system to petsc-pre-gitlab? >> >> Barry >> >> >>> On Aug 20, 2019, at 4:40 PM, Fande Kong via petsc-dev wrote: >>> >>> Any way to have the link https://bitbucket.org/petsc/petsc valid (read only) for a couple of weeks? We have some tests that dynamically picks up a particular version of PETSc (master, maint, tag). The link was removed now, and that disabled us to have daily tests. If we could keep this link for a couple of weeks, that will give me a grace period to update all the impacted tests. >>> >>> Any particular reason that we have to rename petsc to petsc-pre-gitlab? >>> >>> Thanks, >>> >>> Fande Kong, >>> >>> On Tue, Aug 20, 2019 at 2:52 AM Lisandro Dalcin via petsc-dev wrote: >>> Dear Satish, are you planning to migrate petsc4py? >>> >>> On Mon, 19 Aug 2019 at 23:47, Balay, Satish via petsc-dev wrote: >>> A note: >>> >>> The bitbucket repository is saved at https://bitbucket.org/petsc/petsc-pre-gitlab >>> >>> The git part is now read-only. The other parts [issues, PRs, wiki etc] are perhaps writable - but we should avoid that. >>> >>> Satish >>> >>>> On Mon, 19 Aug 2019, Smith, Barry F. via petsc-dev wrote: >>>> >>>> >>>> >>>> PETSc folks. >>>> >>>> This announcement is for people who access PETSc from the BitBucket repository or post issues or have other activities with the Bitbucket repository >>>> >>>> We have changed the location of the PETSc repository from BitBucket to GitLab. For each copy of the repository you need to do >>>> >>>> git remote set-url origin git at gitlab.com:petsc/petsc.git >>>> or >>>> git remote set-url origin https://gitlab.com/petsc/petsc.git >>>> >>>> You will likely also want to set up an account on gitlab and remember to set the ssh key information >>>> >>>> if you previously had write permission to the petsc repository and cannot write to the new repository please email bsmith at mcs.anl.gov with your GitLab >>>> username and the email used >>>> >>>> Please do not make pull requests to the Gitlab site yet; we will be manually processing the PR from the BitBucket site over the next couple of >>>> days as we implement the testing. >>>> >>>> Please be patient, this is all new to use and it may take a few days to get out all the glitches. >>>> >>>> Thanks for your support >>>> >>>> Barry >>>> >>>> The reason for switching to GitLab is that it has a better testing system than BitBucket and Gitlab. We hope that will allow us to test and manage >>>> pull requests more rapidly, efficiently and accurately, thus allowing us to improve and add to PETSc more quickly. >>>> >>> >>> >>> >>> -- >>> Lisandro Dalcin >>> ============ >>> Research Scientist >>> Extreme Computing Research Center (ECRC) >>> King Abdullah University of Science and Technology (KAUST) >>> http://ecrc.kaust.edu.sa/ >> From J.Zhang-10 at tudelft.nl Wed Aug 21 06:34:50 2019 From: J.Zhang-10 at tudelft.nl (Jian Zhang - 3ME) Date: Wed, 21 Aug 2019 11:34:50 +0000 Subject: [petsc-users] Getting the connectivity from DMPlex Message-ID: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl> Hi guys, I am trying to get the element connectivity from DMPlex. The input is the element id, and the output should be the vertice ids. Which function should I use to achieve this? Thanks in advance. Best, Jian -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Aug 21 06:42:41 2019 From: jed at jedbrown.org (Jed Brown) Date: Wed, 21 Aug 2019 05:42:41 -0600 Subject: [petsc-users] Getting the connectivity from DMPlex In-Reply-To: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl> References: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl> Message-ID: <87lfvm7nim.fsf@jedbrown.org> Jian Zhang - 3ME via petsc-users writes: > Hi guys, > > I am trying to get the element connectivity from DMPlex. The input is the element id, and the output should be the vertice ids. Which function should I use to achieve this? Thanks in advance. See DMPlexGetCone or DMPlexGetClosureIndices. From J.Zhang-10 at tudelft.nl Wed Aug 21 06:48:36 2019 From: J.Zhang-10 at tudelft.nl (Jian Zhang - 3ME) Date: Wed, 21 Aug 2019 11:48:36 +0000 Subject: [petsc-users] Getting the connectivity from DMPlex In-Reply-To: <87lfvm7nim.fsf@jedbrown.org> References: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl>, <87lfvm7nim.fsf@jedbrown.org> Message-ID: Hi Jed, Thank you very much. I tried to use DMPlexGetCone, but the output is the edge ids, not the vertice ids. For the function DMPlexGetClosureIndices, I can not find it in petsc4py. Do you know any alternative ways to solve this? Best, Jian ________________________________ From: Jed Brown Sent: 21 August 2019 13:42:41 To: Jian Zhang - 3ME; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Getting the connectivity from DMPlex Jian Zhang - 3ME via petsc-users writes: > Hi guys, > > I am trying to get the element connectivity from DMPlex. The input is the element id, and the output should be the vertice ids. Which function should I use to achieve this? Thanks in advance. See DMPlexGetCone or DMPlexGetClosureIndices. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Aug 21 06:52:27 2019 From: jed at jedbrown.org (Jed Brown) Date: Wed, 21 Aug 2019 05:52:27 -0600 Subject: [petsc-users] Getting the connectivity from DMPlex In-Reply-To: References: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl> <87lfvm7nim.fsf@jedbrown.org> Message-ID: <87imqq7n2c.fsf@jedbrown.org> Jian Zhang - 3ME writes: > Hi Jed, > > Thank you very much. I tried to use DMPlexGetCone, but the output is > the edge ids, not the vertice ids. This means you have an interpolated mesh (edges represented explicitly in the data structure). > For the function DMPlexGetClosureIndices, I can not find it in > petsc4py. Do you know any alternative ways to solve this? Add it to petsc4py or ask for it to be added. I'm sorry I can't do it at the moment. From knepley at gmail.com Wed Aug 21 07:03:19 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 21 Aug 2019 08:03:19 -0400 Subject: [petsc-users] Getting the connectivity from DMPlex In-Reply-To: References: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl> <87lfvm7nim.fsf@jedbrown.org> Message-ID: On Wed, Aug 21, 2019 at 7:48 AM Jian Zhang - 3ME via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi Jed, > > Thank you very much. I tried to use DMPlexGetCone, but the output is the > edge ids, not the vertice ids. For the function DMPlexGetClosureIndices, > I can not find it in petsc4py. Do you know any alternative ways to solve > this? > DMPlexGetClosure() should be in petsc4py. It also returns orientations, which you can just filter out. Thanks, Matt > Best, > > Jian > ------------------------------ > *From:* Jed Brown > *Sent:* 21 August 2019 13:42:41 > *To:* Jian Zhang - 3ME; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Getting the connectivity from DMPlex > > Jian Zhang - 3ME via petsc-users writes: > > > Hi guys, > > > > I am trying to get the element connectivity from DMPlex. The input is > the element id, and the output should be the vertice ids. Which function > should I use to achieve this? Thanks in advance. > > See DMPlexGetCone or DMPlexGetClosureIndices. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Zhang-10 at tudelft.nl Wed Aug 21 07:21:05 2019 From: J.Zhang-10 at tudelft.nl (Jian Zhang - 3ME) Date: Wed, 21 Aug 2019 12:21:05 +0000 Subject: [petsc-users] Getting the connectivity from DMPlex In-Reply-To: References: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl> <87lfvm7nim.fsf@jedbrown.org> , Message-ID: Hi Matthew, Thanks for your answer. I just looked at the pestc4py 3.11.0 (I think it is the newest one). Actually, I did not find DMPlexGetClosure function in the DMPlex.pyx, also not in DM.pyx. Best, Jian ________________________________ From: Matthew Knepley Sent: 21 August 2019 14:03:19 To: Jian Zhang - 3ME Cc: Jed Brown; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Getting the connectivity from DMPlex On Wed, Aug 21, 2019 at 7:48 AM Jian Zhang - 3ME via petsc-users > wrote: Hi Jed, Thank you very much. I tried to use DMPlexGetCone, but the output is the edge ids, not the vertice ids. For the function DMPlexGetClosureIndices, I can not find it in petsc4py. Do you know any alternative ways to solve this? DMPlexGetClosure() should be in petsc4py. It also returns orientations, which you can just filter out. Thanks, Matt Best, Jian ________________________________ From: Jed Brown > Sent: 21 August 2019 13:42:41 To: Jian Zhang - 3ME; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Getting the connectivity from DMPlex Jian Zhang - 3ME via petsc-users > writes: > Hi guys, > > I am trying to get the element connectivity from DMPlex. The input is the element id, and the output should be the vertice ids. Which function should I use to achieve this? Thanks in advance. See DMPlexGetCone or DMPlexGetClosureIndices. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 21 07:24:56 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 21 Aug 2019 08:24:56 -0400 Subject: [petsc-users] Getting the connectivity from DMPlex In-Reply-To: References: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl> <87lfvm7nim.fsf@jedbrown.org> Message-ID: On Wed, Aug 21, 2019 at 8:21 AM Jian Zhang - 3ME wrote: > Hi Matthew, > > Thanks for your answer. I just looked at the pestc4py 3.11.0 (I think it > is the newest one). > > Actually, I did not find DMPlexGetClosure function in the DMPlex.pyx, also > not in DM.pyx. > > Sorry, I messed up the name in my mail. Its getTransitiveClosure() https://bitbucket.org/petsc/petsc4py/src/9598d8fb2bb0baa79f99ebf76991ddecd1fd173e/src/PETSc/DMPlex.pyx#lines-332 Thanks, Matt > Best, > > Jian > ------------------------------ > *From:* Matthew Knepley > *Sent:* 21 August 2019 14:03:19 > *To:* Jian Zhang - 3ME > *Cc:* Jed Brown; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Getting the connectivity from DMPlex > > On Wed, Aug 21, 2019 at 7:48 AM Jian Zhang - 3ME via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hi Jed, >> >> Thank you very much. I tried to use DMPlexGetCone, but the output is the >> edge ids, not the vertice ids. For the function DMPlexGetClosureIndices, >> I can not find it in petsc4py. Do you know any alternative ways to solve >> this? >> > DMPlexGetClosure() should be in petsc4py. It also returns orientations, > which you can just filter out. > > Thanks, > > Matt > > >> Best, >> >> Jian >> ------------------------------ >> *From:* Jed Brown >> *Sent:* 21 August 2019 13:42:41 >> *To:* Jian Zhang - 3ME; petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] Getting the connectivity from DMPlex >> >> Jian Zhang - 3ME via petsc-users writes: >> >> > Hi guys, >> > >> > I am trying to get the element connectivity from DMPlex. The input is >> the element id, and the output should be the vertice ids. Which function >> should I use to achieve this? Thanks in advance. >> >> See DMPlexGetCone or DMPlexGetClosureIndices. >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Zhang-10 at tudelft.nl Wed Aug 21 07:35:07 2019 From: J.Zhang-10 at tudelft.nl (Jian Zhang - 3ME) Date: Wed, 21 Aug 2019 12:35:07 +0000 Subject: [petsc-users] Getting the connectivity from DMPlex In-Reply-To: References: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl> <87lfvm7nim.fsf@jedbrown.org> , Message-ID: <031dca4c46bf4ffe887757ead1ecfb3c@tudelft.nl> Hi Matthew, That is ok. May I ask you one more question? Sorry. I just start to learn how to use DMPLex. If I have an msh file including the triangular elements (t3). After I use DMPlex to read this mesh, how can I know the element type (actually which is t3) by using the functions inside DMPlex? Best, Jian ________________________________ From: Matthew Knepley Sent: 21 August 2019 14:24:56 To: Jian Zhang - 3ME Cc: Jed Brown; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Getting the connectivity from DMPlex On Wed, Aug 21, 2019 at 8:21 AM Jian Zhang - 3ME > wrote: Hi Matthew, Thanks for your answer. I just looked at the pestc4py 3.11.0 (I think it is the newest one). Actually, I did not find DMPlexGetClosure function in the DMPlex.pyx, also not in DM.pyx. Sorry, I messed up the name in my mail. Its getTransitiveClosure() https://bitbucket.org/petsc/petsc4py/src/9598d8fb2bb0baa79f99ebf76991ddecd1fd173e/src/PETSc/DMPlex.pyx#lines-332 Thanks, Matt Best, Jian ________________________________ From: Matthew Knepley > Sent: 21 August 2019 14:03:19 To: Jian Zhang - 3ME Cc: Jed Brown; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Getting the connectivity from DMPlex On Wed, Aug 21, 2019 at 7:48 AM Jian Zhang - 3ME via petsc-users > wrote: Hi Jed, Thank you very much. I tried to use DMPlexGetCone, but the output is the edge ids, not the vertice ids. For the function DMPlexGetClosureIndices, I can not find it in petsc4py. Do you know any alternative ways to solve this? DMPlexGetClosure() should be in petsc4py. It also returns orientations, which you can just filter out. Thanks, Matt Best, Jian ________________________________ From: Jed Brown > Sent: 21 August 2019 13:42:41 To: Jian Zhang - 3ME; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Getting the connectivity from DMPlex Jian Zhang - 3ME via petsc-users > writes: > Hi guys, > > I am trying to get the element connectivity from DMPlex. The input is the element id, and the output should be the vertice ids. Which function should I use to achieve this? Thanks in advance. See DMPlexGetCone or DMPlexGetClosureIndices. -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 21 07:41:48 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 21 Aug 2019 08:41:48 -0400 Subject: [petsc-users] Getting the connectivity from DMPlex In-Reply-To: <031dca4c46bf4ffe887757ead1ecfb3c@tudelft.nl> References: <6b0a4d89c88b480793843b8604033eb6@tudelft.nl> <87lfvm7nim.fsf@jedbrown.org> <031dca4c46bf4ffe887757ead1ecfb3c@tudelft.nl> Message-ID: On Wed, Aug 21, 2019 at 8:35 AM Jian Zhang - 3ME wrote: > Hi Matthew, > > That is ok. May I ask you one more question? > > Sorry. I just start to learn how to use DMPLex. If I have an msh file > including the triangular elements (t3). After I use DMPlex to read this > mesh, how can I know the element type (actually which is t3) by using the > functions inside DMPlex? > They are cells with cone size 3. Matt > Best, > > Jian > ------------------------------ > *From:* Matthew Knepley > *Sent:* 21 August 2019 14:24:56 > *To:* Jian Zhang - 3ME > *Cc:* Jed Brown; petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] Getting the connectivity from DMPlex > > On Wed, Aug 21, 2019 at 8:21 AM Jian Zhang - 3ME > wrote: > >> Hi Matthew, >> >> Thanks for your answer. I just looked at the pestc4py 3.11.0 (I think it >> is the newest one). >> >> Actually, I did not find DMPlexGetClosure function in the DMPlex.pyx, >> also not in DM.pyx. >> >> Sorry, I messed up the name in my mail. Its getTransitiveClosure() > > > https://bitbucket.org/petsc/petsc4py/src/9598d8fb2bb0baa79f99ebf76991ddecd1fd173e/src/PETSc/DMPlex.pyx#lines-332 > > Thanks, > > Matt > >> Best, >> >> Jian >> ------------------------------ >> *From:* Matthew Knepley >> *Sent:* 21 August 2019 14:03:19 >> *To:* Jian Zhang - 3ME >> *Cc:* Jed Brown; petsc-users at mcs.anl.gov >> *Subject:* Re: [petsc-users] Getting the connectivity from DMPlex >> >> On Wed, Aug 21, 2019 at 7:48 AM Jian Zhang - 3ME via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hi Jed, >>> >>> Thank you very much. I tried to use DMPlexGetCone, but the output is the >>> edge ids, not the vertice ids. For the function DMPlexGetClosureIndices, >>> I can not find it in petsc4py. Do you know any alternative ways to solve >>> this? >>> >> DMPlexGetClosure() should be in petsc4py. It also returns orientations, >> which you can just filter out. >> >> Thanks, >> >> Matt >> >> >>> Best, >>> >>> Jian >>> ------------------------------ >>> *From:* Jed Brown >>> *Sent:* 21 August 2019 13:42:41 >>> *To:* Jian Zhang - 3ME; petsc-users at mcs.anl.gov >>> *Subject:* Re: [petsc-users] Getting the connectivity from DMPlex >>> >>> Jian Zhang - 3ME via petsc-users writes: >>> >>> > Hi guys, >>> > >>> > I am trying to get the element connectivity from DMPlex. The input is >>> the element id, and the output should be the vertice ids. Which function >>> should I use to achieve this? Thanks in advance. >>> >>> See DMPlexGetCone or DMPlexGetClosureIndices. >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tangqi at msu.edu Wed Aug 21 10:05:19 2019 From: tangqi at msu.edu (Tang, Qi) Date: Wed, 21 Aug 2019 15:05:19 +0000 Subject: [petsc-users] How to choose mat_mffd_err in JFNK In-Reply-To: References: <459CE6EB-5359-4D35-B90F-CECFD32C0EB4@anl.gov> <827791B1-D492-49DE-954B-BCD6E3253360@mcs.anl.gov> , Message-ID: I made a mistake that I turned on -ksp_monitor_true_residual in ~/.petscrc on my cluster. The extra MatMult_MFFD comes from the ksp monitor. Sorry about the confusion. Thanks again for your help, Barry. We will do more tests based on your suggestion. ________________________________ From: Smith, Barry F. Sent: Monday, August 19, 2019 10:04 PM To: Tang, Qi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK If I run a "standard" example, such as snes/examples/tutorials/ex19 it does not have this "double" business with fgmres/gmres I think what you are seeing is related to where and how the shell preconditioner is working. By default GMRES uses left preconditioning (you can change it to right with -ksp_pc_side right) while FMGRES has to use right preconditioner (it makes no mathematical sense with left preconditioning). My first guess is if you use GMRES with -ksp_pc_side right you will see the double "business". Right preconditioned GMRES/FGMRES solves A B y= b so an application of the operator is A B but then once y is compute it needs to compute x = B y which is another application of your preconditioner, hence requires another matrix free operation. The h is different for the two multiplies because the a is different. I don't see that this would happen for every KSP iteration, I think there will be one "extra" for each KSP solve. I you truly see two MatMult_MFFD() for each GMRES/FGMRES I would run in the debugger, put a break point in MatMult_MFFD() to see exactly where each one is called. From the manual page MATMFFD_WP - Implements an alternative approach for computing the differencing parameter h used with the finite difference based matrix-free Jacobian. This code implements the strategy of M. Pernice and H. Walker: h = error_rel * sqrt(1 + ||U||) / ||a|| Notes: 1) || U || does not change between linear iterations so is reused 2) In GMRES || a || == 1 and so does not need to ever be computed except at restart when it is recomputed. > On Aug 19, 2019, at 7:51 PM, Tang, Qi wrote: > > Thanks again, Barry. I am testing more based on your suggestions. > > One thing I do not understand is when I use -mat_mffd_type wp -mat_mffd_err 1e-3 -ksp_type fgmres. It computes "h" of JFNK twice sequentially in each KSP iteration. For instance, the output in info looks like > ... > [0] MatMult_MFFD(): Current differencing parameter: 1.953103182078e-09 > [0] MatMult_MFFD(): Current differencing parameter: 6.054449182838e-01 > ... > I found it called MatMult_MFFD twice in a row. Do you happen to know why petsc calls MatMult_MFFD twice when using fgmre? I also do not understand the relationship between these two h. They are very off (because apparently ||a|| is very different). > > On the other hand, h is only computed once when I use gmres. The h there is clearly scaled with mat_mffd_err. That is easy to understand. However, I think I need to use fgmres because my preconditioner is not quite linear. > > BTW, my snes view is: > > SNES Object: () 16 MPI processes > type: newtonls > maximum iterations=20, maximum function evaluations=10000 > tolerances: relative=0.0001, absolute=0., solution=0. > total number of linear solver iterations=70 > total number of function evaluations=149 > norm schedule ALWAYS > Eisenstat-Walker computation of KSP relative tolerance (version 3) > rtol_0=0.3, rtol_max=0.9, threshold=0.1 > gamma=0.9, alpha=1.5, alpha2=1.5 > SNESLineSearch Object: () 16 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: () 16 MPI processes > type: fgmres > restart=50, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=0.0564799, absolute=1e-50, divergence=10000. > right preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: () 16 MPI processes > type: shell > JFNK preconditioner > No information available on the mfem::Solver > Number of preconditioners created by the factory 8 > linear system matrix followed by preconditioner matrix: > Mat Object: 16 MPI processes > type: mffd > rows=787968, cols=787968 > Matrix-free approximation: > err=0.001 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 16 MPI processes > type: nest > rows=787968, cols=787968 > Matrix object: > type=nest, rows=3, cols=3 > MatNest structure: > (0,0) : type=mpiaij, rows=262656, cols=262656 > (0,1) : type=mpiaij, rows=262656, cols=262656 > (0,2) : NULL > (1,0) : type=mpiaij, rows=262656, cols=262656 > (1,1) : type=mpiaij, rows=262656, cols=262656 > (1,2) : NULL > (2,0) : type=mpiaij, rows=262656, cols=262656 > (2,1) : NULL > (2,2) : type=mpiaij, rows=262656, cols=262656 > From: Smith, Barry F. > Sent: Thursday, August 15, 2019 3:30 AM > To: Tang, Qi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > On Aug 15, 2019, at 12:36 AM, Tang, Qi wrote: > > > > Thanks, it works. snes_mf_jorge works for me. > > Great. > > > It appears to compute h in every ksp. > > Each matrix vector product or each KSPSolve()? From the code looks like each matrix-vector product. > > > > > Without -snes_mf_jorge, it is not working. For some reason, it only computes h once, but that h is bad. My gmres residual is not decaying. > > > > > Indeed, the noise in my function becomes larger when I refine the mesh. I think it makes sense as I use the same time step for different meshes (that is the goal of the preconditioning). However, even when the algorithm is working, sqrt(noise) is much less than the good mat_mffd_err I previously found (10^-6 vs 10^-3). I do not understand why. > > I have no explanation. The details of the code that computes err are difficult to trace exactly; would take a while. Perhaps there is some parameter in there that is too "conservative"? > > > > > Although snes_mf_jorge is working, it is very expensive, as it has to evaluate F many times when estimating h. Unfortunately, to achieve the nonlinearity, I have to assemble some operators inside my F. There seem no easy solutions. > > > > I will try to compute h multiple times without using snes_mf_jorge. But let me know if you have other suggestions. Thanks! > > Yes, it seems to me computing the err less often would be a way to make the code go faster. I looked at the code more closely and noticed a couple of things. > > if (ctx->jorge) { > ierr = SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); > > /* Use the Brown/Saad method to compute h */ > } else { > /* Compute error if desired */ > ierr = SNESGetIterationNumber(snes,&iter);CHKERRQ(ierr); > if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) { > /* Use Jorge's method to compute noise */ > ierr = SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); > > ctx->error_rel = PetscSqrtReal(noise); > > ierr = PetscInfo3(snes,"Using Jorge's noise: noise=%g, sqrt(noise)=%g, h_more=%g\n",(double)noise,(double)ctx->error_rel,(double)h);CHKERRQ(ierr); > > ctx->compute_err_iter = iter; > ctx->need_err = PETSC_FALSE; > } > > So if jorge is set it uses the jorge algorithm for each matrix multiple to compute a new err and h. If jorge is not set but -snes_mf_compute_err is set then it computes a new err and h depending on the parameters (you can run with -info and grep for noise to see the PetscInfo lines printed (maybe add iter to the output to see how often it is recomputed) > > if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) { > > so the compute_err_freq determines when the err and h are recomputed. The logic is a bit strange so I cannot decipher exactly how often it is recomputing. I'm guess at the first Newton iteration and then some number of iterations later. > > You could try to rig it so that it is at every new Newton step (this would mean when computing f(x + h a) - f(x) for each new x it will recompute I think). > > > A more "research type" approach to try to reduce the work to a reasonable level would be to keep the same err until "things start to go bad" and then recompute it. But how to measure "when things start to go bad?" It is related to GMRES stagnating, so you could for example track the convergence of GMRES and if it is less than expected, kill the linear solve, reset the computation of err and then start the KSP at the same point as before. But this could be terribly expensive since all the steps in the most recent KSP are lost. > > Another possibility is to assume that the err is valid in a certain size ball around the current x. When x + ha or the new x is outside that ball then recompute the err. But how to choose the ball size and is there a way to adjust the ball size depending on how the computations proceed. For example if everything is going great you could slowly increase the size of the ball and hope for the best; but how to detect if the ball got too big (as above but terribly expensive)? > > Track the size of the err each time it is computed. If it stays about the same for a small number of times just freeze it for a while at that value? But again how to determine when it is no longer good? > > Just a few wild thoughts, I really have no solid ideas on how to reduce the work requirements. > > Barry > > > > > > > > > > Qi > > > > > > From: Smith, Barry F. > > Sent: Wednesday, August 14, 2019 10:57 PM > > To: Tang, Qi > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > > > > > On Aug 14, 2019, at 9:41 PM, Tang, Qi wrote: > > > > > > Thanks for the help, Barry. I tired both ds and wp, and again it depends on if I could find the correct parameter set. It is getting harder as I refine the mesh. > > > > > > So I try to use SNESDefaultMatrixFreeCreate2, SNESMatrixFreeMult2_Private or SNESDiffParameterCompute_More in mfem. But it looks like these functions are not in linked in petscsnes.h. How could I call them? > > > > They may not be listed in petscsnes.h but I think they should be in the library. You can just stick the prototypes for the functions anywhere you need them for now. > > > > > > You should be able to use > > > > ierr = PetscOptionsInt("-snes_mf_version","Matrix-Free routines version 1 or 2","None",snes->mf_version,&snes->mf_version,0);CHKERRQ(ierr); > > > > then this info is passed with > > > > if (snes->mf) { > > ierr = SNESSetUpMatrixFree_Private(snes, snes->mf_operator, snes->mf_version);CHKERRQ(ierr); > > } > > > > this routine has > > > > if (version == 1) { > > ierr = MatCreateSNESMF(snes,&J);CHKERRQ(ierr); > > ierr = MatMFFDSetOptionsPrefix(J,((PetscObject)snes)->prefix);CHKERRQ(ierr); > > ierr = MatSetFromOptions(J);CHKERRQ(ierr); > > } else if (version == 2) { > > if (!snes->vec_func) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"SNESSetFunction() must be called first"); > > #if !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_REAL_SINGLE) && !defined(PETSC_USE_REAL___FLOAT128) && !defined(PETSC_USE_REAL___FP16) > > ierr = SNESDefaultMatrixFreeCreate2(snes,snes->vec_func,&J);CHKERRQ(ierr); > > #else > > > > and this routine has > > > > ierr = VecDuplicate(x,&mfctx->w);CHKERRQ(ierr); > > ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr); > > ierr = VecGetSize(x,&n);CHKERRQ(ierr); > > ierr = VecGetLocalSize(x,&nloc);CHKERRQ(ierr); > > ierr = MatCreate(comm,J);CHKERRQ(ierr); > > ierr = MatSetSizes(*J,nloc,n,n,n);CHKERRQ(ierr); > > ierr = MatSetType(*J,MATSHELL);CHKERRQ(ierr); > > ierr = MatShellSetContext(*J,mfctx);CHKERRQ(ierr); > > ierr = MatShellSetOperation(*J,MATOP_MULT,(void (*)(void))SNESMatrixFreeMult2_Private);CHKERRQ(ierr); > > ierr = MatShellSetOperation(*J,MATOP_DESTROY,(void (*)(void))SNESMatrixFreeDestroy2_Private);CHKERRQ(ierr); > > ierr = MatShellSetOperation(*J,MATOP_VIEW,(void (*)(void))SNESMatrixFreeView2_Private);CHKERRQ(ierr); > > ierr = MatSetUp(*J);CHKERRQ(ierr); > > > > > > > > > > > > > > > > > > Also, could I call SNESDiffParameterCompute_More in snes_monitor? But it needs "a" in (F(u + ha) - F(u)) /h as the input. So I am not sure how I could properly use it. Maybe use SNESDefaultMatrixFreeCreate2 inside SNESSetJacobian would be easier to try? > > > > If you use the flag -snes_mf_noise_file filename when it runs it will save all the noise information it computes along the way to that file (yes it is crude and doesn't match the PETSc Viewer/monitor style but it should work). > > > > Thus I think you can use it and get all the possible monitoring information without actually writing any code. Just > > > > -snes_mf_version 2 > > -snes_mf_noise_file filename > > > > > > > > Barry > > > > > > > > Thanks again, > > > Qi > > > > > > From: Smith, Barry F. > > > Sent: Tuesday, August 13, 2019 9:07 PM > > > To: Tang, Qi > > > Cc: petsc-users at mcs.anl.gov > > > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > > > > > > > > > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users wrote: > > > > > > > > ?Hi, > > > > I am using JFNK, inexact Newton and a shell "physics-based" preconditioning to solve some multiphysics problems. I have been playing with mat_mffd_err, and it gives me some results I do not fully understand. > > > > > > > > I believe the default value of mat_mffd_err is 10^-8 for double precision, which seem too large for my problems. When I test a problem essentially still in the linear regime, I expect my converged "unpreconditioned resid norm of KSP" should be identical to "SNES Function norm" (Should I?). This is exactly what I found if I could find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it happens, the whole algorithm works as expected. When those two norms are off, the inexact Newton becomes very inefficient. For instance, it may take many ksp iterations to converge but the snes norm is only reduced slightly. > > > > > > > > According to the manual, mat_mffd_err should be "square root of relative error in function evaluation". But is there a simple way to estimate it? > > > > > > First a related note: there are two different algorithms that PETSc provides for computing the factor h using the err parameter. They can be set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some people have better luck with one or the other for their problem. > > > > > > > > > There is some code in PETSc to compute what is called the "noise" of the function which in theory leads to a better err value. For example if say the last four digits of your function are "noise" (that is meaningless stuff) which is possible for complicated multiphysics problems due to round off in the function evaluations then you should use an err that is 2 digits bigger than the default (note this is just the square root of the "function epsilon" instead of the machine epsilon, because the "function epsilon" is much larger than the machine epsilon). > > > > > > We never had a great model for how to hook the noise computation code up into SNES so it is a bit disconnected, we should revisit this. Perhaps you will find it useful and we can figure out together how to hook it in cleanly for use. > > > > > > The code that computes and reports the noise is in the directory src/snes/interface/noise > > > > > > You can use this version with the option -snes_mf_version 2 (version 1 is the normal behavior) > > > > > > The code has the ability to print to a file or the screen information about the noise it is finding, maybe with command line options, you'll have to look directly at the code to see how. > > > > > > I don't like the current setup because if you use the noise based computation of h you have to use SNESMatrixFreeMult2_Private() which is a slightly different routine for doing the actually differencing than the two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it can't use the standard differencing routines but call the noise routine to compute h (just requires some minor code refactoring). Also I don't see an automatic way to just compute the noise of a function at a bunch of points independent of actually using the noise to compute and use h. It is sort of there in the routine SNESDiffParameterCompute_More() but that routine is not documented or user friendly. > > > > > > I would start by rigging up calls to SNESDiffParameterCompute_More() to see what it says the noise of your function is, based on your hand determined values optimal values for err it should be roughly the square of them. Then if this makes sense you can trying using the -snes_mf_version 2 code to see if it helps the matrix free multiple behave consistently well for your problem by adaptively computing the err and using that in computing h. > > > > > > Good luck and I'd love to hear back from you if it helps and suggestions/code for making this more integrated with SNES so it is easier to use. For example perhaps we want a function SNESComputeNoise() that uses SNESDiffParameterCompute_More() to compute and report the noise for a range of values of u. > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > > Is there anything else I could possibly tune in this context? > > > > > > > > The discretization is through mfem and I use standard H1 for my problem. > > > > > > > > Thanks, > > > > Qi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfaibussowitsch at anl.gov Wed Aug 21 10:35:47 2019 From: jfaibussowitsch at anl.gov (Faibussowitsch, Jacob) Date: Wed, 21 Aug 2019 15:35:47 +0000 Subject: [petsc-users] [Reminder] Working Group Beginners: Feedback On Layout Message-ID: <685AB87F-1935-479E-93B4-DF79CBAA0439@anl.gov> Hello Again All PETSc Developers/Users! Friendly reminder to continue to submit your feedback on the project template by this Friday August 23! The more we hear from developers, and most importantly you the users, the better these tutorials will be! Any and all ideas are welcome so please do not hesitate to send me your feedback! I have already heard back form a number of you with excellent comments, so please do not hesitate to reach out. As many of you may or may not know, PETSc recently held an all-hands strategic meeting to chart the medium term course for the group. As part of this meeting a working group was formed to focus on beginner tutorial guides aimed at bringing new users up to speed on how to program basic to intermediate PETSc scripts. We have just completed a first draft of our template for these guides and would like to ask you all for your feedback! Any and all feedback would be greatly appreciated, however please limit your feedback to the general layout and structure. The visual presentation of the web page and content is still all a WIP, and is not necessarily representative of the finished product. That being said, in order to keep the project moving forward we will soft-cap feedback collection by the end of next Friday (August 23) so that we can get started on writing the tutorials and integrating them with the rest of the revamped user-guides. Please email me directly at jfaibussowitsch at anl.gov with your comments! Be sure to include specific details and examples of what you like and don?t like with your mail. Here is the template: http://patricksanan.com/temp/_build/html/introductory_tutorial_ksp.html Sincerely, Jacob Faibussowitsch -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.gutzwiller at gmail.com Wed Aug 21 15:20:01 2019 From: david.gutzwiller at gmail.com (David Gutzwiller) Date: Wed, 21 Aug 2019 13:20:01 -0700 Subject: [petsc-users] CUDA-Aware MPI & PETSc Message-ID: Hello, I'm currently using PETSc for the GPU acceleration of simple Krylov solver with GMRES, without preconditioning. This is within the framework of our in-house multigrid solver. I am getting a good GPU speedup on the finest grid level but progressively worse performance on each coarse level. This is not surprising, but I still hope to squeeze out some more performance, hopefully making it worthwhile to run some or all of the coarse grids on the GPU. I started investigating with nvprof / nsight and essentially came to the same conclusion that Xiangdong reported in a recent thread (July 16, "MemCpy (HtoD and DtoH) in Krylov solver"). My question is a follow-up to that thread: The MPI communication is staged from the host, which results in some H<->D transfers for every mat-vec operation. A CUDA-aware MPI implementation might avoid these transfers for communication between ranks that are assigned to the same accelerator. Has this been implemented or tested? In our solver we typically run with multiple MPI ranks all assigned to a single device, and running with a single rank is not really feasible as we still have a sizable amount of work for the CPU to chew through. Thus, I think quite a lot of the H<->D transfers could be avoided if I can skip the MPI staging on the host. I am quite new to PETSc so I wanted to ask around before blindly digging into this. Thanks for your help, David Virus-free. www.avast.com <#m_-3093947404852640465_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> -------------- next part -------------- An HTML attachment was scrubbed... URL: From julyzll06 at gmail.com Wed Aug 21 21:48:22 2019 From: julyzll06 at gmail.com (Lailai Zhu) Date: Wed, 21 Aug 2019 22:48:22 -0400 Subject: [petsc-users] errors when using elemental with petsc3.10.5 Message-ID: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> hi, dear petsc developers, I am having a problem when using the external solver elemental. I installed petsc3.10.5 version with the flag --download-elemental-commit=v0.87.7 the installation seems to be ok. However, it seems that i may not be able to use the elemental solver though. I followed this page https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html to interface the elemental solver, namely, MatSetType(A,MATELEMENTAL); or set it via the command line '*-mat_type elemental*', in either case, i will get the following error, [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: http://www.mcs.anl.gov/petsc/documentation/installation.html#external [0]PETSC ERROR: Unknown Mat type given: elemental [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 May i ask whether there will be a way or some specific petsc versions that are able to use the elemental solver? Thanks in advance, best, lailai -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Aug 21 21:58:46 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 22 Aug 2019 02:58:46 +0000 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> Message-ID: To install elemental - you use: --download-elemental=1 [not --download-elemental-commit=v0.87.7] Satish On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > hi, dear petsc developers, > > I am having a problem when using the external solver elemental. > I installed petsc3.10.5 version with the flag > --download-elemental-commit=v0.87.7 > the installation seems to be ok. However, it seems that i may not be able > to use the elemental solver though. > > I followed this page > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html > to interface the elemental solver, namely, > MatSetType(A,MATELEMENTAL); > or set it via the command line '*-mat_type elemental*', > > in either case, i will get the following error, > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: > http://www.mcs.anl.gov/petsc/documentation/installation.html#external > [0]PETSC ERROR: Unknown Mat type given: elemental > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for > trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 > > May i ask whether there will be a way or some specific petsc versions that are > able to use the elemental solver? > > Thanks in advance, > > best, > lailai > From knepley at gmail.com Wed Aug 21 21:58:33 2019 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 21 Aug 2019 22:58:33 -0400 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> Message-ID: Send configure.log Thanks, Matt On Wed, Aug 21, 2019 at 10:56 PM Lailai Zhu via petsc-users < petsc-users at mcs.anl.gov> wrote: > hi, dear petsc developers, > > I am having a problem when using the external solver elemental. > I installed petsc3.10.5 version with the flag > --download-elemental-commit=v0.87.7 > the installation seems to be ok. However, it seems that i may not be able > to use the elemental solver though. > > I followed this page > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html > to interface the elemental solver, namely, > MatSetType(A,MATELEMENTAL); > or set it via the command line '*-mat_type elemental*', > > in either case, i will get the following error, > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: > http://www.mcs.anl.gov/petsc/documentation/installation.html#external > [0]PETSC ERROR: Unknown Mat type given: elemental > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 > > May i ask whether there will be a way or some specific petsc versions that > are able to use the elemental solver? > > Thanks in advance, > > best, > lailai > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From julyzll06 at gmail.com Wed Aug 21 22:48:03 2019 From: julyzll06 at gmail.com (Lailai Zhu) Date: Wed, 21 Aug 2019 23:48:03 -0400 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> Message-ID: <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> hi, Satish, i tried to do it following your suggestion, i get the following errors when installing. here is my configuration, any ideas? best, lailai ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1? --with-batch=0? --with-mpi=1 --with-debugging=0? CXXOPTFLAGS="-g -O3"? COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 --download-blacs=1? --download-scalapack=1? --download-hypre=1 --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort? --download-amd=1 --download-anamod=1 --download-blopex=1 --download-dscpack=1???? --download-sprng=1 --download-superlu=1 --with-cxx-dialect=C++11 --download-metis --download-parmetis pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function `MatCreate_SeqSBAIJ': sbaij.c:(.text+0x1bc45): undefined reference to `MatConvert_SeqSBAIJ_Elemental' ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: relocation R_X86_64_PC32 against undefined hidden symbol `MatConvert_SeqSBAIJ_Elemental' can not be used when making a shared object ld: final link failed: Bad value gmakefile:86: recipe for target 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 make[2]: Leaving directory '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: recipe for target 'gnumake' failed make[1]: *** [gnumake] Error 2 make[1]: Leaving directory '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' **************************ERROR************************************* ? Error during compile, check pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log ? Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov On 8/21/19 10:58 PM, Balay, Satish wrote: > To install elemental - you use: --download-elemental=1 [not --download-elemental-commit=v0.87.7] > > Satish > > > On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >> hi, dear petsc developers, >> >> I am having a problem when using the external solver elemental. >> I installed petsc3.10.5 version with the flag >> --download-elemental-commit=v0.87.7 >> the installation seems to be ok. However, it seems that i may not be able >> to use the elemental solver though. >> >> I followed this page >> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html >> to interface the elemental solver, namely, >> MatSetType(A,MATELEMENTAL); >> or set it via the command line '*-mat_type elemental*', >> >> in either case, i will get the following error, >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: >> http://www.mcs.anl.gov/petsc/documentation/installation.html#external >> [0]PETSC ERROR: Unknown Mat type given: elemental >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for >> trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 >> >> May i ask whether there will be a way or some specific petsc versions that are >> able to use the elemental solver? >> >> Thanks in advance, >> >> best, >> lailai >> From balay at mcs.anl.gov Wed Aug 21 23:41:21 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 22 Aug 2019 04:41:21 +0000 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> Message-ID: Can you run 'make' again and see if this error goes away? Satish On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > hi, Satish, > i tried to do it following your suggestion, i get the following errors when > installing. > here is my configuration, > > any ideas? > > best, > lailai > > ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1? > --with-batch=0? --with-mpi=1 --with-debugging=0? CXXOPTFLAGS="-g -O3"? > COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 > -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 > --download-blacs=1? --download-scalapack=1? --download-hypre=1 > --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort? > --download-amd=1 --download-anamod=1 --download-blopex=1 > --download-dscpack=1???? --download-sprng=1 --download-superlu=1 > --with-cxx-dialect=C++11 --download-metis --download-parmetis > > > > pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function > `MatCreate_SeqSBAIJ': > sbaij.c:(.text+0x1bc45): undefined reference to > `MatConvert_SeqSBAIJ_Elemental' > ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: relocation > R_X86_64_PC32 against undefined hidden symbol `MatConvert_SeqSBAIJ_Elemental' > can not be used when making a shared object > ld: final link failed: Bad value > gmakefile:86: recipe for target > 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed > make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 > make[2]: Leaving directory '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: > recipe for target 'gnumake' failed > make[1]: *** [gnumake] Error 2 > make[1]: Leaving directory '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > **************************ERROR************************************* > ? Error during compile, check > pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log > ? Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to > petsc-maint at mcs.anl.gov > > On 8/21/19 10:58 PM, Balay, Satish wrote: > > To install elemental - you use: --download-elemental=1 [not > > --download-elemental-commit=v0.87.7] > > > > Satish > > > > > > On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > > > >> hi, dear petsc developers, > >> > >> I am having a problem when using the external solver elemental. > >> I installed petsc3.10.5 version with the flag > >> --download-elemental-commit=v0.87.7 > >> the installation seems to be ok. However, it seems that i may not be able > >> to use the elemental solver though. > >> > >> I followed this page > >> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html > >> to interface the elemental solver, namely, > >> MatSetType(A,MATELEMENTAL); > >> or set it via the command line '*-mat_type elemental*', > >> > >> in either case, i will get the following error, > >> > >> [0]PETSC ERROR: --------------------- Error Message > >> -------------------------------------------------------------- > >> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: > >> http://www.mcs.anl.gov/petsc/documentation/installation.html#external > >> [0]PETSC ERROR: Unknown Mat type given: elemental > >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for > >> trouble shooting. > >> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 > >> > >> May i ask whether there will be a way or some specific petsc versions that > >> are > >> able to use the elemental solver? > >> > >> Thanks in advance, > >> > >> best, > >> lailai > >> > > From jczhang at mcs.anl.gov Thu Aug 22 09:22:30 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Thu, 22 Aug 2019 14:22:30 +0000 Subject: [petsc-users] CUDA-Aware MPI & PETSc In-Reply-To: References: Message-ID: This feature is under active development. I hope I can make it usable in a couple of weeks. Thanks. --Junchao Zhang On Wed, Aug 21, 2019 at 3:21 PM David Gutzwiller via petsc-users > wrote: Hello, I'm currently using PETSc for the GPU acceleration of simple Krylov solver with GMRES, without preconditioning. This is within the framework of our in-house multigrid solver. I am getting a good GPU speedup on the finest grid level but progressively worse performance on each coarse level. This is not surprising, but I still hope to squeeze out some more performance, hopefully making it worthwhile to run some or all of the coarse grids on the GPU. I started investigating with nvprof / nsight and essentially came to the same conclusion that Xiangdong reported in a recent thread (July 16, "MemCpy (HtoD and DtoH) in Krylov solver"). My question is a follow-up to that thread: The MPI communication is staged from the host, which results in some H<->D transfers for every mat-vec operation. A CUDA-aware MPI implementation might avoid these transfers for communication between ranks that are assigned to the same accelerator. Has this been implemented or tested? In our solver we typically run with multiple MPI ranks all assigned to a single device, and running with a single rank is not really feasible as we still have a sizable amount of work for the CPU to chew through. Thus, I think quite a lot of the H<->D transfers could be avoided if I can skip the MPI staging on the host. I am quite new to PETSc so I wanted to ask around before blindly digging into this. Thanks for your help, David [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif] Virus-free. www.avast.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.gutzwiller at gmail.com Thu Aug 22 11:33:23 2019 From: david.gutzwiller at gmail.com (David Gutzwiller) Date: Thu, 22 Aug 2019 09:33:23 -0700 Subject: [petsc-users] CUDA-Aware MPI & PETSc In-Reply-To: References: Message-ID: Hello Junchao, Spectacular news! I have our production code running on Summit (Power9 + Nvidia V100) and on local x86 workstations, and I can definitely provide comparative benchmark data with this feature once it is ready. Just let me know when it is available for testing and I'll be happy to contribute. Thanks, -David Virus-free. www.avast.com <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> On Thu, Aug 22, 2019 at 7:22 AM Zhang, Junchao wrote: > This feature is under active development. I hope I can make it usable in a > couple of weeks. Thanks. > --Junchao Zhang > > > On Wed, Aug 21, 2019 at 3:21 PM David Gutzwiller via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello, >> >> I'm currently using PETSc for the GPU acceleration of simple Krylov >> solver with GMRES, without preconditioning. This is within the framework >> of our in-house multigrid solver. I am getting a good GPU speedup on the >> finest grid level but progressively worse performance on each coarse >> level. This is not surprising, but I still hope to squeeze out some more >> performance, hopefully making it worthwhile to run some or all of the >> coarse grids on the GPU. >> >> I started investigating with nvprof / nsight and essentially came to the >> same conclusion that Xiangdong reported in a recent thread (July 16, >> "MemCpy (HtoD and DtoH) in Krylov solver"). My question is a follow-up to >> that thread: >> >> The MPI communication is staged from the host, which results in some >> H<->D transfers for every mat-vec operation. A CUDA-aware MPI >> implementation might avoid these transfers for communication between ranks >> that are assigned to the same accelerator. Has this been implemented or >> tested? >> >> In our solver we typically run with multiple MPI ranks all assigned to a >> single device, and running with a single rank is not really feasible as we >> still have a sizable amount of work for the CPU to chew through. Thus, I >> think quite a lot of the H<->D transfers could be avoided if I can skip the >> MPI staging on the host. I am quite new to PETSc so I wanted to ask around >> before blindly digging into this. >> >> Thanks for your help, >> >> David >> >> >> Virus-free. >> www.avast.com >> >> <#m_2897511617293957267_m_5808030803546790052_m_-3093947404852640465_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Aug 22 11:59:59 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 22 Aug 2019 16:59:59 +0000 Subject: [petsc-users] [petsc-dev] IMPORTANT PETSc repository changed from Bucketbit to GitLab In-Reply-To: <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> References: <87D5FE0A-00FB-4A5D-82D1-9D8BAA31F258@mcs.anl.gov> <59054920-7BD3-48BD-B9F6-E9724429155F@mcs.anl.gov> Message-ID: On Mon, 19 Aug 2019, Smith, Barry F. via petsc-dev wrote: > > > PETSc folks. > > This announcement is for people who access PETSc from the BitBucket repository or post issues or have other activities with the Bitbucket repository > > We have changed the location of the PETSc repository from BitBucket to GitLab. For each copy of the repository you need to do > > git remote set-url origin git at gitlab.com:petsc/petsc.git > or > git remote set-url origin https://gitlab.com/petsc/petsc.git > > You will likely also want to set up an account on gitlab and remember to set the ssh key information > > if you previously had write permission to the petsc repository and cannot write to the new repository please email bsmith at mcs.anl.gov with your GitLab > username and the email used > > Please do not make pull requests to the Gitlab site yet; we will be manually processing the PR from the BitBucket site over the next couple of > days as we implement the testing. > > Please be patient, this is all new to use and it may take a few days to get out all the glitches. Just an update: We are still in the process of setting up the CI at gitlab. So we are not yet ready to process PRs [or Merge Requests (MRs) in gitlab terminology] As of now - we have the old jenkins equivalent [and a few additional] tests working with gitlab setup. i.e https://gitlab.com/petsc/petsc/pipelines/77669506 But we are yet to migrate all the regular [aka next] tests to this infrastructure. Satish > > Thanks for your support > > Barry > > The reason for switching to GitLab is that it has a better testing system than BitBucket and Gitlab. We hope that will allow us to test and manage > pull requests more rapidly, efficiently and accurately, thus allowing us to improve and add to PETSc more quickly. > From jczhang at mcs.anl.gov Thu Aug 22 13:03:28 2019 From: jczhang at mcs.anl.gov (Zhang, Junchao) Date: Thu, 22 Aug 2019 18:03:28 +0000 Subject: [petsc-users] CUDA-Aware MPI & PETSc In-Reply-To: References: Message-ID: Definitely I will do. Thanks. --Junchao Zhang On Thu, Aug 22, 2019 at 11:34 AM David Gutzwiller > wrote: Hello Junchao, Spectacular news! I have our production code running on Summit (Power9 + Nvidia V100) and on local x86 workstations, and I can definitely provide comparative benchmark data with this feature once it is ready. Just let me know when it is available for testing and I'll be happy to contribute. Thanks, -David [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif] Virus-free. www.avast.com On Thu, Aug 22, 2019 at 7:22 AM Zhang, Junchao > wrote: This feature is under active development. I hope I can make it usable in a couple of weeks. Thanks. --Junchao Zhang On Wed, Aug 21, 2019 at 3:21 PM David Gutzwiller via petsc-users > wrote: Hello, I'm currently using PETSc for the GPU acceleration of simple Krylov solver with GMRES, without preconditioning. This is within the framework of our in-house multigrid solver. I am getting a good GPU speedup on the finest grid level but progressively worse performance on each coarse level. This is not surprising, but I still hope to squeeze out some more performance, hopefully making it worthwhile to run some or all of the coarse grids on the GPU. I started investigating with nvprof / nsight and essentially came to the same conclusion that Xiangdong reported in a recent thread (July 16, "MemCpy (HtoD and DtoH) in Krylov solver"). My question is a follow-up to that thread: The MPI communication is staged from the host, which results in some H<->D transfers for every mat-vec operation. A CUDA-aware MPI implementation might avoid these transfers for communication between ranks that are assigned to the same accelerator. Has this been implemented or tested? In our solver we typically run with multiple MPI ranks all assigned to a single device, and running with a single rank is not really feasible as we still have a sizable amount of work for the CPU to chew through. Thus, I think quite a lot of the H<->D transfers could be avoided if I can skip the MPI staging on the host. I am quite new to PETSc so I wanted to ask around before blindly digging into this. Thanks for your help, David [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif] Virus-free. www.avast.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From julyzll06 at gmail.com Thu Aug 22 15:01:13 2019 From: julyzll06 at gmail.com (Lailai Zhu) Date: Thu, 22 Aug 2019 16:01:13 -0400 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> Message-ID: <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> sorry, Satish, but it does not seem to solve the problem. best, lailai On 8/22/19 12:41 AM, Balay, Satish wrote: > Can you run 'make' again and see if this error goes away? > > Satish > > On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >> hi, Satish, >> i tried to do it following your suggestion, i get the following errors when >> installing. >> here is my configuration, >> >> any ideas? >> >> best, >> lailai >> >> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1 >> --with-batch=0? --with-mpi=1 --with-debugging=0? CXXOPTFLAGS="-g -O3" >> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 >> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 >> --download-blacs=1? --download-scalapack=1? --download-hypre=1 >> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort >> --download-amd=1 --download-anamod=1 --download-blopex=1 >> --download-dscpack=1???? --download-sprng=1 --download-superlu=1 >> --with-cxx-dialect=C++11 --download-metis --download-parmetis >> >> >> >> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function >> `MatCreate_SeqSBAIJ': >> sbaij.c:(.text+0x1bc45): undefined reference to >> `MatConvert_SeqSBAIJ_Elemental' >> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: relocation >> R_X86_64_PC32 against undefined hidden symbol `MatConvert_SeqSBAIJ_Elemental' >> can not be used when making a shared object >> ld: final link failed: Bad value >> gmakefile:86: recipe for target >> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed >> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 >> make[2]: Leaving directory '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: >> recipe for target 'gnumake' failed >> make[1]: *** [gnumake] Error 2 >> make[1]: Leaving directory '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >> **************************ERROR************************************* >> ? Error during compile, check >> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log >> ? Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to >> petsc-maint at mcs.anl.gov >> >> On 8/21/19 10:58 PM, Balay, Satish wrote: >>> To install elemental - you use: --download-elemental=1 [not >>> --download-elemental-commit=v0.87.7] >>> >>> Satish >>> >>> >>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: >>> >>>> hi, dear petsc developers, >>>> >>>> I am having a problem when using the external solver elemental. >>>> I installed petsc3.10.5 version with the flag >>>> --download-elemental-commit=v0.87.7 >>>> the installation seems to be ok. However, it seems that i may not be able >>>> to use the elemental solver though. >>>> >>>> I followed this page >>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html >>>> to interface the elemental solver, namely, >>>> MatSetType(A,MATELEMENTAL); >>>> or set it via the command line '*-mat_type elemental*', >>>> >>>> in either case, i will get the following error, >>>> >>>> [0]PETSC ERROR: --------------------- Error Message >>>> -------------------------------------------------------------- >>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: >>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external >>>> [0]PETSC ERROR: Unknown Mat type given: elemental >>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for >>>> trouble shooting. >>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 >>>> >>>> May i ask whether there will be a way or some specific petsc versions that >>>> are >>>> able to use the elemental solver? >>>> >>>> Thanks in advance, >>>> >>>> best, >>>> lailai >>>> >> From balay at mcs.anl.gov Thu Aug 22 15:16:07 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 22 Aug 2019 20:16:07 +0000 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> Message-ID: Any reason for using petsc-3.10.5 and not latest petsc-3.11? I suggest starting from scatch and rebuilding. And if you still have issues - send corresponding configure.log and make.log Satish On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > sorry, Satish, > > but it does not seem to solve the problem. > > best, > lailai > > On 8/22/19 12:41 AM, Balay, Satish wrote: > > Can you run 'make' again and see if this error goes away? > > > > Satish > > > > On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > > > >> hi, Satish, > >> i tried to do it following your suggestion, i get the following errors when > >> installing. > >> here is my configuration, > >> > >> any ideas? > >> > >> best, > >> lailai > >> > >> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1 > >> --with-batch=0? --with-mpi=1 --with-debugging=0? CXXOPTFLAGS="-g -O3" > >> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 > >> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 > >> --download-blacs=1? --download-scalapack=1? --download-hypre=1 > >> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort > >> --download-amd=1 --download-anamod=1 --download-blopex=1 > >> --download-dscpack=1???? --download-sprng=1 --download-superlu=1 > >> --with-cxx-dialect=C++11 --download-metis --download-parmetis > >> > >> > >> > >> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function > >> `MatCreate_SeqSBAIJ': > >> sbaij.c:(.text+0x1bc45): undefined reference to > >> `MatConvert_SeqSBAIJ_Elemental' > >> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: relocation > >> R_X86_64_PC32 against undefined hidden symbol > >> `MatConvert_SeqSBAIJ_Elemental' > >> can not be used when making a shared object > >> ld: final link failed: Bad value > >> gmakefile:86: recipe for target > >> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed > >> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 > >> make[2]: Leaving directory > >> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: > >> recipe for target 'gnumake' failed > >> make[1]: *** [gnumake] Error 2 > >> make[1]: Leaving directory > >> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >> **************************ERROR************************************* > >> ? Error during compile, check > >> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log > >> ? Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to > >> petsc-maint at mcs.anl.gov > >> > >> On 8/21/19 10:58 PM, Balay, Satish wrote: > >>> To install elemental - you use: --download-elemental=1 [not > >>> --download-elemental-commit=v0.87.7] > >>> > >>> Satish > >>> > >>> > >>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >>> > >>>> hi, dear petsc developers, > >>>> > >>>> I am having a problem when using the external solver elemental. > >>>> I installed petsc3.10.5 version with the flag > >>>> --download-elemental-commit=v0.87.7 > >>>> the installation seems to be ok. However, it seems that i may not be able > >>>> to use the elemental solver though. > >>>> > >>>> I followed this page > >>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html > >>>> to interface the elemental solver, namely, > >>>> MatSetType(A,MATELEMENTAL); > >>>> or set it via the command line '*-mat_type elemental*', > >>>> > >>>> in either case, i will get the following error, > >>>> > >>>> [0]PETSC ERROR: --------------------- Error Message > >>>> -------------------------------------------------------------- > >>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: > >>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external > >>>> [0]PETSC ERROR: Unknown Mat type given: elemental > >>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>>> for > >>>> trouble shooting. > >>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 > >>>> > >>>> May i ask whether there will be a way or some specific petsc versions > >>>> that > >>>> are > >>>> able to use the elemental solver? > >>>> > >>>> Thanks in advance, > >>>> > >>>> best, > >>>> lailai > >>>> > >> > > From julyzll06 at gmail.com Thu Aug 22 16:06:59 2019 From: julyzll06 at gmail.com (Lailai Zhu) Date: Thu, 22 Aug 2019 17:06:59 -0400 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> Message-ID: <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> hi, Satish, as you have suggested, i compiled a new version using 3.11.3, it compiles well, the errors occur in checking. i also attach the errors of check. thanks very much, lailai On 8/22/19 4:16 PM, Balay, Satish wrote: > Any reason for using petsc-3.10.5 and not latest petsc-3.11? > > I suggest starting from scatch and rebuilding. > > And if you still have issues - send corresponding configure.log and make.log > > Satish > > On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > >> sorry, Satish, >> >> but it does not seem to solve the problem. >> >> best, >> lailai >> >> On 8/22/19 12:41 AM, Balay, Satish wrote: >>> Can you run 'make' again and see if this error goes away? >>> >>> Satish >>> >>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: >>> >>>> hi, Satish, >>>> i tried to do it following your suggestion, i get the following errors when >>>> installing. >>>> here is my configuration, >>>> >>>> any ideas? >>>> >>>> best, >>>> lailai >>>> >>>> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1 >>>> --with-batch=0? --with-mpi=1 --with-debugging=0? CXXOPTFLAGS="-g -O3" >>>> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 >>>> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 >>>> --download-blacs=1? --download-scalapack=1? --download-hypre=1 >>>> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort >>>> --download-amd=1 --download-anamod=1 --download-blopex=1 >>>> --download-dscpack=1???? --download-sprng=1 --download-superlu=1 >>>> --with-cxx-dialect=C++11 --download-metis --download-parmetis >>>> >>>> >>>> >>>> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function >>>> `MatCreate_SeqSBAIJ': >>>> sbaij.c:(.text+0x1bc45): undefined reference to >>>> `MatConvert_SeqSBAIJ_Elemental' >>>> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: relocation >>>> R_X86_64_PC32 against undefined hidden symbol >>>> `MatConvert_SeqSBAIJ_Elemental' >>>> can not be used when making a shared object >>>> ld: final link failed: Bad value >>>> gmakefile:86: recipe for target >>>> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed >>>> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 >>>> make[2]: Leaving directory >>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >>>> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: >>>> recipe for target 'gnumake' failed >>>> make[1]: *** [gnumake] Error 2 >>>> make[1]: Leaving directory >>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >>>> **************************ERROR************************************* >>>> ? Error during compile, check >>>> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log >>>> ? Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to >>>> petsc-maint at mcs.anl.gov >>>> >>>> On 8/21/19 10:58 PM, Balay, Satish wrote: >>>>> To install elemental - you use: --download-elemental=1 [not >>>>> --download-elemental-commit=v0.87.7] >>>>> >>>>> Satish >>>>> >>>>> >>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: >>>>> >>>>>> hi, dear petsc developers, >>>>>> >>>>>> I am having a problem when using the external solver elemental. >>>>>> I installed petsc3.10.5 version with the flag >>>>>> --download-elemental-commit=v0.87.7 >>>>>> the installation seems to be ok. However, it seems that i may not be able >>>>>> to use the elemental solver though. >>>>>> >>>>>> I followed this page >>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html >>>>>> to interface the elemental solver, namely, >>>>>> MatSetType(A,MATELEMENTAL); >>>>>> or set it via the command line '*-mat_type elemental*', >>>>>> >>>>>> in either case, i will get the following error, >>>>>> >>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>> -------------------------------------------------------------- >>>>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing package: >>>>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external >>>>>> [0]PETSC ERROR: Unknown Mat type given: elemental >>>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>> for >>>>>> trouble shooting. >>>>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 >>>>>> >>>>>> May i ask whether there will be a way or some specific petsc versions >>>>>> that >>>>>> are >>>>>> able to use the elemental solver? >>>>>> >>>>>> Thanks in advance, >>>>>> >>>>>> best, >>>>>> lailai >>>>>> -------------- next part -------------- Running test examples to verify correct installation Using PETSC_DIR=/somedir/petsc3.11.3_intel19_mpich3.3 and PETSC_ARCH=pet3.11.3-intel19-mpich3.3 Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI process See http://www.mcs.anl.gov/petsc/documentation/faq.html ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI processes See http://www.mcs.anl.gov/petsc/documentation/faq.html ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world 1,5c1,2 < lid velocity = 0.0016, prandtl # = 1., grashof # = 1. < 0 SNES Function norm 0.0406612 < 1 SNES Function norm 4.12227e-06 < 2 SNES Function norm 6.098e-11 < Number of SNES iterations = 2 --- > ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world > ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world /somedir/petsc3.11.3_intel19_mpich3.3/src/snes/examples/tutorials Possible problem with ex19 running with hypre, diffs above ========================================= Possible error running Fortran example src/snes/examples/tutorials/ex5f with 1 MPI process See http://www.mcs.anl.gov/petsc/documentation/faq.html ./ex5f: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world Completed test examples -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: text/x-log Size: 390445 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 4448426 bytes Desc: not available URL: From balay at mcs.anl.gov Thu Aug 22 17:17:45 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 22 Aug 2019 22:17:45 +0000 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> Message-ID: > ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world For some reason the wrong parmetis library is getting picked up. I don't know why. Can you copy/paste the log from the following? cd src/snes/examples/tutorials make PETSC_DIR=/home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3 ex19 ldd ex19 cd /home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3/pet3.11.3-intel19-mpich3.3/lib ldd *.so Satish On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > hi, Satish, > > as you have suggested, i compiled a new version using 3.11.3, > it compiles well, the errors occur in checking. i also attach > the errors of check. thanks very much, > > lailai > > On 8/22/19 4:16 PM, Balay, Satish wrote: > > Any reason for using petsc-3.10.5 and not latest petsc-3.11? > > > > I suggest starting from scatch and rebuilding. > > > > And if you still have issues - send corresponding configure.log and make.log > > > > Satish > > > > On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > > > >> sorry, Satish, > >> > >> but it does not seem to solve the problem. > >> > >> best, > >> lailai > >> > >> On 8/22/19 12:41 AM, Balay, Satish wrote: > >>> Can you run 'make' again and see if this error goes away? > >>> > >>> Satish > >>> > >>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >>> > >>>> hi, Satish, > >>>> i tried to do it following your suggestion, i get the following errors > >>>> when > >>>> installing. > >>>> here is my configuration, > >>>> > >>>> any ideas? > >>>> > >>>> best, > >>>> lailai > >>>> > >>>> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1 > >>>> --with-batch=0? --with-mpi=1 --with-debugging=0? CXXOPTFLAGS="-g -O3" > >>>> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 > >>>> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 > >>>> --download-blacs=1? --download-scalapack=1? --download-hypre=1 > >>>> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort > >>>> --download-amd=1 --download-anamod=1 --download-blopex=1 > >>>> --download-dscpack=1???? --download-sprng=1 --download-superlu=1 > >>>> --with-cxx-dialect=C++11 --download-metis --download-parmetis > >>>> > >>>> > >>>> > >>>> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function > >>>> `MatCreate_SeqSBAIJ': > >>>> sbaij.c:(.text+0x1bc45): undefined reference to > >>>> `MatConvert_SeqSBAIJ_Elemental' > >>>> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: > >>>> relocation > >>>> R_X86_64_PC32 against undefined hidden symbol > >>>> `MatConvert_SeqSBAIJ_Elemental' > >>>> can not be used when making a shared object > >>>> ld: final link failed: Bad value > >>>> gmakefile:86: recipe for target > >>>> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed > >>>> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 > >>>> make[2]: Leaving directory > >>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >>>> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: > >>>> recipe for target 'gnumake' failed > >>>> make[1]: *** [gnumake] Error 2 > >>>> make[1]: Leaving directory > >>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >>>> **************************ERROR************************************* > >>>> ? Error during compile, check > >>>> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log > >>>> ? Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to > >>>> petsc-maint at mcs.anl.gov > >>>> > >>>> On 8/21/19 10:58 PM, Balay, Satish wrote: > >>>>> To install elemental - you use: --download-elemental=1 [not > >>>>> --download-elemental-commit=v0.87.7] > >>>>> > >>>>> Satish > >>>>> > >>>>> > >>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>>> > >>>>>> hi, dear petsc developers, > >>>>>> > >>>>>> I am having a problem when using the external solver elemental. > >>>>>> I installed petsc3.10.5 version with the flag > >>>>>> --download-elemental-commit=v0.87.7 > >>>>>> the installation seems to be ok. However, it seems that i may not be > >>>>>> able > >>>>>> to use the elemental solver though. > >>>>>> > >>>>>> I followed this page > >>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html > >>>>>> to interface the elemental solver, namely, > >>>>>> MatSetType(A,MATELEMENTAL); > >>>>>> or set it via the command line '*-mat_type elemental*', > >>>>>> > >>>>>> in either case, i will get the following error, > >>>>>> > >>>>>> [0]PETSC ERROR: --------------------- Error Message > >>>>>> -------------------------------------------------------------- > >>>>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing > >>>>>> package: > >>>>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external > >>>>>> [0]PETSC ERROR: Unknown Mat type given: elemental > >>>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>>>>> for > >>>>>> trouble shooting. > >>>>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 > >>>>>> > >>>>>> May i ask whether there will be a way or some specific petsc versions > >>>>>> that > >>>>>> are > >>>>>> able to use the elemental solver? > >>>>>> > >>>>>> Thanks in advance, > >>>>>> > >>>>>> best, > >>>>>> lailai > >>>>>> > > > From bsmith at mcs.anl.gov Thu Aug 22 18:13:45 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Thu, 22 Aug 2019 23:13:45 +0000 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> Message-ID: You have a copy of parmetis installed in /usr/lib this is a systems directory and many compilers and linkers automatically find libraries in that location and it is often difficult to avoid have the compilers/linkers use these. In general you never want to install external software such as parmetis, PETSc, MPI, etc in systems directories (/usr/ and /usr/local) You should delete this library (and the includes in /usr/include) Barry > On Aug 22, 2019, at 5:17 PM, Balay, Satish via petsc-users wrote: > > >> ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world > > For some reason the wrong parmetis library is getting picked up. I don't know why. > > Can you copy/paste the log from the following? > > cd src/snes/examples/tutorials > make PETSC_DIR=/home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3 ex19 > ldd ex19 > > cd /home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3/pet3.11.3-intel19-mpich3.3/lib > ldd *.so > > Satish > > On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > >> hi, Satish, >> >> as you have suggested, i compiled a new version using 3.11.3, >> it compiles well, the errors occur in checking. i also attach >> the errors of check. thanks very much, >> >> lailai >> >> On 8/22/19 4:16 PM, Balay, Satish wrote: >>> Any reason for using petsc-3.10.5 and not latest petsc-3.11? >>> >>> I suggest starting from scatch and rebuilding. >>> >>> And if you still have issues - send corresponding configure.log and make.log >>> >>> Satish >>> >>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: >>> >>>> sorry, Satish, >>>> >>>> but it does not seem to solve the problem. >>>> >>>> best, >>>> lailai >>>> >>>> On 8/22/19 12:41 AM, Balay, Satish wrote: >>>>> Can you run 'make' again and see if this error goes away? >>>>> >>>>> Satish >>>>> >>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: >>>>> >>>>>> hi, Satish, >>>>>> i tried to do it following your suggestion, i get the following errors >>>>>> when >>>>>> installing. >>>>>> here is my configuration, >>>>>> >>>>>> any ideas? >>>>>> >>>>>> best, >>>>>> lailai >>>>>> >>>>>> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1 >>>>>> --with-batch=0 --with-mpi=1 --with-debugging=0 CXXOPTFLAGS="-g -O3" >>>>>> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 >>>>>> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 >>>>>> --download-blacs=1 --download-scalapack=1 --download-hypre=1 >>>>>> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort >>>>>> --download-amd=1 --download-anamod=1 --download-blopex=1 >>>>>> --download-dscpack=1 --download-sprng=1 --download-superlu=1 >>>>>> --with-cxx-dialect=C++11 --download-metis --download-parmetis >>>>>> >>>>>> >>>>>> >>>>>> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function >>>>>> `MatCreate_SeqSBAIJ': >>>>>> sbaij.c:(.text+0x1bc45): undefined reference to >>>>>> `MatConvert_SeqSBAIJ_Elemental' >>>>>> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: >>>>>> relocation >>>>>> R_X86_64_PC32 against undefined hidden symbol >>>>>> `MatConvert_SeqSBAIJ_Elemental' >>>>>> can not be used when making a shared object >>>>>> ld: final link failed: Bad value >>>>>> gmakefile:86: recipe for target >>>>>> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed >>>>>> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 >>>>>> make[2]: Leaving directory >>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >>>>>> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: >>>>>> recipe for target 'gnumake' failed >>>>>> make[1]: *** [gnumake] Error 2 >>>>>> make[1]: Leaving directory >>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >>>>>> **************************ERROR************************************* >>>>>> Error during compile, check >>>>>> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log >>>>>> Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to >>>>>> petsc-maint at mcs.anl.gov >>>>>> >>>>>> On 8/21/19 10:58 PM, Balay, Satish wrote: >>>>>>> To install elemental - you use: --download-elemental=1 [not >>>>>>> --download-elemental-commit=v0.87.7] >>>>>>> >>>>>>> Satish >>>>>>> >>>>>>> >>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: >>>>>>> >>>>>>>> hi, dear petsc developers, >>>>>>>> >>>>>>>> I am having a problem when using the external solver elemental. >>>>>>>> I installed petsc3.10.5 version with the flag >>>>>>>> --download-elemental-commit=v0.87.7 >>>>>>>> the installation seems to be ok. However, it seems that i may not be >>>>>>>> able >>>>>>>> to use the elemental solver though. >>>>>>>> >>>>>>>> I followed this page >>>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html >>>>>>>> to interface the elemental solver, namely, >>>>>>>> MatSetType(A,MATELEMENTAL); >>>>>>>> or set it via the command line '*-mat_type elemental*', >>>>>>>> >>>>>>>> in either case, i will get the following error, >>>>>>>> >>>>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>>>> -------------------------------------------------------------- >>>>>>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing >>>>>>>> package: >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external >>>>>>>> [0]PETSC ERROR: Unknown Mat type given: elemental >>>>>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>> for >>>>>>>> trouble shooting. >>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 >>>>>>>> >>>>>>>> May i ask whether there will be a way or some specific petsc versions >>>>>>>> that >>>>>>>> are >>>>>>>> able to use the elemental solver? >>>>>>>> >>>>>>>> Thanks in advance, >>>>>>>> >>>>>>>> best, >>>>>>>> lailai >>>>>>>> >> >> >> From balay at mcs.anl.gov Thu Aug 22 18:48:01 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 22 Aug 2019 23:48:01 +0000 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> Message-ID: Compilers are supposed to prefer libraries in specified -L path before system stuff. >>>>>> balay at es^~ $ ls /usr/lib/lib*metis* /usr/lib/libmetis.a /usr/lib/libmetis.so.3.1 /usr/lib/libparmetis.so@ /usr/lib/libscotchmetis-5.1.so /usr/lib/libscotchmetis.so@ /usr/lib/libmetis.so@ /usr/lib/libparmetis.a /usr/lib/libparmetis.so.3.1 /usr/lib/libscotchmetis.a balay at es^~ $ <<<< And we have these files installed and they don't cause problems. And its not always practical to uninstall system stuff [esp on multi-user machines] Satish On Thu, 22 Aug 2019, Smith, Barry F. wrote: > > You have a copy of parmetis installed in /usr/lib this is a systems directory and many compilers and linkers automatically find libraries in that location and it is often difficult to avoid have the compilers/linkers use these. In general you never want to install external software such as parmetis, PETSc, MPI, etc in systems directories (/usr/ and /usr/local) > > You should delete this library (and the includes in /usr/include) > > Barry > > > > > > On Aug 22, 2019, at 5:17 PM, Balay, Satish via petsc-users wrote: > > > > > >> ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world > > > > For some reason the wrong parmetis library is getting picked up. I don't know why. > > > > Can you copy/paste the log from the following? > > > > cd src/snes/examples/tutorials > > make PETSC_DIR=/home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3 ex19 > > ldd ex19 > > > > cd /home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3/pet3.11.3-intel19-mpich3.3/lib > > ldd *.so > > > > Satish > > > > On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > > > >> hi, Satish, > >> > >> as you have suggested, i compiled a new version using 3.11.3, > >> it compiles well, the errors occur in checking. i also attach > >> the errors of check. thanks very much, > >> > >> lailai > >> > >> On 8/22/19 4:16 PM, Balay, Satish wrote: > >>> Any reason for using petsc-3.10.5 and not latest petsc-3.11? > >>> > >>> I suggest starting from scatch and rebuilding. > >>> > >>> And if you still have issues - send corresponding configure.log and make.log > >>> > >>> Satish > >>> > >>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > >>> > >>>> sorry, Satish, > >>>> > >>>> but it does not seem to solve the problem. > >>>> > >>>> best, > >>>> lailai > >>>> > >>>> On 8/22/19 12:41 AM, Balay, Satish wrote: > >>>>> Can you run 'make' again and see if this error goes away? > >>>>> > >>>>> Satish > >>>>> > >>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>>> > >>>>>> hi, Satish, > >>>>>> i tried to do it following your suggestion, i get the following errors > >>>>>> when > >>>>>> installing. > >>>>>> here is my configuration, > >>>>>> > >>>>>> any ideas? > >>>>>> > >>>>>> best, > >>>>>> lailai > >>>>>> > >>>>>> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1 > >>>>>> --with-batch=0 --with-mpi=1 --with-debugging=0 CXXOPTFLAGS="-g -O3" > >>>>>> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 > >>>>>> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 > >>>>>> --download-blacs=1 --download-scalapack=1 --download-hypre=1 > >>>>>> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort > >>>>>> --download-amd=1 --download-anamod=1 --download-blopex=1 > >>>>>> --download-dscpack=1 --download-sprng=1 --download-superlu=1 > >>>>>> --with-cxx-dialect=C++11 --download-metis --download-parmetis > >>>>>> > >>>>>> > >>>>>> > >>>>>> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function > >>>>>> `MatCreate_SeqSBAIJ': > >>>>>> sbaij.c:(.text+0x1bc45): undefined reference to > >>>>>> `MatConvert_SeqSBAIJ_Elemental' > >>>>>> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: > >>>>>> relocation > >>>>>> R_X86_64_PC32 against undefined hidden symbol > >>>>>> `MatConvert_SeqSBAIJ_Elemental' > >>>>>> can not be used when making a shared object > >>>>>> ld: final link failed: Bad value > >>>>>> gmakefile:86: recipe for target > >>>>>> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed > >>>>>> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 > >>>>>> make[2]: Leaving directory > >>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >>>>>> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: > >>>>>> recipe for target 'gnumake' failed > >>>>>> make[1]: *** [gnumake] Error 2 > >>>>>> make[1]: Leaving directory > >>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >>>>>> **************************ERROR************************************* > >>>>>> Error during compile, check > >>>>>> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log > >>>>>> Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to > >>>>>> petsc-maint at mcs.anl.gov > >>>>>> > >>>>>> On 8/21/19 10:58 PM, Balay, Satish wrote: > >>>>>>> To install elemental - you use: --download-elemental=1 [not > >>>>>>> --download-elemental-commit=v0.87.7] > >>>>>>> > >>>>>>> Satish > >>>>>>> > >>>>>>> > >>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>>>>> > >>>>>>>> hi, dear petsc developers, > >>>>>>>> > >>>>>>>> I am having a problem when using the external solver elemental. > >>>>>>>> I installed petsc3.10.5 version with the flag > >>>>>>>> --download-elemental-commit=v0.87.7 > >>>>>>>> the installation seems to be ok. However, it seems that i may not be > >>>>>>>> able > >>>>>>>> to use the elemental solver though. > >>>>>>>> > >>>>>>>> I followed this page > >>>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html > >>>>>>>> to interface the elemental solver, namely, > >>>>>>>> MatSetType(A,MATELEMENTAL); > >>>>>>>> or set it via the command line '*-mat_type elemental*', > >>>>>>>> > >>>>>>>> in either case, i will get the following error, > >>>>>>>> > >>>>>>>> [0]PETSC ERROR: --------------------- Error Message > >>>>>>>> -------------------------------------------------------------- > >>>>>>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing > >>>>>>>> package: > >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external > >>>>>>>> [0]PETSC ERROR: Unknown Mat type given: elemental > >>>>>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>>>>>>> for > >>>>>>>> trouble shooting. > >>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 > >>>>>>>> > >>>>>>>> May i ask whether there will be a way or some specific petsc versions > >>>>>>>> that > >>>>>>>> are > >>>>>>>> able to use the elemental solver? > >>>>>>>> > >>>>>>>> Thanks in advance, > >>>>>>>> > >>>>>>>> best, > >>>>>>>> lailai > >>>>>>>> > >> > >> > >> > From bsmith at mcs.anl.gov Thu Aug 22 19:03:58 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 23 Aug 2019 00:03:58 +0000 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> Message-ID: > On Aug 22, 2019, at 6:48 PM, Balay, Satish wrote: > > Compilers are supposed to prefer libraries in specified -L path before system stuff. Suppose to. > >>>>>>> > balay at es^~ $ ls /usr/lib/lib*metis* > /usr/lib/libmetis.a /usr/lib/libmetis.so.3.1 /usr/lib/libparmetis.so@ /usr/lib/libscotchmetis-5.1.so /usr/lib/libscotchmetis.so@ > /usr/lib/libmetis.so@ /usr/lib/libparmetis.a /usr/lib/libparmetis.so.3.1 /usr/lib/libscotchmetis.a > balay at es^~ $ This is really bad system management, there is no reason for them to be there nor should they be there. bsmith at es:~$ ldd /usr/lib/libparmetis.so.3.1 linux-vdso.so.1 => (0x00007fff9115e000) libmpi.so.1 => /usr/lib/libmpi.so.1 (0x00007faf95f87000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faf95c81000) libmetis.so.3.1 => /usr/lib/libmetis.so.3.1 (0x00007faf95a34000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf9566b000) libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007faf95468000) libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007faf95228000) libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007faf9501e000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faf94e00000) /lib64/ld-linux-x86-64.so.2 (0x00007faf9654e000) libnuma.so.1 => /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 (0x00007faf94bf5000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf949f1000) You have something in /usr/lib referring to something in /soft/com/packages/pgi/ ?? and of course that refers back to bsmith at es:~$ ls -l /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 lrwxrwxrwx 1 fritz voice 38 Mar 29 15:59 /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 -> /usr/lib/x86_64-linux-gnu/libnuma.so.1 I stand by my statement, it is bad policy to put any stuff like this in system directories. > <<<< > > And we have these files installed and they don't cause problems. And its not always practical to uninstall system stuff > [esp on multi-user machines] I agree it is not always practical or possible to remove them. barry > > Satish > > > On Thu, 22 Aug 2019, Smith, Barry F. wrote: > >> >> You have a copy of parmetis installed in /usr/lib this is a systems directory and many compilers and linkers automatically find libraries in that location and it is often difficult to avoid have the compilers/linkers use these. In general you never want to install external software such as parmetis, PETSc, MPI, etc in systems directories (/usr/ and /usr/local) >> >> You should delete this library (and the includes in /usr/include) >> >> Barry >> >> >> >> >>> On Aug 22, 2019, at 5:17 PM, Balay, Satish via petsc-users wrote: >>> >>> >>>> ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world >>> >>> For some reason the wrong parmetis library is getting picked up. I don't know why. >>> >>> Can you copy/paste the log from the following? >>> >>> cd src/snes/examples/tutorials >>> make PETSC_DIR=/home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3 ex19 >>> ldd ex19 >>> >>> cd /home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3/pet3.11.3-intel19-mpich3.3/lib >>> ldd *.so >>> >>> Satish >>> >>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: >>> >>>> hi, Satish, >>>> >>>> as you have suggested, i compiled a new version using 3.11.3, >>>> it compiles well, the errors occur in checking. i also attach >>>> the errors of check. thanks very much, >>>> >>>> lailai >>>> >>>> On 8/22/19 4:16 PM, Balay, Satish wrote: >>>>> Any reason for using petsc-3.10.5 and not latest petsc-3.11? >>>>> >>>>> I suggest starting from scatch and rebuilding. >>>>> >>>>> And if you still have issues - send corresponding configure.log and make.log >>>>> >>>>> Satish >>>>> >>>>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: >>>>> >>>>>> sorry, Satish, >>>>>> >>>>>> but it does not seem to solve the problem. >>>>>> >>>>>> best, >>>>>> lailai >>>>>> >>>>>> On 8/22/19 12:41 AM, Balay, Satish wrote: >>>>>>> Can you run 'make' again and see if this error goes away? >>>>>>> >>>>>>> Satish >>>>>>> >>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: >>>>>>> >>>>>>>> hi, Satish, >>>>>>>> i tried to do it following your suggestion, i get the following errors >>>>>>>> when >>>>>>>> installing. >>>>>>>> here is my configuration, >>>>>>>> >>>>>>>> any ideas? >>>>>>>> >>>>>>>> best, >>>>>>>> lailai >>>>>>>> >>>>>>>> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1 >>>>>>>> --with-batch=0 --with-mpi=1 --with-debugging=0 CXXOPTFLAGS="-g -O3" >>>>>>>> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 >>>>>>>> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 >>>>>>>> --download-blacs=1 --download-scalapack=1 --download-hypre=1 >>>>>>>> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort >>>>>>>> --download-amd=1 --download-anamod=1 --download-blopex=1 >>>>>>>> --download-dscpack=1 --download-sprng=1 --download-superlu=1 >>>>>>>> --with-cxx-dialect=C++11 --download-metis --download-parmetis >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function >>>>>>>> `MatCreate_SeqSBAIJ': >>>>>>>> sbaij.c:(.text+0x1bc45): undefined reference to >>>>>>>> `MatConvert_SeqSBAIJ_Elemental' >>>>>>>> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: >>>>>>>> relocation >>>>>>>> R_X86_64_PC32 against undefined hidden symbol >>>>>>>> `MatConvert_SeqSBAIJ_Elemental' >>>>>>>> can not be used when making a shared object >>>>>>>> ld: final link failed: Bad value >>>>>>>> gmakefile:86: recipe for target >>>>>>>> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed >>>>>>>> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 >>>>>>>> make[2]: Leaving directory >>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >>>>>>>> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: >>>>>>>> recipe for target 'gnumake' failed >>>>>>>> make[1]: *** [gnumake] Error 2 >>>>>>>> make[1]: Leaving directory >>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >>>>>>>> **************************ERROR************************************* >>>>>>>> Error during compile, check >>>>>>>> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log >>>>>>>> Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to >>>>>>>> petsc-maint at mcs.anl.gov >>>>>>>> >>>>>>>> On 8/21/19 10:58 PM, Balay, Satish wrote: >>>>>>>>> To install elemental - you use: --download-elemental=1 [not >>>>>>>>> --download-elemental-commit=v0.87.7] >>>>>>>>> >>>>>>>>> Satish >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: >>>>>>>>> >>>>>>>>>> hi, dear petsc developers, >>>>>>>>>> >>>>>>>>>> I am having a problem when using the external solver elemental. >>>>>>>>>> I installed petsc3.10.5 version with the flag >>>>>>>>>> --download-elemental-commit=v0.87.7 >>>>>>>>>> the installation seems to be ok. However, it seems that i may not be >>>>>>>>>> able >>>>>>>>>> to use the elemental solver though. >>>>>>>>>> >>>>>>>>>> I followed this page >>>>>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html >>>>>>>>>> to interface the elemental solver, namely, >>>>>>>>>> MatSetType(A,MATELEMENTAL); >>>>>>>>>> or set it via the command line '*-mat_type elemental*', >>>>>>>>>> >>>>>>>>>> in either case, i will get the following error, >>>>>>>>>> >>>>>>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>>>>>> -------------------------------------------------------------- >>>>>>>>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing >>>>>>>>>> package: >>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external >>>>>>>>>> [0]PETSC ERROR: Unknown Mat type given: elemental >>>>>>>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>>>> for >>>>>>>>>> trouble shooting. >>>>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 >>>>>>>>>> >>>>>>>>>> May i ask whether there will be a way or some specific petsc versions >>>>>>>>>> that >>>>>>>>>> are >>>>>>>>>> able to use the elemental solver? >>>>>>>>>> >>>>>>>>>> Thanks in advance, >>>>>>>>>> >>>>>>>>>> best, >>>>>>>>>> lailai >>>>>>>>>> >>>> >>>> >>>> >> > From julyzll06 at gmail.com Thu Aug 22 19:44:12 2019 From: julyzll06 at gmail.com (Lailai Zhu) Date: Thu, 22 Aug 2019 20:44:12 -0400 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> Message-ID: <3ea599e5-478b-9e8e-9f16-9d07a7368e22@gmail.com> Thank you guys,? after i remove the system's metis-related things, it compiles well and go through the check. however when i try to use the elemental solver, it still does not work out. i basically get the segmentation fault error, which does not appear when i use the standard dense matrix and jacobi sovers. thanks in advance, best, lailai [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. On 8/22/19 8:03 PM, Smith, Barry F. wrote: > >> On Aug 22, 2019, at 6:48 PM, Balay, Satish wrote: >> >> Compilers are supposed to prefer libraries in specified -L path before system stuff. > Suppose to. > >> balay at es^~ $ ls /usr/lib/lib*metis* >> /usr/lib/libmetis.a /usr/lib/libmetis.so.3.1 /usr/lib/libparmetis.so@ /usr/lib/libscotchmetis-5.1.so /usr/lib/libscotchmetis.so@ >> /usr/lib/libmetis.so@ /usr/lib/libparmetis.a /usr/lib/libparmetis.so.3.1 /usr/lib/libscotchmetis.a >> balay at es^~ $ > This is really bad system management, there is no reason for them to be there nor should they be there. > > bsmith at es:~$ ldd /usr/lib/libparmetis.so.3.1 > linux-vdso.so.1 => (0x00007fff9115e000) > libmpi.so.1 => /usr/lib/libmpi.so.1 (0x00007faf95f87000) > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faf95c81000) > libmetis.so.3.1 => /usr/lib/libmetis.so.3.1 (0x00007faf95a34000) > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf9566b000) > libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007faf95468000) > libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007faf95228000) > libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007faf9501e000) > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faf94e00000) > /lib64/ld-linux-x86-64.so.2 (0x00007faf9654e000) > libnuma.so.1 => /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 (0x00007faf94bf5000) > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf949f1000) > > You have something in /usr/lib referring to something in /soft/com/packages/pgi/ ?? > > and of course that refers back to > > bsmith at es:~$ ls -l /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 > lrwxrwxrwx 1 fritz voice 38 Mar 29 15:59 /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 -> /usr/lib/x86_64-linux-gnu/libnuma.so.1 > > I stand by my statement, it is bad policy to put any stuff like this in system directories. > > >> <<<< >> >> And we have these files installed and they don't cause problems. And its not always practical to uninstall system stuff >> [esp on multi-user machines] > I agree it is not always practical or possible to remove them. > > barry > >> Satish >> >> >> On Thu, 22 Aug 2019, Smith, Barry F. wrote: >> >>> You have a copy of parmetis installed in /usr/lib this is a systems directory and many compilers and linkers automatically find libraries in that location and it is often difficult to avoid have the compilers/linkers use these. In general you never want to install external software such as parmetis, PETSc, MPI, etc in systems directories (/usr/ and /usr/local) >>> >>> You should delete this library (and the includes in /usr/include) >>> >>> Barry >>> >>> >>> >>> >>>> On Aug 22, 2019, at 5:17 PM, Balay, Satish via petsc-users wrote: >>>> >>>> >>>>> ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world >>>> For some reason the wrong parmetis library is getting picked up. I don't know why. >>>> >>>> Can you copy/paste the log from the following? >>>> >>>> cd src/snes/examples/tutorials >>>> make PETSC_DIR=/home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3 ex19 >>>> ldd ex19 >>>> >>>> cd /home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3/pet3.11.3-intel19-mpich3.3/lib >>>> ldd *.so >>>> >>>> Satish >>>> >>>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: >>>> >>>>> hi, Satish, >>>>> >>>>> as you have suggested, i compiled a new version using 3.11.3, >>>>> it compiles well, the errors occur in checking. i also attach >>>>> the errors of check. thanks very much, >>>>> >>>>> lailai >>>>> >>>>> On 8/22/19 4:16 PM, Balay, Satish wrote: >>>>>> Any reason for using petsc-3.10.5 and not latest petsc-3.11? >>>>>> >>>>>> I suggest starting from scatch and rebuilding. >>>>>> >>>>>> And if you still have issues - send corresponding configure.log and make.log >>>>>> >>>>>> Satish >>>>>> >>>>>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: >>>>>> >>>>>>> sorry, Satish, >>>>>>> >>>>>>> but it does not seem to solve the problem. >>>>>>> >>>>>>> best, >>>>>>> lailai >>>>>>> >>>>>>> On 8/22/19 12:41 AM, Balay, Satish wrote: >>>>>>>> Can you run 'make' again and see if this error goes away? >>>>>>>> >>>>>>>> Satish >>>>>>>> >>>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: >>>>>>>> >>>>>>>>> hi, Satish, >>>>>>>>> i tried to do it following your suggestion, i get the following errors >>>>>>>>> when >>>>>>>>> installing. >>>>>>>>> here is my configuration, >>>>>>>>> >>>>>>>>> any ideas? >>>>>>>>> >>>>>>>>> best, >>>>>>>>> lailai >>>>>>>>> >>>>>>>>> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1 >>>>>>>>> --with-batch=0 --with-mpi=1 --with-debugging=0 CXXOPTFLAGS="-g -O3" >>>>>>>>> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 >>>>>>>>> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 >>>>>>>>> --download-blacs=1 --download-scalapack=1 --download-hypre=1 >>>>>>>>> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort >>>>>>>>> --download-amd=1 --download-anamod=1 --download-blopex=1 >>>>>>>>> --download-dscpack=1 --download-sprng=1 --download-superlu=1 >>>>>>>>> --with-cxx-dialect=C++11 --download-metis --download-parmetis >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function >>>>>>>>> `MatCreate_SeqSBAIJ': >>>>>>>>> sbaij.c:(.text+0x1bc45): undefined reference to >>>>>>>>> `MatConvert_SeqSBAIJ_Elemental' >>>>>>>>> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: >>>>>>>>> relocation >>>>>>>>> R_X86_64_PC32 against undefined hidden symbol >>>>>>>>> `MatConvert_SeqSBAIJ_Elemental' >>>>>>>>> can not be used when making a shared object >>>>>>>>> ld: final link failed: Bad value >>>>>>>>> gmakefile:86: recipe for target >>>>>>>>> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed >>>>>>>>> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 >>>>>>>>> make[2]: Leaving directory >>>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >>>>>>>>> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: >>>>>>>>> recipe for target 'gnumake' failed >>>>>>>>> make[1]: *** [gnumake] Error 2 >>>>>>>>> make[1]: Leaving directory >>>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' >>>>>>>>> **************************ERROR************************************* >>>>>>>>> Error during compile, check >>>>>>>>> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log >>>>>>>>> Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to >>>>>>>>> petsc-maint at mcs.anl.gov >>>>>>>>> >>>>>>>>> On 8/21/19 10:58 PM, Balay, Satish wrote: >>>>>>>>>> To install elemental - you use: --download-elemental=1 [not >>>>>>>>>> --download-elemental-commit=v0.87.7] >>>>>>>>>> >>>>>>>>>> Satish >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: >>>>>>>>>> >>>>>>>>>>> hi, dear petsc developers, >>>>>>>>>>> >>>>>>>>>>> I am having a problem when using the external solver elemental. >>>>>>>>>>> I installed petsc3.10.5 version with the flag >>>>>>>>>>> --download-elemental-commit=v0.87.7 >>>>>>>>>>> the installation seems to be ok. However, it seems that i may not be >>>>>>>>>>> able >>>>>>>>>>> to use the elemental solver though. >>>>>>>>>>> >>>>>>>>>>> I followed this page >>>>>>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html >>>>>>>>>>> to interface the elemental solver, namely, >>>>>>>>>>> MatSetType(A,MATELEMENTAL); >>>>>>>>>>> or set it via the command line '*-mat_type elemental*', >>>>>>>>>>> >>>>>>>>>>> in either case, i will get the following error, >>>>>>>>>>> >>>>>>>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>>>>>>> -------------------------------------------------------------- >>>>>>>>>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing >>>>>>>>>>> package: >>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external >>>>>>>>>>> [0]PETSC ERROR: Unknown Mat type given: elemental >>>>>>>>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>>>>> for >>>>>>>>>>> trouble shooting. >>>>>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 >>>>>>>>>>> >>>>>>>>>>> May i ask whether there will be a way or some specific petsc versions >>>>>>>>>>> that >>>>>>>>>>> are >>>>>>>>>>> able to use the elemental solver? >>>>>>>>>>> >>>>>>>>>>> Thanks in advance, >>>>>>>>>>> >>>>>>>>>>> best, >>>>>>>>>>> lailai >>>>>>>>>>> >>>>> >>>>> From knepley at gmail.com Thu Aug 22 19:54:13 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 22 Aug 2019 20:54:13 -0400 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: <3ea599e5-478b-9e8e-9f16-9d07a7368e22@gmail.com> References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> <3ea599e5-478b-9e8e-9f16-9d07a7368e22@gmail.com> Message-ID: On Thu, Aug 22, 2019 at 8:45 PM Lailai Zhu via petsc-users < petsc-users at mcs.anl.gov> wrote: > Thank you guys, after i remove the system's metis-related things, > it compiles well and go through the check. however when i try to > use the elemental solver, it still does not work out. i basically get > the segmentation fault error, which does not appear when i use > the standard dense matrix and jacobi sovers. thanks in advance, > > best, > lailai > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > [0]PETSC ERROR: to get more information on the crash. > It would be really helpful to get a stack trace from the debugger. Thanks, Matt > > On 8/22/19 8:03 PM, Smith, Barry F. wrote: > > > >> On Aug 22, 2019, at 6:48 PM, Balay, Satish wrote: > >> > >> Compilers are supposed to prefer libraries in specified -L path before > system stuff. > > Suppose to. > > > >> balay at es^~ $ ls /usr/lib/lib*metis* > >> /usr/lib/libmetis.a /usr/lib/libmetis.so.3.1 > /usr/lib/libparmetis.so@ /usr/lib/libscotchmetis-5.1.so > /usr/lib/libscotchmetis.so@ > >> /usr/lib/libmetis.so@ /usr/lib/libparmetis.a > /usr/lib/libparmetis.so.3.1 /usr/lib/libscotchmetis.a > >> balay at es^~ $ > > This is really bad system management, there is no reason for them to > be there nor should they be there. > > > > bsmith at es:~$ ldd /usr/lib/libparmetis.so.3.1 > > linux-vdso.so.1 => (0x00007fff9115e000) > > libmpi.so.1 => /usr/lib/libmpi.so.1 (0x00007faf95f87000) > > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faf95c81000) > > libmetis.so.3.1 => /usr/lib/libmetis.so.3.1 (0x00007faf95a34000) > > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf9566b000) > > libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 > (0x00007faf95468000) > > libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 > (0x00007faf95228000) > > libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 > (0x00007faf9501e000) > > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 > (0x00007faf94e00000) > > /lib64/ld-linux-x86-64.so.2 (0x00007faf9654e000) > > libnuma.so.1 => > /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 > (0x00007faf94bf5000) > > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf949f1000) > > > > You have something in /usr/lib referring to something in > /soft/com/packages/pgi/ ?? > > > > and of course that refers back to > > > > bsmith at es:~$ ls -l > /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 > > lrwxrwxrwx 1 fritz voice 38 Mar 29 15:59 > /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 -> > /usr/lib/x86_64-linux-gnu/libnuma.so.1 > > > > I stand by my statement, it is bad policy to put any stuff like this in > system directories. > > > > > >> <<<< > >> > >> And we have these files installed and they don't cause problems. And > its not always practical to uninstall system stuff > >> [esp on multi-user machines] > > I agree it is not always practical or possible to remove them. > > > > barry > > > >> Satish > >> > >> > >> On Thu, 22 Aug 2019, Smith, Barry F. wrote: > >> > >>> You have a copy of parmetis installed in /usr/lib this is a systems > directory and many compilers and linkers automatically find libraries in > that location and it is often difficult to avoid have the compilers/linkers > use these. In general you never want to install external software such as > parmetis, PETSc, MPI, etc in systems directories (/usr/ and /usr/local) > >>> > >>> You should delete this library (and the includes in /usr/include) > >>> > >>> Barry > >>> > >>> > >>> > >>> > >>>> On Aug 22, 2019, at 5:17 PM, Balay, Satish via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >>>> > >>>> > >>>>> ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined > symbol: ompi_mpi_comm_world > >>>> For some reason the wrong parmetis library is getting picked up. I > don't know why. > >>>> > >>>> Can you copy/paste the log from the following? > >>>> > >>>> cd src/snes/examples/tutorials > >>>> make > PETSC_DIR=/home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3 ex19 > >>>> ldd ex19 > >>>> > >>>> cd > /home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3/pet3.11.3-intel19-mpich3.3/lib > >>>> ldd *.so > >>>> > >>>> Satish > >>>> > >>>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>> > >>>>> hi, Satish, > >>>>> > >>>>> as you have suggested, i compiled a new version using 3.11.3, > >>>>> it compiles well, the errors occur in checking. i also attach > >>>>> the errors of check. thanks very much, > >>>>> > >>>>> lailai > >>>>> > >>>>> On 8/22/19 4:16 PM, Balay, Satish wrote: > >>>>>> Any reason for using petsc-3.10.5 and not latest petsc-3.11? > >>>>>> > >>>>>> I suggest starting from scatch and rebuilding. > >>>>>> > >>>>>> And if you still have issues - send corresponding configure.log and > make.log > >>>>>> > >>>>>> Satish > >>>>>> > >>>>>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>>>> > >>>>>>> sorry, Satish, > >>>>>>> > >>>>>>> but it does not seem to solve the problem. > >>>>>>> > >>>>>>> best, > >>>>>>> lailai > >>>>>>> > >>>>>>> On 8/22/19 12:41 AM, Balay, Satish wrote: > >>>>>>>> Can you run 'make' again and see if this error goes away? > >>>>>>>> > >>>>>>>> Satish > >>>>>>>> > >>>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>>>>>> > >>>>>>>>> hi, Satish, > >>>>>>>>> i tried to do it following your suggestion, i get the following > errors > >>>>>>>>> when > >>>>>>>>> installing. > >>>>>>>>> here is my configuration, > >>>>>>>>> > >>>>>>>>> any ideas? > >>>>>>>>> > >>>>>>>>> best, > >>>>>>>>> lailai > >>>>>>>>> > >>>>>>>>> ./config/configure.py --with-c++-support > --known-mpi-shared-libraries=1 > >>>>>>>>> --with-batch=0 --with-mpi=1 --with-debugging=0 CXXOPTFLAGS="-g > -O3" > >>>>>>>>> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip > -axCORE-AVX2 > >>>>>>>>> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl > --download-elemental=1 > >>>>>>>>> --download-blacs=1 --download-scalapack=1 --download-hypre=1 > >>>>>>>>> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ > --with-fc=mpifort > >>>>>>>>> --download-amd=1 --download-anamod=1 --download-blopex=1 > >>>>>>>>> --download-dscpack=1 --download-sprng=1 --download-superlu=1 > >>>>>>>>> --with-cxx-dialect=C++11 --download-metis --download-parmetis > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In > function > >>>>>>>>> `MatCreate_SeqSBAIJ': > >>>>>>>>> sbaij.c:(.text+0x1bc45): undefined reference to > >>>>>>>>> `MatConvert_SeqSBAIJ_Elemental' > >>>>>>>>> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: > >>>>>>>>> relocation > >>>>>>>>> R_X86_64_PC32 against undefined hidden symbol > >>>>>>>>> `MatConvert_SeqSBAIJ_Elemental' > >>>>>>>>> can not be used when making a shared object > >>>>>>>>> ld: final link failed: Bad value > >>>>>>>>> gmakefile:86: recipe for target > >>>>>>>>> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed > >>>>>>>>> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] > Error 1 > >>>>>>>>> make[2]: Leaving directory > >>>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >>>>>>>>> > ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: > >>>>>>>>> recipe for target 'gnumake' failed > >>>>>>>>> make[1]: *** [gnumake] Error 2 > >>>>>>>>> make[1]: Leaving directory > >>>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >>>>>>>>> > **************************ERROR************************************* > >>>>>>>>> Error during compile, check > >>>>>>>>> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log > >>>>>>>>> Send it and > pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to > >>>>>>>>> petsc-maint at mcs.anl.gov > >>>>>>>>> > >>>>>>>>> On 8/21/19 10:58 PM, Balay, Satish wrote: > >>>>>>>>>> To install elemental - you use: --download-elemental=1 [not > >>>>>>>>>> --download-elemental-commit=v0.87.7] > >>>>>>>>>> > >>>>>>>>>> Satish > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>>>>>>>> > >>>>>>>>>>> hi, dear petsc developers, > >>>>>>>>>>> > >>>>>>>>>>> I am having a problem when using the external solver elemental. > >>>>>>>>>>> I installed petsc3.10.5 version with the flag > >>>>>>>>>>> --download-elemental-commit=v0.87.7 > >>>>>>>>>>> the installation seems to be ok. However, it seems that i may > not be > >>>>>>>>>>> able > >>>>>>>>>>> to use the elemental solver though. > >>>>>>>>>>> > >>>>>>>>>>> I followed this page > >>>>>>>>>>> > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html > >>>>>>>>>>> to interface the elemental solver, namely, > >>>>>>>>>>> MatSetType(A,MATELEMENTAL); > >>>>>>>>>>> or set it via the command line '*-mat_type elemental*', > >>>>>>>>>>> > >>>>>>>>>>> in either case, i will get the following error, > >>>>>>>>>>> > >>>>>>>>>>> [0]PETSC ERROR: --------------------- Error Message > >>>>>>>>>>> -------------------------------------------------------------- > >>>>>>>>>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or > missing > >>>>>>>>>>> package: > >>>>>>>>>>> > http://www.mcs.anl.gov/petsc/documentation/installation.html#external > >>>>>>>>>>> [0]PETSC ERROR: Unknown Mat type given: elemental > >>>>>>>>>>> [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html > >>>>>>>>>>> for > >>>>>>>>>>> trouble shooting. > >>>>>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 > >>>>>>>>>>> > >>>>>>>>>>> May i ask whether there will be a way or some specific petsc > versions > >>>>>>>>>>> that > >>>>>>>>>>> are > >>>>>>>>>>> able to use the elemental solver? > >>>>>>>>>>> > >>>>>>>>>>> Thanks in advance, > >>>>>>>>>>> > >>>>>>>>>>> best, > >>>>>>>>>>> lailai > >>>>>>>>>>> > >>>>> > >>>>> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Aug 22 20:06:47 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 23 Aug 2019 01:06:47 +0000 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> Message-ID: On Fri, 23 Aug 2019, Smith, Barry F. via petsc-users wrote: > bsmith at es:~$ ldd /usr/lib/libparmetis.so.3.1 > linux-vdso.so.1 => (0x00007fff9115e000) > libmpi.so.1 => /usr/lib/libmpi.so.1 (0x00007faf95f87000) > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faf95c81000) > libmetis.so.3.1 => /usr/lib/libmetis.so.3.1 (0x00007faf95a34000) > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf9566b000) > libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007faf95468000) > libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007faf95228000) > libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007faf9501e000) > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faf94e00000) > /lib64/ld-linux-x86-64.so.2 (0x00007faf9654e000) > libnuma.so.1 => /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 (0x00007faf94bf5000) > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf949f1000) > > You have something in /usr/lib referring to something in /soft/com/packages/pgi/ ?? Must be due to LD_LIBRARY_PATH. Checking back the original issue: LD_LIBRARY_PATH=/home/lailai/nonroot/mpi/mpich/3.3-intel19/lib:/opt/intel/lib/intel64:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib: Ok - all this stuff in LD_LIBRARY_PATH is the trigger of the original issue. Satish From bsmith at mcs.anl.gov Thu Aug 22 21:12:29 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Fri, 23 Aug 2019 02:12:29 +0000 Subject: [petsc-users] errors when using elemental with petsc3.10.5 In-Reply-To: References: <4aeeda91-0111-fc0b-2401-fca5dd8ab96c@gmail.com> <8fc74926-1c3e-463f-0672-d311df83c8a2@gmail.com> <55a68eaf-fb75-92da-364c-37d2659bf757@gmail.com> <2b3e8086-4034-4d28-8241-69a45eb9b3f8@gmail.com> <3ea599e5-478b-9e8e-9f16-9d07a7368e22@gmail.com> Message-ID: <54938B7B-3375-4D1E-A107-477A4F0A46C2@mcs.anl.gov> Does this crash on a PETSc example that uses elemental? For example run src/ksp/ksp/examples/tests/ex40.c: with the arguments -pc_type lu -pc_factor_mat_solver_type elemental To get the stack trace Add the command line option -start_in_debugger noxterm when you run your program and when it starts up type c (for continue) when it crashes type bt (for backtrace) send all the output. Barry > On Aug 22, 2019, at 7:54 PM, Matthew Knepley wrote: > > On Thu, Aug 22, 2019 at 8:45 PM Lailai Zhu via petsc-users wrote: > Thank you guys, after i remove the system's metis-related things, > it compiles well and go through the check. however when i try to > use the elemental solver, it still does not work out. i basically get > the segmentation fault error, which does not appear when i use > the standard dense matrix and jacobi sovers. thanks in advance, > > best, > lailai > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > [0]PETSC ERROR: to get more information on the crash. > > It would be really helpful to get a stack trace from the debugger. > > Thanks, > > Matt > > > On 8/22/19 8:03 PM, Smith, Barry F. wrote: > > > >> On Aug 22, 2019, at 6:48 PM, Balay, Satish wrote: > >> > >> Compilers are supposed to prefer libraries in specified -L path before system stuff. > > Suppose to. > > > >> balay at es^~ $ ls /usr/lib/lib*metis* > >> /usr/lib/libmetis.a /usr/lib/libmetis.so.3.1 /usr/lib/libparmetis.so@ /usr/lib/libscotchmetis-5.1.so /usr/lib/libscotchmetis.so@ > >> /usr/lib/libmetis.so@ /usr/lib/libparmetis.a /usr/lib/libparmetis.so.3.1 /usr/lib/libscotchmetis.a > >> balay at es^~ $ > > This is really bad system management, there is no reason for them to be there nor should they be there. > > > > bsmith at es:~$ ldd /usr/lib/libparmetis.so.3.1 > > linux-vdso.so.1 => (0x00007fff9115e000) > > libmpi.so.1 => /usr/lib/libmpi.so.1 (0x00007faf95f87000) > > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007faf95c81000) > > libmetis.so.3.1 => /usr/lib/libmetis.so.3.1 (0x00007faf95a34000) > > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faf9566b000) > > libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007faf95468000) > > libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007faf95228000) > > libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007faf9501e000) > > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007faf94e00000) > > /lib64/ld-linux-x86-64.so.2 (0x00007faf9654e000) > > libnuma.so.1 => /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 (0x00007faf94bf5000) > > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007faf949f1000) > > > > You have something in /usr/lib referring to something in /soft/com/packages/pgi/ ?? > > > > and of course that refers back to > > > > bsmith at es:~$ ls -l /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 > > lrwxrwxrwx 1 fritz voice 38 Mar 29 15:59 /soft/com/packages/pgi/19.3/linux86-64/19.3/lib/libnuma.so.1 -> /usr/lib/x86_64-linux-gnu/libnuma.so.1 > > > > I stand by my statement, it is bad policy to put any stuff like this in system directories. > > > > > >> <<<< > >> > >> And we have these files installed and they don't cause problems. And its not always practical to uninstall system stuff > >> [esp on multi-user machines] > > I agree it is not always practical or possible to remove them. > > > > barry > > > >> Satish > >> > >> > >> On Thu, 22 Aug 2019, Smith, Barry F. wrote: > >> > >>> You have a copy of parmetis installed in /usr/lib this is a systems directory and many compilers and linkers automatically find libraries in that location and it is often difficult to avoid have the compilers/linkers use these. In general you never want to install external software such as parmetis, PETSc, MPI, etc in systems directories (/usr/ and /usr/local) > >>> > >>> You should delete this library (and the includes in /usr/include) > >>> > >>> Barry > >>> > >>> > >>> > >>> > >>>> On Aug 22, 2019, at 5:17 PM, Balay, Satish via petsc-users wrote: > >>>> > >>>> > >>>>> ./ex19: symbol lookup error: /usr/lib/libparmetis.so: undefined symbol: ompi_mpi_comm_world > >>>> For some reason the wrong parmetis library is getting picked up. I don't know why. > >>>> > >>>> Can you copy/paste the log from the following? > >>>> > >>>> cd src/snes/examples/tutorials > >>>> make PETSC_DIR=/home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3 ex19 > >>>> ldd ex19 > >>>> > >>>> cd /home/lailai/nonroot/petsc/petsc3.11.3_intel19_mpich3.3/pet3.11.3-intel19-mpich3.3/lib > >>>> ldd *.so > >>>> > >>>> Satish > >>>> > >>>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>> > >>>>> hi, Satish, > >>>>> > >>>>> as you have suggested, i compiled a new version using 3.11.3, > >>>>> it compiles well, the errors occur in checking. i also attach > >>>>> the errors of check. thanks very much, > >>>>> > >>>>> lailai > >>>>> > >>>>> On 8/22/19 4:16 PM, Balay, Satish wrote: > >>>>>> Any reason for using petsc-3.10.5 and not latest petsc-3.11? > >>>>>> > >>>>>> I suggest starting from scatch and rebuilding. > >>>>>> > >>>>>> And if you still have issues - send corresponding configure.log and make.log > >>>>>> > >>>>>> Satish > >>>>>> > >>>>>> On Thu, 22 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>>>> > >>>>>>> sorry, Satish, > >>>>>>> > >>>>>>> but it does not seem to solve the problem. > >>>>>>> > >>>>>>> best, > >>>>>>> lailai > >>>>>>> > >>>>>>> On 8/22/19 12:41 AM, Balay, Satish wrote: > >>>>>>>> Can you run 'make' again and see if this error goes away? > >>>>>>>> > >>>>>>>> Satish > >>>>>>>> > >>>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>>>>>> > >>>>>>>>> hi, Satish, > >>>>>>>>> i tried to do it following your suggestion, i get the following errors > >>>>>>>>> when > >>>>>>>>> installing. > >>>>>>>>> here is my configuration, > >>>>>>>>> > >>>>>>>>> any ideas? > >>>>>>>>> > >>>>>>>>> best, > >>>>>>>>> lailai > >>>>>>>>> > >>>>>>>>> ./config/configure.py --with-c++-support --known-mpi-shared-libraries=1 > >>>>>>>>> --with-batch=0 --with-mpi=1 --with-debugging=0 CXXOPTFLAGS="-g -O3" > >>>>>>>>> COPTFLAGS="-O3 -ip -axCORE-AVX2 -xSSE4.2" FOPTFLAGS="-O3 -ip -axCORE-AVX2 > >>>>>>>>> -xSSE4.2" --with-blas-lapack-dir=/opt/intel/mkl --download-elemental=1 > >>>>>>>>> --download-blacs=1 --download-scalapack=1 --download-hypre=1 > >>>>>>>>> --download-plapack=1 --with-cc=mpicc --with-cxx=mpic++ --with-fc=mpifort > >>>>>>>>> --download-amd=1 --download-anamod=1 --download-blopex=1 > >>>>>>>>> --download-dscpack=1 --download-sprng=1 --download-superlu=1 > >>>>>>>>> --with-cxx-dialect=C++11 --download-metis --download-parmetis > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: In function > >>>>>>>>> `MatCreate_SeqSBAIJ': > >>>>>>>>> sbaij.c:(.text+0x1bc45): undefined reference to > >>>>>>>>> `MatConvert_SeqSBAIJ_Elemental' > >>>>>>>>> ld: pet3.10.5-intel19-mpich3.3/obj/mat/impls/sbaij/seq/sbaij.o: > >>>>>>>>> relocation > >>>>>>>>> R_X86_64_PC32 against undefined hidden symbol > >>>>>>>>> `MatConvert_SeqSBAIJ_Elemental' > >>>>>>>>> can not be used when making a shared object > >>>>>>>>> ld: final link failed: Bad value > >>>>>>>>> gmakefile:86: recipe for target > >>>>>>>>> 'pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5' failed > >>>>>>>>> make[2]: *** [pet3.10.5-intel19-mpich3.3/lib/libpetsc.so.3.10.5] Error 1 > >>>>>>>>> make[2]: Leaving directory > >>>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >>>>>>>>> ........................../petsc3.10.5_intel19_mpich3.3/lib/petsc/conf/rules:81: > >>>>>>>>> recipe for target 'gnumake' failed > >>>>>>>>> make[1]: *** [gnumake] Error 2 > >>>>>>>>> make[1]: Leaving directory > >>>>>>>>> '/usr/nonroot/petsc/petsc3.10.5_intel19_mpich3.3' > >>>>>>>>> **************************ERROR************************************* > >>>>>>>>> Error during compile, check > >>>>>>>>> pet3.10.5-intel19-mpich3.3/lib/petsc/conf/make.log > >>>>>>>>> Send it and pet3.10.5-intel19-mpich3.3/lib/petsc/conf/configure.log to > >>>>>>>>> petsc-maint at mcs.anl.gov > >>>>>>>>> > >>>>>>>>> On 8/21/19 10:58 PM, Balay, Satish wrote: > >>>>>>>>>> To install elemental - you use: --download-elemental=1 [not > >>>>>>>>>> --download-elemental-commit=v0.87.7] > >>>>>>>>>> > >>>>>>>>>> Satish > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Wed, 21 Aug 2019, Lailai Zhu via petsc-users wrote: > >>>>>>>>>> > >>>>>>>>>>> hi, dear petsc developers, > >>>>>>>>>>> > >>>>>>>>>>> I am having a problem when using the external solver elemental. > >>>>>>>>>>> I installed petsc3.10.5 version with the flag > >>>>>>>>>>> --download-elemental-commit=v0.87.7 > >>>>>>>>>>> the installation seems to be ok. However, it seems that i may not be > >>>>>>>>>>> able > >>>>>>>>>>> to use the elemental solver though. > >>>>>>>>>>> > >>>>>>>>>>> I followed this page > >>>>>>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATELEMENTAL.html > >>>>>>>>>>> to interface the elemental solver, namely, > >>>>>>>>>>> MatSetType(A,MATELEMENTAL); > >>>>>>>>>>> or set it via the command line '*-mat_type elemental*', > >>>>>>>>>>> > >>>>>>>>>>> in either case, i will get the following error, > >>>>>>>>>>> > >>>>>>>>>>> [0]PETSC ERROR: --------------------- Error Message > >>>>>>>>>>> -------------------------------------------------------------- > >>>>>>>>>>> [0]PETSC ERROR: Unknown type. Check for miss-spelling or missing > >>>>>>>>>>> package: > >>>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/installation.html#external > >>>>>>>>>>> [0]PETSC ERROR: Unknown Mat type given: elemental > >>>>>>>>>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > >>>>>>>>>>> for > >>>>>>>>>>> trouble shooting. > >>>>>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.10.5, Mar, 28, 2019 > >>>>>>>>>>> > >>>>>>>>>>> May i ask whether there will be a way or some specific petsc versions > >>>>>>>>>>> that > >>>>>>>>>>> are > >>>>>>>>>>> able to use the elemental solver? > >>>>>>>>>>> > >>>>>>>>>>> Thanks in advance, > >>>>>>>>>>> > >>>>>>>>>>> best, > >>>>>>>>>>> lailai > >>>>>>>>>>> > >>>>> > >>>>> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ From edoardo.alinovi at gmail.com Tue Aug 27 03:05:52 2019 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Tue, 27 Aug 2019 09:05:52 +0100 Subject: [petsc-users] DMPlex for cell centred finite volume Message-ID: Hello PETSc users and developers, I hope you are doing well! Today I have a general question about DMplex to see if it can be usefull to me or not. I have my fancy finite volume solver (cell centered, incompressible NS) which uses petsc as linear solver engine. Thanks to your suggestions so far it is now running well :) I would like to see if I can do another step incapsulatng in petsc also the mesh managment part. I think that Dmplex is the way to go since my code is fully unstructured. I have red some papers and lectures by Matt around the web, but still I have not found an answer to this question: Can dmplex dial with cell centered data arrangement and provide some support for basic operation (e. g. interpolation between partitions, face value calculation etc)? Thank you very much! Edo -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 27 03:33:46 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 27 Aug 2019 04:33:46 -0400 Subject: [petsc-users] DMPlex for cell centred finite volume In-Reply-To: References: Message-ID: On Tue, Aug 27, 2019 at 4:07 AM Edoardo alinovi via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello PETSc users and developers, > I hope you are doing well! Today I have a general question about DMplex to > see if it can be usefull to me or not. > > I have my fancy finite volume solver (cell centered, incompressible NS) > which uses petsc as linear solver engine. Thanks to your suggestions so far > it is now running well :) > > I would like to see if I can do another step incapsulatng in petsc also > the mesh managment part. I think that Dmplex is the way to go since my code > is fully unstructured. > > I have red some papers and lectures by Matt around the web, but still I > have not found an answer to this question: > Can dmplex dial with cell centered data arrangement and provide some > support for basic operation (e. g. interpolation between partitions, face > value calculation etc)? > 1) Cell centered data Definitely yes. This is the same data layout as P0 finite elements. You just assign the PetscSection k dofs per cell. 2) Interpolation between partitions I assume you mean ghost values from other parallel partitions. You can do this by using overlap=1 in DMPlexDisrtibute(). 3) Face value calculation You can do Riemann solves by looping over faces, grabbing values from the neighboring cells, doing the calculation, and updating cell values. We carry out this kind of computation in TS ex11. That example attempts to do everything, so it is messy, but we can help you understand what it is doing in any part that is unclear. 4) FV boundary conditions You can use DMPlexConstructGhostCells() to put artificial cells around the boundary that you use to prescribe fluxes. Thanks, Matt > Thank you very much! > > Edo > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.schoebel at fz-juelich.de Tue Aug 27 03:44:31 2019 From: r.schoebel at fz-juelich.de (=?UTF-8?Q?Ruth_Sch=c3=b6bel?=) Date: Tue, 27 Aug 2019 10:44:31 +0200 Subject: [petsc-users] high order interpolation Message-ID: <038ddd52-2a8e-ce6c-d09b-99ba20d043b7@fz-juelich.de> Hi, I am on the way to write a program in python where I need parallel high order interpolation. I saw there exists ?createInterpolation(self, DM dm)?, but this seems to be accurate only for constant functions. Is there a way to get a higher order interpolation matrix? If it does not exist maybe someone has already written a code himself which I may use? Thanks Ruth ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Volker Rieke Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ From edoardo.alinovi at gmail.com Tue Aug 27 03:54:24 2019 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Tue, 27 Aug 2019 09:54:24 +0100 Subject: [petsc-users] DMPlex for cell centred finite volume In-Reply-To: References: Message-ID: Hello Matt, Thanks for your kind replay. Using DMplex and changing the data structure is not a straightforward task as you can imagine. Up to now I am in the preliminaries and I am trying to understand how to reuse most of all my subroutines (I am using fortran) such as fluxes and coeffs assembly, numerical schemes and so o in order to avoid to rewrite all the code :p I will start with your first two suggestions and see if it is an affordable task :) For the face value, I am not interested in Reimann solvers. Being incompressible I am using Central difference, QUICK, SOU, Upwind or TVD-family schemes, but maybe I can use my stuff to get it. I think I can be happy enough having the ghost cell centers available and some way to deal with the BC (I guess this will be the most tricky part). Basically I would like to use DMplex at low level just to have a support for the parallel mesh management. Have you got some example for P0 finite elements and overlap=1? Thank you! :) Il giorno mar 27 ago 2019 alle ore 09:33 Matthew Knepley ha scritto: > On Tue, Aug 27, 2019 at 4:07 AM Edoardo alinovi via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Hello PETSc users and developers, >> I hope you are doing well! Today I have a general question about DMplex >> to see if it can be usefull to me or not. >> >> I have my fancy finite volume solver (cell centered, incompressible NS) >> which uses petsc as linear solver engine. Thanks to your suggestions so far >> it is now running well :) >> >> I would like to see if I can do another step incapsulatng in petsc also >> the mesh managment part. I think that Dmplex is the way to go since my code >> is fully unstructured. >> >> I have red some papers and lectures by Matt around the web, but still I >> have not found an answer to this question: >> Can dmplex dial with cell centered data arrangement and provide some >> support for basic operation (e. g. interpolation between partitions, face >> value calculation etc)? >> > > 1) Cell centered data > > Definitely yes. This is the same data layout as P0 finite elements. You > just assign the PetscSection k dofs per cell. > > 2) Interpolation between partitions > > I assume you mean ghost values from other parallel partitions. You can do > this by using overlap=1 in DMPlexDisrtibute(). > > 3) Face value calculation > > You can do Riemann solves by looping over faces, grabbing values from the > neighboring cells, doing the calculation, and > updating cell values. We carry out this kind of computation in TS ex11. > That example attempts to do everything, so it is > messy, but we can help you understand what it is doing in any part that is > unclear. > > 4) FV boundary conditions > > You can use DMPlexConstructGhostCells() to put artificial cells around the > boundary that you use to prescribe fluxes. > > Thanks, > > Matt > > >> Thank you very much! >> >> Edo >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Zhang-10 at tudelft.nl Tue Aug 27 05:21:22 2019 From: J.Zhang-10 at tudelft.nl (Jian Zhang - 3ME) Date: Tue, 27 Aug 2019 10:21:22 +0000 Subject: [petsc-users] Questions about the physical groups Message-ID: Hi all, I have a mesh file where I defined a physical group named "volume" whose id is 0. For DMPlex, I know that I can use this id (0) to access this physical group. Here I would like to ask that is it possible to use the physical group name "volume" to achieve this in DMPlex. Thank you very much. Bests, Jian -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 27 05:42:24 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 27 Aug 2019 06:42:24 -0400 Subject: [petsc-users] high order interpolation In-Reply-To: <038ddd52-2a8e-ce6c-d09b-99ba20d043b7@fz-juelich.de> References: <038ddd52-2a8e-ce6c-d09b-99ba20d043b7@fz-juelich.de> Message-ID: On Tue, Aug 27, 2019 at 4:44 AM Ruth Sch?bel via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > I am on the way to write a program in python where I need parallel high > order interpolation. > I saw there exists ?createInterpolation(self, DM dm)?, but this seems to > be accurate only for constant functions. Linear functions. > Is there a way to get a higher > order interpolation matrix? If it does not exist maybe someone has > already written a code himself which I may use? > All the pieces are in PETSc, just not hooked up in the way you need. Here we are dispatching to local interpolation: https://gitlab.com/petsc/petsc/blob/master/src/snes/utils/dmplexsnes.c#L886 It is currently hard coded to linear interpolation, as you point out. However, PETSc has high order interpolation. You would want something like this loop https://gitlab.com/petsc/petsc/blob/master/include/petsc/private/petscfeimpl.h#L216 but with a tabulation for your points, which you can get from https://gitlab.com/petsc/petsc/blob/master/src/dm/dt/fe/interface/fe.c#L862 I know this is more complicated than it should be, but we have not quite worked out what should be exposed to the user. I would just replace that short, linear DMPlexInterpolationEvalute() routine with a higher order one which assumes a PetscFE or PetscDS that comes in with it. Thanks, Matt > Thanks Ruth > > > > > ------------------------------------------------------------------------------------------------ > > ------------------------------------------------------------------------------------------------ > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzender des Aufsichtsrats: MinDir Volker Rieke > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, > Prof. Dr. Sebastian M. Schmidt > > ------------------------------------------------------------------------------------------------ > > ------------------------------------------------------------------------------------------------ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 27 05:47:43 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 27 Aug 2019 06:47:43 -0400 Subject: [petsc-users] Questions about the physical groups In-Reply-To: References: Message-ID: On Tue, Aug 27, 2019 at 6:22 AM Jian Zhang - 3ME via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi all, > > I have a mesh file where I defined a physical group named "volume" whose > id is 0. > > For DMPlex, I know that I can use this id (0) to access this physical > group. Here I would like to ask that is it possible to use the physical > group name "volume" to achieve this in DMPlex. Thank you very much. > > Do you mean the $PhysicalNames section in a GMsh file? No, right now we ignore those names. We could maintain a translation table in the DM between the names and integers, but it seemed to specialized to GMsh. We are willing to listen to arguments however. Thanks, Matt > Bests, > > Jian > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 27 05:54:08 2019 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 27 Aug 2019 06:54:08 -0400 Subject: [petsc-users] DMPlex for cell centred finite volume In-Reply-To: References: Message-ID: On Tue, Aug 27, 2019 at 4:54 AM Edoardo alinovi wrote: > Hello Matt, > > Thanks for your kind replay. Using DMplex and changing the data structure > is not a straightforward task as you can imagine. Up to now I am in the > preliminaries and I am trying to understand how to reuse most of all my > subroutines (I am using fortran) such as fluxes and coeffs assembly, > numerical schemes and so o in order to avoid to rewrite all the code :p I > will start with your first two suggestions and see if it is an affordable > task :) > > For the face value, I am not interested in Reimann solvers. Being > incompressible I am using Central difference, QUICK, SOU, Upwind or > TVD-family schemes, but maybe I can use my stuff to get it. > > I think I can be happy enough having the ghost cell centers available and > some way to deal with the BC (I guess this will be the most tricky part). > Basically I would like to use DMplex at low level just to have a support > for the parallel mesh management. > Have you got some example for P0 finite elements and overlap=1? > This is one group already using PETSc in exactly this way for a FV code. Maybe it would help if I gave a high-level overview of this usage pattern. 1) Hopefully replacing your current mesh topology/geometry with Plex is not hard. Let me know if you have any questions about this. 2) If you give overlap=1 on distribution, then support(face) for any regular face gives the two cells on either side. You probably want to screen out faces between two ghost cells. There is a DMLabel for doing that. 3) A PetscSection just gives the data layout over a mesh. The idea here is to make it describe _exactly_ the layout you already have. The DMGetGlobalVector() and DMGetLocalVector() should produce vectors in your existing order and you can reuse all your code. Thanks, Matt > Thank you! :) > > Il giorno mar 27 ago 2019 alle ore 09:33 Matthew Knepley < > knepley at gmail.com> ha scritto: > >> On Tue, Aug 27, 2019 at 4:07 AM Edoardo alinovi via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Hello PETSc users and developers, >>> I hope you are doing well! Today I have a general question about DMplex >>> to see if it can be usefull to me or not. >>> >>> I have my fancy finite volume solver (cell centered, incompressible NS) >>> which uses petsc as linear solver engine. Thanks to your suggestions so far >>> it is now running well :) >>> >>> I would like to see if I can do another step incapsulatng in petsc also >>> the mesh managment part. I think that Dmplex is the way to go since my code >>> is fully unstructured. >>> >>> I have red some papers and lectures by Matt around the web, but still I >>> have not found an answer to this question: >>> Can dmplex dial with cell centered data arrangement and provide some >>> support for basic operation (e. g. interpolation between partitions, face >>> value calculation etc)? >>> >> >> 1) Cell centered data >> >> Definitely yes. This is the same data layout as P0 finite elements. You >> just assign the PetscSection k dofs per cell. >> >> 2) Interpolation between partitions >> >> I assume you mean ghost values from other parallel partitions. You can do >> this by using overlap=1 in DMPlexDisrtibute(). >> >> 3) Face value calculation >> >> You can do Riemann solves by looping over faces, grabbing values from the >> neighboring cells, doing the calculation, and >> updating cell values. We carry out this kind of computation in TS ex11. >> That example attempts to do everything, so it is >> messy, but we can help you understand what it is doing in any part that >> is unclear. >> >> 4) FV boundary conditions >> >> You can use DMPlexConstructGhostCells() to put artificial cells around >> the boundary that you use to prescribe fluxes. >> >> Thanks, >> >> Matt >> >> >>> Thank you very much! >>> >>> Edo >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From J.Zhang-10 at tudelft.nl Tue Aug 27 06:25:53 2019 From: J.Zhang-10 at tudelft.nl (Jian Zhang - 3ME) Date: Tue, 27 Aug 2019 11:25:53 +0000 Subject: [petsc-users] Questions about the physical groups In-Reply-To: References: , Message-ID: <69eaee4d39bf45f88457966bdad9a305@tudelft.nl> Thank you very much for your reply. As a user, if I define a physical group named "right" in Gmsh, I also would like to access it by using this name in DMPlex. It could be convenient and understandable for the user. I think it would be nice if you guys can add this translation table. Thanks a lot. Best, Jian ________________________________ From: Matthew Knepley Sent: 27 August 2019 12:47:43 To: Jian Zhang - 3ME Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Questions about the physical groups On Tue, Aug 27, 2019 at 6:22 AM Jian Zhang - 3ME via petsc-users > wrote: Hi all, I have a mesh file where I defined a physical group named "volume" whose id is 0. For DMPlex, I know that I can use this id (0) to access this physical group. Here I would like to ask that is it possible to use the physical group name "volume" to achieve this in DMPlex. Thank you very much. Do you mean the $PhysicalNames section in a GMsh file? No, right now we ignore those names. We could maintain a translation table in the DM between the names and integers, but it seemed to specialized to GMsh. We are willing to listen to arguments however. Thanks, Matt Bests, Jian -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathaniel.collier at gmail.com Tue Aug 27 06:27:50 2019 From: nathaniel.collier at gmail.com (Nathan Collier) Date: Tue, 27 Aug 2019 07:27:50 -0400 Subject: [petsc-users] DMPlex for cell centred finite volume In-Reply-To: References: Message-ID: Here is a example of what I think you need: https://github.com/TDycores-Project/toy-problems/blob/master/TracyProblem/Richards.c It is written to solve Richard's equation using a simple two-point flux, but it may be helpful to look at the setup and residual loop. Nate On Tue, Aug 27, 2019 at 6:55 AM Matthew Knepley via petsc-users < petsc-users at mcs.anl.gov> wrote: > On Tue, Aug 27, 2019 at 4:54 AM Edoardo alinovi > wrote: > >> Hello Matt, >> >> Thanks for your kind replay. Using DMplex and changing the data structure >> is not a straightforward task as you can imagine. Up to now I am in the >> preliminaries and I am trying to understand how to reuse most of all my >> subroutines (I am using fortran) such as fluxes and coeffs assembly, >> numerical schemes and so o in order to avoid to rewrite all the code :p I >> will start with your first two suggestions and see if it is an affordable >> task :) >> >> For the face value, I am not interested in Reimann solvers. Being >> incompressible I am using Central difference, QUICK, SOU, Upwind or >> TVD-family schemes, but maybe I can use my stuff to get it. >> >> I think I can be happy enough having the ghost cell centers available and >> some way to deal with the BC (I guess this will be the most tricky part). >> Basically I would like to use DMplex at low level just to have a support >> for the parallel mesh management. >> Have you got some example for P0 finite elements and overlap=1? >> > > This is one group already using PETSc in exactly this way for a FV code. > > Maybe it would help if I gave a high-level overview of this usage pattern. > > 1) Hopefully replacing your current mesh topology/geometry with Plex is > not hard. Let me know if you have any questions about this. > > 2) If you give overlap=1 on distribution, then support(face) for any > regular face gives the two cells on either side. You > probably want to screen out faces between two ghost cells. There is a > DMLabel for doing that. > > 3) A PetscSection just gives the data layout over a mesh. The idea here is > to make it describe _exactly_ the layout you already have. > The DMGetGlobalVector() and DMGetLocalVector() should produce vectors > in your existing order and you can reuse all your code. > > Thanks, > > Matt > > >> Thank you! :) >> >> Il giorno mar 27 ago 2019 alle ore 09:33 Matthew Knepley < >> knepley at gmail.com> ha scritto: >> >>> On Tue, Aug 27, 2019 at 4:07 AM Edoardo alinovi via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Hello PETSc users and developers, >>>> I hope you are doing well! Today I have a general question about DMplex >>>> to see if it can be usefull to me or not. >>>> >>>> I have my fancy finite volume solver (cell centered, incompressible NS) >>>> which uses petsc as linear solver engine. Thanks to your suggestions so far >>>> it is now running well :) >>>> >>>> I would like to see if I can do another step incapsulatng in petsc also >>>> the mesh managment part. I think that Dmplex is the way to go since my code >>>> is fully unstructured. >>>> >>>> I have red some papers and lectures by Matt around the web, but still I >>>> have not found an answer to this question: >>>> Can dmplex dial with cell centered data arrangement and provide some >>>> support for basic operation (e. g. interpolation between partitions, face >>>> value calculation etc)? >>>> >>> >>> 1) Cell centered data >>> >>> Definitely yes. This is the same data layout as P0 finite elements. You >>> just assign the PetscSection k dofs per cell. >>> >>> 2) Interpolation between partitions >>> >>> I assume you mean ghost values from other parallel partitions. You can >>> do this by using overlap=1 in DMPlexDisrtibute(). >>> >>> 3) Face value calculation >>> >>> You can do Riemann solves by looping over faces, grabbing values from >>> the neighboring cells, doing the calculation, and >>> updating cell values. We carry out this kind of computation in TS ex11. >>> That example attempts to do everything, so it is >>> messy, but we can help you understand what it is doing in any part that >>> is unclear. >>> >>> 4) FV boundary conditions >>> >>> You can use DMPlexConstructGhostCells() to put artificial cells around >>> the boundary that you use to prescribe fluxes. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thank you very much! >>>> >>>> Edo >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From edoardo.alinovi at gmail.com Tue Aug 27 06:28:57 2019 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Tue, 27 Aug 2019 12:28:57 +0100 Subject: [petsc-users] DMPlex for cell centred finite volume In-Reply-To: References: Message-ID: Sounds good, i will take a look. Thanks! On Tue, 27 Aug 2019, 12:28 Nathan Collier, wrote: > Here is a example of what I think you need: > > > https://github.com/TDycores-Project/toy-problems/blob/master/TracyProblem/Richards.c > > It is written to solve Richard's equation using a simple two-point flux, > but it may be helpful to look at the setup and residual loop. > > Nate > > > On Tue, Aug 27, 2019 at 6:55 AM Matthew Knepley via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> On Tue, Aug 27, 2019 at 4:54 AM Edoardo alinovi < >> edoardo.alinovi at gmail.com> wrote: >> >>> Hello Matt, >>> >>> Thanks for your kind replay. Using DMplex and changing the data >>> structure is not a straightforward task as you can imagine. Up to now I am >>> in the preliminaries and I am trying to understand how to reuse most of all >>> my subroutines (I am using fortran) such as fluxes and coeffs assembly, >>> numerical schemes and so o in order to avoid to rewrite all the code :p I >>> will start with your first two suggestions and see if it is an affordable >>> task :) >>> >>> For the face value, I am not interested in Reimann solvers. Being >>> incompressible I am using Central difference, QUICK, SOU, Upwind or >>> TVD-family schemes, but maybe I can use my stuff to get it. >>> >>> I think I can be happy enough having the ghost cell centers available >>> and some way to deal with the BC (I guess this will be the most tricky >>> part). Basically I would like to use DMplex at low level just to have a >>> support for the parallel mesh management. >>> Have you got some example for P0 finite elements and overlap=1? >>> >> >> This is one group already using PETSc in exactly this way for a FV code. >> >> Maybe it would help if I gave a high-level overview of this usage pattern. >> >> 1) Hopefully replacing your current mesh topology/geometry with Plex is >> not hard. Let me know if you have any questions about this. >> >> 2) If you give overlap=1 on distribution, then support(face) for any >> regular face gives the two cells on either side. You >> probably want to screen out faces between two ghost cells. There is a >> DMLabel for doing that. >> >> 3) A PetscSection just gives the data layout over a mesh. The idea here >> is to make it describe _exactly_ the layout you already have. >> The DMGetGlobalVector() and DMGetLocalVector() should produce vectors >> in your existing order and you can reuse all your code. >> >> Thanks, >> >> Matt >> >> >>> Thank you! :) >>> >>> Il giorno mar 27 ago 2019 alle ore 09:33 Matthew Knepley < >>> knepley at gmail.com> ha scritto: >>> >>>> On Tue, Aug 27, 2019 at 4:07 AM Edoardo alinovi via petsc-users < >>>> petsc-users at mcs.anl.gov> wrote: >>>> >>>>> Hello PETSc users and developers, >>>>> I hope you are doing well! Today I have a general question about >>>>> DMplex to see if it can be usefull to me or not. >>>>> >>>>> I have my fancy finite volume solver (cell centered, incompressible >>>>> NS) which uses petsc as linear solver engine. Thanks to your suggestions so >>>>> far it is now running well :) >>>>> >>>>> I would like to see if I can do another step incapsulatng in petsc >>>>> also the mesh managment part. I think that Dmplex is the way to go since my >>>>> code is fully unstructured. >>>>> >>>>> I have red some papers and lectures by Matt around the web, but still >>>>> I have not found an answer to this question: >>>>> Can dmplex dial with cell centered data arrangement and provide some >>>>> support for basic operation (e. g. interpolation between partitions, face >>>>> value calculation etc)? >>>>> >>>> >>>> 1) Cell centered data >>>> >>>> Definitely yes. This is the same data layout as P0 finite elements. You >>>> just assign the PetscSection k dofs per cell. >>>> >>>> 2) Interpolation between partitions >>>> >>>> I assume you mean ghost values from other parallel partitions. You can >>>> do this by using overlap=1 in DMPlexDisrtibute(). >>>> >>>> 3) Face value calculation >>>> >>>> You can do Riemann solves by looping over faces, grabbing values from >>>> the neighboring cells, doing the calculation, and >>>> updating cell values. We carry out this kind of computation in TS ex11. >>>> That example attempts to do everything, so it is >>>> messy, but we can help you understand what it is doing in any part that >>>> is unclear. >>>> >>>> 4) FV boundary conditions >>>> >>>> You can use DMPlexConstructGhostCells() to put artificial cells around >>>> the boundary that you use to prescribe fluxes. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thank you very much! >>>>> >>>>> Edo >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> >>>> >>> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Tue Aug 27 15:18:28 2019 From: fdkong.jd at gmail.com (Fande Kong) Date: Tue, 27 Aug 2019 14:18:28 -0600 Subject: [petsc-users] Unable to compile super_dist using an intel compiler Message-ID: Hi All, I was trying to compile PETSc with "--download-superlu_dist" using an intel compiler. I have explored with different options, but did not get PETSc built successfully so far. Any help would be appreciated. The log file is attached. Thanks, Fande, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log.zip Type: application/zip Size: 443065 bytes Desc: not available URL: From balay at mcs.anl.gov Tue Aug 27 15:54:16 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Tue, 27 Aug 2019 20:54:16 +0000 Subject: [petsc-users] Unable to compile super_dist using an intel compiler In-Reply-To: References: Message-ID: On Tue, 27 Aug 2019, Fande Kong via petsc-users wrote: > Hi All, > > I was trying to compile PETSc with "--download-superlu_dist" using an intel > compiler. I have explored with different options, but did not get PETSc > built successfully so far. Any help would be appreciated. > > The log file is attached. I attempted one and that worked. Can you try the same on your machine and see if it still gives errors? Satish ----- -bash-4.2$ ./configure --with-cc=mpiicc --with-fc=mpiifort --with-cxx=mpiicpc --with-debugging=no -with-blas-lapack-dir=$MKLROOT --with-cxx-dialect=C++11 --download-superlu_dist=1 --download-metis --download-parmetis --download-cmake =============================================================================== Configuring PETSc to compile on your system =============================================================================== =============================================================================== ***** WARNING: Using default optimization C flags -g -O3 You might consider manually setting optimal optimization flags for your system with COPTFLAGS="optimization flags" see config/examples/arch-*-opt.py for examples =============================================================================== =============================================================================== ***** WARNING: Using default C++ optimization flags -g -O3 You might consider manually setting optimal optimization flags for your system with CXXOPTFLAGS="optimization flags" see config/examples/arch-*-opt.py for examples =============================================================================== =============================================================================== ***** WARNING: Using default FORTRAN optimization flags -g -O3 You might consider manually setting optimal optimization flags for your system with FOPTFLAGS="optimization flags" see config/examples/arch-*-opt.py for examples =============================================================================== =============================================================================== It appears you do not have valgrind installed on your system. We HIGHLY recommend you install it from www.valgrind.org Or install valgrind-devel or equivalent using your package manager. Then rerun ./configure =============================================================================== =============================================================================== Trying to download git://https://bitbucket.org/petsc/pkg-sowing.git for SOWING =============================================================================== =============================================================================== Running configure on SOWING; this may take several minutes =============================================================================== =============================================================================== Running make on SOWING; this may take several minutes =============================================================================== =============================================================================== Running make install on SOWING; this may take several minutes =============================================================================== =============================================================================== Trying to download https://cmake.org/files/v3.9/cmake-3.9.6.tar.gz for CMAKE =============================================================================== =============================================================================== Running configure on CMAKE; this may take several minutes =============================================================================== =============================================================================== Running make on CMAKE; this may take several minutes =============================================================================== =============================================================================== Running make install on CMAKE; this may take several minutes =============================================================================== =============================================================================== Trying to download git://https://bitbucket.org/petsc/pkg-metis.git for METIS =============================================================================== =============================================================================== Configuring METIS with cmake, this may take several minutes =============================================================================== =============================================================================== Compiling and installing METIS; this may take several minutes =============================================================================== =============================================================================== Trying to download git://https://bitbucket.org/petsc/pkg-parmetis.git for PARMETIS =============================================================================== =============================================================================== Configuring PARMETIS with cmake, this may take several minutes =============================================================================== =============================================================================== Compiling and installing PARMETIS; this may take several minutes =============================================================================== =============================================================================== Trying to download git://https://github.com/xiaoyeli/superlu_dist for SUPERLU_DIST =============================================================================== =============================================================================== Configuring SUPERLU_DIST with cmake, this may take several minutes =============================================================================== =============================================================================== Compiling and installing SUPERLU_DIST; this may take several minutes =============================================================================== Compilers: C Compiler: mpiicc -fPIC -wd1572 -g -O3 C++ Compiler: mpiicpc -wd1572 -g -O3 -fPIC -std=c++11 Fortran Compiler: mpiifort -fPIC -g -O3 Linkers: Shared linker: mpiicc -shared -fPIC -wd1572 -g -O3 Dynamic linker: mpiicc -shared -fPIC -wd1572 -g -O3 make: BLAS/LAPACK: -Wl,-rpath,/home/intel/19u3/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 -L/home/intel/19u3/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_def -lpthread MPI: cmake: X: Library: -lX11 pthread: metis: Includes: -I/home/balay/petsc/arch-linux2-c-opt/include Library: -Wl,-rpath,/home/balay/petsc/arch-linux2-c-opt/lib -L/home/balay/petsc/arch-linux2-c-opt/lib -lmetis parmetis: Includes: -I/home/balay/petsc/arch-linux2-c-opt/include Library: -Wl,-rpath,/home/balay/petsc/arch-linux2-c-opt/lib -L/home/balay/petsc/arch-linux2-c-opt/lib -lparmetis SuperLU_DIST: Includes: -I/home/balay/petsc/arch-linux2-c-opt/include Library: -Wl,-rpath,/home/balay/petsc/arch-linux2-c-opt/lib -L/home/balay/petsc/arch-linux2-c-opt/lib -lsuperlu_dist Arch: mkl_sparse: mkl_sparse_optimize: sowing: PETSc: PETSC_ARCH: arch-linux2-c-opt PETSC_DIR: /home/balay/petsc Scalar type: real Precision: double Clanguage: C Integer size: 32 shared libraries: enabled Memory alignment: 16 xxx=========================================================================xxx Configure stage complete. Now build PETSc libraries with (gnumake build): make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux2-c-opt all xxx=========================================================================xxx -bash-4.2$ uname -a Linux isdp001.cels.anl.gov 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 17:46:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux -bash-4.2$ From fdkong.jd at gmail.com Tue Aug 27 16:39:25 2019 From: fdkong.jd at gmail.com (Fande Kong) Date: Tue, 27 Aug 2019 15:39:25 -0600 Subject: [petsc-users] Unable to compile super_dist using an intel compiler In-Reply-To: References: Message-ID: No, I hit exactly the same error using your script. Thanks, Fande, On Tue, Aug 27, 2019 at 2:54 PM Balay, Satish wrote: > On Tue, 27 Aug 2019, Fande Kong via petsc-users wrote: > > > Hi All, > > > > I was trying to compile PETSc with "--download-superlu_dist" using an > intel > > compiler. I have explored with different options, but did not get PETSc > > built successfully so far. Any help would be appreciated. > > > > The log file is attached. > > I attempted one and that worked. > > Can you try the same on your machine and see if it still gives errors? > > Satish > > ----- > > -bash-4.2$ ./configure --with-cc=mpiicc --with-fc=mpiifort > --with-cxx=mpiicpc --with-debugging=no -with-blas-lapack-dir=$MKLROOT > --with-cxx-dialect=C++11 --download-superlu_dist=1 --download-metis > --download-parmetis --download-cmake > > =============================================================================== > Configuring PETSc to compile on your system > > > =============================================================================== > =============================================================================== > ***** WARNING: Using default optimization C > flags -g -O3 You might > consider manually setting optimal optimization flags for your system with > COPTFLAGS="optimization flags" see > config/examples/arch-*-opt.py for examples > =============================================================================== > > =============================================================================== > ***** WARNING: Using default C++ optimization > flags -g -O3 You might > consider manually setting optimal optimization flags for your system with > CXXOPTFLAGS="optimization flags" see > config/examples/arch-*-opt.py for examples > =============================================================================== > > =============================================================================== > ***** WARNING: Using default FORTRAN > optimization flags -g -O3 You > might consider manually setting optimal optimization flags for your system > with FOPTFLAGS="optimization flags" see > config/examples/arch-*-opt.py for examples > =============================================================================== > > =============================================================================== > It appears you do not have valgrind installed > on your system. We HIGHLY > recommend you install it from www.valgrind.org > Or install valgrind-devel or equivalent using your > package manager. Then rerun > ./configure > > =============================================================================== > > =============================================================================== > Trying to download git:// > https://bitbucket.org/petsc/pkg-sowing.git for SOWING > =============================================================================== > > =============================================================================== > Running configure on SOWING; this may take > several minutes > =============================================================================== > > =============================================================================== > Running make on SOWING; this may take several > minutes > =============================================================================== > > =============================================================================== > Running make install on SOWING; this may take > several minutes > =============================================================================== > > =============================================================================== > Trying to download > https://cmake.org/files/v3.9/cmake-3.9.6.tar.gz for CMAKE > > =============================================================================== > > =============================================================================== > Running configure on CMAKE; this may take > several minutes > =============================================================================== > > =============================================================================== > Running make on CMAKE; this may take several > minutes > =============================================================================== > > =============================================================================== > Running make install on CMAKE; this may take > several minutes > =============================================================================== > > =============================================================================== > Trying to download git:// > https://bitbucket.org/petsc/pkg-metis.git for METIS > =============================================================================== > > =============================================================================== > Configuring METIS with cmake, this may take > several minutes > =============================================================================== > > =============================================================================== > Compiling and installing METIS; this may take > several minutes > =============================================================================== > > =============================================================================== > Trying to download git:// > https://bitbucket.org/petsc/pkg-parmetis.git for PARMETIS > =============================================================================== > > =============================================================================== > Configuring PARMETIS with cmake, this may > take several minutes > =============================================================================== > > =============================================================================== > Compiling and installing PARMETIS; this may > take several minutes > =============================================================================== > > =============================================================================== > Trying to download git:// > https://github.com/xiaoyeli/superlu_dist for SUPERLU_DIST > =============================================================================== > > =============================================================================== > Configuring SUPERLU_DIST with cmake, this may > take several minutes > =============================================================================== > > =============================================================================== > Compiling and installing SUPERLU_DIST; this > may take several minutes > =============================================================================== > Compilers: > > C Compiler: mpiicc -fPIC -wd1572 -g -O3 > C++ Compiler: mpiicpc -wd1572 -g -O3 -fPIC -std=c++11 > Fortran Compiler: mpiifort -fPIC -g -O3 > Linkers: > Shared linker: mpiicc -shared -fPIC -wd1572 -g -O3 > Dynamic linker: mpiicc -shared -fPIC -wd1572 -g -O3 > make: > BLAS/LAPACK: > -Wl,-rpath,/home/intel/19u3/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 > -L/home/intel/19u3/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 > -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_def -lpthread > MPI: > cmake: > X: > Library: -lX11 > pthread: > metis: > Includes: -I/home/balay/petsc/arch-linux2-c-opt/include > Library: -Wl,-rpath,/home/balay/petsc/arch-linux2-c-opt/lib > -L/home/balay/petsc/arch-linux2-c-opt/lib -lmetis > parmetis: > Includes: -I/home/balay/petsc/arch-linux2-c-opt/include > Library: -Wl,-rpath,/home/balay/petsc/arch-linux2-c-opt/lib > -L/home/balay/petsc/arch-linux2-c-opt/lib -lparmetis > SuperLU_DIST: > Includes: -I/home/balay/petsc/arch-linux2-c-opt/include > Library: -Wl,-rpath,/home/balay/petsc/arch-linux2-c-opt/lib > -L/home/balay/petsc/arch-linux2-c-opt/lib -lsuperlu_dist > Arch: > mkl_sparse: > mkl_sparse_optimize: > sowing: > PETSc: > PETSC_ARCH: arch-linux2-c-opt > PETSC_DIR: /home/balay/petsc > Scalar type: real > Precision: double > Clanguage: C > Integer size: 32 > shared libraries: enabled > Memory alignment: 16 > > xxx=========================================================================xxx > Configure stage complete. Now build PETSc libraries with (gnumake build): > make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux2-c-opt all > > xxx=========================================================================xxx > -bash-4.2$ uname -a > Linux isdp001.cels.anl.gov 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 > 17:46:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux > -bash-4.2$ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log 2.zip Type: application/zip Size: 443065 bytes Desc: not available URL: From balay at mcs.anl.gov Tue Aug 27 16:45:35 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Tue, 27 Aug 2019 21:45:35 +0000 Subject: [petsc-users] Unable to compile super_dist using an intel compiler In-Reply-To: References: Message-ID: attached zip file has the same/old configure.log Can you use PETSC_ARCH=arch-test - so that the old build files are not reused, and resend the new configure.log Satish On Tue, 27 Aug 2019, Fande Kong via petsc-users wrote: > No, I hit exactly the same error using your script. > > Thanks, > > Fande, > > On Tue, Aug 27, 2019 at 2:54 PM Balay, Satish wrote: > > > On Tue, 27 Aug 2019, Fande Kong via petsc-users wrote: > > > > > Hi All, > > > > > > I was trying to compile PETSc with "--download-superlu_dist" using an > > intel > > > compiler. I have explored with different options, but did not get PETSc > > > built successfully so far. Any help would be appreciated. > > > > > > The log file is attached. > > > > I attempted one and that worked. > > > > Can you try the same on your machine and see if it still gives errors? > > > > Satish > > > > ----- > > > > -bash-4.2$ ./configure --with-cc=mpiicc --with-fc=mpiifort > > --with-cxx=mpiicpc --with-debugging=no -with-blas-lapack-dir=$MKLROOT > > --with-cxx-dialect=C++11 --download-superlu_dist=1 --download-metis > > --download-parmetis --download-cmake > > > > =============================================================================== > > Configuring PETSc to compile on your system > > > > > > =============================================================================== > > =============================================================================== > > ***** WARNING: Using default optimization C > > flags -g -O3 You might > > consider manually setting optimal optimization flags for your system with > > COPTFLAGS="optimization flags" see > > config/examples/arch-*-opt.py for examples > > =============================================================================== > > > > =============================================================================== > > ***** WARNING: Using default C++ optimization > > flags -g -O3 You might > > consider manually setting optimal optimization flags for your system with > > CXXOPTFLAGS="optimization flags" see > > config/examples/arch-*-opt.py for examples > > =============================================================================== > > > > =============================================================================== > > ***** WARNING: Using default FORTRAN > > optimization flags -g -O3 You > > might consider manually setting optimal optimization flags for your system > > with FOPTFLAGS="optimization flags" see > > config/examples/arch-*-opt.py for examples > > =============================================================================== > > > > =============================================================================== > > It appears you do not have valgrind installed > > on your system. We HIGHLY > > recommend you install it from www.valgrind.org > > Or install valgrind-devel or equivalent using your > > package manager. Then rerun > > ./configure > > > > =============================================================================== > > > > =============================================================================== > > Trying to download git:// > > https://bitbucket.org/petsc/pkg-sowing.git for SOWING > > =============================================================================== > > > > =============================================================================== > > Running configure on SOWING; this may take > > several minutes > > =============================================================================== > > > > =============================================================================== > > Running make on SOWING; this may take several > > minutes > > =============================================================================== > > > > =============================================================================== > > Running make install on SOWING; this may take > > several minutes > > =============================================================================== > > > > =============================================================================== > > Trying to download > > https://cmake.org/files/v3.9/cmake-3.9.6.tar.gz for CMAKE > > > > =============================================================================== > > > > =============================================================================== > > Running configure on CMAKE; this may take > > several minutes > > =============================================================================== > > > > =============================================================================== > > Running make on CMAKE; this may take several > > minutes > > =============================================================================== > > > > =============================================================================== > > Running make install on CMAKE; this may take > > several minutes > > =============================================================================== > > > > =============================================================================== > > Trying to download git:// > > https://bitbucket.org/petsc/pkg-metis.git for METIS > > =============================================================================== > > > > =============================================================================== > > Configuring METIS with cmake, this may take > > several minutes > > =============================================================================== > > > > =============================================================================== > > Compiling and installing METIS; this may take > > several minutes > > =============================================================================== > > > > =============================================================================== > > Trying to download git:// > > https://bitbucket.org/petsc/pkg-parmetis.git for PARMETIS > > =============================================================================== > > > > =============================================================================== > > Configuring PARMETIS with cmake, this may > > take several minutes > > =============================================================================== > > > > =============================================================================== > > Compiling and installing PARMETIS; this may > > take several minutes > > =============================================================================== > > > > =============================================================================== > > Trying to download git:// > > https://github.com/xiaoyeli/superlu_dist for SUPERLU_DIST > > =============================================================================== > > > > =============================================================================== > > Configuring SUPERLU_DIST with cmake, this may > > take several minutes > > =============================================================================== > > > > =============================================================================== > > Compiling and installing SUPERLU_DIST; this > > may take several minutes > > =============================================================================== > > Compilers: > > > > C Compiler: mpiicc -fPIC -wd1572 -g -O3 > > C++ Compiler: mpiicpc -wd1572 -g -O3 -fPIC -std=c++11 > > Fortran Compiler: mpiifort -fPIC -g -O3 > > Linkers: > > Shared linker: mpiicc -shared -fPIC -wd1572 -g -O3 > > Dynamic linker: mpiicc -shared -fPIC -wd1572 -g -O3 > > make: > > BLAS/LAPACK: > > -Wl,-rpath,/home/intel/19u3/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 > > -L/home/intel/19u3/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64 > > -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_def -lpthread > > MPI: > > cmake: > > X: > > Library: -lX11 > > pthread: > > metis: > > Includes: -I/home/balay/petsc/arch-linux2-c-opt/include > > Library: -Wl,-rpath,/home/balay/petsc/arch-linux2-c-opt/lib > > -L/home/balay/petsc/arch-linux2-c-opt/lib -lmetis > > parmetis: > > Includes: -I/home/balay/petsc/arch-linux2-c-opt/include > > Library: -Wl,-rpath,/home/balay/petsc/arch-linux2-c-opt/lib > > -L/home/balay/petsc/arch-linux2-c-opt/lib -lparmetis > > SuperLU_DIST: > > Includes: -I/home/balay/petsc/arch-linux2-c-opt/include > > Library: -Wl,-rpath,/home/balay/petsc/arch-linux2-c-opt/lib > > -L/home/balay/petsc/arch-linux2-c-opt/lib -lsuperlu_dist > > Arch: > > mkl_sparse: > > mkl_sparse_optimize: > > sowing: > > PETSc: > > PETSC_ARCH: arch-linux2-c-opt > > PETSC_DIR: /home/balay/petsc > > Scalar type: real > > Precision: double > > Clanguage: C > > Integer size: 32 > > shared libraries: enabled > > Memory alignment: 16 > > > > xxx=========================================================================xxx > > Configure stage complete. Now build PETSc libraries with (gnumake build): > > make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux2-c-opt all > > > > xxx=========================================================================xxx > > -bash-4.2$ uname -a > > Linux isdp001.cels.anl.gov 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 > > 17:46:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux > > -bash-4.2$ > > > > > From bsmith at mcs.anl.gov Tue Aug 27 20:57:07 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 28 Aug 2019 01:57:07 +0000 Subject: [petsc-users] Unable to compile super_dist using an intel compiler In-Reply-To: References: Message-ID: ls -ld /apps/local/easybuild/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/include > On Aug 27, 2019, at 3:18 PM, Fande Kong via petsc-users wrote: > > Hi All, > > I was trying to compile PETSc with "--download-superlu_dist" using an intel compiler. I have explored with different options, but did not get PETSc built successfully so far. Any help would be appreciated. > > The log file is attached. > > Thanks, > > Fande, > From bsmith at mcs.anl.gov Tue Aug 27 21:13:23 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 28 Aug 2019 02:13:23 +0000 Subject: [petsc-users] Unable to compile super_dist using an intel compiler In-Reply-To: References: Message-ID: <0E1E502D-3DB0-4284-8CDD-4EDA842C9DA4@anl.gov> CMake likes to specifically check that all include directories passed to it exist. PETSc does not foreach(dir ${TPL_PARMETIS_INCLUDE_DIRS}) if (NOT EXISTS ${dir}) message(FATAL_ERROR "PARMETIS include directory not found: ${dir}") endif() set(CMAKE_C_FLAGS "-I${dir} ${CMAKE_C_FLAGS}") endforeach() Given configure's habit of printing things out of order it is not possible to say decisively where the include path first appears but it does appear here when checking for Fortran include files. Executing: mpiifort -o /tmp/petsc-r1P209/config.compilers/conftest -v -fPIC -g -O3 /tmp/petsc-r1P209/config.compilers/conftest.o -lstdc++ -ldl stdout: mpiifort for the Intel(R) MPI Library 2018 Update 3 for Linux* Copyright(C) 2003-2018, Intel Corporation. All rights reserved. Possible ERROR while running linker: stdout: mpiifort for the Intel(R) MPI Library 2018 Update 3 for Linux* Copyright(C) 2003-2018, Intel Corporation. All rights reserved.stderr: ifort version 18.0.3 /apps/local/easybuild/software/ifort/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/bin/intel64/fortcom -mGLOB_em64t=TRUE -mP1OPT_version=18.0-intel64 -mGLOB_diag_file=/tmp/petsc-r1P209/config.compilers/conftest.diag -mGLOB_long_size_64 -mGLOB_routine_pointer_size_64 -mGLOB_source_language=GLOB_SOURCE_LANGUAGE_F90 -mP2OPT_static_promotion -mP1OPT_print_version=FALSE -mCG_use_gas_got_workaround=F -mP2OPT_align_option_used=TRUE -mGLOB_gcc_version=730 "-mGLOB_options_string=-I/apps/local/easybuild/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/include -I/apps/local/easybuild/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/include It gets into the BLAS/LAPACK includes at Checking for header files ['mkl.h', 'mkl_spblas.h'] in ['/apps/local/easybuild/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/intel64/include'] It is in your CPATH=/apps/local/easybuild/software/imkl/2018.3.222-iimpi-2018.03/mkl/include/fftw:/apps/local/easybuild/software/imkl/2018.3.222-iimpi-2018.03/mkl/include:/apps/local/easybuild/software/impi/2018.3.222-iccifort-2018.3.222-GCC-7.3.0-2.30/include64:/apps/local/easybuild/software/ifort/2018.3.222-GCC-7.3.0-2.30/include:/apps/local/easybuild/software/icc/2018.3.222-GCC-7.3.0-2.30/compilers_and_libraries_2018.3.222/linux/tbb/include:/apps/local/easybuild/software/binutils/2.30-GCCcore-7.3.0/include:/apps/local/easybuild/software/GCCcore/7.3.0/include:/home/kongf/workhome/lemhi/tools/include:: > On Aug 27, 2019, at 3:18 PM, Fande Kong via petsc-users wrote: > > Hi All, > > I was trying to compile PETSc with "--download-superlu_dist" using an intel compiler. I have explored with different options, but did not get PETSc built successfully so far. Any help would be appreciated. > > The log file is attached. > > Thanks, > > Fande, > From Moritz.Huck at rwth-aachen.de Wed Aug 28 02:26:19 2019 From: Moritz.Huck at rwth-aachen.de (Huck, Moritz) Date: Wed, 28 Aug 2019 07:26:19 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> <5A251CF0-D34E-4823-B0CA-695CC21AC1B5@pnnl.gov> <002a46ca0d73467aa4a7a4f9dfb503ea@rwth-aachen.de>, Message-ID: <5643e5d1bb8b4dada80ec8e76c98d5cd@rwth-aachen.de> Hi, (since I'm using petsc4py and wasnot able to hook gdb correctly up, I am "debbuging" with prints) I am using TS_EQ_DAE_IMPLICIT_INDEX1 as equation type. The out of bounds values occur inside the SNES as well after a step has finished. The occur first after a timestep and then "propgate" into the SNES. The problem arises with bt, l2 or basic as linesearch. It seems to occur with ARKIMEX(3,4,5) but not with ARKIMEX(L2,A2) or BDF (but these have to use much lower time steps), for the later SNESVI also works for bounding . @Shri the event solution seems not work for me, if an lower bound crossing is detected the solver reduces the time step to a small value and doesnt reach the crossing in a reasonable time frame. Best Regards, Moritz ________________________________________ Von: Smith, Barry F. Gesendet: Dienstag, 13. August 2019 05:58:29 An: Huck, Moritz Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] Problem with TS and SNES VI > On Aug 12, 2019, at 10:25 AM, Huck, Moritz wrote: > > Hi, > at the moment I am trying the Precheck version (I will try the event one afterwards). > My precheckfunction is (pseudo code): > precheckfunction(Vec X,Vec Y,PetscBool *changed){ > if any(X+Y *changed=True > Y[where(X+Y } > Inside precheck around 10-20 occurences of X+Y In my understanding this should not happen, since the precheck should be called before the IFunction call. For the nonlinear system solve once the precheck is done the new nonlinear solution approximation is computed via a line search X = X + lamba Y where lambda > 0 and for most line searches lambda <=1 (for example the SNESLINESEARCHBT will always result in lambda <=1, I am not sure about the other line searchs) X_i = X_i + lambda (lowerbound - X_i) = X_i - lambda X_i + lambda lowerbound = (1 - lambda) X_i + lambda lowerbound => (1 - lambda) lowerbound + lambda lowerbound = lowerbound Thus it seems you are correct, each step that the line search tries should satisfy the bounds. Possible issues: 1) the line search produces lambda > 1. Make sure you use SNESLINESEARCHBT ??? Here you would need to determine exactly when in the algorithm the IFunction is having as input X < lower bound. Somewhere in the ARKIMEX integrator? Are you using fully implicit? You might need to use fully implicit in order to enforce the bound? What I would do is run in the debugger and have it stop inside IFunction when the lower bound is not satisfied. Then do bt to see where the code is, in what part of the algorithms. If inside the line search you'll need to poke around at the values to see why the step could produce something below the bound which in theory it shouldn't Good luck Barry > > ________________________________________ > Von: Abhyankar, Shrirang G > Gesendet: Donnerstag, 8. August 2019 19:16:12 > An: Huck, Moritz; Smith, Barry F. > Cc: petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > Moritz, > I think your case will also work with using TSEvent. I think your problem is similar, correct me if I am wrong, to my application where I need to constrain the states within some limits, lb \le x. I use events to handle this, where I use two event functions: > (i) x ? lb = 0. if x > lb & > (ii) \dot{x} = 0 x = lb > > The first event function is used to detect when x hits the limit lb. Once it hits the limit, the differential equation for x is changed to (x-lb = 0) in the model to hold x at limit lb. For releasing x, there is an event function on the derivative of x, \dot{x}, and x is released on detection of the condition \dot{x} > 0. This is done through the event function \dot{x} = 0 with a positive zero crossing. > > An example of how the above works is in the example src/ts/examples/tutorials/power_grid/stability_9bus/ex9bus.c. In this example, there is an event function that first checks whether the state VR has hit the upper limit VRMAX. Once it does so, the flag VRatmax is set by the post-event function. The event function is then switched to the \dot{VR} > if (!VRatmax[i])) > fvalue[2+2*i] = VRMAX[i] - VR; > } else { > fvalue[2+2*i] = (VR - KA[i]*RF + KA[i]*KF[i]*Efd/TF[i] - KA[i]*(Vref[i] - Vm))/TA[i]; > } > > You can either try TSEvent or what Barry suggested SNESLineSearchSetPreCheck(), or both. > > Thanks, > Shri > > > From: "Huck, Moritz" > Date: Wednesday, August 7, 2019 at 8:46 AM > To: "Smith, Barry F." > Cc: "Abhyankar, Shrirang G" , "petsc-users at mcs.anl.gov" > Subject: AW: [petsc-users] Problem with TS and SNES VI > > Thank you for your response. > The sizes are only allowed to go down to a certain value. > The non-physical values do also occur during the function evaluations (IFunction). > > I will try to implment your suggestions with SNESLineSearchSetPreCheck. This would mean I dont have to use SNESVISetVariableBounds at all, right? > ________________________________________ > Von: Smith, Barry F. > > Gesendet: Dienstag, 6. August 2019 17:47:13 > An: Huck, Moritz > Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > Thanks, very useful. > > Are the non-physical values appearing in the nonlinear solver ? Or just at the time-step? > > Do you check for non-physical values each time you do a function evaluation needed by SNES/TS? > > If the non-physical values are an artifact of the steps taken in the nonlinear solver in SNES then the correct solution is to use > SNESLineSearchSetPreCheck() what you do is change the step so the resulting solutions are physical. > > For you case where the sizes go negative I am not sure what to do. Are the sizes allowed to go to zero? If so then adjust the step so that the sizes that go to negative values just go to zero. If they are suppose to be always positive then you need to pick some tolerance (say epsilon) and adjust the step so they are of size epsilon. Note you don't scale the entire step vector by a small number to satisfy the constraint you change each entry in the step as needed to satisfy the constraints. > > Good luck and let us know how it goes > > Barry > > > > On Aug 6, 2019, at 9:24 AM, Huck, Moritz > wrote: > > At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. > Unphysical values are e.g. particle sizes below zero. > > My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. > > The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. > ________________________________________ > Von: Smith, Barry F. > > Gesendet: Dienstag, 6. August 2019 15:51:16 > An: Huck, Moritz > Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? > > Is this within a stage or at the actual time-step after the stage? > > Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? > > Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). > > This information can help us determine what approach you should take. > > Thanks > > Barry > > > On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users > wrote: > > Hi, > I think I am missing something here. > How would events help to constrain the states. > Do you mean to use the event to "pause" to integration an adjust the state manually? > Or are the events to enforce smaller timesteps when the state come close to the constraints? > > Thank you, > Moritz > ________________________________________ > Von: Abhyankar, Shrirang G > > Gesendet: Montag, 5. August 2019 17:21:41 > An: Huck, Moritz; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > > For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. > https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. > > An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html > > A brief intro to TSEvent can be found here. > > Thanks, > Shri > > > From: petsc-users > on behalf of "Huck, Moritz via petsc-users" > > Reply-To: "Huck, Moritz" > > Date: Monday, August 5, 2019 at 5:18 AM > To: "petsc-users at mcs.anl.gov" > > Subject: [petsc-users] Problem with TS and SNES VI > > Hi, > I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. > The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). > But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. > Are there some tolerances I have to set for VI or something like this? > > Best Regards, > Moritz From cpraveen at gmail.com Wed Aug 28 08:09:22 2019 From: cpraveen at gmail.com (Praveen C) Date: Wed, 28 Aug 2019 18:39:22 +0530 Subject: [petsc-users] FVM using dmplex for linear advection Message-ID: Dear all I am trying to write a simple first order upwind FVM to solve linear advection in 2d on triangular meshes using dmplex. Based on some suggestions in recent discussions, I cooked up this simple example https://github.com/cpraveen/cfdlab/tree/master/petsc/dmplex_convect2d A gaussian profile must rotate around origin but in my code, there is some errors developing in just one or two cells, see picture. I am looking for any suggestions/hints/tips on how to identify the problem. Since I have never written a dmplex code before, I am not able to ask a specific question. Thanks praveen -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: visit0000.png Type: image/png Size: 117214 bytes Desc: not available URL: From jed at jedbrown.org Wed Aug 28 08:14:39 2019 From: jed at jedbrown.org (Jed Brown) Date: Wed, 28 Aug 2019 07:14:39 -0600 Subject: [petsc-users] FVM using dmplex for linear advection In-Reply-To: References: Message-ID: <87k1axqvnk.fsf@jedbrown.org> Try to reduce the problem. You may also compare with src/ts/examples/tutorials/ex11.c, which includes a finite-volume advection solver (with or without slope reconstruction/limiting). Praveen C via petsc-users writes: > Dear all > > I am trying to write a simple first order upwind FVM to solve linear advection in 2d on triangular meshes using dmplex. > > Based on some suggestions in recent discussions, I cooked up this simple example > > https://github.com/cpraveen/cfdlab/tree/master/petsc/dmplex_convect2d > > A gaussian profile must rotate around origin but in my code, there is some errors developing in just one or two cells, see picture. > > I am looking for any suggestions/hints/tips on how to identify the problem. Since I have never written a dmplex code before, I am not able to ask a specific question. > > Thanks > praveen From Andrew.Holm at kratosdefense.com Wed Aug 28 10:38:46 2019 From: Andrew.Holm at kratosdefense.com (Andrew Holm) Date: Wed, 28 Aug 2019 15:38:46 +0000 Subject: [petsc-users] Creating Local Copy of locked read only Vec Message-ID: Hello, We are working with an application using SNES and having a problem with the SNESFunction f called by SNESSetFunction(SNES snes,Vec r,PetscErrorCode (*f)(SNES,Vec,Vec,void*),void *ctx). We would like our SNESFunction(SNES snes,Vec x,Vec f,void *ctx) to be able to modify Vec x each time the function is called, how ever the vector is locked read only. We are running PETSc 3.9.3 with debugging turned on which tells us that Vec x is locked read only. To get around this, we have tried to make a local copy of the Vec x using VecCopy (x, xCopy). This gave us the error: [0]PETSC ERROR: Null argument, when expecting valid pointer [0]PETSC ERROR: Null Object: Parameter # 2 We have also tried using VecDuplicate(vecWithDesiredLayout, &xLocalCopy) and then VecSetValues to fill the xLocalCopy with the values of Vec x. This gave the error: [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Vec is locked read only, argument # 2 Also, we have tried using VecLockPop in PETSc 3.9.3 to overide the the lock however this seems gives a "Vector has been unlocked too many times" error at runtime: [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Vector has been unlocked too many times Any advice on how to make a local, modifiable copy of the locked read only vector? Any and All help is appreciated! Andy Holm Aeroscience Engineer Kratos Defense and Security Solutions 4904 Research Drive Huntsville, AL 35805 From bsmith at mcs.anl.gov Wed Aug 28 11:01:25 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 28 Aug 2019 16:01:25 +0000 Subject: [petsc-users] Problem with TS and SNES VI In-Reply-To: <5643e5d1bb8b4dada80ec8e76c98d5cd@rwth-aachen.de> References: <8dbfce1e921b4ae282a7539dfbf5370b@rwth-aachen.de> <3ac7d73f24074aec8b8c288b45193ecb@rwth-aachen.de> <5A251CF0-D34E-4823-B0CA-695CC21AC1B5@pnnl.gov> <002a46ca0d73467aa4a7a4f9dfb503ea@rwth-aachen.de> <5643e5d1bb8b4dada80ec8e76c98d5cd@rwth-aachen.de> Message-ID: <3633CF86-71B0-4C91-A3EB-A14F22B5098A@mcs.anl.gov> Without more detail it is impossible to understand why things are going wrong. Please attempt to do your debugging with gdb and send all the output, we may have suggestions on how to get it working. With that working you will be able to zoom in immediately at exactly where the problem is. Print statements are not the way to go. Do you have your snes line search prestep code working? Are the values out of bounds when you get in and do you fix them here to not be out of bounds? Do they they get out of bounds later inside SNES? If the prestep is working properly TS should never see out of bounds variables (since your code in SNES detects and removes them). Stick to bt for now. Barry > On Aug 28, 2019, at 2:26 AM, Huck, Moritz wrote: > > Hi, > (since I'm using petsc4py and wasnot able to hook gdb correctly up, I am "debbuging" with prints) > > I am using TS_EQ_DAE_IMPLICIT_INDEX1 as equation type. > The out of bounds values occur inside the SNES as well after a step has finished. > The occur first after a timestep and then "propgate" into the SNES. > The problem arises with bt, l2 or basic as linesearch. > It seems to occur with ARKIMEX(3,4,5) but not with ARKIMEX(L2,A2) or BDF (but these have to use much lower time steps), for the later SNESVI also works for bounding . > > @Shri the event solution seems not work for me, if an lower bound crossing is detected the solver reduces the time step to a small value and doesnt reach the crossing in a reasonable time frame. > > Best Regards, > Moritz > > > > > ________________________________________ > Von: Smith, Barry F. > Gesendet: Dienstag, 13. August 2019 05:58:29 > An: Huck, Moritz > Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov > Betreff: Re: [petsc-users] Problem with TS and SNES VI > >> On Aug 12, 2019, at 10:25 AM, Huck, Moritz wrote: >> >> Hi, >> at the moment I am trying the Precheck version (I will try the event one afterwards). >> My precheckfunction is (pseudo code): >> precheckfunction(Vec X,Vec Y,PetscBool *changed){ >> if any(X+Y> *changed=True >> Y[where(X+Y> } >> Inside precheck around 10-20 occurences of X+Y > In what IFunction calls are you getting all these occurrences? > >> In my understanding this should not happen, since the precheck should be called before the IFunction call. > > For the nonlinear system solve once the precheck is done the new nonlinear solution approximation is computed via a line search > > X = X + lamba Y where lambda > 0 and for most line searches lambda <=1 (for example the SNESLINESEARCHBT will always result in lambda <=1, I am not sure about the other line searchs) > > X_i = X_i + lambda (lowerbound - X_i) = X_i - lambda X_i + lambda lowerbound = (1 - lambda) X_i + lambda lowerbound => (1 - lambda) lowerbound + lambda lowerbound = lowerbound > > Thus it seems you are correct, each step that the line search tries should satisfy the bounds. > > Possible issues: > 1) the line search produces lambda > 1. Make sure you use SNESLINESEARCHBT > > ??? Here you would need to determine exactly when in the algorithm the IFunction is having as input X < lower bound. Somewhere in the ARKIMEX integrator? Are you using fully implicit? You might need to use fully implicit in order to enforce the bound? > > What I would do is run in the debugger and have it stop inside IFunction when the lower bound is not satisfied. Then do bt to see where the code is, in what part of the algorithms. If inside the line search you'll need to poke around at the values to see why the step could produce something below the bound which in theory it shouldn't > > Good luck > > Barry > > > >> >> ________________________________________ >> Von: Abhyankar, Shrirang G >> Gesendet: Donnerstag, 8. August 2019 19:16:12 >> An: Huck, Moritz; Smith, Barry F. >> Cc: petsc-users at mcs.anl.gov >> Betreff: Re: [petsc-users] Problem with TS and SNES VI >> >> Moritz, >> I think your case will also work with using TSEvent. I think your problem is similar, correct me if I am wrong, to my application where I need to constrain the states within some limits, lb \le x. I use events to handle this, where I use two event functions: >> (i) x ? lb = 0. if x > lb & >> (ii) \dot{x} = 0 x = lb >> >> The first event function is used to detect when x hits the limit lb. Once it hits the limit, the differential equation for x is changed to (x-lb = 0) in the model to hold x at limit lb. For releasing x, there is an event function on the derivative of x, \dot{x}, and x is released on detection of the condition \dot{x} > 0. This is done through the event function \dot{x} = 0 with a positive zero crossing. >> >> An example of how the above works is in the example src/ts/examples/tutorials/power_grid/stability_9bus/ex9bus.c. In this example, there is an event function that first checks whether the state VR has hit the upper limit VRMAX. Once it does so, the flag VRatmax is set by the post-event function. The event function is then switched to the \dot{VR} >> if (!VRatmax[i])) >> fvalue[2+2*i] = VRMAX[i] - VR; >> } else { >> fvalue[2+2*i] = (VR - KA[i]*RF + KA[i]*KF[i]*Efd/TF[i] - KA[i]*(Vref[i] - Vm))/TA[i]; >> } >> >> You can either try TSEvent or what Barry suggested SNESLineSearchSetPreCheck(), or both. >> >> Thanks, >> Shri >> >> >> From: "Huck, Moritz" >> Date: Wednesday, August 7, 2019 at 8:46 AM >> To: "Smith, Barry F." >> Cc: "Abhyankar, Shrirang G" , "petsc-users at mcs.anl.gov" >> Subject: AW: [petsc-users] Problem with TS and SNES VI >> >> Thank you for your response. >> The sizes are only allowed to go down to a certain value. >> The non-physical values do also occur during the function evaluations (IFunction). >> >> I will try to implment your suggestions with SNESLineSearchSetPreCheck. This would mean I dont have to use SNESVISetVariableBounds at all, right? >> ________________________________________ >> Von: Smith, Barry F. > >> Gesendet: Dienstag, 6. August 2019 17:47:13 >> An: Huck, Moritz >> Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov >> Betreff: Re: [petsc-users] Problem with TS and SNES VI >> >> Thanks, very useful. >> >> Are the non-physical values appearing in the nonlinear solver ? Or just at the time-step? >> >> Do you check for non-physical values each time you do a function evaluation needed by SNES/TS? >> >> If the non-physical values are an artifact of the steps taken in the nonlinear solver in SNES then the correct solution is to use >> SNESLineSearchSetPreCheck() what you do is change the step so the resulting solutions are physical. >> >> For you case where the sizes go negative I am not sure what to do. Are the sizes allowed to go to zero? If so then adjust the step so that the sizes that go to negative values just go to zero. If they are suppose to be always positive then you need to pick some tolerance (say epsilon) and adjust the step so they are of size epsilon. Note you don't scale the entire step vector by a small number to satisfy the constraint you change each entry in the step as needed to satisfy the constraints. >> >> Good luck and let us know how it goes >> >> Barry >> >> >> >> On Aug 6, 2019, at 9:24 AM, Huck, Moritz > wrote: >> >> At the moment I output only the values at the actual time-step (with the poststep functionality), I dont know the values during the stages. >> Unphysical values are e.g. particle sizes below zero. >> >> My model as no explicit inequalities, the only handling of the constraints is done by setting SNES VI. >> >> The model does not change in the senes that there are new equations. If have put in an conditional that xdot is calculated to be positive of x is on or below the lower bound. >> ________________________________________ >> Von: Smith, Barry F. > >> Gesendet: Dienstag, 6. August 2019 15:51:16 >> An: Huck, Moritz >> Cc: Abhyankar, Shrirang G; petsc-users at mcs.anl.gov >> Betreff: Re: [petsc-users] Problem with TS and SNES VI >> >> Could you explain in a bit more detail what you mean by "some states go to unphysical values" ? >> >> Is this within a stage or at the actual time-step after the stage? >> >> Does you model explicitly have these bounds on the solution; i.e. it is imposed as a variational inequality or does the model not explicitly have the constraints because its "analytic" solution just naturally stays in the physical region anyways? But numerical it can go out? >> >> Or, is your model suppose to "change" at a certain time, which you don't know in advance when the solution goes out of some predefined bounds" (this is where the event is designed for). >> >> This information can help us determine what approach you should take. >> >> Thanks >> >> Barry >> >> >> On Aug 6, 2019, at 2:12 AM, Huck, Moritz via petsc-users > wrote: >> >> Hi, >> I think I am missing something here. >> How would events help to constrain the states. >> Do you mean to use the event to "pause" to integration an adjust the state manually? >> Or are the events to enforce smaller timesteps when the state come close to the constraints? >> >> Thank you, >> Moritz >> ________________________________________ >> Von: Abhyankar, Shrirang G > >> Gesendet: Montag, 5. August 2019 17:21:41 >> An: Huck, Moritz; petsc-users at mcs.anl.gov >> Betreff: Re: [petsc-users] Problem with TS and SNES VI >> >> For problems with constraints on the states, I would recommend trying the event functionality, TSEvent, that allows detection and location of discrete events, such as one that you have in your problem. >> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetEventHandler.html. >> >> An example using TSEvent functionality: https://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/ex40.c.html >> >> A brief intro to TSEvent can be found here. >> >> Thanks, >> Shri >> >> >> From: petsc-users > on behalf of "Huck, Moritz via petsc-users" > >> Reply-To: "Huck, Moritz" > >> Date: Monday, August 5, 2019 at 5:18 AM >> To: "petsc-users at mcs.anl.gov" > >> Subject: [petsc-users] Problem with TS and SNES VI >> >> Hi, >> I am trying to solve a DAE with the ARKIMEX solver, which works mostly fine. >> The problem arises when some states go to unphysical values. I try to constrain my states with SNESVISetVariableBounds (through the petsc4py interface). >> But TS seems not respect this e.g. I have a state with is usually between 1 and 1e3 for which I set a lower bound of 1, but the state goes t0 -0.8 at some points. >> Are there some tolerances I have to set for VI or something like this? >> >> Best Regards, >> Moritz > From bsmith at mcs.anl.gov Wed Aug 28 11:13:59 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 28 Aug 2019 16:13:59 +0000 Subject: [petsc-users] Creating Local Copy of locked read only Vec In-Reply-To: References: Message-ID: <82153D01-2606-4546-A790-316624B220BB@anl.gov> You should be able to do VecDuplicate() and then VecCopy() into the duplicate but there may be a bug with the lock being passed through the VecDuplicate() we'll check this and get back to your soon. Barry > On Aug 28, 2019, at 10:38 AM, Andrew Holm via petsc-users wrote: > > Hello, > > We are working with an application using SNES and having a problem with the SNESFunction f called by SNESSetFunction(SNES snes,Vec r,PetscErrorCode (*f)(SNES,Vec,Vec,void*),void *ctx). > > We would like our SNESFunction(SNES snes,Vec x,Vec f,void *ctx) to be able to modify Vec x each time the function is called, how ever the vector is locked read only. We are running PETSc 3.9.3 with debugging turned on which tells us that Vec x is locked read only. > > To get around this, we have tried to make a local copy of the Vec x using VecCopy (x, xCopy). This gave us the error: > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Object: Parameter # 2 > > > We have also tried using VecDuplicate(vecWithDesiredLayout, &xLocalCopy) and then VecSetValues to fill the xLocalCopy with the values of Vec x. This gave the error: > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Vec is locked read only, argument # 2 > > > Also, we have tried using VecLockPop in PETSc 3.9.3 to overide the the lock however this seems gives a "Vector has been unlocked too many times" error at runtime: > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Vector has been unlocked too many times > > > Any advice on how to make a local, modifiable copy of the locked read only vector? Any and All help is appreciated! > > Andy Holm > Aeroscience Engineer > Kratos Defense and Security Solutions > 4904 Research Drive > Huntsville, AL 35805 From skavou1 at lsu.edu Wed Aug 28 13:23:57 2019 From: skavou1 at lsu.edu (Sepideh Kavousi) Date: Wed, 28 Aug 2019 18:23:57 +0000 Subject: [petsc-users] Caught signal number 15 Terminate Message-ID: Dear Petsc users, I have a code which is very strange that works for some cases and for some cases it gives an error. I run the following command: mpiexec -np 20 valgrind --tool=memcheck -q --num-callers=20 --log-file=valgrind.log.%p ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu -sub_ksp_type preonly -malloc_dump the error is as following and nothing is written in valgrid log files. Can someone please help me on how to solve the problem? the system size is 600x1200 and I have two degrees of freedom per grid point. Can this be due to the fact that I need to assign a larger number of processors? How should I know if the problem is related to this or anything else? Thanks, Sepideh . . . . . . [ 0]16 bytes PetscStrallocpy() line 197 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/str.c [0] PetscStrallocpy() line 197 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/str.c [0] PetscClassRegLogRegister() line 242 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c [0] PetscClassIdRegister() line 2042 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/plog.c [0] PetscSysInitializePackage() line 45 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/classes/viewer/interface/dlregispetsc.c [0] DMCreate() line 36 in /home/skavou1/Downloads/petsc-3.7.3/src/dm/interface/dm.c [0] DMDACreate() line 460 in /home/skavou1/Downloads/petsc-3.7.3/src/dm/impls/da/dacreate.c [0] DMDACreate2d() line 854 in /home/skavou1/Downloads/petsc-3.7.3/src/dm/impls/da/da2.c [ 0]10032 bytes PetscSegBufferCreate() line 68 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/segbuffer.c [ 0]16 bytes PetscSegBufferCreate() line 67 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/segbuffer.c [ 0]16 bytes PetscOptionsHelpPrintedCreate() line 50 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/classes/viewer/interface/viewreg.c [ 0]3200 bytes ClassPerfLogCreate() line 121 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c [ 0]16 bytes ClassPerfLogCreate() line 116 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c [ 0]16 bytes EventPerfLogCreate() line 92 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/eventlog.c [ 0]16 bytes PetscStrallocpy() line 197 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/str.c [ 0]1600 bytes PetscClassRegLogCreate() line 36 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c [ 0]16 bytes PetscClassRegLogCreate() line 31 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c [ 0]16 bytes EventRegLogCreate() line 34 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/eventlog.c [ 0]1280 bytes PetscStageLogCreate() line 635 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/stagelog.c [ 0]512 bytes PetscIntStackCreate() line 177 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/stack.c [ 0]16 bytes PetscIntStackCreate() line 172 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/stack.c [ 0]48 bytes PetscStageLogCreate() line 628 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/stagelog.c [ 0]10032 bytes PetscSegBufferCreate() line 68 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/segbuffer.c [ 0]16 bytes PetscSegBufferCreate() line 67 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/segbuffer.c [ 0]32 bytes PetscPushSignalHandler() line 307 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/error/signal.c [0]PETSC ERROR: Memory requested 414030660 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack [0]PETSC ERROR: #1 MatFDColoringSetUpBlocked_AIJ_Private() line 122 in /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/memory/mtr.c [0]PETSC ERROR: #3 MatFDColoringSetUpBlocked_AIJ_Private() line 122 in /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: #4 MatFDColoringSetUp_SeqXAIJ() line 277 in /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: #5 MatFDColoringSetUp() line 252 in /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c [0]PETSC ERROR: #6 SNESComputeJacobianDefaultColor() line 76 in /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c [0]PETSC ERROR: #7 SNESComputeJacobian() line 2312 in /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: #8 SNESSolve_KSPONLY() line 38 in /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c [0]PETSC ERROR: #9 SNESSolve() line 4005 in /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: #10 TS_SNESSolve() line 160 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: #11 TSBDF_Restart() line 181 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: #12 TSStep_BDF() line 216 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: #13 TSStep() line 3733 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: #14 TSSolve() line 3982 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: #15 main() line 267 in /data/skavou1/Ni-Nb/Fig-3/one.c [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -ksp_gmres_restart 1001 [0]PETSC ERROR: -ksp_type gmres [0]PETSC ERROR: -malloc_dump [0]PETSC ERROR: -pc_type bjacobi [0]PETSC ERROR: -snes_fd_color [0]PETSC ERROR: -snes_linesearch_type l2 [0]PETSC ERROR: -snes_type ksponly [0]PETSC ERROR: -sub_ksp_type preonly [0]PETSC ERROR: -sub_pc_type ilu [0]PETSC ERROR: -ts_bdf_adapt [0]PETSC ERROR: -ts_max_snes_failures -1 [0]PETSC ERROR: -ts_monitor [0]PETSC ERROR: -ts_type bdf [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 55) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 55) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatFDColoringSetUpBlocked_AIJ_Private line 67 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatFDColoringSetUpBlocked_AIJ_Private line 67 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatFDColoringSetUpBlocked_AIJ_Private line 67 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatFDColoringSetUpBlocked_AIJ_Private line 67 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: PetscMallocValidate: error detected at PetscSignalHandlerDefault() line 149 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/error/signal.c [0]PETSC ERROR: Memory [id=0(414029056)] at address 0x2cfac680 already freed [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Memory corruption: http://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind [0]PETSC ERROR: [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack [0]PETSC ERROR: #1 PetscMallocValidate() line 144 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/memory/mtr.c [0]PETSC ERROR: #2 PetscSignalHandlerDefault() line 149 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/error/signal.c application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0 -------------------------------------------------------------------------- mpiexec noticed that process rank 15 with PID 0 on node kratos exited on signal 9 (Killed). -------------------------------------------------------------------------- (base) skavou1 at kratos:/data/skavou1/Ni-Nb/Fig-3$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 28 15:07:38 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 28 Aug 2019 20:07:38 +0000 Subject: [petsc-users] Caught signal number 15 Terminate In-Reply-To: References: Message-ID: Process 0 tried to allocate 414,030,660 bytes which is not a huge amount of memory, this failed and MPI then shut down the other processes with a kill -15 Either your program or it in combination with other people's is using up the memory with this process. Not really anything to do with PETSc directly. As the program runs you can get a rough of how much memory it is using with top on linux or Activity Monitor on Apple. PETSc also has limited ways of tracking memory usage with one being the option you are using that prints how much is allocated in each function. The master branch has even more mays which can indicate in which method the various memory is being allocated. But in your case you just have to run smaller jobs or get more memory in your machines. A 600x1200 grid is not huge but ... Barry > On Aug 28, 2019, at 1:23 PM, Sepideh Kavousi via petsc-users wrote: > > Dear Petsc users, > I have a code which is very strange that works for some cases and for some cases it gives an error. I run the following command: > > mpiexec -np 20 valgrind --tool=memcheck -q --num-callers=20 --log-file=valgrind.log.%p ./one.out -ts_monitor -snes_fd_color -ts_max_snes_failures -1 -ts_type bdf -ts_bdf_adapt -pc_type bjacobi -snes_linesearch_type l2 -snes_type ksponly -ksp_type gmres -ksp_gmres_restart 1001 -sub_pc_type ilu -sub_ksp_type preonly -malloc_dump > > the error is as following and nothing is written in valgrid log files. Can someone please help me on how to solve the problem? the system size is 600x1200 and I have two degrees of freedom per grid point. Can this be due to the fact that I need to assign a larger number of processors? > How should I know if the problem is related to this or anything else? > > Thanks, > Sepideh > > . > . > . > . > . > . > [ 0]16 bytes PetscStrallocpy() line 197 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/str.c > [0] PetscStrallocpy() line 197 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/str.c > [0] PetscClassRegLogRegister() line 242 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c > [0] PetscClassIdRegister() line 2042 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/plog.c > [0] PetscSysInitializePackage() line 45 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/classes/viewer/interface/dlregispetsc.c > [0] DMCreate() line 36 in /home/skavou1/Downloads/petsc-3.7.3/src/dm/interface/dm.c > [0] DMDACreate() line 460 in /home/skavou1/Downloads/petsc-3.7.3/src/dm/impls/da/dacreate.c > [0] DMDACreate2d() line 854 in /home/skavou1/Downloads/petsc-3.7.3/src/dm/impls/da/da2.c > [ 0]10032 bytes PetscSegBufferCreate() line 68 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/segbuffer.c > [ 0]16 bytes PetscSegBufferCreate() line 67 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/segbuffer.c > [ 0]16 bytes PetscOptionsHelpPrintedCreate() line 50 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/classes/viewer/interface/viewreg.c > [ 0]3200 bytes ClassPerfLogCreate() line 121 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c > [ 0]16 bytes ClassPerfLogCreate() line 116 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c > [ 0]16 bytes EventPerfLogCreate() line 92 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/eventlog.c > [ 0]16 bytes PetscStrallocpy() line 197 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/str.c > [ 0]1600 bytes PetscClassRegLogCreate() line 36 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c > [ 0]16 bytes PetscClassRegLogCreate() line 31 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/classlog.c > [ 0]16 bytes EventRegLogCreate() line 34 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/eventlog.c > [ 0]1280 bytes PetscStageLogCreate() line 635 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/stagelog.c > [ 0]512 bytes PetscIntStackCreate() line 177 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/stack.c > [ 0]16 bytes PetscIntStackCreate() line 172 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/stack.c > [ 0]48 bytes PetscStageLogCreate() line 628 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/logging/utils/stagelog.c > [ 0]10032 bytes PetscSegBufferCreate() line 68 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/segbuffer.c > [ 0]16 bytes PetscSegBufferCreate() line 67 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/utils/segbuffer.c > [ 0]32 bytes PetscPushSignalHandler() line 307 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/error/signal.c > [0]PETSC ERROR: Memory requested 414030660 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 > [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack > [0]PETSC ERROR: #1 MatFDColoringSetUpBlocked_AIJ_Private() line 122 in /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/memory/mtr.c > [0]PETSC ERROR: #3 MatFDColoringSetUpBlocked_AIJ_Private() line 122 in /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: #4 MatFDColoringSetUp_SeqXAIJ() line 277 in /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: #5 MatFDColoringSetUp() line 252 in /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c > [0]PETSC ERROR: #6 SNESComputeJacobianDefaultColor() line 76 in /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c > [0]PETSC ERROR: #7 SNESComputeJacobian() line 2312 in /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: #8 SNESSolve_KSPONLY() line 38 in /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c > [0]PETSC ERROR: #9 SNESSolve() line 4005 in /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: #10 TS_SNESSolve() line 160 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: #11 TSBDF_Restart() line 181 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: #12 TSStep_BDF() line 216 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: #13 TSStep() line 3733 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: #14 TSSolve() line 3982 in /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: #15 main() line 267 in /data/skavou1/Ni-Nb/Fig-3/one.c > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -ksp_gmres_restart 1001 > [0]PETSC ERROR: -ksp_type gmres > [0]PETSC ERROR: -malloc_dump > [0]PETSC ERROR: -pc_type bjacobi > [0]PETSC ERROR: -snes_fd_color > [0]PETSC ERROR: -snes_linesearch_type l2 > [0]PETSC ERROR: -snes_type ksponly > [0]PETSC ERROR: -sub_ksp_type preonly > [0]PETSC ERROR: -sub_pc_type ilu > [0]PETSC ERROR: -ts_bdf_adapt > [0]PETSC ERROR: -ts_max_snes_failures -1 > [0]PETSC ERROR: -ts_monitor > [0]PETSC ERROR: -ts_type bdf > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > application called MPI_Abort(MPI_COMM_WORLD, 55) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 55) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c > [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c > [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c > [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 > [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatFDColoringSetUpBlocked_AIJ_Private line 67 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c > [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c > [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c > [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 > [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatFDColoringSetUpBlocked_AIJ_Private line 67 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c > [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c > [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c > [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 > [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatFDColoringSetUpBlocked_AIJ_Private line 67 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c > [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c > [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c > [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 > [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatFDColoringSetUpBlocked_AIJ_Private line 67 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c > [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c > [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c > [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 > [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatFDColoringSetUp_SeqXAIJ line 184 /home/skavou1/Downloads/petsc-3.7.3/src/mat/impls/aij/seq/fdaij.c > [0]PETSC ERROR: [0] MatFDColoringSetUp line 245 /home/skavou1/Downloads/petsc-3.7.3/src/mat/matfd/fdmatrix.c > [0]PETSC ERROR: [0] SNESComputeJacobianDefaultColor line 62 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snesj2.c > [0]PETSC ERROR: [0] SNES user Jacobian function line 2311 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeJacobian line 2270 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_KSPONLY line 12 /home/skavou1/Downloads/petsc-3.7.3/src/snes/impls/ksponly/ksponly.c > [0]PETSC ERROR: [0] SNESSolve line 3958 /home/skavou1/Downloads/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] TS_SNESSolve line 159 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSBDF_Restart line 174 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep_BDF line 204 /home/skavou1/Downloads/petsc-3.7.3/src/ts/impls/bdf/bdf.c > [0]PETSC ERROR: [0] TSStep line 3712 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSolve line 3933 /home/skavou1/Downloads/petsc-3.7.3/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 > [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: PetscMallocValidate: error detected at PetscSignalHandlerDefault() line 149 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/error/signal.c > [0]PETSC ERROR: Memory [id=0(414029056)] at address 0x2cfac680 already freed > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Memory corruption: http://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind > [0]PETSC ERROR: > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./one.out on a one named kratos by skavou1 Wed Aug 28 12:41:38 2019 > [0]PETSC ERROR: Configure options PETSC_ARCH=one --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack > [0]PETSC ERROR: #1 PetscMallocValidate() line 144 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/memory/mtr.c > [0]PETSC ERROR: #2 PetscSignalHandlerDefault() line 149 in /home/skavou1/Downloads/petsc-3.7.3/src/sys/error/signal.c > application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0 > -------------------------------------------------------------------------- > mpiexec noticed that process rank 15 with PID 0 on node kratos exited on signal 9 (Killed). > -------------------------------------------------------------------------- > (base) skavou1 at kratos:/data/skavou1/Ni-Nb/Fig-3$ From bsmith at mcs.anl.gov Wed Aug 28 17:25:29 2019 From: bsmith at mcs.anl.gov (Smith, Barry F.) Date: Wed, 28 Aug 2019 22:25:29 +0000 Subject: [petsc-users] Creating Local Copy of locked read only Vec In-Reply-To: References: Message-ID: I tested ierr = VecLockPush(x); ierr = VecDuplicate(x,&y);CHKERRQ(ierr); PetscScalar *array; VecGetArray(y,&array); VecRestoreArray(y,&array); ierr = VecLockPop(x); And it ran fine with version 3.9.3; please double check that you are not trying to access the locked vector. If the problem persists please send a small standalone code that reproduces the problem. Barry > On Aug 28, 2019, at 10:38 AM, Andrew Holm via petsc-users wrote: > > Hello, > > We are working with an application using SNES and having a problem with the SNESFunction f called by SNESSetFunction(SNES snes,Vec r,PetscErrorCode (*f)(SNES,Vec,Vec,void*),void *ctx). > > We would like our SNESFunction(SNES snes,Vec x,Vec f,void *ctx) to be able to modify Vec x each time the function is called, how ever the vector is locked read only. We are running PETSc 3.9.3 with debugging turned on which tells us that Vec x is locked read only. > > To get around this, we have tried to make a local copy of the Vec x using VecCopy (x, xCopy). This gave us the error: > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Object: Parameter # 2 > > > We have also tried using VecDuplicate(vecWithDesiredLayout, &xLocalCopy) and then VecSetValues to fill the xLocalCopy with the values of Vec x. This gave the error: > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Vec is locked read only, argument # 2 > > > Also, we have tried using VecLockPop in PETSc 3.9.3 to overide the the lock however this seems gives a "Vector has been unlocked too many times" error at runtime: > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Vector has been unlocked too many times > > > Any advice on how to make a local, modifiable copy of the locked read only vector? Any and All help is appreciated! > > Andy Holm > Aeroscience Engineer > Kratos Defense and Security Solutions > 4904 Research Drive > Huntsville, AL 35805 From swarnava89 at gmail.com Wed Aug 28 23:58:48 2019 From: swarnava89 at gmail.com (Swarnava Ghosh) Date: Wed, 28 Aug 2019 21:58:48 -0700 Subject: [petsc-users] Ignore points outside the mesh in DMInterpolationSetUP Message-ID: Hi Petsc team, I am trying to setup an interpolator. Is there any way to force DMInterpolationSetUp to ignore points which are not inside the mesh. My code fails with the following error: [1]PETSC ERROR: --------------------- Error Message ------------------------------------\ -------------------------- [1]PETSC ERROR: Petsc has generated inconsistent data [1]PETSC ERROR: Point 0: -5.4387 -2.15493 18.465 not located in mesh [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shoo\ ting. [1]PETSC ERROR: Petsc Release Version 3.8.4, Mar, 24, 2018 [1]PETSC ERROR: ../lib/cgdft on a arch-linux2-c-opt named hpc-22-08 by swarnava Wed Aug \ 28 21:52:24 2019 [1]PETSC ERROR: Configure options --prefix=/software/PETSc/3.8.4-intel --with-cc=mpicc -\ -with-cxx=mpicxx --with-fc=mpif90 --with-blaslapack-dir=/software/Intel/2018.1/compilers\ _and_libraries_2018.1.163/linux/mkl/lib/intel64_lin --with-debugging=no --with-shared-li\ braries=0 --download-metis --download-parmetis --download-superlu_dist [1]PETSC ERROR: #1 DMInterpolationSetUp() line 177 in /groups/hpc-support/install/PETSc/\ petsc-3.8.4_intel_no-debug/src/snes/utils/dmplexsnes.c [1]PETSC ERROR: #2 SetupInterpolator() line 250 in ./src/cgdft_finemesh.cc Sincerely, Swarnava -------------- next part -------------- An HTML attachment was scrubbed... URL: From mpovolot at purdue.edu Thu Aug 29 13:56:12 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Thu, 29 Aug 2019 18:56:12 +0000 Subject: [petsc-users] question about CISS Message-ID: Hello everyone, this is a question about? SLEPc. The problem that I need to solve is as follows. I have a matrix and I need a full spectrum of it (both eigenvalues and eigenvectors). The regular way is to use Lapack, but it is slow. I decided to try the following: a) compute the bounds of the spectrum using Krylov Schur approach. b) divide the complex eigenvalue plane into rectangular areas, then apply CISS to each area in parallel. However, I found that the solver is missing some eigenvalues, even if my rectangles cover the whole spectral area. My question: can this approach work in principle? If yes, how one can set-up CISS solver to not loose the eigenvalues? Thank you, Michael. From sam.guo at cd-adapco.com Thu Aug 29 14:31:44 2019 From: sam.guo at cd-adapco.com (Sam Guo) Date: Thu, 29 Aug 2019 15:31:44 -0400 Subject: [petsc-users] petsc on windows Message-ID: Dear PETSc dev team, I am looking some tips porting petsc to windows. We have our mpi wrapper (so we can switch different mpi). I configure petsc using --with-mpi-lib and --with-mpi-include ./configure --with-cc="win32fe cl" --with-fc=0 --download-f2cblaslapack --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include --with-shared-libaries=1 But I got error =============================================================================== Configuring PETSc to compile on your system =============================================================================== TESTING: check from config.libraries(config/BuildSystem/config/libraries.py:154) ******************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): ------------------------------------------------------------------------------- --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] and --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] did not work ******************************************************************************* To fix the configuration error, in config/BuildSystem/config/package.py, I removed self.executeTest(self.checkDependencies) self.executeTest(self.configureLibrary) self.executeTest(self.checkSharedLibrary) To link, I add my mpi wrapper to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: PCC_LINKER_FLAGS = -MD -wd4996 -Z7 /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib I got libpetstc.dll and libpetstc.lib. When I try to test it inside our code, PETSc somehow crates a duplicate of communicator with only 1 MPI process and PETSC_COMM_WORLD is set to 2. If I set PETSC_COMM_WORLD to 1 (our MPI_COMM_WORLD), PETSc is hanging. I am wondering if you could give me some tips how to debug this problem. BR, Sam -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 29 14:38:47 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 Aug 2019 15:38:47 -0400 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear PETSc dev team, > I am looking some tips porting petsc to windows. We have our mpi > wrapper (so we can switch different mpi). I configure petsc using > --with-mpi-lib and --with-mpi-include > ./configure --with-cc="win32fe cl" --with-fc=0 --download-f2cblaslapack > --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include > --with-shared-libaries=1 > > But I got error > > =============================================================================== > Configuring PETSc to compile on your system > > =============================================================================== > TESTING: check from > config.libraries(config/BuildSystem/config/libraries.py:154) > ******************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > ------------------------------------------------------------------------------- > --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] > and > --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] did > not work > > ******************************************************************************* > Your MPI wrapper should pass the tests here. Send the configure.log > To fix the configuration error, in config/BuildSystem/config/package.py, > I removed > self.executeTest(self.checkDependencies) > self.executeTest(self.configureLibrary) > self.executeTest(self.checkSharedLibrary) > > To link, I add my mpi wrapper > to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > I got libpetstc.dll and libpetstc.lib. When I try to test it inside our > code, PETSc somehow crates a duplicate of communicator with only 1 MPI > process and PETSC_COMM_WORLD is set to 2. If I set PETSC_COMM_WORLD to 1 > (our MPI_COMM_WORLD), PETSc is hanging. > We do dup the communicator on entry. Shouldn't that be supported by your wrapper? Thanks, Matt > I am wondering if you could give me some tips how to debug this problem. > > BR, > Sam > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Aug 29 14:44:14 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 29 Aug 2019 21:44:14 +0200 Subject: [petsc-users] question about CISS In-Reply-To: References: Message-ID: The CISS solver is supposed to estimate the number of eigenvalues contained in the contour. My impression is that the estimation is less accurate in case of rectangular contours, compared to elliptic ones. But of course, with ellipses it is not possible to fully cover the complex plane unless there is some overlap. Jose > El 29 ago 2019, a las 20:56, Povolotskyi, Mykhailo via petsc-users escribi?: > > Hello everyone, > > this is a question about SLEPc. > > The problem that I need to solve is as follows. > > I have a matrix and I need a full spectrum of it (both eigenvalues and > eigenvectors). > > The regular way is to use Lapack, but it is slow. I decided to try the > following: > > a) compute the bounds of the spectrum using Krylov Schur approach. > > b) divide the complex eigenvalue plane into rectangular areas, then > apply CISS to each area in parallel. > > However, I found that the solver is missing some eigenvalues, even if my > rectangles cover the whole spectral area. > > My question: can this approach work in principle? If yes, how one can > set-up CISS solver to not loose the eigenvalues? > > Thank you, > > Michael. > From mpovolot at purdue.edu Thu Aug 29 14:55:18 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Thu, 29 Aug 2019 19:55:18 +0000 Subject: [petsc-users] question about CISS In-Reply-To: References: Message-ID: Thank you, Jose, what about rings? Are they better than rectangles? Michael. On 08/29/2019 03:44 PM, Jose E. Roman wrote: > The CISS solver is supposed to estimate the number of eigenvalues contained in the contour. My impression is that the estimation is less accurate in case of rectangular contours, compared to elliptic ones. But of course, with ellipses it is not possible to fully cover the complex plane unless there is some overlap. > > Jose > > >> El 29 ago 2019, a las 20:56, Povolotskyi, Mykhailo via petsc-users escribi?: >> >> Hello everyone, >> >> this is a question about SLEPc. >> >> The problem that I need to solve is as follows. >> >> I have a matrix and I need a full spectrum of it (both eigenvalues and >> eigenvectors). >> >> The regular way is to use Lapack, but it is slow. I decided to try the >> following: >> >> a) compute the bounds of the spectrum using Krylov Schur approach. >> >> b) divide the complex eigenvalue plane into rectangular areas, then >> apply CISS to each area in parallel. >> >> However, I found that the solver is missing some eigenvalues, even if my >> rectangles cover the whole spectral area. >> >> My question: can this approach work in principle? If yes, how one can >> set-up CISS solver to not loose the eigenvalues? >> >> Thank you, >> >> Michael. >> From sam.guo at cd-adapco.com Thu Aug 29 15:01:51 2019 From: sam.guo at cd-adapco.com (Sam Guo) Date: Thu, 29 Aug 2019 16:01:51 -0400 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: Thanks for the quick response. Attached please find the configure.log containing the configure error. Regarding our dup, our wrapper does support it. In fact, everything works fine on Linux. I suspect on windows, PETSc picks the system mpi.h somehow. I am investigating it. Thanks, Sam On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley wrote: > On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < > petsc-users at mcs.anl.gov> wrote: > >> Dear PETSc dev team, >> I am looking some tips porting petsc to windows. We have our mpi >> wrapper (so we can switch different mpi). I configure petsc using >> --with-mpi-lib and --with-mpi-include >> ./configure --with-cc="win32fe cl" --with-fc=0 --download-f2cblaslapack >> --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >> --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include >> --with-shared-libaries=1 >> >> But I got error >> >> =============================================================================== >> Configuring PETSc to compile on your system >> >> =============================================================================== >> TESTING: check from >> config.libraries(config/BuildSystem/config/libraries.py:154) >> ******************************************************************************* >> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for >> details): >> >> ------------------------------------------------------------------------------- >> --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] >> and >> --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] did >> not work >> >> ******************************************************************************* >> > > Your MPI wrapper should pass the tests here. Send the configure.log > > >> To fix the configuration error, in config/BuildSystem/config/package.py, >> I removed >> self.executeTest(self.checkDependencies) >> self.executeTest(self.configureLibrary) >> self.executeTest(self.checkSharedLibrary) >> >> To link, I add my mpi wrapper >> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: >> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 >> /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >> >> I got libpetstc.dll and libpetstc.lib. When I try to test it inside our >> code, PETSc somehow crates a duplicate of communicator with only 1 MPI >> process and PETSC_COMM_WORLD is set to 2. If I set PETSC_COMM_WORLD to 1 >> (our MPI_COMM_WORLD), PETSc is hanging. >> > > We do dup the communicator on entry. Shouldn't that be supported by your > wrapper? > > Thanks, > > Matt > > >> I am wondering if you could give me some tips how to debug this problem. >> >> BR, >> Sam >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 108 bytes Desc: not available URL: From jroman at dsic.upv.es Thu Aug 29 15:14:33 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 29 Aug 2019 22:14:33 +0200 Subject: [petsc-users] question about CISS In-Reply-To: References: Message-ID: <67F9732B-55A2-41DB-8D05-862C40142B2D@dsic.upv.es> I am not an expert in contour integral eigensolvers. I think difficulties come with corners, so ellipses are the best choice. I don't think ring regions are relevant here. Have you considered using ScaLAPACK. Some time ago we were able to address problems of size up to 400k https://doi.org/10.1017/jfm.2016.208 Jose > El 29 ago 2019, a las 21:55, Povolotskyi, Mykhailo escribi?: > > Thank you, Jose, > > what about rings? Are they better than rectangles? > > Michael. > > > On 08/29/2019 03:44 PM, Jose E. Roman wrote: >> The CISS solver is supposed to estimate the number of eigenvalues contained in the contour. My impression is that the estimation is less accurate in case of rectangular contours, compared to elliptic ones. But of course, with ellipses it is not possible to fully cover the complex plane unless there is some overlap. >> >> Jose >> >> >>> El 29 ago 2019, a las 20:56, Povolotskyi, Mykhailo via petsc-users escribi?: >>> >>> Hello everyone, >>> >>> this is a question about SLEPc. >>> >>> The problem that I need to solve is as follows. >>> >>> I have a matrix and I need a full spectrum of it (both eigenvalues and >>> eigenvectors). >>> >>> The regular way is to use Lapack, but it is slow. I decided to try the >>> following: >>> >>> a) compute the bounds of the spectrum using Krylov Schur approach. >>> >>> b) divide the complex eigenvalue plane into rectangular areas, then >>> apply CISS to each area in parallel. >>> >>> However, I found that the solver is missing some eigenvalues, even if my >>> rectangles cover the whole spectral area. >>> >>> My question: can this approach work in principle? If yes, how one can >>> set-up CISS solver to not loose the eigenvalues? >>> >>> Thank you, >>> >>> Michael. >>> > From sam.guo at cd-adapco.com Thu Aug 29 15:16:18 2019 From: sam.guo at cd-adapco.com (Sam Guo) Date: Thu, 29 Aug 2019 16:16:18 -0400 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: Sorry, attachment again. On Thu, Aug 29, 2019 at 4:01 PM Sam Guo wrote: > Thanks for the quick response. Attached please find the configure.log > containing the configure error. > > Regarding our dup, our wrapper does support it. In fact, everything works > fine on Linux. I suspect on windows, PETSc picks the system mpi.h somehow. > I am investigating it. > > Thanks, > Sam > > On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley wrote: > >> On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Dear PETSc dev team, >>> I am looking some tips porting petsc to windows. We have our mpi >>> wrapper (so we can switch different mpi). I configure petsc using >>> --with-mpi-lib and --with-mpi-include >>> ./configure --with-cc="win32fe cl" --with-fc=0 --download-f2cblaslapack >>> --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >>> --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include >>> --with-shared-libaries=1 >>> >>> But I got error >>> >>> =============================================================================== >>> Configuring PETSc to compile on your system >>> >>> =============================================================================== >>> TESTING: check from >>> config.libraries(config/BuildSystem/config/libraries.py:154) >>> ******************************************************************************* >>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >>> for details): >>> >>> ------------------------------------------------------------------------------- >>> --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] >>> and >>> --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] >>> did not work >>> >>> ******************************************************************************* >>> >> >> Your MPI wrapper should pass the tests here. Send the configure.log >> >> >>> To fix the configuration error, in >>> config/BuildSystem/config/package.py, I removed >>> self.executeTest(self.checkDependencies) >>> self.executeTest(self.configureLibrary) >>> self.executeTest(self.checkSharedLibrary) >>> >>> To link, I add my mpi wrapper >>> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: >>> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 >>> /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >>> >>> I got libpetstc.dll and libpetstc.lib. When I try to test it inside our >>> code, PETSc somehow crates a duplicate of communicator with only 1 MPI >>> process and PETSC_COMM_WORLD is set to 2. If I set PETSC_COMM_WORLD to 1 >>> (our MPI_COMM_WORLD), PETSc is hanging. >>> >> >> We do dup the communicator on entry. Shouldn't that be supported by your >> wrapper? >> >> Thanks, >> >> Matt >> >> >>> I am wondering if you could give me some tips how to debug this problem. >>> >>> BR, >>> Sam >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 1030790 bytes Desc: not available URL: From mpovolot at purdue.edu Thu Aug 29 15:20:10 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Thu, 29 Aug 2019 20:20:10 +0000 Subject: [petsc-users] question about CISS In-Reply-To: <67F9732B-55A2-41DB-8D05-862C40142B2D@dsic.upv.es> References: <67F9732B-55A2-41DB-8D05-862C40142B2D@dsic.upv.es> Message-ID: <5326c756-2e7b-6e85-4461-25cb308c0693@purdue.edu> Thank you for suggestion. Is it interfaced to SLEPC? On 08/29/2019 04:14 PM, Jose E. Roman wrote: > I am not an expert in contour integral eigensolvers. I think difficulties come with corners, so ellipses are the best choice. I don't think ring regions are relevant here. > > Have you considered using ScaLAPACK. Some time ago we were able to address problems of size up to 400k https://doi.org/10.1017/jfm.2016.208 > > Jose > > >> El 29 ago 2019, a las 21:55, Povolotskyi, Mykhailo escribi?: >> >> Thank you, Jose, >> >> what about rings? Are they better than rectangles? >> >> Michael. >> >> >> On 08/29/2019 03:44 PM, Jose E. Roman wrote: >>> The CISS solver is supposed to estimate the number of eigenvalues contained in the contour. My impression is that the estimation is less accurate in case of rectangular contours, compared to elliptic ones. But of course, with ellipses it is not possible to fully cover the complex plane unless there is some overlap. >>> >>> Jose >>> >>> >>>> El 29 ago 2019, a las 20:56, Povolotskyi, Mykhailo via petsc-users escribi?: >>>> >>>> Hello everyone, >>>> >>>> this is a question about SLEPc. >>>> >>>> The problem that I need to solve is as follows. >>>> >>>> I have a matrix and I need a full spectrum of it (both eigenvalues and >>>> eigenvectors). >>>> >>>> The regular way is to use Lapack, but it is slow. I decided to try the >>>> following: >>>> >>>> a) compute the bounds of the spectrum using Krylov Schur approach. >>>> >>>> b) divide the complex eigenvalue plane into rectangular areas, then >>>> apply CISS to each area in parallel. >>>> >>>> However, I found that the solver is missing some eigenvalues, even if my >>>> rectangles cover the whole spectral area. >>>> >>>> My question: can this approach work in principle? If yes, how one can >>>> set-up CISS solver to not loose the eigenvalues? >>>> >>>> Thank you, >>>> >>>> Michael. >>>> From knepley at gmail.com Thu Aug 29 15:28:21 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 Aug 2019 16:28:21 -0400 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: On Thu, Aug 29, 2019 at 4:02 PM Sam Guo wrote: > Thanks for the quick response. Attached please find the configure.log > containing the configure error. > Executing: /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl -c -o /tmp/petsc-6DsCEk/config.libraries/conftest.o -I/tmp/petsc-6DsCEk/config.compilers -I/tmp/petsc-6DsCEk/config.setCompilers -I/tmp/petsc-6DsCEk/config.utilities.closure -I/tmp/petsc-6DsCEk/config.headers -I/tmp/petsc-6DsCEk/config.utilities.cacheDetails -I/tmp/petsc-6DsCEk/config.types -I/tmp/petsc-6DsCEk/config.atomics -I/tmp/petsc-6DsCEk/config.functions -I/tmp/petsc-6DsCEk/config.utilities.featureTestMacros -I/tmp/petsc-6DsCEk/config.utilities.missing -I/tmp/petsc-6DsCEk/PETSc.options.scalarTypes -I/tmp/petsc-6DsCEk/config.libraries -MD -wd4996 -Z7 /tmp/petsc-6DsCEk/config.libraries/conftest.c stdout: conftest.c Successful compile: Source: #include "confdefs.h" #include "conffix.h" /* Override any gcc2 internal prototype to avoid an error. */ char MPI_Init(); static void _check_MPI_Init() { MPI_Init(); } char MPI_Comm_create(); static void _check_MPI_Comm_create() { MPI_Comm_create(); } int main() { _check_MPI_Init(); _check_MPI_Comm_create();; return 0; } Executing: /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl -o /tmp/petsc-6DsCEk/config.libraries/conftest.exe -MD -wd4996 -Z7 /tmp/petsc-6DsCEk/config.libraries/conftest.o /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib Ws2_32.lib stdout: LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found or not built by the last incremental link; performing full link conftest.obj : error LNK2019: unresolved external symbol MPI_Init referenced in function _check_MPI_Init conftest.obj : error LNK2019: unresolved external symbol MPI_Comm_create referenced in function _check_MPI_Comm_create C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error LNK1120: 2 unresolved externals Possible ERROR while running linker: exit code 2 stdout: LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found or not built by the last incremental link; performing full link conftest.obj : error LNK2019: unresolved external symbol MPI_Init referenced in function _check_MPI_Init conftest.obj : error LNK2019: unresolved external symbol MPI_Comm_create referenced in function _check_MPI_Comm_create C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error LNK1120: 2 unresolved externals The link is definitely failing. Does it work if you do it by hand? Thanks, Matt > Regarding our dup, our wrapper does support it. In fact, everything works > fine on Linux. I suspect on windows, PETSc picks the system mpi.h somehow. > I am investigating it. > > Thanks, > Sam > > On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley wrote: > >> On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < >> petsc-users at mcs.anl.gov> wrote: >> >>> Dear PETSc dev team, >>> I am looking some tips porting petsc to windows. We have our mpi >>> wrapper (so we can switch different mpi). I configure petsc using >>> --with-mpi-lib and --with-mpi-include >>> ./configure --with-cc="win32fe cl" --with-fc=0 --download-f2cblaslapack >>> --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >>> --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include >>> --with-shared-libaries=1 >>> >>> But I got error >>> >>> =============================================================================== >>> Configuring PETSc to compile on your system >>> >>> =============================================================================== >>> TESTING: check from >>> config.libraries(config/BuildSystem/config/libraries.py:154) >>> ******************************************************************************* >>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >>> for details): >>> >>> ------------------------------------------------------------------------------- >>> --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] >>> and >>> --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] >>> did not work >>> >>> ******************************************************************************* >>> >> >> Your MPI wrapper should pass the tests here. Send the configure.log >> >> >>> To fix the configuration error, in >>> config/BuildSystem/config/package.py, I removed >>> self.executeTest(self.checkDependencies) >>> self.executeTest(self.configureLibrary) >>> self.executeTest(self.checkSharedLibrary) >>> >>> To link, I add my mpi wrapper >>> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: >>> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 >>> /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >>> >>> I got libpetstc.dll and libpetstc.lib. When I try to test it inside our >>> code, PETSc somehow crates a duplicate of communicator with only 1 MPI >>> process and PETSC_COMM_WORLD is set to 2. If I set PETSC_COMM_WORLD to 1 >>> (our MPI_COMM_WORLD), PETSc is hanging. >>> >> >> We do dup the communicator on entry. Shouldn't that be supported by your >> wrapper? >> >> Thanks, >> >> Matt >> >> >>> I am wondering if you could give me some tips how to debug this problem. >>> >>> BR, >>> Sam >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Aug 29 15:29:44 2019 From: jed at jedbrown.org (Jed Brown) Date: Thu, 29 Aug 2019 14:29:44 -0600 Subject: [petsc-users] question about CISS In-Reply-To: <5326c756-2e7b-6e85-4461-25cb308c0693@purdue.edu> References: <67F9732B-55A2-41DB-8D05-862C40142B2D@dsic.upv.es> <5326c756-2e7b-6e85-4461-25cb308c0693@purdue.edu> Message-ID: <87imqfr9zb.fsf@jedbrown.org> Elemental also has distributed-memory eigensolvers that should be at least as good as ScaLAPACK's. There is support for Elemental in PETSc, but not yet in SLEPc. "Povolotskyi, Mykhailo via petsc-users" writes: > Thank you for suggestion. > > Is it interfaced to SLEPC? > > > On 08/29/2019 04:14 PM, Jose E. Roman wrote: >> I am not an expert in contour integral eigensolvers. I think difficulties come with corners, so ellipses are the best choice. I don't think ring regions are relevant here. >> >> Have you considered using ScaLAPACK. Some time ago we were able to address problems of size up to 400k https://doi.org/10.1017/jfm.2016.208 >> >> Jose >> >> >>> El 29 ago 2019, a las 21:55, Povolotskyi, Mykhailo escribi?: >>> >>> Thank you, Jose, >>> >>> what about rings? Are they better than rectangles? >>> >>> Michael. >>> >>> >>> On 08/29/2019 03:44 PM, Jose E. Roman wrote: >>>> The CISS solver is supposed to estimate the number of eigenvalues contained in the contour. My impression is that the estimation is less accurate in case of rectangular contours, compared to elliptic ones. But of course, with ellipses it is not possible to fully cover the complex plane unless there is some overlap. >>>> >>>> Jose >>>> >>>> >>>>> El 29 ago 2019, a las 20:56, Povolotskyi, Mykhailo via petsc-users escribi?: >>>>> >>>>> Hello everyone, >>>>> >>>>> this is a question about SLEPc. >>>>> >>>>> The problem that I need to solve is as follows. >>>>> >>>>> I have a matrix and I need a full spectrum of it (both eigenvalues and >>>>> eigenvectors). >>>>> >>>>> The regular way is to use Lapack, but it is slow. I decided to try the >>>>> following: >>>>> >>>>> a) compute the bounds of the spectrum using Krylov Schur approach. >>>>> >>>>> b) divide the complex eigenvalue plane into rectangular areas, then >>>>> apply CISS to each area in parallel. >>>>> >>>>> However, I found that the solver is missing some eigenvalues, even if my >>>>> rectangles cover the whole spectral area. >>>>> >>>>> My question: can this approach work in principle? If yes, how one can >>>>> set-up CISS solver to not loose the eigenvalues? >>>>> >>>>> Thank you, >>>>> >>>>> Michael. >>>>> From mpovolot at purdue.edu Thu Aug 29 15:32:10 2019 From: mpovolot at purdue.edu (Povolotskyi, Mykhailo) Date: Thu, 29 Aug 2019 20:32:10 +0000 Subject: [petsc-users] question about CISS In-Reply-To: References: <67F9732B-55A2-41DB-8D05-862C40142B2D@dsic.upv.es> <5326c756-2e7b-6e85-4461-25cb308c0693@purdue.edu> <87imqfr9zb.fsf@jedbrown.org> Message-ID: <8072d037-32d1-3723-26ff-a48b54d1dca4@purdue.edu> It is not a symmetric matrix On 08/29/2019 04:30 PM, Matthew Knepley wrote: On Thu, Aug 29, 2019 at 4:29 PM Jed Brown via petsc-users > wrote: Elemental also has distributed-memory eigensolvers that should be at least as good as ScaLAPACK's. There is support for Elemental in PETSc, but not yet in SLEPc. Also if its symmetric, isn't https://elpa.mpcdf.mpg.de/ fairly scalable? Matt "Povolotskyi, Mykhailo via petsc-users" > writes: > Thank you for suggestion. > > Is it interfaced to SLEPC? > > > On 08/29/2019 04:14 PM, Jose E. Roman wrote: >> I am not an expert in contour integral eigensolvers. I think difficulties come with corners, so ellipses are the best choice. I don't think ring regions are relevant here. >> >> Have you considered using ScaLAPACK. Some time ago we were able to address problems of size up to 400k https://doi.org/10.1017/jfm.2016.208 >> >> Jose >> >> >>> El 29 ago 2019, a las 21:55, Povolotskyi, Mykhailo > escribi?: >>> >>> Thank you, Jose, >>> >>> what about rings? Are they better than rectangles? >>> >>> Michael. >>> >>> >>> On 08/29/2019 03:44 PM, Jose E. Roman wrote: >>>> The CISS solver is supposed to estimate the number of eigenvalues contained in the contour. My impression is that the estimation is less accurate in case of rectangular contours, compared to elliptic ones. But of course, with ellipses it is not possible to fully cover the complex plane unless there is some overlap. >>>> >>>> Jose >>>> >>>> >>>>> El 29 ago 2019, a las 20:56, Povolotskyi, Mykhailo via petsc-users > escribi?: >>>>> >>>>> Hello everyone, >>>>> >>>>> this is a question about SLEPc. >>>>> >>>>> The problem that I need to solve is as follows. >>>>> >>>>> I have a matrix and I need a full spectrum of it (both eigenvalues and >>>>> eigenvectors). >>>>> >>>>> The regular way is to use Lapack, but it is slow. I decided to try the >>>>> following: >>>>> >>>>> a) compute the bounds of the spectrum using Krylov Schur approach. >>>>> >>>>> b) divide the complex eigenvalue plane into rectangular areas, then >>>>> apply CISS to each area in parallel. >>>>> >>>>> However, I found that the solver is missing some eigenvalues, even if my >>>>> rectangles cover the whole spectral area. >>>>> >>>>> My question: can this approach work in principle? If yes, how one can >>>>> set-up CISS solver to not loose the eigenvalues? >>>>> >>>>> Thank you, >>>>> >>>>> Michael. >>>>> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 29 15:30:59 2019 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 29 Aug 2019 16:30:59 -0400 Subject: [petsc-users] question about CISS In-Reply-To: <87imqfr9zb.fsf@jedbrown.org> References: <67F9732B-55A2-41DB-8D05-862C40142B2D@dsic.upv.es> <5326c756-2e7b-6e85-4461-25cb308c0693@purdue.edu> <87imqfr9zb.fsf@jedbrown.org> Message-ID: On Thu, Aug 29, 2019 at 4:29 PM Jed Brown via petsc-users < petsc-users at mcs.anl.gov> wrote: > Elemental also has distributed-memory eigensolvers that should be at > least as good as ScaLAPACK's. There is support for Elemental in PETSc, > but not yet in SLEPc. > Also if its symmetric, isn't https://elpa.mpcdf.mpg.de/ fairly scalable? Matt > "Povolotskyi, Mykhailo via petsc-users" writes: > > > Thank you for suggestion. > > > > Is it interfaced to SLEPC? > > > > > > On 08/29/2019 04:14 PM, Jose E. Roman wrote: > >> I am not an expert in contour integral eigensolvers. I think > difficulties come with corners, so ellipses are the best choice. I don't > think ring regions are relevant here. > >> > >> Have you considered using ScaLAPACK. Some time ago we were able to > address problems of size up to 400k https://doi.org/10.1017/jfm.2016.208 > >> > >> Jose > >> > >> > >>> El 29 ago 2019, a las 21:55, Povolotskyi, Mykhailo < > mpovolot at purdue.edu> escribi?: > >>> > >>> Thank you, Jose, > >>> > >>> what about rings? Are they better than rectangles? > >>> > >>> Michael. > >>> > >>> > >>> On 08/29/2019 03:44 PM, Jose E. Roman wrote: > >>>> The CISS solver is supposed to estimate the number of eigenvalues > contained in the contour. My impression is that the estimation is less > accurate in case of rectangular contours, compared to elliptic ones. But of > course, with ellipses it is not possible to fully cover the complex plane > unless there is some overlap. > >>>> > >>>> Jose > >>>> > >>>> > >>>>> El 29 ago 2019, a las 20:56, Povolotskyi, Mykhailo via petsc-users < > petsc-users at mcs.anl.gov> escribi?: > >>>>> > >>>>> Hello everyone, > >>>>> > >>>>> this is a question about SLEPc. > >>>>> > >>>>> The problem that I need to solve is as follows. > >>>>> > >>>>> I have a matrix and I need a full spectrum of it (both eigenvalues > and > >>>>> eigenvectors). > >>>>> > >>>>> The regular way is to use Lapack, but it is slow. I decided to try > the > >>>>> following: > >>>>> > >>>>> a) compute the bounds of the spectrum using Krylov Schur approach. > >>>>> > >>>>> b) divide the complex eigenvalue plane into rectangular areas, then > >>>>> apply CISS to each area in parallel. > >>>>> > >>>>> However, I found that the solver is missing some eigenvalues, even > if my > >>>>> rectangles cover the whole spectral area. > >>>>> > >>>>> My question: can this approach work in principle? If yes, how one can > >>>>> set-up CISS solver to not loose the eigenvalues? > >>>>> > >>>>> Thank you, > >>>>> > >>>>> Michael. > >>>>> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Thu Aug 29 15:31:54 2019 From: sam.guo at cd-adapco.com (Sam Guo) Date: Thu, 29 Aug 2019 13:31:54 -0700 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: I can link when I add my wrapper to PCC_LINKER_FLAGS = -MD -wd4996 -Z7 /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib (I don't understand why configure does not include my wrapper) On Thu, Aug 29, 2019 at 1:28 PM Matthew Knepley wrote: > On Thu, Aug 29, 2019 at 4:02 PM Sam Guo wrote: > >> Thanks for the quick response. Attached please find the configure.log >> containing the configure error. >> > > Executing: /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > -c -o /tmp/petsc-6DsCEk/config.libraries/conftest.o > -I/tmp/petsc-6DsCEk/config.compilers > -I/tmp/petsc-6DsCEk/config.setCompilers > -I/tmp/petsc-6DsCEk/config.utilities.closure > -I/tmp/petsc-6DsCEk/config.headers > -I/tmp/petsc-6DsCEk/config.utilities.cacheDetails > -I/tmp/petsc-6DsCEk/config.types -I/tmp/petsc-6DsCEk/config.atomics > -I/tmp/petsc-6DsCEk/config.functions > -I/tmp/petsc-6DsCEk/config.utilities.featureTestMacros > -I/tmp/petsc-6DsCEk/config.utilities.missing > -I/tmp/petsc-6DsCEk/PETSc.options.scalarTypes > -I/tmp/petsc-6DsCEk/config.libraries -MD -wd4996 -Z7 > /tmp/petsc-6DsCEk/config.libraries/conftest.c > stdout: conftest.c > Successful compile: > Source: > #include "confdefs.h" > #include "conffix.h" > /* Override any gcc2 internal prototype to avoid an error. */ > char MPI_Init(); > static void _check_MPI_Init() { MPI_Init(); } > char MPI_Comm_create(); > static void _check_MPI_Comm_create() { MPI_Comm_create(); } > > int main() { > _check_MPI_Init(); > _check_MPI_Comm_create();; > return 0; > } > Executing: /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > -o /tmp/petsc-6DsCEk/config.libraries/conftest.exe -MD -wd4996 -Z7 > /tmp/petsc-6DsCEk/config.libraries/conftest.o > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > Ws2_32.lib > stdout: > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found or not > built by the last incremental link; performing full link > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > referenced in function _check_MPI_Init > conftest.obj : error LNK2019: unresolved external symbol MPI_Comm_create > referenced in function _check_MPI_Comm_create > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error LNK1120: > 2 unresolved externals > Possible ERROR while running linker: exit code 2 > stdout: > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found or not > built by the last incremental link; performing full link > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > referenced in function _check_MPI_Init > conftest.obj : error LNK2019: unresolved external symbol MPI_Comm_create > referenced in function _check_MPI_Comm_create > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error LNK1120: > 2 unresolved externals > > The link is definitely failing. Does it work if you do it by hand? > > Thanks, > > Matt > > >> Regarding our dup, our wrapper does support it. In fact, everything works >> fine on Linux. I suspect on windows, PETSc picks the system mpi.h somehow. >> I am investigating it. >> >> Thanks, >> Sam >> >> On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley >> wrote: >> >>> On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < >>> petsc-users at mcs.anl.gov> wrote: >>> >>>> Dear PETSc dev team, >>>> I am looking some tips porting petsc to windows. We have our mpi >>>> wrapper (so we can switch different mpi). I configure petsc using >>>> --with-mpi-lib and --with-mpi-include >>>> ./configure --with-cc="win32fe cl" --with-fc=0 >>>> --download-f2cblaslapack >>>> --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >>>> --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include >>>> --with-shared-libaries=1 >>>> >>>> But I got error >>>> >>>> =============================================================================== >>>> Configuring PETSc to compile on your system >>>> >>>> =============================================================================== >>>> TESTING: check from >>>> config.libraries(config/BuildSystem/config/libraries.py:154) >>>> ******************************************************************************* >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log >>>> for details): >>>> >>>> ------------------------------------------------------------------------------- >>>> --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] >>>> and >>>> --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] >>>> did not work >>>> >>>> ******************************************************************************* >>>> >>> >>> Your MPI wrapper should pass the tests here. Send the configure.log >>> >>> >>>> To fix the configuration error, in >>>> config/BuildSystem/config/package.py, I removed >>>> self.executeTest(self.checkDependencies) >>>> self.executeTest(self.configureLibrary) >>>> self.executeTest(self.checkSharedLibrary) >>>> >>>> To link, I add my mpi wrapper >>>> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: >>>> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 >>>> /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >>>> >>>> I got libpetstc.dll and libpetstc.lib. When I try to test it inside our >>>> code, PETSc somehow crates a duplicate of communicator with only 1 MPI >>>> process and PETSC_COMM_WORLD is set to 2. If I set PETSC_COMM_WORLD to 1 >>>> (our MPI_COMM_WORLD), PETSc is hanging. >>>> >>> >>> We do dup the communicator on entry. Shouldn't that be supported by your >>> wrapper? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> I am wondering if you could give me some tips how to debug this problem. >>>> >>>> BR, >>>> Sam >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Thu Aug 29 15:36:13 2019 From: jroman at dsic.upv.es (Jose E. Roman) Date: Thu, 29 Aug 2019 22:36:13 +0200 Subject: [petsc-users] question about CISS In-Reply-To: <5326c756-2e7b-6e85-4461-25cb308c0693@purdue.edu> References: <67F9732B-55A2-41DB-8D05-862C40142B2D@dsic.upv.es> <5326c756-2e7b-6e85-4461-25cb308c0693@purdue.edu> Message-ID: <5C807185-5413-4F57-B0A3-B10FDFF816EF@dsic.upv.es> > El 29 ago 2019, a las 22:20, Povolotskyi, Mykhailo escribi?: > > Thank you for suggestion. > > Is it interfaced to SLEPC? No, could be a future project... > > > On 08/29/2019 04:14 PM, Jose E. Roman wrote: >> I am not an expert in contour integral eigensolvers. I think difficulties come with corners, so ellipses are the best choice. I don't think ring regions are relevant here. >> >> Have you considered using ScaLAPACK. Some time ago we were able to address problems of size up to 400k https://doi.org/10.1017/jfm.2016.208 >> >> Jose >> >> >>> El 29 ago 2019, a las 21:55, Povolotskyi, Mykhailo escribi?: >>> >>> Thank you, Jose, >>> >>> what about rings? Are they better than rectangles? >>> >>> Michael. >>> >>> >>> On 08/29/2019 03:44 PM, Jose E. Roman wrote: >>>> The CISS solver is supposed to estimate the number of eigenvalues contained in the contour. My impression is that the estimation is less accurate in case of rectangular contours, compared to elliptic ones. But of course, with ellipses it is not possible to fully cover the complex plane unless there is some overlap. >>>> >>>> Jose >>>> >>>> >>>>> El 29 ago 2019, a las 20:56, Povolotskyi, Mykhailo via petsc-users escribi?: >>>>> >>>>> Hello everyone, >>>>> >>>>> this is a question about SLEPc. >>>>> >>>>> The problem that I need to solve is as follows. >>>>> >>>>> I have a matrix and I need a full spectrum of it (both eigenvalues and >>>>> eigenvectors). >>>>> >>>>> The regular way is to use Lapack, but it is slow. I decided to try the >>>>> following: >>>>> >>>>> a) compute the bounds of the spectrum using Krylov Schur approach. >>>>> >>>>> b) divide the complex eigenvalue plane into rectangular areas, then >>>>> apply CISS to each area in parallel. >>>>> >>>>> However, I found that the solver is missing some eigenvalues, even if my >>>>> rectangles cover the whole spectral area. >>>>> >>>>> My question: can this approach work in principle? If yes, how one can >>>>> set-up CISS solver to not loose the eigenvalues? >>>>> >>>>> Thank you, >>>>> >>>>> Michael. >>>>> > From balay at mcs.anl.gov Thu Aug 29 17:28:25 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Thu, 29 Aug 2019 22:28:25 +0000 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: On Thu, 29 Aug 2019, Sam Guo via petsc-users wrote: > I can link when I add my wrapper to > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib I don't understand what you mean here. Add PCC_LINKER_FLAGS to where? This is a variable in configure generated makefile Since PETSc is not built [as configure failed] - there should be no configure generated makefiles. > (I don't understand why configure does not include my wrapper) Well the compiler gives the error below. Can you try to compile manually [i.e without PETSc or any petsc makefiles] a simple MPI code - say cpi.c from MPICH and see if it works? [and copy/paste the log from this compile attempt. Satish > > > On Thu, Aug 29, 2019 at 1:28 PM Matthew Knepley wrote: > > > On Thu, Aug 29, 2019 at 4:02 PM Sam Guo wrote: > > > >> Thanks for the quick response. Attached please find the configure.log > >> containing the configure error. > >> > > > > Executing: /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > > -c -o /tmp/petsc-6DsCEk/config.libraries/conftest.o > > -I/tmp/petsc-6DsCEk/config.compilers > > -I/tmp/petsc-6DsCEk/config.setCompilers > > -I/tmp/petsc-6DsCEk/config.utilities.closure > > -I/tmp/petsc-6DsCEk/config.headers > > -I/tmp/petsc-6DsCEk/config.utilities.cacheDetails > > -I/tmp/petsc-6DsCEk/config.types -I/tmp/petsc-6DsCEk/config.atomics > > -I/tmp/petsc-6DsCEk/config.functions > > -I/tmp/petsc-6DsCEk/config.utilities.featureTestMacros > > -I/tmp/petsc-6DsCEk/config.utilities.missing > > -I/tmp/petsc-6DsCEk/PETSc.options.scalarTypes > > -I/tmp/petsc-6DsCEk/config.libraries -MD -wd4996 -Z7 > > /tmp/petsc-6DsCEk/config.libraries/conftest.c > > stdout: conftest.c > > Successful compile: > > Source: > > #include "confdefs.h" > > #include "conffix.h" > > /* Override any gcc2 internal prototype to avoid an error. */ > > char MPI_Init(); > > static void _check_MPI_Init() { MPI_Init(); } > > char MPI_Comm_create(); > > static void _check_MPI_Comm_create() { MPI_Comm_create(); } > > > > int main() { > > _check_MPI_Init(); > > _check_MPI_Comm_create();; > > return 0; > > } > > Executing: /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > > -o /tmp/petsc-6DsCEk/config.libraries/conftest.exe -MD -wd4996 -Z7 > > /tmp/petsc-6DsCEk/config.libraries/conftest.o > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > Ws2_32.lib > > stdout: > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found or not > > built by the last incremental link; performing full link > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > > referenced in function _check_MPI_Init > > conftest.obj : error LNK2019: unresolved external symbol MPI_Comm_create > > referenced in function _check_MPI_Comm_create > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error LNK1120: > > 2 unresolved externals > > Possible ERROR while running linker: exit code 2 > > stdout: > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found or not > > built by the last incremental link; performing full link > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > > referenced in function _check_MPI_Init > > conftest.obj : error LNK2019: unresolved external symbol MPI_Comm_create > > referenced in function _check_MPI_Comm_create > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error LNK1120: > > 2 unresolved externals > > > > The link is definitely failing. Does it work if you do it by hand? > > > > Thanks, > > > > Matt > > > > > >> Regarding our dup, our wrapper does support it. In fact, everything works > >> fine on Linux. I suspect on windows, PETSc picks the system mpi.h somehow. > >> I am investigating it. > >> > >> Thanks, > >> Sam > >> > >> On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley > >> wrote: > >> > >>> On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < > >>> petsc-users at mcs.anl.gov> wrote: > >>> > >>>> Dear PETSc dev team, > >>>> I am looking some tips porting petsc to windows. We have our mpi > >>>> wrapper (so we can switch different mpi). I configure petsc using > >>>> --with-mpi-lib and --with-mpi-include > >>>> ./configure --with-cc="win32fe cl" --with-fc=0 > >>>> --download-f2cblaslapack > >>>> --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > >>>> --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include > >>>> --with-shared-libaries=1 > >>>> > >>>> But I got error > >>>> > >>>> =============================================================================== > >>>> Configuring PETSc to compile on your system > >>>> > >>>> =============================================================================== > >>>> TESTING: check from > >>>> config.libraries(config/BuildSystem/config/libraries.py:154) > >>>> ******************************************************************************* > >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log > >>>> for details): > >>>> > >>>> ------------------------------------------------------------------------------- > >>>> --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] > >>>> and > >>>> --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] > >>>> did not work > >>>> > >>>> ******************************************************************************* > >>>> > >>> > >>> Your MPI wrapper should pass the tests here. Send the configure.log > >>> > >>> > >>>> To fix the configuration error, in > >>>> config/BuildSystem/config/package.py, I removed > >>>> self.executeTest(self.checkDependencies) > >>>> self.executeTest(self.configureLibrary) > >>>> self.executeTest(self.checkSharedLibrary) > >>>> > >>>> To link, I add my mpi wrapper > >>>> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > >>>> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > >>>> /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > >>>> > >>>> I got libpetstc.dll and libpetstc.lib. When I try to test it inside our > >>>> code, PETSc somehow crates a duplicate of communicator with only 1 MPI > >>>> process and PETSC_COMM_WORLD is set to 2. If I set PETSC_COMM_WORLD to 1 > >>>> (our MPI_COMM_WORLD), PETSc is hanging. > >>>> > >>> > >>> We do dup the communicator on entry. Shouldn't that be supported by your > >>> wrapper? > >>> > >>> Thanks, > >>> > >>> Matt > >>> > >>> > >>>> I am wondering if you could give me some tips how to debug this problem. > >>>> > >>>> BR, > >>>> Sam > >>>> > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin their > >>> experiments is infinitely more interesting than any results to which their > >>> experiments lead. > >>> -- Norbert Wiener > >>> > >>> https://www.cse.buffalo.edu/~knepley/ > >>> > >>> > >> > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which their > > experiments lead. > > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > From sam.guo at cd-adapco.com Thu Aug 29 17:46:47 2019 From: sam.guo at cd-adapco.com (Sam Guo) Date: Thu, 29 Aug 2019 15:46:47 -0700 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: After I removed following lines inin config/BuildSystem/config/package.py, configuration finished without error. self.executeTest(self.checkDependencies) self.executeTest(self.configureLibrary) self.executeTest(self.checkSharedLibrary) I then add my mpi wrapper to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: PCC_LINKER_FLAGS = -MD -wd4996 -Z7 /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib On Thu, Aug 29, 2019 at 3:28 PM Balay, Satish wrote: > On Thu, 29 Aug 2019, Sam Guo via petsc-users wrote: > > > I can link when I add my wrapper to > > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > I don't understand what you mean here. Add PCC_LINKER_FLAGS to where? This > is a variable in configure generated makefile > > Since PETSc is not built [as configure failed] - there should be no > configure generated makefiles. > > > (I don't understand why configure does not include my wrapper) > > Well the compiler gives the error below. Can you try to compile > manually [i.e without PETSc or any petsc makefiles] a simple MPI code > - say cpi.c from MPICH and see if it works? [and copy/paste the log > from this compile attempt. > > Satish > > > > > > > On Thu, Aug 29, 2019 at 1:28 PM Matthew Knepley > wrote: > > > > > On Thu, Aug 29, 2019 at 4:02 PM Sam Guo wrote: > > > > > >> Thanks for the quick response. Attached please find the configure.log > > >> containing the configure error. > > >> > > > > > > Executing: /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe > cl > > > -c -o /tmp/petsc-6DsCEk/config.libraries/conftest.o > > > -I/tmp/petsc-6DsCEk/config.compilers > > > -I/tmp/petsc-6DsCEk/config.setCompilers > > > -I/tmp/petsc-6DsCEk/config.utilities.closure > > > -I/tmp/petsc-6DsCEk/config.headers > > > -I/tmp/petsc-6DsCEk/config.utilities.cacheDetails > > > -I/tmp/petsc-6DsCEk/config.types -I/tmp/petsc-6DsCEk/config.atomics > > > -I/tmp/petsc-6DsCEk/config.functions > > > -I/tmp/petsc-6DsCEk/config.utilities.featureTestMacros > > > -I/tmp/petsc-6DsCEk/config.utilities.missing > > > -I/tmp/petsc-6DsCEk/PETSc.options.scalarTypes > > > -I/tmp/petsc-6DsCEk/config.libraries -MD -wd4996 -Z7 > > > /tmp/petsc-6DsCEk/config.libraries/conftest.c > > > stdout: conftest.c > > > Successful compile: > > > Source: > > > #include "confdefs.h" > > > #include "conffix.h" > > > /* Override any gcc2 internal prototype to avoid an error. */ > > > char MPI_Init(); > > > static void _check_MPI_Init() { MPI_Init(); } > > > char MPI_Comm_create(); > > > static void _check_MPI_Comm_create() { MPI_Comm_create(); } > > > > > > int main() { > > > _check_MPI_Init(); > > > _check_MPI_Comm_create();; > > > return 0; > > > } > > > Executing: /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe > cl > > > -o /tmp/petsc-6DsCEk/config.libraries/conftest.exe -MD -wd4996 -Z7 > > > /tmp/petsc-6DsCEk/config.libraries/conftest.o > > > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > > Ws2_32.lib > > > stdout: > > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found or > not > > > built by the last incremental link; performing full link > > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > > > referenced in function _check_MPI_Init > > > conftest.obj : error LNK2019: unresolved external symbol > MPI_Comm_create > > > referenced in function _check_MPI_Comm_create > > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error > LNK1120: > > > 2 unresolved externals > > > Possible ERROR while running linker: exit code 2 > > > stdout: > > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found or > not > > > built by the last incremental link; performing full link > > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > > > referenced in function _check_MPI_Init > > > conftest.obj : error LNK2019: unresolved external symbol > MPI_Comm_create > > > referenced in function _check_MPI_Comm_create > > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error > LNK1120: > > > 2 unresolved externals > > > > > > The link is definitely failing. Does it work if you do it by hand? > > > > > > Thanks, > > > > > > Matt > > > > > > > > >> Regarding our dup, our wrapper does support it. In fact, everything > works > > >> fine on Linux. I suspect on windows, PETSc picks the system mpi.h > somehow. > > >> I am investigating it. > > >> > > >> Thanks, > > >> Sam > > >> > > >> On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley > > >> wrote: > > >> > > >>> On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < > > >>> petsc-users at mcs.anl.gov> wrote: > > >>> > > >>>> Dear PETSc dev team, > > >>>> I am looking some tips porting petsc to windows. We have our mpi > > >>>> wrapper (so we can switch different mpi). I configure petsc using > > >>>> --with-mpi-lib and --with-mpi-include > > >>>> ./configure --with-cc="win32fe cl" --with-fc=0 > > >>>> --download-f2cblaslapack > > >>>> > --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > >>>> --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include > > >>>> --with-shared-libaries=1 > > >>>> > > >>>> But I got error > > >>>> > > >>>> > =============================================================================== > > >>>> Configuring PETSc to compile on your system > > >>>> > > >>>> > =============================================================================== > > >>>> TESTING: check from > > >>>> config.libraries(config/BuildSystem/config/libraries.py:154) > > >>>> > ******************************************************************************* > > >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see > configure.log > > >>>> for details): > > >>>> > > >>>> > ------------------------------------------------------------------------------- > > >>>> > --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] > > >>>> and > > >>>> > --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] > > >>>> did not work > > >>>> > > >>>> > ******************************************************************************* > > >>>> > > >>> > > >>> Your MPI wrapper should pass the tests here. Send the configure.log > > >>> > > >>> > > >>>> To fix the configuration error, in > > >>>> config/BuildSystem/config/package.py, I removed > > >>>> self.executeTest(self.checkDependencies) > > >>>> self.executeTest(self.configureLibrary) > > >>>> self.executeTest(self.checkSharedLibrary) > > >>>> > > >>>> To link, I add my mpi wrapper > > >>>> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > > >>>> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > > >>>> > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > >>>> > > >>>> I got libpetstc.dll and libpetstc.lib. When I try to test it inside > our > > >>>> code, PETSc somehow crates a duplicate of communicator with only 1 > MPI > > >>>> process and PETSC_COMM_WORLD is set to 2. If I set PETSC_COMM_WORLD > to 1 > > >>>> (our MPI_COMM_WORLD), PETSc is hanging. > > >>>> > > >>> > > >>> We do dup the communicator on entry. Shouldn't that be supported by > your > > >>> wrapper? > > >>> > > >>> Thanks, > > >>> > > >>> Matt > > >>> > > >>> > > >>>> I am wondering if you could give me some tips how to debug this > problem. > > >>>> > > >>>> BR, > > >>>> Sam > > >>>> > > >>> > > >>> > > >>> -- > > >>> What most experimenters take for granted before they begin their > > >>> experiments is infinitely more interesting than any results to which > their > > >>> experiments lead. > > >>> -- Norbert Wiener > > >>> > > >>> https://www.cse.buffalo.edu/~knepley/ > > >>> > > >>> > > >> > > > > > > -- > > > What most experimenters take for granted before they begin their > > > experiments is infinitely more interesting than any results to which > their > > > experiments lead. > > > -- Norbert Wiener > > > > > > https://www.cse.buffalo.edu/~knepley/ > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Thu Aug 29 17:57:49 2019 From: sam.guo at cd-adapco.com (Sam Guo) Date: Thu, 29 Aug 2019 15:57:49 -0700 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: When I use intel mpi, configuration, compile and test all work fine but I cannot use dll in my application. On Thu, Aug 29, 2019 at 3:46 PM Sam Guo wrote: > After I removed following lines inin config/BuildSystem/config/package.py, > configuration finished without error. > self.executeTest(self.checkDependencies) > self.executeTest(self.configureLibrary) > self.executeTest(self.checkSharedLibrary) > > I then add my mpi wrapper to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > On Thu, Aug 29, 2019 at 3:28 PM Balay, Satish wrote: > >> On Thu, 29 Aug 2019, Sam Guo via petsc-users wrote: >> >> > I can link when I add my wrapper to >> > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 >> > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >> >> I don't understand what you mean here. Add PCC_LINKER_FLAGS to where? >> This is a variable in configure generated makefile >> >> Since PETSc is not built [as configure failed] - there should be no >> configure generated makefiles. >> >> > (I don't understand why configure does not include my wrapper) >> >> Well the compiler gives the error below. Can you try to compile >> manually [i.e without PETSc or any petsc makefiles] a simple MPI code >> - say cpi.c from MPICH and see if it works? [and copy/paste the log >> from this compile attempt. >> >> Satish >> >> > >> > >> > On Thu, Aug 29, 2019 at 1:28 PM Matthew Knepley >> wrote: >> > >> > > On Thu, Aug 29, 2019 at 4:02 PM Sam Guo >> wrote: >> > > >> > >> Thanks for the quick response. Attached please find the configure.log >> > >> containing the configure error. >> > >> >> > > >> > > Executing: >> /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl >> > > -c -o /tmp/petsc-6DsCEk/config.libraries/conftest.o >> > > -I/tmp/petsc-6DsCEk/config.compilers >> > > -I/tmp/petsc-6DsCEk/config.setCompilers >> > > -I/tmp/petsc-6DsCEk/config.utilities.closure >> > > -I/tmp/petsc-6DsCEk/config.headers >> > > -I/tmp/petsc-6DsCEk/config.utilities.cacheDetails >> > > -I/tmp/petsc-6DsCEk/config.types -I/tmp/petsc-6DsCEk/config.atomics >> > > -I/tmp/petsc-6DsCEk/config.functions >> > > -I/tmp/petsc-6DsCEk/config.utilities.featureTestMacros >> > > -I/tmp/petsc-6DsCEk/config.utilities.missing >> > > -I/tmp/petsc-6DsCEk/PETSc.options.scalarTypes >> > > -I/tmp/petsc-6DsCEk/config.libraries -MD -wd4996 -Z7 >> > > /tmp/petsc-6DsCEk/config.libraries/conftest.c >> > > stdout: conftest.c >> > > Successful compile: >> > > Source: >> > > #include "confdefs.h" >> > > #include "conffix.h" >> > > /* Override any gcc2 internal prototype to avoid an error. */ >> > > char MPI_Init(); >> > > static void _check_MPI_Init() { MPI_Init(); } >> > > char MPI_Comm_create(); >> > > static void _check_MPI_Comm_create() { MPI_Comm_create(); } >> > > >> > > int main() { >> > > _check_MPI_Init(); >> > > _check_MPI_Comm_create();; >> > > return 0; >> > > } >> > > Executing: >> /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl >> > > -o /tmp/petsc-6DsCEk/config.libraries/conftest.exe -MD -wd4996 -Z7 >> > > /tmp/petsc-6DsCEk/config.libraries/conftest.o >> > > >> /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >> > > Ws2_32.lib >> > > stdout: >> > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found >> or not >> > > built by the last incremental link; performing full link >> > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init >> > > referenced in function _check_MPI_Init >> > > conftest.obj : error LNK2019: unresolved external symbol >> MPI_Comm_create >> > > referenced in function _check_MPI_Comm_create >> > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error >> LNK1120: >> > > 2 unresolved externals >> > > Possible ERROR while running linker: exit code 2 >> > > stdout: >> > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found >> or not >> > > built by the last incremental link; performing full link >> > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init >> > > referenced in function _check_MPI_Init >> > > conftest.obj : error LNK2019: unresolved external symbol >> MPI_Comm_create >> > > referenced in function _check_MPI_Comm_create >> > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error >> LNK1120: >> > > 2 unresolved externals >> > > >> > > The link is definitely failing. Does it work if you do it by hand? >> > > >> > > Thanks, >> > > >> > > Matt >> > > >> > > >> > >> Regarding our dup, our wrapper does support it. In fact, everything >> works >> > >> fine on Linux. I suspect on windows, PETSc picks the system mpi.h >> somehow. >> > >> I am investigating it. >> > >> >> > >> Thanks, >> > >> Sam >> > >> >> > >> On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley >> > >> wrote: >> > >> >> > >>> On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < >> > >>> petsc-users at mcs.anl.gov> wrote: >> > >>> >> > >>>> Dear PETSc dev team, >> > >>>> I am looking some tips porting petsc to windows. We have our mpi >> > >>>> wrapper (so we can switch different mpi). I configure petsc using >> > >>>> --with-mpi-lib and --with-mpi-include >> > >>>> ./configure --with-cc="win32fe cl" --with-fc=0 >> > >>>> --download-f2cblaslapack >> > >>>> >> --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >> > >>>> --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include >> > >>>> --with-shared-libaries=1 >> > >>>> >> > >>>> But I got error >> > >>>> >> > >>>> >> =============================================================================== >> > >>>> Configuring PETSc to compile on your system >> > >>>> >> > >>>> >> =============================================================================== >> > >>>> TESTING: check from >> > >>>> config.libraries(config/BuildSystem/config/libraries.py:154) >> > >>>> >> ******************************************************************************* >> > >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see >> configure.log >> > >>>> for details): >> > >>>> >> > >>>> >> ------------------------------------------------------------------------------- >> > >>>> >> --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] >> > >>>> and >> > >>>> >> --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] >> > >>>> did not work >> > >>>> >> > >>>> >> ******************************************************************************* >> > >>>> >> > >>> >> > >>> Your MPI wrapper should pass the tests here. Send the configure.log >> > >>> >> > >>> >> > >>>> To fix the configuration error, in >> > >>>> config/BuildSystem/config/package.py, I removed >> > >>>> self.executeTest(self.checkDependencies) >> > >>>> self.executeTest(self.configureLibrary) >> > >>>> self.executeTest(self.checkSharedLibrary) >> > >>>> >> > >>>> To link, I add my mpi wrapper >> > >>>> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: >> > >>>> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 >> > >>>> >> /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib >> > >>>> >> > >>>> I got libpetstc.dll and libpetstc.lib. When I try to test it >> inside our >> > >>>> code, PETSc somehow crates a duplicate of communicator with only 1 >> MPI >> > >>>> process and PETSC_COMM_WORLD is set to 2. If I set >> PETSC_COMM_WORLD to 1 >> > >>>> (our MPI_COMM_WORLD), PETSc is hanging. >> > >>>> >> > >>> >> > >>> We do dup the communicator on entry. Shouldn't that be supported by >> your >> > >>> wrapper? >> > >>> >> > >>> Thanks, >> > >>> >> > >>> Matt >> > >>> >> > >>> >> > >>>> I am wondering if you could give me some tips how to debug this >> problem. >> > >>>> >> > >>>> BR, >> > >>>> Sam >> > >>>> >> > >>> >> > >>> >> > >>> -- >> > >>> What most experimenters take for granted before they begin their >> > >>> experiments is infinitely more interesting than any results to >> which their >> > >>> experiments lead. >> > >>> -- Norbert Wiener >> > >>> >> > >>> https://www.cse.buffalo.edu/~knepley/ >> > >>> >> > >>> >> > >> >> > > >> > > -- >> > > What most experimenters take for granted before they begin their >> > > experiments is infinitely more interesting than any results to which >> their >> > > experiments lead. >> > > -- Norbert Wiener >> > > >> > > https://www.cse.buffalo.edu/~knepley/ >> > > >> > > >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Aug 29 19:51:47 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 30 Aug 2019 00:51:47 +0000 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: On MS-Windows - you need the location of the DLLs in PATH Or use --with-shared-libraries=0 Satish On Thu, 29 Aug 2019, Sam Guo via petsc-users wrote: > When I use intel mpi, configuration, compile and test all work fine but I > cannot use dll in my application. > > On Thu, Aug 29, 2019 at 3:46 PM Sam Guo wrote: > > > After I removed following lines inin config/BuildSystem/config/package.py, > > configuration finished without error. > > self.executeTest(self.checkDependencies) > > self.executeTest(self.configureLibrary) > > self.executeTest(self.checkSharedLibrary) > > > > I then add my mpi wrapper to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > > > On Thu, Aug 29, 2019 at 3:28 PM Balay, Satish wrote: > > > >> On Thu, 29 Aug 2019, Sam Guo via petsc-users wrote: > >> > >> > I can link when I add my wrapper to > >> > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > >> > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > >> > >> I don't understand what you mean here. Add PCC_LINKER_FLAGS to where? > >> This is a variable in configure generated makefile > >> > >> Since PETSc is not built [as configure failed] - there should be no > >> configure generated makefiles. > >> > >> > (I don't understand why configure does not include my wrapper) > >> > >> Well the compiler gives the error below. Can you try to compile > >> manually [i.e without PETSc or any petsc makefiles] a simple MPI code > >> - say cpi.c from MPICH and see if it works? [and copy/paste the log > >> from this compile attempt. > >> > >> Satish > >> > >> > > >> > > >> > On Thu, Aug 29, 2019 at 1:28 PM Matthew Knepley > >> wrote: > >> > > >> > > On Thu, Aug 29, 2019 at 4:02 PM Sam Guo > >> wrote: > >> > > > >> > >> Thanks for the quick response. Attached please find the configure.log > >> > >> containing the configure error. > >> > >> > >> > > > >> > > Executing: > >> /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > >> > > -c -o /tmp/petsc-6DsCEk/config.libraries/conftest.o > >> > > -I/tmp/petsc-6DsCEk/config.compilers > >> > > -I/tmp/petsc-6DsCEk/config.setCompilers > >> > > -I/tmp/petsc-6DsCEk/config.utilities.closure > >> > > -I/tmp/petsc-6DsCEk/config.headers > >> > > -I/tmp/petsc-6DsCEk/config.utilities.cacheDetails > >> > > -I/tmp/petsc-6DsCEk/config.types -I/tmp/petsc-6DsCEk/config.atomics > >> > > -I/tmp/petsc-6DsCEk/config.functions > >> > > -I/tmp/petsc-6DsCEk/config.utilities.featureTestMacros > >> > > -I/tmp/petsc-6DsCEk/config.utilities.missing > >> > > -I/tmp/petsc-6DsCEk/PETSc.options.scalarTypes > >> > > -I/tmp/petsc-6DsCEk/config.libraries -MD -wd4996 -Z7 > >> > > /tmp/petsc-6DsCEk/config.libraries/conftest.c > >> > > stdout: conftest.c > >> > > Successful compile: > >> > > Source: > >> > > #include "confdefs.h" > >> > > #include "conffix.h" > >> > > /* Override any gcc2 internal prototype to avoid an error. */ > >> > > char MPI_Init(); > >> > > static void _check_MPI_Init() { MPI_Init(); } > >> > > char MPI_Comm_create(); > >> > > static void _check_MPI_Comm_create() { MPI_Comm_create(); } > >> > > > >> > > int main() { > >> > > _check_MPI_Init(); > >> > > _check_MPI_Comm_create();; > >> > > return 0; > >> > > } > >> > > Executing: > >> /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > >> > > -o /tmp/petsc-6DsCEk/config.libraries/conftest.exe -MD -wd4996 -Z7 > >> > > /tmp/petsc-6DsCEk/config.libraries/conftest.o > >> > > > >> /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > >> > > Ws2_32.lib > >> > > stdout: > >> > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found > >> or not > >> > > built by the last incremental link; performing full link > >> > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > >> > > referenced in function _check_MPI_Init > >> > > conftest.obj : error LNK2019: unresolved external symbol > >> MPI_Comm_create > >> > > referenced in function _check_MPI_Comm_create > >> > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error > >> LNK1120: > >> > > 2 unresolved externals > >> > > Possible ERROR while running linker: exit code 2 > >> > > stdout: > >> > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not found > >> or not > >> > > built by the last incremental link; performing full link > >> > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > >> > > referenced in function _check_MPI_Init > >> > > conftest.obj : error LNK2019: unresolved external symbol > >> MPI_Comm_create > >> > > referenced in function _check_MPI_Comm_create > >> > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error > >> LNK1120: > >> > > 2 unresolved externals > >> > > > >> > > The link is definitely failing. Does it work if you do it by hand? > >> > > > >> > > Thanks, > >> > > > >> > > Matt > >> > > > >> > > > >> > >> Regarding our dup, our wrapper does support it. In fact, everything > >> works > >> > >> fine on Linux. I suspect on windows, PETSc picks the system mpi.h > >> somehow. > >> > >> I am investigating it. > >> > >> > >> > >> Thanks, > >> > >> Sam > >> > >> > >> > >> On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley > >> > >> wrote: > >> > >> > >> > >>> On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < > >> > >>> petsc-users at mcs.anl.gov> wrote: > >> > >>> > >> > >>>> Dear PETSc dev team, > >> > >>>> I am looking some tips porting petsc to windows. We have our mpi > >> > >>>> wrapper (so we can switch different mpi). I configure petsc using > >> > >>>> --with-mpi-lib and --with-mpi-include > >> > >>>> ./configure --with-cc="win32fe cl" --with-fc=0 > >> > >>>> --download-f2cblaslapack > >> > >>>> > >> --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > >> > >>>> --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include > >> > >>>> --with-shared-libaries=1 > >> > >>>> > >> > >>>> But I got error > >> > >>>> > >> > >>>> > >> =============================================================================== > >> > >>>> Configuring PETSc to compile on your system > >> > >>>> > >> > >>>> > >> =============================================================================== > >> > >>>> TESTING: check from > >> > >>>> config.libraries(config/BuildSystem/config/libraries.py:154) > >> > >>>> > >> ******************************************************************************* > >> > >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see > >> configure.log > >> > >>>> for details): > >> > >>>> > >> > >>>> > >> ------------------------------------------------------------------------------- > >> > >>>> > >> --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] > >> > >>>> and > >> > >>>> > >> --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] > >> > >>>> did not work > >> > >>>> > >> > >>>> > >> ******************************************************************************* > >> > >>>> > >> > >>> > >> > >>> Your MPI wrapper should pass the tests here. Send the configure.log > >> > >>> > >> > >>> > >> > >>>> To fix the configuration error, in > >> > >>>> config/BuildSystem/config/package.py, I removed > >> > >>>> self.executeTest(self.checkDependencies) > >> > >>>> self.executeTest(self.configureLibrary) > >> > >>>> self.executeTest(self.checkSharedLibrary) > >> > >>>> > >> > >>>> To link, I add my mpi wrapper > >> > >>>> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > >> > >>>> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > >> > >>>> > >> /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > >> > >>>> > >> > >>>> I got libpetstc.dll and libpetstc.lib. When I try to test it > >> inside our > >> > >>>> code, PETSc somehow crates a duplicate of communicator with only 1 > >> MPI > >> > >>>> process and PETSC_COMM_WORLD is set to 2. If I set > >> PETSC_COMM_WORLD to 1 > >> > >>>> (our MPI_COMM_WORLD), PETSc is hanging. > >> > >>>> > >> > >>> > >> > >>> We do dup the communicator on entry. Shouldn't that be supported by > >> your > >> > >>> wrapper? > >> > >>> > >> > >>> Thanks, > >> > >>> > >> > >>> Matt > >> > >>> > >> > >>> > >> > >>>> I am wondering if you could give me some tips how to debug this > >> problem. > >> > >>>> > >> > >>>> BR, > >> > >>>> Sam > >> > >>>> > >> > >>> > >> > >>> > >> > >>> -- > >> > >>> What most experimenters take for granted before they begin their > >> > >>> experiments is infinitely more interesting than any results to > >> which their > >> > >>> experiments lead. > >> > >>> -- Norbert Wiener > >> > >>> > >> > >>> https://www.cse.buffalo.edu/~knepley/ > >> > >>> > >> > >>> > >> > >> > >> > > > >> > > -- > >> > > What most experimenters take for granted before they begin their > >> > > experiments is infinitely more interesting than any results to which > >> their > >> > > experiments lead. > >> > > -- Norbert Wiener > >> > > > >> > > https://www.cse.buffalo.edu/~knepley/ > >> > > > >> > > > >> > > >> > >> > From a.croucher at auckland.ac.nz Thu Aug 29 23:10:50 2019 From: a.croucher at auckland.ac.nz (Adrian Croucher) Date: Fri, 30 Aug 2019 16:10:50 +1200 Subject: [petsc-users] DMLocalToLocal for DMPlex Message-ID: hi If I have a local vector and I want to update all its ghost values from its non-ghost values, should I use DMLocalToLocalBegin()/End() ? I have tried it and it gives me an error: "This DM does not support local to local maps". The DM is a DMPlex. Is the local-to-local operation not implemented for DMPlex? Or should I be using something else to do this? - Adrian -- Dr Adrian Croucher Senior Research Fellow Department of Engineering Science University of Auckland, New Zealand email: a.croucher at auckland.ac.nz tel: +64 (0)9 923 4611 From knepley at gmail.com Fri Aug 30 06:30:43 2019 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 30 Aug 2019 07:30:43 -0400 Subject: [petsc-users] DMLocalToLocal for DMPlex In-Reply-To: References: Message-ID: On Fri, Aug 30, 2019 at 12:11 AM Adrian Croucher via petsc-users < petsc-users at mcs.anl.gov> wrote: > hi > > If I have a local vector and I want to update all its ghost values from > its non-ghost values, should I use DMLocalToLocalBegin()/End() ? > > I have tried it and it gives me an error: "This DM does not support > local to local maps". > > The DM is a DMPlex. Is the local-to-local operation not implemented for > DMPlex? > Yes, I did not write a direct L2L. Everything must go through L2G->G2L. If its a bottleneck, we could add it. Thanks, Matt > Or should I be using something else to do this? > > - Adrian > > -- > Dr Adrian Croucher > Senior Research Fellow > Department of Engineering Science > University of Auckland, New Zealand > email: a.croucher at auckland.ac.nz > tel: +64 (0)9 923 4611 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam.guo at cd-adapco.com Fri Aug 30 13:11:08 2019 From: sam.guo at cd-adapco.com (Sam Guo) Date: Fri, 30 Aug 2019 11:11:08 -0700 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: Thanks a lot for your help. It is my pilot error: I have both serial version and parallel version of petstc. It turns out serial version is always loaded. Now parallel petstc is working. On Thu, Aug 29, 2019 at 5:51 PM Balay, Satish wrote: > On MS-Windows - you need the location of the DLLs in PATH > > Or use --with-shared-libraries=0 > > Satish > > On Thu, 29 Aug 2019, Sam Guo via petsc-users wrote: > > > When I use intel mpi, configuration, compile and test all work fine but I > > cannot use dll in my application. > > > > On Thu, Aug 29, 2019 at 3:46 PM Sam Guo wrote: > > > > > After I removed following lines inin > config/BuildSystem/config/package.py, > > > configuration finished without error. > > > self.executeTest(self.checkDependencies) > > > self.executeTest(self.configureLibrary) > > > self.executeTest(self.checkSharedLibrary) > > > > > > I then add my mpi wrapper to > ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > > > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > > > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > > > > > On Thu, Aug 29, 2019 at 3:28 PM Balay, Satish > wrote: > > > > > >> On Thu, 29 Aug 2019, Sam Guo via petsc-users wrote: > > >> > > >> > I can link when I add my wrapper to > > >> > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > > >> > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > >> > > >> I don't understand what you mean here. Add PCC_LINKER_FLAGS to where? > > >> This is a variable in configure generated makefile > > >> > > >> Since PETSc is not built [as configure failed] - there should be no > > >> configure generated makefiles. > > >> > > >> > (I don't understand why configure does not include my wrapper) > > >> > > >> Well the compiler gives the error below. Can you try to compile > > >> manually [i.e without PETSc or any petsc makefiles] a simple MPI code > > >> - say cpi.c from MPICH and see if it works? [and copy/paste the log > > >> from this compile attempt. > > >> > > >> Satish > > >> > > >> > > > >> > > > >> > On Thu, Aug 29, 2019 at 1:28 PM Matthew Knepley > > >> wrote: > > >> > > > >> > > On Thu, Aug 29, 2019 at 4:02 PM Sam Guo > > >> wrote: > > >> > > > > >> > >> Thanks for the quick response. Attached please find the > configure.log > > >> > >> containing the configure error. > > >> > >> > > >> > > > > >> > > Executing: > > >> /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > > >> > > -c -o /tmp/petsc-6DsCEk/config.libraries/conftest.o > > >> > > -I/tmp/petsc-6DsCEk/config.compilers > > >> > > -I/tmp/petsc-6DsCEk/config.setCompilers > > >> > > -I/tmp/petsc-6DsCEk/config.utilities.closure > > >> > > -I/tmp/petsc-6DsCEk/config.headers > > >> > > -I/tmp/petsc-6DsCEk/config.utilities.cacheDetails > > >> > > -I/tmp/petsc-6DsCEk/config.types > -I/tmp/petsc-6DsCEk/config.atomics > > >> > > -I/tmp/petsc-6DsCEk/config.functions > > >> > > -I/tmp/petsc-6DsCEk/config.utilities.featureTestMacros > > >> > > -I/tmp/petsc-6DsCEk/config.utilities.missing > > >> > > -I/tmp/petsc-6DsCEk/PETSc.options.scalarTypes > > >> > > -I/tmp/petsc-6DsCEk/config.libraries -MD -wd4996 -Z7 > > >> > > /tmp/petsc-6DsCEk/config.libraries/conftest.c > > >> > > stdout: conftest.c > > >> > > Successful compile: > > >> > > Source: > > >> > > #include "confdefs.h" > > >> > > #include "conffix.h" > > >> > > /* Override any gcc2 internal prototype to avoid an error. */ > > >> > > char MPI_Init(); > > >> > > static void _check_MPI_Init() { MPI_Init(); } > > >> > > char MPI_Comm_create(); > > >> > > static void _check_MPI_Comm_create() { MPI_Comm_create(); } > > >> > > > > >> > > int main() { > > >> > > _check_MPI_Init(); > > >> > > _check_MPI_Comm_create();; > > >> > > return 0; > > >> > > } > > >> > > Executing: > > >> /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > > >> > > -o /tmp/petsc-6DsCEk/config.libraries/conftest.exe -MD > -wd4996 -Z7 > > >> > > /tmp/petsc-6DsCEk/config.libraries/conftest.o > > >> > > > > >> > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > >> > > Ws2_32.lib > > >> > > stdout: > > >> > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not > found > > >> or not > > >> > > built by the last incremental link; performing full link > > >> > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > > >> > > referenced in function _check_MPI_Init > > >> > > conftest.obj : error LNK2019: unresolved external symbol > > >> MPI_Comm_create > > >> > > referenced in function _check_MPI_Comm_create > > >> > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error > > >> LNK1120: > > >> > > 2 unresolved externals > > >> > > Possible ERROR while running linker: exit code 2 > > >> > > stdout: > > >> > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not > found > > >> or not > > >> > > built by the last incremental link; performing full link > > >> > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > > >> > > referenced in function _check_MPI_Init > > >> > > conftest.obj : error LNK2019: unresolved external symbol > > >> MPI_Comm_create > > >> > > referenced in function _check_MPI_Comm_create > > >> > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error > > >> LNK1120: > > >> > > 2 unresolved externals > > >> > > > > >> > > The link is definitely failing. Does it work if you do it by hand? > > >> > > > > >> > > Thanks, > > >> > > > > >> > > Matt > > >> > > > > >> > > > > >> > >> Regarding our dup, our wrapper does support it. In fact, > everything > > >> works > > >> > >> fine on Linux. I suspect on windows, PETSc picks the system mpi.h > > >> somehow. > > >> > >> I am investigating it. > > >> > >> > > >> > >> Thanks, > > >> > >> Sam > > >> > >> > > >> > >> On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley < > knepley at gmail.com> > > >> > >> wrote: > > >> > >> > > >> > >>> On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < > > >> > >>> petsc-users at mcs.anl.gov> wrote: > > >> > >>> > > >> > >>>> Dear PETSc dev team, > > >> > >>>> I am looking some tips porting petsc to windows. We have > our mpi > > >> > >>>> wrapper (so we can switch different mpi). I configure petsc > using > > >> > >>>> --with-mpi-lib and --with-mpi-include > > >> > >>>> ./configure --with-cc="win32fe cl" --with-fc=0 > > >> > >>>> --download-f2cblaslapack > > >> > >>>> > > >> > --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > >> > >>>> > --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include > > >> > >>>> --with-shared-libaries=1 > > >> > >>>> > > >> > >>>> But I got error > > >> > >>>> > > >> > >>>> > > >> > =============================================================================== > > >> > >>>> Configuring PETSc to compile on your system > > >> > >>>> > > >> > >>>> > > >> > =============================================================================== > > >> > >>>> TESTING: check from > > >> > >>>> config.libraries(config/BuildSystem/config/libraries.py:154) > > >> > >>>> > > >> > ******************************************************************************* > > >> > >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see > > >> configure.log > > >> > >>>> for details): > > >> > >>>> > > >> > >>>> > > >> > ------------------------------------------------------------------------------- > > >> > >>>> > > >> > --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] > > >> > >>>> and > > >> > >>>> > > >> --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] > > >> > >>>> did not work > > >> > >>>> > > >> > >>>> > > >> > ******************************************************************************* > > >> > >>>> > > >> > >>> > > >> > >>> Your MPI wrapper should pass the tests here. Send the > configure.log > > >> > >>> > > >> > >>> > > >> > >>>> To fix the configuration error, in > > >> > >>>> config/BuildSystem/config/package.py, I removed > > >> > >>>> self.executeTest(self.checkDependencies) > > >> > >>>> self.executeTest(self.configureLibrary) > > >> > >>>> self.executeTest(self.checkSharedLibrary) > > >> > >>>> > > >> > >>>> To link, I add my mpi wrapper > > >> > >>>> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > > >> > >>>> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > > >> > >>>> > > >> > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > >> > >>>> > > >> > >>>> I got libpetstc.dll and libpetstc.lib. When I try to test it > > >> inside our > > >> > >>>> code, PETSc somehow crates a duplicate of communicator with > only 1 > > >> MPI > > >> > >>>> process and PETSC_COMM_WORLD is set to 2. If I set > > >> PETSC_COMM_WORLD to 1 > > >> > >>>> (our MPI_COMM_WORLD), PETSc is hanging. > > >> > >>>> > > >> > >>> > > >> > >>> We do dup the communicator on entry. Shouldn't that be > supported by > > >> your > > >> > >>> wrapper? > > >> > >>> > > >> > >>> Thanks, > > >> > >>> > > >> > >>> Matt > > >> > >>> > > >> > >>> > > >> > >>>> I am wondering if you could give me some tips how to debug this > > >> problem. > > >> > >>>> > > >> > >>>> BR, > > >> > >>>> Sam > > >> > >>>> > > >> > >>> > > >> > >>> > > >> > >>> -- > > >> > >>> What most experimenters take for granted before they begin their > > >> > >>> experiments is infinitely more interesting than any results to > > >> which their > > >> > >>> experiments lead. > > >> > >>> -- Norbert Wiener > > >> > >>> > > >> > >>> https://www.cse.buffalo.edu/~knepley/ > > >> > >>> > > >> > >>> > > >> > >> > > >> > > > > >> > > -- > > >> > > What most experimenters take for granted before they begin their > > >> > > experiments is infinitely more interesting than any results to > which > > >> their > > >> > > experiments lead. > > >> > > -- Norbert Wiener > > >> > > > > >> > > https://www.cse.buffalo.edu/~knepley/ > > >> > > > > >> > > > > >> > > > >> > > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Aug 30 13:34:01 2019 From: balay at mcs.anl.gov (Balay, Satish) Date: Fri, 30 Aug 2019 18:34:01 +0000 Subject: [petsc-users] petsc on windows In-Reply-To: References: Message-ID: Thanks for the update. Yes - having the wrong variant of libpetsc.dll in PATH can cause problems. Satish On Fri, 30 Aug 2019, Sam Guo via petsc-users wrote: > Thanks a lot for your help. It is my pilot error: I have both serial > version and parallel version of petstc. It turns out serial version is > always loaded. Now parallel petstc is working. > > On Thu, Aug 29, 2019 at 5:51 PM Balay, Satish wrote: > > > On MS-Windows - you need the location of the DLLs in PATH > > > > Or use --with-shared-libraries=0 > > > > Satish > > > > On Thu, 29 Aug 2019, Sam Guo via petsc-users wrote: > > > > > When I use intel mpi, configuration, compile and test all work fine but I > > > cannot use dll in my application. > > > > > > On Thu, Aug 29, 2019 at 3:46 PM Sam Guo wrote: > > > > > > > After I removed following lines inin > > config/BuildSystem/config/package.py, > > > > configuration finished without error. > > > > self.executeTest(self.checkDependencies) > > > > self.executeTest(self.configureLibrary) > > > > self.executeTest(self.checkSharedLibrary) > > > > > > > > I then add my mpi wrapper to > > ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > > > > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > > > > > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > > > > > > > On Thu, Aug 29, 2019 at 3:28 PM Balay, Satish > > wrote: > > > > > > > >> On Thu, 29 Aug 2019, Sam Guo via petsc-users wrote: > > > >> > > > >> > I can link when I add my wrapper to > > > >> > PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > > > >> > > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > > >> > > > >> I don't understand what you mean here. Add PCC_LINKER_FLAGS to where? > > > >> This is a variable in configure generated makefile > > > >> > > > >> Since PETSc is not built [as configure failed] - there should be no > > > >> configure generated makefiles. > > > >> > > > >> > (I don't understand why configure does not include my wrapper) > > > >> > > > >> Well the compiler gives the error below. Can you try to compile > > > >> manually [i.e without PETSc or any petsc makefiles] a simple MPI code > > > >> - say cpi.c from MPICH and see if it works? [and copy/paste the log > > > >> from this compile attempt. > > > >> > > > >> Satish > > > >> > > > >> > > > > >> > > > > >> > On Thu, Aug 29, 2019 at 1:28 PM Matthew Knepley > > > >> wrote: > > > >> > > > > >> > > On Thu, Aug 29, 2019 at 4:02 PM Sam Guo > > > >> wrote: > > > >> > > > > > >> > >> Thanks for the quick response. Attached please find the > > configure.log > > > >> > >> containing the configure error. > > > >> > >> > > > >> > > > > > >> > > Executing: > > > >> /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > > > >> > > -c -o /tmp/petsc-6DsCEk/config.libraries/conftest.o > > > >> > > -I/tmp/petsc-6DsCEk/config.compilers > > > >> > > -I/tmp/petsc-6DsCEk/config.setCompilers > > > >> > > -I/tmp/petsc-6DsCEk/config.utilities.closure > > > >> > > -I/tmp/petsc-6DsCEk/config.headers > > > >> > > -I/tmp/petsc-6DsCEk/config.utilities.cacheDetails > > > >> > > -I/tmp/petsc-6DsCEk/config.types > > -I/tmp/petsc-6DsCEk/config.atomics > > > >> > > -I/tmp/petsc-6DsCEk/config.functions > > > >> > > -I/tmp/petsc-6DsCEk/config.utilities.featureTestMacros > > > >> > > -I/tmp/petsc-6DsCEk/config.utilities.missing > > > >> > > -I/tmp/petsc-6DsCEk/PETSc.options.scalarTypes > > > >> > > -I/tmp/petsc-6DsCEk/config.libraries -MD -wd4996 -Z7 > > > >> > > /tmp/petsc-6DsCEk/config.libraries/conftest.c > > > >> > > stdout: conftest.c > > > >> > > Successful compile: > > > >> > > Source: > > > >> > > #include "confdefs.h" > > > >> > > #include "conffix.h" > > > >> > > /* Override any gcc2 internal prototype to avoid an error. */ > > > >> > > char MPI_Init(); > > > >> > > static void _check_MPI_Init() { MPI_Init(); } > > > >> > > char MPI_Comm_create(); > > > >> > > static void _check_MPI_Comm_create() { MPI_Comm_create(); } > > > >> > > > > > >> > > int main() { > > > >> > > _check_MPI_Init(); > > > >> > > _check_MPI_Comm_create();; > > > >> > > return 0; > > > >> > > } > > > >> > > Executing: > > > >> /home/xianzhongg/petsc-3.11.3/lib/petsc/bin/win32fe/win32fe cl > > > >> > > -o /tmp/petsc-6DsCEk/config.libraries/conftest.exe -MD > > -wd4996 -Z7 > > > >> > > /tmp/petsc-6DsCEk/config.libraries/conftest.o > > > >> > > > > > >> > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > > >> > > Ws2_32.lib > > > >> > > stdout: > > > >> > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not > > found > > > >> or not > > > >> > > built by the last incremental link; performing full link > > > >> > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > > > >> > > referenced in function _check_MPI_Init > > > >> > > conftest.obj : error LNK2019: unresolved external symbol > > > >> MPI_Comm_create > > > >> > > referenced in function _check_MPI_Comm_create > > > >> > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error > > > >> LNK1120: > > > >> > > 2 unresolved externals > > > >> > > Possible ERROR while running linker: exit code 2 > > > >> > > stdout: > > > >> > > LINK : C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe not > > found > > > >> or not > > > >> > > built by the last incremental link; performing full link > > > >> > > conftest.obj : error LNK2019: unresolved external symbol MPI_Init > > > >> > > referenced in function _check_MPI_Init > > > >> > > conftest.obj : error LNK2019: unresolved external symbol > > > >> MPI_Comm_create > > > >> > > referenced in function _check_MPI_Comm_create > > > >> > > C:\cygwin64\tmp\PE81BA~1\CONFIG~1.LIB\conftest.exe : fatal error > > > >> LNK1120: > > > >> > > 2 unresolved externals > > > >> > > > > > >> > > The link is definitely failing. Does it work if you do it by hand? > > > >> > > > > > >> > > Thanks, > > > >> > > > > > >> > > Matt > > > >> > > > > > >> > > > > > >> > >> Regarding our dup, our wrapper does support it. In fact, > > everything > > > >> works > > > >> > >> fine on Linux. I suspect on windows, PETSc picks the system mpi.h > > > >> somehow. > > > >> > >> I am investigating it. > > > >> > >> > > > >> > >> Thanks, > > > >> > >> Sam > > > >> > >> > > > >> > >> On Thu, Aug 29, 2019 at 3:39 PM Matthew Knepley < > > knepley at gmail.com> > > > >> > >> wrote: > > > >> > >> > > > >> > >>> On Thu, Aug 29, 2019 at 3:33 PM Sam Guo via petsc-users < > > > >> > >>> petsc-users at mcs.anl.gov> wrote: > > > >> > >>> > > > >> > >>>> Dear PETSc dev team, > > > >> > >>>> I am looking some tips porting petsc to windows. We have > > our mpi > > > >> > >>>> wrapper (so we can switch different mpi). I configure petsc > > using > > > >> > >>>> --with-mpi-lib and --with-mpi-include > > > >> > >>>> ./configure --with-cc="win32fe cl" --with-fc=0 > > > >> > >>>> --download-f2cblaslapack > > > >> > >>>> > > > >> > > --with-mpi-lib=/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > > >> > >>>> > > --with-mpi-include=/home/xianzhongg/dev/star/base/src/mpi/include > > > >> > >>>> --with-shared-libaries=1 > > > >> > >>>> > > > >> > >>>> But I got error > > > >> > >>>> > > > >> > >>>> > > > >> > > =============================================================================== > > > >> > >>>> Configuring PETSc to compile on your system > > > >> > >>>> > > > >> > >>>> > > > >> > > =============================================================================== > > > >> > >>>> TESTING: check from > > > >> > >>>> config.libraries(config/BuildSystem/config/libraries.py:154) > > > >> > >>>> > > > >> > > ******************************************************************************* > > > >> > >>>> UNABLE to CONFIGURE with GIVEN OPTIONS (see > > > >> configure.log > > > >> > >>>> for details): > > > >> > >>>> > > > >> > >>>> > > > >> > > ------------------------------------------------------------------------------- > > > >> > >>>> > > > >> > > --with-mpi-lib=['/home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib'] > > > >> > >>>> and > > > >> > >>>> > > > >> --with-mpi-include=['/home/xianzhongg/dev/star/base/src/mpi/include'] > > > >> > >>>> did not work > > > >> > >>>> > > > >> > >>>> > > > >> > > ******************************************************************************* > > > >> > >>>> > > > >> > >>> > > > >> > >>> Your MPI wrapper should pass the tests here. Send the > > configure.log > > > >> > >>> > > > >> > >>> > > > >> > >>>> To fix the configuration error, in > > > >> > >>>> config/BuildSystem/config/package.py, I removed > > > >> > >>>> self.executeTest(self.checkDependencies) > > > >> > >>>> self.executeTest(self.configureLibrary) > > > >> > >>>> self.executeTest(self.checkSharedLibrary) > > > >> > >>>> > > > >> > >>>> To link, I add my mpi wrapper > > > >> > >>>> to ${PTESTC_ARCH}/lib/petsc/conf/petscvariables: > > > >> > >>>> PCC_LINKER_FLAGS = -MD -wd4996 -Z7 > > > >> > >>>> > > > >> > > /home/xianzhongg/dev/star/lib/win64/intel18.3vc14/lib/StarMpiWrapper.lib > > > >> > >>>> > > > >> > >>>> I got libpetstc.dll and libpetstc.lib. When I try to test it > > > >> inside our > > > >> > >>>> code, PETSc somehow crates a duplicate of communicator with > > only 1 > > > >> MPI > > > >> > >>>> process and PETSC_COMM_WORLD is set to 2. If I set > > > >> PETSC_COMM_WORLD to 1 > > > >> > >>>> (our MPI_COMM_WORLD), PETSc is hanging. > > > >> > >>>> > > > >> > >>> > > > >> > >>> We do dup the communicator on entry. Shouldn't that be > > supported by > > > >> your > > > >> > >>> wrapper? > > > >> > >>> > > > >> > >>> Thanks, > > > >> > >>> > > > >> > >>> Matt > > > >> > >>> > > > >> > >>> > > > >> > >>>> I am wondering if you could give me some tips how to debug this > > > >> problem. > > > >> > >>>> > > > >> > >>>> BR, > > > >> > >>>> Sam > > > >> > >>>> > > > >> > >>> > > > >> > >>> > > > >> > >>> -- > > > >> > >>> What most experimenters take for granted before they begin their > > > >> > >>> experiments is infinitely more interesting than any results to > > > >> which their > > > >> > >>> experiments lead. > > > >> > >>> -- Norbert Wiener > > > >> > >>> > > > >> > >>> https://www.cse.buffalo.edu/~knepley/ > > > >> > >>> > > > >> > >>> > > > >> > >> > > > >> > > > > > >> > > -- > > > >> > > What most experimenters take for granted before they begin their > > > >> > > experiments is infinitely more interesting than any results to > > which > > > >> their > > > >> > > experiments lead. > > > >> > > -- Norbert Wiener > > > >> > > > > > >> > > https://www.cse.buffalo.edu/~knepley/ > > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > > >