From kandanovian at gmail.com Mon Aug 1 02:33:49 2016 From: kandanovian at gmail.com (Tim Steinhoff) Date: Mon, 1 Aug 2016 09:33:49 +0200 Subject: [petsc-users] Ignore command line arguments with fortran code using PETSc In-Reply-To: References: <8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov> Message-ID: Okay, I'll let you know when I make the pull request. Thanks. 2016-07-29 16:49 GMT+02:00 Barry Smith : > >> On Jul 28, 2016, at 2:35 AM, Tim Steinhoff wrote: >> >> 2016-07-27 21:42 GMT+02:00 Barry Smith : >>> >>> Actually there is currently no way to PetscInitialize from Fortran without adding the command line options to the database. In the middle >>> of petscinitialize_() is the code fragment >>> >>> PETScParseFortranArgs_Private(&PetscGlobalArgc,&PetscGlobalArgs); >>> FIXCHAR(filename,len,t1); >>> *ierr = PetscOptionsInsert(NULL,&PetscGlobalArgc,&PetscGlobalArgs,t1); >>> >>> We'll need to do a bit of code refactoring to provide a Fortran petscinitializenoarguments_(). The simplest way to refactor would be to change the name of petscinitialize_ to say PetscInitializeFortran_Internal() and add a bool argument whether to process the arguments and then write two trivial routines petscinitialize_ that calls the new routine with PETSC_TRUE and petscinitializenoarguments_() that calls it with PETSC_FALSE. >> >> Thanks Barry. It would be really nice if PETSc comes with that feature >> in future, because I would prefer not to make any changes to the PETSc >> code that disappear with every new PETSc release. > > Understood. You could make a pull request with your changes https://bitbucket.org/petsc/petsc/wiki/pull-request-instructions-git otherwise I will add it but it will take a few days since I am backlogged. > > Barry > >> >>> >>> Barry >>> >>> Of course you can have a C/C++ main routine that calls PetscInitializeNoArguments(); followed by PetscInitializeFortran() and then have the bulk of your code in Fortran. >> That would work, but we have a rather large fortran code without any >> C. So, for now we will probably stick to your first approach and keep >> our code fotran only. >> >> Thanks again, >> Volker >> >> >>> >>> >>>> On Jul 27, 2016, at 10:55 AM, Tim Steinhoff wrote: >>>> >>>> 2016-07-27 16:04 GMT+02:00 Matthew Knepley : >>>>> On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff >>>>> wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> we coupled PETSc with our fortran code. Is there any way to let PETSc >>>>>> (PetscInitialize) ignore all arguments passed by the command line? >>>>>> Since our code is controlled by command line arguements as well, it >>>>>> leads to a mess, when those arguments are read twice. >>>>> >>>>> >>>>> 1) You can use PetscInitializeNoArguments() >>>> >>>> Thanks! I thought that function was for C/C++ only. >>>> >>>>> >>>>> 2) What goes wrong? PETSc should just ignore any options it does not >>>>> recognize. >>>> >>>> >>>> The problem is that our code uses the same or similar argument names >>>> as PETSc does and our end user should not have access to all petsc >>>> options. >>>> >>>> >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> >>>>>> Thanks and kind regards, >>>>>> >>>>>> Volker >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments >>>>> is infinitely more interesting than any results to which their experiments >>>>> lead. >>>>> -- Norbert Wiener >>> > From C.Klaij at marin.nl Mon Aug 1 03:00:02 2016 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 1 Aug 2016 08:00:02 +0000 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: References: <1469695134232.97712@marin.nl>, Message-ID: <1470038402343.32500@marin.nl> Matt, Barry Thanks for your replies! I've added a call to MatNestSetSubMats() but something's still wrong, see below. Chris $ cat mattry.F90 program mattry use petscksp implicit none #include PetscInt :: n=4 ! setting 4 cells per process PetscErrorCode :: ierr PetscInt :: size,rank,i Mat :: A,A02 MatType :: type IS :: isg0,isg1,isg2 IS :: isl0,isl1,isl2 ISLocalToGlobalMapping :: map integer, allocatable, dimension(:) :: idx call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) ! local index sets for 3 fields allocate(idx(n)) idx=(/ (i-1, i=1,n) /) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr) ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! global index sets for 3 fields allocate(idx(n)) idx=(/ (i-1+rank*3*n, i=1,n) /) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr); CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr); CHKERRQ(ierr) ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! local-to-global mapping allocate(idx(3*n)) idx=(/ (i-1+rank*3*n, i=1,3*n) /) call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); CHKERRQ(ierr) ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! create the 3-by-3 block matrix call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr) call MatSetUp(A,ierr); CHKERRQ(ierr) call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) call MatSetFromOptions(A,ierr); CHKERRQ(ierr) ! setup nest call MatGetType(A,type,ierr); CHKERRQ(ierr) if (type.eq."nest") then call MatNestSetSubMats(A,3,(/isg0,isg1,isg2/),3,(/isg0,isg1,isg2/),PETSC_NULL_OBJECT,ierr); CHKERRQ(ierr) end if ! set diagonal of block A02 to 0.65 call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) do i=1,n call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); CHKERRQ(ierr) end do call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) ! verify call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr); CHKERRQ(ierr) call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) call PetscFinalize(ierr) end program mattry $ mpiexec -n 2 ./mattry -A_mat_type nest [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Corrupt argument: http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 09:54:07 2016 [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [0]PETSC ERROR: #1 PetscObjectReference() line 534 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c [0]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [0]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Corrupt argument: http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [1]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 09:54:07 2016 [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [1]PETSC ERROR: #1 PetscObjectReference() line 534 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c [1]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [1]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 64. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [lin0322.marin.local:14326] 1 more process has sent help message help-mpi-api.txt / mpi-abort [lin0322.marin.local:14326] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages $ dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | C.Klaij at marin.nl | www.marin.nl [LinkedIn] [YouTube] [Twitter] [Facebook] MARIN news: Joint Industry Project LifeLine kicks off -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image2bc716.PNG Type: image/png Size: 293 bytes Desc: image2bc716.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageaadb69.PNG Type: image/png Size: 331 bytes Desc: imageaadb69.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagedd86d2.PNG Type: image/png Size: 333 bytes Desc: imagedd86d2.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imaged1a140.PNG Type: image/png Size: 253 bytes Desc: imaged1a140.PNG URL: From kandanovian at gmail.com Mon Aug 1 04:36:35 2016 From: kandanovian at gmail.com (Tim Steinhoff) Date: Mon, 1 Aug 2016 11:36:35 +0200 Subject: [petsc-users] Ignore command line arguments with fortran code using PETSc In-Reply-To: References: <8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov> Message-ID: I made a pull request. Cheers, Volker 2016-07-29 16:49 GMT+02:00 Barry Smith : > >> On Jul 28, 2016, at 2:35 AM, Tim Steinhoff wrote: >> >> 2016-07-27 21:42 GMT+02:00 Barry Smith : >>> >>> Actually there is currently no way to PetscInitialize from Fortran without adding the command line options to the database. In the middle >>> of petscinitialize_() is the code fragment >>> >>> PETScParseFortranArgs_Private(&PetscGlobalArgc,&PetscGlobalArgs); >>> FIXCHAR(filename,len,t1); >>> *ierr = PetscOptionsInsert(NULL,&PetscGlobalArgc,&PetscGlobalArgs,t1); >>> >>> We'll need to do a bit of code refactoring to provide a Fortran petscinitializenoarguments_(). The simplest way to refactor would be to change the name of petscinitialize_ to say PetscInitializeFortran_Internal() and add a bool argument whether to process the arguments and then write two trivial routines petscinitialize_ that calls the new routine with PETSC_TRUE and petscinitializenoarguments_() that calls it with PETSC_FALSE. >> >> Thanks Barry. It would be really nice if PETSc comes with that feature >> in future, because I would prefer not to make any changes to the PETSc >> code that disappear with every new PETSc release. > > Understood. You could make a pull request with your changes https://bitbucket.org/petsc/petsc/wiki/pull-request-instructions-git otherwise I will add it but it will take a few days since I am backlogged. > > Barry > >> >>> >>> Barry >>> >>> Of course you can have a C/C++ main routine that calls PetscInitializeNoArguments(); followed by PetscInitializeFortran() and then have the bulk of your code in Fortran. >> That would work, but we have a rather large fortran code without any >> C. So, for now we will probably stick to your first approach and keep >> our code fotran only. >> >> Thanks again, >> Volker >> >> >>> >>> >>>> On Jul 27, 2016, at 10:55 AM, Tim Steinhoff wrote: >>>> >>>> 2016-07-27 16:04 GMT+02:00 Matthew Knepley : >>>>> On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff >>>>> wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> we coupled PETSc with our fortran code. Is there any way to let PETSc >>>>>> (PetscInitialize) ignore all arguments passed by the command line? >>>>>> Since our code is controlled by command line arguements as well, it >>>>>> leads to a mess, when those arguments are read twice. >>>>> >>>>> >>>>> 1) You can use PetscInitializeNoArguments() >>>> >>>> Thanks! I thought that function was for C/C++ only. >>>> >>>>> >>>>> 2) What goes wrong? PETSc should just ignore any options it does not >>>>> recognize. >>>> >>>> >>>> The problem is that our code uses the same or similar argument names >>>> as PETSc does and our end user should not have access to all petsc >>>> options. >>>> >>>> >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>>> >>>>>> Thanks and kind regards, >>>>>> >>>>>> Volker >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments >>>>> is infinitely more interesting than any results to which their experiments >>>>> lead. >>>>> -- Norbert Wiener >>> > From knepley at gmail.com Mon Aug 1 08:09:11 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Aug 2016 08:09:11 -0500 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: <1470038402343.32500@marin.nl> References: <1469695134232.97712@marin.nl> <1470038402343.32500@marin.nl> Message-ID: On Mon, Aug 1, 2016 at 3:00 AM, Klaij, Christiaan wrote: > Matt, Barry > > Thanks for your replies! I've added a call to MatNestSetSubMats() > but something's still wrong, see below. > > Chris > > > $ cat mattry.F90 > program mattry > > use petscksp > implicit none > #include > > PetscInt :: n=4 ! setting 4 cells per process > > PetscErrorCode :: ierr > PetscInt :: size,rank,i > Mat :: A,A02 > MatType :: type > IS :: isg0,isg1,isg2 > IS :: isl0,isl1,isl2 > ISLocalToGlobalMapping :: map > > integer, allocatable, dimension(:) :: idx > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) > > ! local index sets for 3 fields > allocate(idx(n)) > idx=(/ (i-1, i=1,n) /) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr) > ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) > deallocate(idx) > > ! global index sets for 3 fields > allocate(idx(n)) > idx=(/ (i-1+rank*3*n, i=1,n) /) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr); > CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr); > CHKERRQ(ierr) > ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) > deallocate(idx) > > ! local-to-global mapping > allocate(idx(3*n)) > idx=(/ (i-1+rank*3*n, i=1,3*n) /) > call > ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); > CHKERRQ(ierr) > ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); > CHKERRQ(ierr) > deallocate(idx) > > ! create the 3-by-3 block matrix > call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) > call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr) > call MatSetUp(A,ierr); CHKERRQ(ierr) > call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) > call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) > call MatSetFromOptions(A,ierr); CHKERRQ(ierr) > > ! setup nest > call MatGetType(A,type,ierr); CHKERRQ(ierr) > if (type.eq."nest") then > 1) Get rid of this protection, you do not need it. 2) It looks like the (/ /) notation does not work here. Can you put them in proper array? Thanks, Matt > call > MatNestSetSubMats(A,3,(/isg0,isg1,isg2/),3,(/isg0,isg1,isg2/),PETSC_NULL_OBJECT,ierr); > CHKERRQ(ierr) > end if > > ! set diagonal of block A02 to 0.65 > call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) > do i=1,n > call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); > CHKERRQ(ierr) > end do > call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > > ! verify > call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr); > CHKERRQ(ierr) > call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) > > call PetscFinalize(ierr) > > end program mattry > > > $ mpiexec -n 2 ./mattry -A_mat_type nest > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Corrupt argument: > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: > ./mattry > on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 > 09:54:07 2016 > [0]PETSC ERROR: Configure options > --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 > --with-clanguage=c++ --with-x=1 --with-debugging=1 > --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl > --with-shared-libraries=0 > [0]PETSC ERROR: #1 PetscObjectReference() line 534 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c > [0]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Corrupt argument: > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [1]PETSC ERROR: > ./mattry > on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 > 09:54:07 2016 > [1]PETSC ERROR: Configure options > --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 > --with-clanguage=c++ --with-x=1 --with-debugging=1 > --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl > --with-shared-libraries=0 > [1]PETSC ERROR: #1 PetscObjectReference() line 534 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c > [1]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD > with errorcode 64. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [lin0322.marin.local:14326] 1 more process has sent help message > help-mpi-api.txt / mpi-abort > [lin0322.marin.local:14326] Set MCA parameter "orte_base_help_aggregate" > to 0 to see all help / error messages > $ > > > > > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development > MARIN | T +31 317 49 33 44 | C.Klaij at marin.nl | www.marin.nl > > [image: LinkedIn] [image: > YouTube] [image: Twitter] > [image: Facebook] > > MARIN news: Joint Industry Project LifeLine kicks off > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imaged1a140.PNG Type: image/png Size: 253 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagedd86d2.PNG Type: image/png Size: 333 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imageaadb69.PNG Type: image/png Size: 331 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image2bc716.PNG Type: image/png Size: 293 bytes Desc: not available URL: From C.Klaij at marin.nl Mon Aug 1 08:59:12 2016 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 1 Aug 2016 13:59:12 +0000 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: References: <1469695134232.97712@marin.nl> <1470038402343.32500@marin.nl>, Message-ID: <1470059952301.68773@marin.nl> Matt, Thanks for your suggestions. Here's the outcome: 1) without the "if type=nest" protection, I get a "Cannot locate function MatNestSetSubMats_C" error when using type mpiaij, see below. 2) with the isg in a proper array, I get the same "Invalid Pointer to Object" error, see below. Chris $ cat mattry.F90 program mattry use petscksp implicit none #include PetscInt :: n=4 ! setting 4 cells per process PetscErrorCode :: ierr PetscInt :: size,rank,i Mat :: A,A02 MatType :: type IS :: isg(3) IS :: isl(3) ISLocalToGlobalMapping :: map integer, allocatable, dimension(:) :: idx call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) ! local index sets for 3 fields allocate(idx(n)) idx=(/ (i-1, i=1,n) /) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl(1),ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl(2),ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl(3),ierr);CHKERRQ(ierr) ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! global index sets for 3 fields allocate(idx(n)) idx=(/ (i-1+rank*3*n, i=1,n) /) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg(1),ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg(2),ierr); CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg(3),ierr); CHKERRQ(ierr) ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! local-to-global mapping allocate(idx(3*n)) idx=(/ (i-1+rank*3*n, i=1,3*n) /) call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); CHKERRQ(ierr) ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! create the 3-by-3 block matrix call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr) call MatSetUp(A,ierr); CHKERRQ(ierr) call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) call MatSetFromOptions(A,ierr); CHKERRQ(ierr) ! setup nest call MatNestSetSubMats(A,3,isg,3,isg,PETSC_NULL_OBJECT,ierr); CHKERRQ(ierr) ! set diagonal of block A02 to 0.65 call MatGetLocalSubmatrix(A,isl(1),isl(3),A02,ierr); CHKERRQ(ierr) do i=1,n call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); CHKERRQ(ierr) end do call MatRestoreLocalSubMatrix(A,isl(1),isl(3),A02,ierr); CHKERRQ(ierr) call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) ! verify call MatGetSubmatrix(A,isg(1),isg(3),MAT_INITIAL_MATRIX,A02,ierr); CHKERRQ(ierr) call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) call PetscFinalize(ierr) end program mattry $ mpiexec -n 2 ./mattry -A_mat_type mpiaij [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Cannot locate function MatNestSetSubMats_C in object [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 15:42:10 2016 [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [0]PETSC ERROR: #1 MatNestSetSubMats() line 1105 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 56. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] PetscSleep line 35 /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/utils/psleep.c [1]PETSC ERROR: [1] PetscTraceBackErrorHandler line 193 /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/error/errtrace.c [1]PETSC ERROR: [1] PetscError line 364 /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/error/err.c [1]PETSC ERROR: [1] MatNestSetSubMats line 1092 /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Signal received [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [1]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 15:42:10 2016 [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [1]PETSC ERROR: #1 User provided function() line 0 in unknown file [lin0322.marin.local:19455] 1 more process has sent help message help-mpi-api.txt / mpi-abort [lin0322.marin.local:19455] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages $ mpiexec -n 2 ./mattry -A_mat_type nest [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Corrupt argument: http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Corrupt argument: http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [1]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 15:42:25 2016 [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [1]PETSC ERROR: #1 PetscObjectReference() line 534 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c [1]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [1]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 15:42:25 2016 [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [0]PETSC ERROR: #1 PetscObjectReference() line 534 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c [0]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [0]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 64. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [lin0322.marin.local:19465] 1 more process has sent help message help-mpi-api.txt / mpi-abort [lin0322.marin.local:19465] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages $ dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl MARIN news: http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm From knepley at gmail.com Mon Aug 1 09:15:09 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Aug 2016 09:15:09 -0500 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: <1470059952301.68773@marin.nl> References: <1469695134232.97712@marin.nl> <1470038402343.32500@marin.nl> <1470059952301.68773@marin.nl> Message-ID: On Mon, Aug 1, 2016 at 8:59 AM, Klaij, Christiaan wrote: > Matt, > > Thanks for your suggestions. Here's the outcome: > > 1) without the "if type=nest" protection, I get a "Cannot > locate function MatNestSetSubMats_C" error when using > type mpiaij, see below. > That is a bug. It should be using PetscTryMethod() there, not PetscUseMethod(). We will fix it. That way it can be called for any matrix type, which is the intention. > 2) with the isg in a proper array, I get the same "Invalid > Pointer to Object" error, see below. > Can you send a small example that we can run? It obviously should work this way. Thanks, Matt > Chris > > > $ cat mattry.F90 > program mattry > > use petscksp > implicit none > #include > > PetscInt :: n=4 ! setting 4 cells per process > > PetscErrorCode :: ierr > PetscInt :: size,rank,i > Mat :: A,A02 > MatType :: type > IS :: isg(3) > IS :: isl(3) > ISLocalToGlobalMapping :: map > > integer, allocatable, dimension(:) :: idx > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) > > ! local index sets for 3 fields > allocate(idx(n)) > idx=(/ (i-1, i=1,n) /) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl(1),ierr);CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl(2),ierr);CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl(3),ierr);CHKERRQ(ierr) > ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) > deallocate(idx) > > ! global index sets for 3 fields > allocate(idx(n)) > idx=(/ (i-1+rank*3*n, i=1,n) /) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg(1),ierr);CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg(2),ierr); > CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg(3),ierr); > CHKERRQ(ierr) > ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) > deallocate(idx) > > ! local-to-global mapping > allocate(idx(3*n)) > idx=(/ (i-1+rank*3*n, i=1,3*n) /) > call > ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); > CHKERRQ(ierr) > ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); > CHKERRQ(ierr) > deallocate(idx) > > ! create the 3-by-3 block matrix > call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) > call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr) > call MatSetUp(A,ierr); CHKERRQ(ierr) > call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) > call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) > call MatSetFromOptions(A,ierr); CHKERRQ(ierr) > > ! setup nest > call MatNestSetSubMats(A,3,isg,3,isg,PETSC_NULL_OBJECT,ierr); > CHKERRQ(ierr) > > ! set diagonal of block A02 to 0.65 > call MatGetLocalSubmatrix(A,isl(1),isl(3),A02,ierr); CHKERRQ(ierr) > do i=1,n > call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); > CHKERRQ(ierr) > end do > call MatRestoreLocalSubMatrix(A,isl(1),isl(3),A02,ierr); CHKERRQ(ierr) > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > > ! verify > call MatGetSubmatrix(A,isg(1),isg(3),MAT_INITIAL_MATRIX,A02,ierr); > CHKERRQ(ierr) > call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) > > call PetscFinalize(ierr) > > end program mattry > > > $ mpiexec -n 2 ./mattry -A_mat_type mpiaij > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Cannot locate function MatNestSetSubMats_C in object > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./mattry > > > on a linux_64bit_debug > named lin0322.marin.local by cklaij Mon Aug 1 15:42:10 2016 > [0]PETSC ERROR: Configure options > --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 > --with-clanguage=c++ --with-x=1 --with-debugging=1 > --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl > --with-shared-libraries=0 > [0]PETSC ERROR: #1 MatNestSetSubMats() line 1105 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 56. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] PetscSleep line 35 > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/utils/psleep.c > [1]PETSC ERROR: [1] PetscTraceBackErrorHandler line 193 > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/error/errtrace.c > [1]PETSC ERROR: [1] PetscError line 364 > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/error/err.c > [1]PETSC ERROR: [1] MatNestSetSubMats line 1092 > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Signal received > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [1]PETSC ERROR: ./mattry > > > on a linux_64bit_debug > named lin0322.marin.local by cklaij Mon Aug 1 15:42:10 2016 > [1]PETSC ERROR: Configure options > --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 > --with-clanguage=c++ --with-x=1 --with-debugging=1 > --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl > --with-shared-libraries=0 > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > [lin0322.marin.local:19455] 1 more process has sent help message > help-mpi-api.txt / mpi-abort > [lin0322.marin.local:19455] Set MCA parameter "orte_base_help_aggregate" > to 0 to see all help / error messages > > > $ mpiexec -n 2 ./mattry -A_mat_type nest > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Corrupt argument: > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Corrupt argument: > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [1]PETSC ERROR: ./mattry > > > on a linux_64bit_debug > named lin0322.marin.local by cklaij Mon Aug 1 15:42:25 2016 > [1]PETSC ERROR: Configure options > --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 > --with-clanguage=c++ --with-x=1 --with-debugging=1 > --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl > --with-shared-libraries=0 > [1]PETSC ERROR: #1 PetscObjectReference() line 534 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c > [1]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./mattry > > > on a linux_64bit_debug > named lin0322.marin.local by cklaij Mon Aug 1 15:42:25 2016 > [0]PETSC ERROR: Configure options > --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 > --with-clanguage=c++ --with-x=1 --with-debugging=1 > --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl > --with-shared-libraries=0 > [0]PETSC ERROR: #1 PetscObjectReference() line 534 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c > [0]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD > with errorcode 64. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [lin0322.marin.local:19465] 1 more process has sent help message > help-mpi-api.txt / mpi-abort > [lin0322.marin.local:19465] Set MCA parameter "orte_base_help_aggregate" > to 0 to see all help / error messages > $ > > > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development > MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl > > MARIN news: > http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Mon Aug 1 09:36:05 2016 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Mon, 1 Aug 2016 14:36:05 +0000 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: References: <1469695134232.97712@marin.nl> <1470038402343.32500@marin.nl> <1470059952301.68773@marin.nl>, Message-ID: <1470062165758.74421@marin.nl> Matt, 1) great! 2) ??? that's precisely why I paste the output of "cat mattry.F90" in the emails, so you have a small example that produces the errors I mention. Now I'm also attaching it to this email. Thanks, Chris dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | C.Klaij at marin.nl | www.marin.nl [LinkedIn] [YouTube] [Twitter] [Facebook] MARIN news: Vice Admiraal De Waard maakt virtuele proefvaart op MARIN?s FSSS (Dutch only) ________________________________ From: Matthew Knepley Sent: Monday, August 01, 2016 4:15 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] block matrix without MatCreateNest On Mon, Aug 1, 2016 at 8:59 AM, Klaij, Christiaan > wrote: Matt, Thanks for your suggestions. Here's the outcome: 1) without the "if type=nest" protection, I get a "Cannot locate function MatNestSetSubMats_C" error when using type mpiaij, see below. That is a bug. It should be using PetscTryMethod() there, not PetscUseMethod(). We will fix it. That way it can be called for any matrix type, which is the intention. 2) with the isg in a proper array, I get the same "Invalid Pointer to Object" error, see below. Can you send a small example that we can run? It obviously should work this way. Thanks, Matt Chris $ cat mattry.F90 program mattry use petscksp implicit none #include PetscInt :: n=4 ! setting 4 cells per process PetscErrorCode :: ierr PetscInt :: size,rank,i Mat :: A,A02 MatType :: type IS :: isg(3) IS :: isl(3) ISLocalToGlobalMapping :: map integer, allocatable, dimension(:) :: idx call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) ! local index sets for 3 fields allocate(idx(n)) idx=(/ (i-1, i=1,n) /) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl(1),ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl(2),ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl(3),ierr);CHKERRQ(ierr) ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! global index sets for 3 fields allocate(idx(n)) idx=(/ (i-1+rank*3*n, i=1,n) /) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg(1),ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg(2),ierr); CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg(3),ierr); CHKERRQ(ierr) ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! local-to-global mapping allocate(idx(3*n)) idx=(/ (i-1+rank*3*n, i=1,3*n) /) call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); CHKERRQ(ierr) ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! create the 3-by-3 block matrix call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr) call MatSetUp(A,ierr); CHKERRQ(ierr) call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) call MatSetFromOptions(A,ierr); CHKERRQ(ierr) ! setup nest call MatNestSetSubMats(A,3,isg,3,isg,PETSC_NULL_OBJECT,ierr); CHKERRQ(ierr) ! set diagonal of block A02 to 0.65 call MatGetLocalSubmatrix(A,isl(1),isl(3),A02,ierr); CHKERRQ(ierr) do i=1,n call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); CHKERRQ(ierr) end do call MatRestoreLocalSubMatrix(A,isl(1),isl(3),A02,ierr); CHKERRQ(ierr) call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) ! verify call MatGetSubmatrix(A,isg(1),isg(3),MAT_INITIAL_MATRIX,A02,ierr); CHKERRQ(ierr) call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) call PetscFinalize(ierr) end program mattry $ mpiexec -n 2 ./mattry -A_mat_type mpiaij [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Cannot locate function MatNestSetSubMats_C in object [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 15:42:10 2016 [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [0]PETSC ERROR: #1 MatNestSetSubMats() line 1105 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 56. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] PetscSleep line 35 /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/utils/psleep.c [1]PETSC ERROR: [1] PetscTraceBackErrorHandler line 193 /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/error/errtrace.c [1]PETSC ERROR: [1] PetscError line 364 /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/error/err.c [1]PETSC ERROR: [1] MatNestSetSubMats line 1092 /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Signal received [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [1]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 15:42:10 2016 [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [1]PETSC ERROR: #1 User provided function() line 0 in unknown file [lin0322.marin.local:19455] 1 more process has sent help message help-mpi-api.txt / mpi-abort [lin0322.marin.local:19455] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages $ mpiexec -n 2 ./mattry -A_mat_type nest [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Corrupt argument: http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Corrupt argument: http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [1]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 15:42:25 2016 [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [1]PETSC ERROR: #1 PetscObjectReference() line 534 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c [1]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [1]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Mon Aug 1 15:42:25 2016 [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [0]PETSC ERROR: #1 PetscObjectReference() line 534 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c [0]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [0]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 64. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [lin0322.marin.local:19465] 1 more process has sent help message help-mpi-api.txt / mpi-abort [lin0322.marin.local:19465] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages $ dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl MARIN news: http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagedaaefa.PNG Type: image/png Size: 293 bytes Desc: imagedaaefa.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagedf4413.PNG Type: image/png Size: 331 bytes Desc: imagedf4413.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagee68916.PNG Type: image/png Size: 333 bytes Desc: imagee68916.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image049e4b.PNG Type: image/png Size: 253 bytes Desc: image049e4b.PNG URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: mattry.F90 URL: From jshen25 at jhu.edu Mon Aug 1 12:52:31 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Mon, 1 Aug 2016 13:52:31 -0400 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: References: Message-ID: Hi Barry, Thanks for your reply. Firstly, as you suggested, I checked my program under valgrind. The results for both sequential and parallel cases showed there are no memory errors detected. Second, I coded a sequential program without using PETSC to generate the global matrix of small mesh for the same problem. I then checked the matrix both from petsc(sequential and parallel) and serial code, and they are same. The way I assembled the global matrix in parallel is first distributing the nodes and elements into processes, then I loop with elements on the calling process to put the element stiffness into the global. Since the nodes and elements in cantilever beam are numbered successively, the connectivity is simple. I didn't use any partition tools to optimize mesh. It's also easy to determine the preallocation d_nnz and o_nnz since each node only connects the left and right nodes except for beginning and end, the maximum nonzeros in each row is 6. The MatSetValue process is shown as follows: do iEL = idElStart, idElEnd g_EL = (/2*iEL-1-1,2*iEL-1,2*iEL+1-1,2*iEL+2-1/) call MatSetValues(SG,4,g_EL,4,g_El,SE,ADD_VALUES,ierr) end do where idElStart and idElEnd are the global number of first element and end element that the process owns, g_EL is the global index for DOF in element iEL, SE is the element stiffness which is same for all elements. >From above assembling, most of the elements are assembled within own process while there are few elements crossing two processes. The BC for my problem(cantilever under end point load) is to fix the first two DOF, so I called the MatZeroRowsColumns to set the first two rows and columns into zero with diagonal equal to one, without changing the RHS. Now some new issues show up : I run with -ksp_monitor_true_residual and -ksp_converged_reason, the monitor showed two different residues, one is the residue I can set(preconditioned, unpreconditioned, natural), the other is called true residue. ?? I initially thought the true residue is same as unpreconditioned based on definition. But it seems not true. Is it the norm of the residue (b-Ax) between computed RHS and true RHS? But, how to understand unprecondition residue since its definition is b-Ax as well? Can I set the true residue as my converging criteria? I found the accuracy of large mesh in my problem didn't necessary depend on the tolerance I set, either preconditioned or unpreconditioned, sometimes, it showed converged while the solution is not correct. But the true residue looks reflecting the true convergence very well, if the true residue is diverging, no matter what the first residue says, the results are bad! For the preconditioner concerns, actually, I used BJACOBI before I sent the first email, since the JACOBI or PBJACOBI didn't even converge when the size was large. But BJACOBI also didn't perform well in the paralleliztion for large mesh as posed in my last email, while it's fine for small size (below 10k elements) Yesterday, I tried the ASM with CG using the runtime option: -pc_type asm -pc_asm_type basic -sub_pc_type lu (default is ilu). For 15k elements mesh, I am now able to get the correct answer with 1-3, 6 and more processes, using either -sub_pc_type lu or ilu. Based on all the results I have got, it shows the results varies a lot with different PC and seems ASM is better for large problem. But what is the major factor to produce such difference between different PCs, since it's not just the issue of computational efficiency, but also the accuracy. Also, I noticed for large mesh, the solution is unstable with small number of processes, for the 15k case, the solution is not correct with 4 and 5 processes, however, the solution becomes always correct with more than 6 processes. For the 50k mesh case, more processes are required to show the stability. What do you think about this? Anything wrong? Since the iterative solver in parallel is first computed locally(if this is correct), can it be possible that there are 'good' and 'bad' locals when dividing the global matrix, and the result from 'bad' local will contaminate the global results. But with more processes, such risk is reduced. It is highly appreciated if you could give me some instruction for above questions. Thank you very much. Bests, Jinlei On Fri, Jul 29, 2016 at 2:09 PM, Barry Smith wrote: > > First run under valgrind all the cases to make sure there is not some > use of uninitialized data or overwriting of data. Go to > http://www.mcs.anl.gov/petsc follow the link to FAQ and search for > valgrind (the web server seems to be broken at the moment). > > Second it is possible that your code the assembles the matrices and > vectors is not correctly assembling it for either the sequential or > parallel case. Hence a different number of processes could be generating a > different linear system hence inconsistent results. How are you handling > the parallelism? How do you know the matrix generated in parallel is > identically to that sequentially? > > Simple preconditioners such as pbjacobi will converge slower and slower > with more elements. > > Note that you should run with -ksp_monitor_true_residual and > -ksp_converged_reason to make sure that the iterative solver is even > converging. By default PETSc KSP solvers do not stop with a big error > message if they do not converge so you need make sure they are always > converging. > > Barry > > > > > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: > > > > Dear PETSC developers, > > > > Thank you for developing such a powerful tool for scientific > computations. > > > > I'm currently trying to run a simple cantilever beam FEM to test the > scalability of PETSC on multi-processors. I also want to verify whether > iterative solver or direct solver is more efficient for parallel large FEM > problem. > > > > Problem description, An Euler elementary cantilever beam with point load > at the end along -y direction. Each node has 2 DOF (deflection and > rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based > on the connectivity. Loop with elements in each processor to assemble the > global matrix with same element stiffness matrix. The boundary condition is > set using call > MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr); > > > > Based on what I have done, I find the computations work well, i.e the > results are correct compared with theoretical solution, for small mesh size > (small than 5000 elements) using both solvers with different numbers of > processes. > > > > However, there are several confusing issues when I increase the mesh > size to 10000 and more elements with iterative solve(CG + PCBJACOBI) > > > > 1. For 10k elements, I can get accurate solution using iterative solver > with uni-processor(i.e. only one process). However, when I use 2-8 > processes, it tells the linear solver converged with different iterations, > but, the results are all different for different processes and erroneous. > The wired thing is when I use >9 processes, the results are correct again. > I am really confused by this. Could you explain me why? If my > parallelization is not correct, why it works for small cases? And I check > the global matrix and RHS vector and didn't see any mallocs during the > process. > > > > 2. For 30k elements, if I use one process, it says: Linear solve did not > converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for large > sparse matrix? If so, is there any stable solver or pc for large problem? > > > > > > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can > only get accuracy when the number of elements are below 5000. There must be > something wrong. The way I use the superlu_dist solver is first convert > MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to > PCLU. Do I miss anything else to run SUPER_LU correctly? > > > > > > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the > sequential version of the same problem. The results shows that iterative > solver works well for <50k elements, while SUPER_LU only gets right > solution below 5k elements. Can I say iterative solver is better than > SUPER_LU for large problem? How can I improve the solver to copy with very > large problem, such as million by million? Another thing is it's still > doubtable of performance of SUPER_LU. > > > > For the inaccuracy issue, do you think it may be due to the memory? > However, there is no memory error showing during the execution. > > > > I really appreciate someone could resolve those puzzles above for me. My > goal is to replace the current SUPER_LU solver in my parallel CPFEM main > program with the iterative solver using PETSC. > > > > > > Please let me if you would like to see my code in detail. > > > > Thank you very much. > > > > Bests, > > Jinlei > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 1 13:10:34 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Aug 2016 13:10:34 -0500 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: References: Message-ID: On Mon, Aug 1, 2016 at 12:52 PM, Jinlei Shen wrote: > Hi Barry, > > Thanks for your reply. > > Firstly, as you suggested, I checked my program under valgrind. The > results for both sequential and parallel cases showed there are no memory > errors detected. > > Second, I coded a sequential program without using PETSC to generate the > global matrix of small mesh for the same problem. I then checked the matrix > both from petsc(sequential and parallel) and serial code, and they are same. > The way I assembled the global matrix in parallel is first distributing > the nodes and elements into processes, then I loop with elements on the > calling process to put the element stiffness into the global. Since the > nodes and elements in cantilever beam are numbered successively, the > connectivity is simple. I didn't use any partition tools to optimize mesh. > It's also easy to determine the preallocation d_nnz and o_nnz since each > node only connects the left and right nodes except for beginning and end, > the maximum nonzeros in each row is 6. The MatSetValue process is shown as > follows: > do iEL = idElStart, idElEnd > g_EL = (/2*iEL-1-1,2*iEL-1,2*iEL+1-1,2*iEL+2-1/) > call MatSetValues(SG,4,g_EL,4,g_El,SE,ADD_VALUES,ierr) > end do > where idElStart and idElEnd are the global number of first element and end > element that the process owns, g_EL is the global index for DOF in element > iEL, SE is the element stiffness which is same for all elements. > From above assembling, most of the elements are assembled within own > process while there are few elements crossing two processes. > > The BC for my problem(cantilever under end point load) is to fix the first > two DOF, so I called the MatZeroRowsColumns to set the first two rows and > columns into zero with diagonal equal to one, without changing the RHS. > > Now some new issues show up : > > I run with -ksp_monitor_true_residual and -ksp_converged_reason, the > monitor showed two different residues, one is the residue I can > set(preconditioned, unpreconditioned, natural), the other is called true > residue. > ?? > I initially thought the true residue is same as unpreconditioned based on > definition. But it seems not true. Is it the norm of the residue (b-Ax) > between computed RHS and true RHS? But, how to understand > unprecondition residue since its definition is b-Ax as well? > It is the unpreconditioned residual. You must be misinterpreting. And we could determine exactly if you sent the output with the suggested options. > Can I set the true residue as my converging criteria? > Use right preconditioning. > I found the accuracy of large mesh in my problem didn't necessary depend > on the tolerance I set, either preconditioned or unpreconditioned, > sometimes, it showed converged while the solution is not correct. But the > true residue looks reflecting the true convergence very well, if the true > residue is diverging, no matter what the first residue says, the results > are bad! > Yes, your preconditioner looks singular. Note that BJACOBI has an inner solver, and by default the is GMRES/ILU(0). I think ILU(0) is really ill-conditioned for your problem. > For the preconditioner concerns, actually, I used BJACOBI before I sent > the first email, since the JACOBI or PBJACOBI didn't even converge when the > size was large. > But BJACOBI also didn't perform well in the paralleliztion for large mesh > as posed in my last email, while it's fine for small size (below 10k > elements) > > Yesterday, I tried the ASM with CG using the runtime option: -pc_type > asm -pc_asm_type basic -sub_pc_type lu (default is ilu). > For 15k elements mesh, I am now able to get the correct answer with 1-3, 6 > and more processes, using either -sub_pc_type lu or ilu. > Yes, LU works for your subdomain solver. > Based on all the results I have got, it shows the results varies a lot > with different PC and seems ASM is better for large problem. > Its not ASM so much as an LU subsolver that is better. > But what is the major factor to produce such difference between different > PCs, since it's not just the issue of computational efficiency, but also > the accuracy. > Also, I noticed for large mesh, the solution is unstable with small number > of processes, for the 15k case, the solution is not correct with 4 and 5 > processes, however, the solution becomes always correct with more than 6 > processes. For the 50k mesh case, more processes are required to show the > stability. > Yes, partitioning is very important here. Since you do not have a good partition, you can get these wild variations. Thanks, Matt > What do you think about this? Anything wrong? > Since the iterative solver in parallel is first computed locally(if this > is correct), can it be possible that there are 'good' and 'bad' locals when > dividing the global matrix, and the result from 'bad' local will > contaminate the global results. But with more processes, such risk is > reduced. > > It is highly appreciated if you could give me some instruction for above > questions. > > Thank you very much. > > Bests, > Jinlei > > > On Fri, Jul 29, 2016 at 2:09 PM, Barry Smith wrote: > >> >> First run under valgrind all the cases to make sure there is not some >> use of uninitialized data or overwriting of data. Go to >> http://www.mcs.anl.gov/petsc follow the link to FAQ and search for >> valgrind (the web server seems to be broken at the moment). >> >> Second it is possible that your code the assembles the matrices and >> vectors is not correctly assembling it for either the sequential or >> parallel case. Hence a different number of processes could be generating a >> different linear system hence inconsistent results. How are you handling >> the parallelism? How do you know the matrix generated in parallel is >> identically to that sequentially? >> >> Simple preconditioners such as pbjacobi will converge slower and slower >> with more elements. >> >> Note that you should run with -ksp_monitor_true_residual and >> -ksp_converged_reason to make sure that the iterative solver is even >> converging. By default PETSc KSP solvers do not stop with a big error >> message if they do not converge so you need make sure they are always >> converging. >> >> Barry >> >> >> >> > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: >> > >> > Dear PETSC developers, >> > >> > Thank you for developing such a powerful tool for scientific >> computations. >> > >> > I'm currently trying to run a simple cantilever beam FEM to test the >> scalability of PETSC on multi-processors. I also want to verify whether >> iterative solver or direct solver is more efficient for parallel large FEM >> problem. >> > >> > Problem description, An Euler elementary cantilever beam with point >> load at the end along -y direction. Each node has 2 DOF (deflection and >> rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based >> on the connectivity. Loop with elements in each processor to assemble the >> global matrix with same element stiffness matrix. The boundary condition is >> set using call >> MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr); >> > >> > Based on what I have done, I find the computations work well, i.e the >> results are correct compared with theoretical solution, for small mesh size >> (small than 5000 elements) using both solvers with different numbers of >> processes. >> > >> > However, there are several confusing issues when I increase the mesh >> size to 10000 and more elements with iterative solve(CG + PCBJACOBI) >> > >> > 1. For 10k elements, I can get accurate solution using iterative solver >> with uni-processor(i.e. only one process). However, when I use 2-8 >> processes, it tells the linear solver converged with different iterations, >> but, the results are all different for different processes and erroneous. >> The wired thing is when I use >9 processes, the results are correct again. >> I am really confused by this. Could you explain me why? If my >> parallelization is not correct, why it works for small cases? And I check >> the global matrix and RHS vector and didn't see any mallocs during the >> process. >> > >> > 2. For 30k elements, if I use one process, it says: Linear solve did >> not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for >> large sparse matrix? If so, is there any stable solver or pc for large >> problem? >> > >> > >> > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can >> only get accuracy when the number of elements are below 5000. There must be >> something wrong. The way I use the superlu_dist solver is first convert >> MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to >> PCLU. Do I miss anything else to run SUPER_LU correctly? >> > >> > >> > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the >> sequential version of the same problem. The results shows that iterative >> solver works well for <50k elements, while SUPER_LU only gets right >> solution below 5k elements. Can I say iterative solver is better than >> SUPER_LU for large problem? How can I improve the solver to copy with very >> large problem, such as million by million? Another thing is it's still >> doubtable of performance of SUPER_LU. >> > >> > For the inaccuracy issue, do you think it may be due to the memory? >> However, there is no memory error showing during the execution. >> > >> > I really appreciate someone could resolve those puzzles above for me. >> My goal is to replace the current SUPER_LU solver in my parallel CPFEM >> main program with the iterative solver using PETSC. >> > >> > >> > Please let me if you would like to see my code in detail. >> > >> > Thank you very much. >> > >> > Bests, >> > Jinlei >> > >> > >> > >> > >> > >> > >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From epscodes at gmail.com Mon Aug 1 14:59:29 2016 From: epscodes at gmail.com (Xiangdong) Date: Mon, 1 Aug 2016 15:59:29 -0400 Subject: [petsc-users] vec norm for local portion of a vector In-Reply-To: <79191996-1FC0-4DFE-B1EB-33A29A5D8CA2@mcs.anl.gov> References: <79191996-1FC0-4DFE-B1EB-33A29A5D8CA2@mcs.anl.gov> Message-ID: On Fri, Jul 29, 2016 at 10:41 AM, Barry Smith wrote: > > > On Jul 27, 2016, at 4:42 PM, Xiangdong wrote: > > > > Hello everyone, > > > > I have a global dmda vector vg. On each processor, if I want to know the > norm of local portion of vg, which function should I call? > > > > So far I am thinking of using DMDAVecGetArray and then write a loop to > compute the norm of this local array. > > > > Is there a simple function available to call? like > *vg->ops->norm_local(vg,NORM_2, &normlocal)? > > There isn't a public interface to this call because it really isn't a > mathematically well defined object; the subdomains in the decomposition of > the array are arbitrary based on the number of processes used. > > Anyways if you want it and it is the NON-overlapping portion then yes, > you can write a little routine (basically just cut and paste VecNorm()) > call it say VecNormLocal() and have it call the function pointer you > indicated above. Note for the 2 norm the norm_local() returns the square of > the norm so you need to take the square root. > I am interested in this non-overlapping case. I found that this norm_local() function returns the correct l2 norm, not the square of norm. I am using old version 3.5. Are there changes in recent version such that norm_local() returns the square of the norm? Xiangdong > > If you want the overlapping portion of the vector then you should just > do the DMDAVecGetArray() as you already do. > > Barry > > > > > > > Thanks. > > > > Best, > > Xiangdong > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sospinar at unal.edu.co Mon Aug 1 15:41:59 2016 From: sospinar at unal.edu.co (Santiago Ospina De Los Rios) Date: Mon, 1 Aug 2016 15:41:59 -0500 Subject: [petsc-users] Segmentation faults: Derived types Message-ID: Hello there, I'm having problems defining some variables into derived types in Fortran. Before, I had a similar problems with an allocatable array "PetsInt" but I solved it just doing a non-collective Petsc Vec. Today I'm having troubles with "PetscBool" or "Logical": In a module which define the variables, I have the following: MODULE ANISOFLOW_Types > > IMPLICIT NONE > > #include > #include > > ... > > TYPE ConductivityField > PetscBool :: > DefinedByCvtZones=.FALSE. ! It produces the segmentation fault. > PetscBool :: > DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > PetscBool :: DefinedByCell=.FALSE. > ! Conductivity defined by zones (Local): > Vec :: ZoneID > TYPE(Tensor),ALLOCATABLE :: Zone(:) > ! Conductivity defined on every cell (Local): > Vec :: Cell > END TYPE ConductivityField > > > TYPE SpecificStorageField > PetscBool :: > DefinedByStoZones=.FALSE. ! It produces the segmentation fault. > PetscBool :: > DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > PetscBool :: DefinedByCell=.FALSE. > ! Specific Storage defined by zones (Local): > Vec :: ZoneID > Vec :: Zone > ! Specific Storage defined on every cell (Global).: > Vec :: Cell > END TYPE SpecificStorageField > > TYPE PropertiesField > TYPE(ConductivityField) :: Cvt > TYPE(SpecificStorageField) :: Sto > ! Property defined by zones (Local): > PetscBool :: DefinedByPptZones=.FALSE. > Vec :: ZoneID > END TYPE PropertiesField > > ... > > CONTAINS > > ... > > END MODULE ANISOFLOW_Types > Later I use it in the main program, with something like this PROGRAM ANISOFLOW > > USE ANISOFLOW_Types, ONLY : ... ,PropertiesField, ... > ... > > IMPLICIT NONE > > #include > > ... > TYPE(PropertiesField) :: PptFld > ... > > CALL PetscInitialize(PETSC_COMM_WORLD,ierr) > ... > CALL PetscFinalize(ierr) > > END PROGRAM > When I run the program appears a Segmentation Fault, which disappears when I comment the booleans marked in the code. Because I need them, I used Valgrind to figure out what is happening but it is yet a mistery to me. Valgrind message: > ==5160== > ==5160== Invalid read of size 1 > ==5160== at 0x4FB2156: petscinitialize_ (zstart.c:433) > ==5160== by 0x4030EA: MAIN__ (ANISOFLOW.F90:29) # line of petsc > inizalitation > ==5160== by 0x404380: main (ANISOFLOW.F90:3) # line of "USE > ANISOFLOW_Types, ONLY : ... ,PropertiesField, ..." > ==5160== Address 0xc54fff is not stack'd, malloc'd or (recently) free'd > ==5160== > > Program received signal SIGSEGV: Segmentation fault - invalid memory > reference. > > Backtrace for this error: > #0 0x699E777 > #1 0x699ED7E > #2 0x6F0BCAF > #3 0x4FB2156 > #4 0x4030EA in anisoflow at ANISOFLOW.F90:29 > I think it is maybe related with petsc because the error popped out just in its initialization, so if you know what's going on, I would appreciate to tell me. Santiago Ospina -- -- Att: Santiago Ospina De Los R?os National University of Colombia -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 1 17:28:33 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 1 Aug 2016 17:28:33 -0500 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: <1470062165758.74421@marin.nl> References: <1469695134232.97712@marin.nl> <1470038402343.32500@marin.nl> <1470059952301.68773@marin.nl> <1470062165758.74421@marin.nl> Message-ID: On Mon, Aug 1, 2016 at 9:36 AM, Klaij, Christiaan wrote: > Matt, > > > 1) great! > > > 2) ??? that's precisely why I paste the output of "cat mattry.F90" in > the emails, so you have a small example that produces the errors I mention. > Now I'm also attaching it to this email. > Okay, I have gone through it. You are correct that it is completely broken. The way that MatNest currently works is that it trys to use L2G mappings from individual blocks and then builds a composite L2G map for the whole matrix. This is obviously incompatible with the primary use case, and should be changed to break up the full L2G into one for each block. Jed, can you fix this? I am not sure I know enough about how Nest works. Matt > Thanks, > > Chris > > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development > MARIN | T +31 317 49 33 44 | C.Klaij at marin.nl | www.marin.nl > > [image: LinkedIn] [image: > YouTube] [image: Twitter] > [image: Facebook] > > MARIN news: Vice Admiraal De Waard maakt virtuele proefvaart op MARIN?s > FSSS (Dutch only) > > > ------------------------------ > *From:* Matthew Knepley > *Sent:* Monday, August 01, 2016 4:15 PM > *To:* Klaij, Christiaan > *Cc:* petsc-users at mcs.anl.gov > *Subject:* Re: [petsc-users] block matrix without MatCreateNest > > On Mon, Aug 1, 2016 at 8:59 AM, Klaij, Christiaan > wrote: > >> Matt, >> >> Thanks for your suggestions. Here's the outcome: >> >> 1) without the "if type=nest" protection, I get a "Cannot >> locate function MatNestSetSubMats_C" error when using >> type mpiaij, see below. >> > > That is a bug. It should be using PetscTryMethod() there, not > PetscUseMethod(). We will fix it. That way > it can be called for any matrix type, which is the intention. > > >> 2) with the isg in a proper array, I get the same "Invalid >> Pointer to Object" error, see below. >> > > Can you send a small example that we can run? It obviously should work > this way. > > Thanks, > > Matt > > >> Chris >> >> >> $ cat mattry.F90 >> program mattry >> >> use petscksp >> implicit none >> #include >> >> PetscInt :: n=4 ! setting 4 cells per process >> >> PetscErrorCode :: ierr >> PetscInt :: size,rank,i >> Mat :: A,A02 >> MatType :: type >> IS :: isg(3) >> IS :: isl(3) >> ISLocalToGlobalMapping :: map >> >> integer, allocatable, dimension(:) :: idx >> >> call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) >> call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) >> call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) >> >> ! local index sets for 3 fields >> allocate(idx(n)) >> idx=(/ (i-1, i=1,n) /) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl(1),ierr);CHKERRQ(ierr) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl(2),ierr);CHKERRQ(ierr) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl(3),ierr);CHKERRQ(ierr) >> ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) >> deallocate(idx) >> >> ! global index sets for 3 fields >> allocate(idx(n)) >> idx=(/ (i-1+rank*3*n, i=1,n) /) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg(1),ierr);CHKERRQ(ierr) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg(2),ierr); >> CHKERRQ(ierr) >> call >> ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg(3),ierr); >> CHKERRQ(ierr) >> ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) >> deallocate(idx) >> >> ! local-to-global mapping >> allocate(idx(3*n)) >> idx=(/ (i-1+rank*3*n, i=1,3*n) /) >> call >> ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); >> CHKERRQ(ierr) >> ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); >> CHKERRQ(ierr) >> deallocate(idx) >> >> ! create the 3-by-3 block matrix >> call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) >> call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); >> CHKERRQ(ierr) >> call MatSetUp(A,ierr); CHKERRQ(ierr) >> call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) >> call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) >> call MatSetFromOptions(A,ierr); CHKERRQ(ierr) >> >> ! setup nest >> call MatNestSetSubMats(A,3,isg,3,isg,PETSC_NULL_OBJECT,ierr); >> CHKERRQ(ierr) >> >> ! set diagonal of block A02 to 0.65 >> call MatGetLocalSubmatrix(A,isl(1),isl(3),A02,ierr); CHKERRQ(ierr) >> do i=1,n >> call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); >> CHKERRQ(ierr) >> end do >> call MatRestoreLocalSubMatrix(A,isl(1),isl(3),A02,ierr); CHKERRQ(ierr) >> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) >> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) >> >> ! verify >> call MatGetSubmatrix(A,isg(1),isg(3),MAT_INITIAL_MATRIX,A02,ierr); >> CHKERRQ(ierr) >> call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) >> >> call PetscFinalize(ierr) >> >> end program mattry >> >> >> $ mpiexec -n 2 ./mattry -A_mat_type mpiaij >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: No support for this operation for this object type >> [0]PETSC ERROR: Cannot locate function MatNestSetSubMats_C in object >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 >> [0]PETSC ERROR: ./mattry >> >> >> on a linux_64bit_debug >> named lin0322.marin.local by cklaij Mon Aug 1 15:42:10 2016 >> [0]PETSC ERROR: Configure options >> --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 >> --with-clanguage=c++ --with-x=1 --with-debugging=1 >> --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl >> --with-shared-libraries=0 >> [0]PETSC ERROR: #1 MatNestSetSubMats() line 1105 in >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> -------------------------------------------------------------------------- >> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >> with errorcode 56. >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> You may or may not see output from other processes, depending on >> exactly when Open MPI kills them. >> -------------------------------------------------------------------------- >> [1]PETSC ERROR: >> ------------------------------------------------------------------------ >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the >> batch system) has told this process to end >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [1]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS >> X to find memory corruption errors >> [1]PETSC ERROR: likely location of problem given in stack below >> [1]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> [1]PETSC ERROR: INSTEAD the line number of the start of the function >> [1]PETSC ERROR: is given. >> [1]PETSC ERROR: [1] PetscSleep line 35 >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/utils/psleep.c >> [1]PETSC ERROR: [1] PetscTraceBackErrorHandler line 193 >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/error/errtrace.c >> [1]PETSC ERROR: [1] PetscError line 364 >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/error/err.c >> [1]PETSC ERROR: [1] MatNestSetSubMats line 1092 >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [1]PETSC ERROR: Signal received >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 >> [1]PETSC ERROR: ./mattry >> >> >> on a linux_64bit_debug >> named lin0322.marin.local by cklaij Mon Aug 1 15:42:10 2016 >> [1]PETSC ERROR: Configure options >> --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 >> --with-clanguage=c++ --with-x=1 --with-debugging=1 >> --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl >> --with-shared-libraries=0 >> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [lin0322.marin.local:19455] 1 more process has sent help message >> help-mpi-api.txt / mpi-abort >> [lin0322.marin.local:19455] Set MCA parameter "orte_base_help_aggregate" >> to 0 to see all help / error messages >> >> >> $ mpiexec -n 2 ./mattry -A_mat_type nest >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Corrupt argument: >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [0]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [1]PETSC ERROR: Corrupt argument: >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [1]PETSC ERROR: Invalid Pointer to Object: Parameter # 1 >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 >> [1]PETSC ERROR: ./mattry >> >> >> on a linux_64bit_debug >> named lin0322.marin.local by cklaij Mon Aug 1 15:42:25 2016 >> [1]PETSC ERROR: Configure options >> --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 >> --with-clanguage=c++ --with-x=1 --with-debugging=1 >> --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl >> --with-shared-libraries=0 >> [1]PETSC ERROR: #1 PetscObjectReference() line 534 in >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c >> [1]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [1]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 >> [0]PETSC ERROR: ./mattry >> >> >> on a linux_64bit_debug >> named lin0322.marin.local by cklaij Mon Aug 1 15:42:25 2016 >> [0]PETSC ERROR: Configure options >> --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 >> --with-clanguage=c++ --with-x=1 --with-debugging=1 >> --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl >> --with-shared-libraries=0 >> [0]PETSC ERROR: #1 PetscObjectReference() line 534 in >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/sys/objects/inherit.c >> [0]PETSC ERROR: #2 MatNestSetSubMats_Nest() line 1042 in >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [0]PETSC ERROR: #3 MatNestSetSubMats() line 1105 in >> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> -------------------------------------------------------------------------- >> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD >> with errorcode 64. >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> You may or may not see output from other processes, depending on >> exactly when Open MPI kills them. >> -------------------------------------------------------------------------- >> [lin0322.marin.local:19465] 1 more process has sent help message >> help-mpi-api.txt / mpi-abort >> [lin0322.marin.local:19465] Set MCA parameter "orte_base_help_aggregate" >> to 0 to see all help / error messages >> $ >> >> >> dr. ir. Christiaan Klaij | CFD Researcher | Research & Development >> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | >> http://www.marin.nl >> >> MARIN news: >> http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagedf4413.PNG Type: image/png Size: 331 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image049e4b.PNG Type: image/png Size: 253 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagedaaefa.PNG Type: image/png Size: 293 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagee68916.PNG Type: image/png Size: 333 bytes Desc: not available URL: From aks084000 at utdallas.edu Mon Aug 1 17:32:58 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Mon, 1 Aug 2016 22:32:58 +0000 Subject: [petsc-users] Nested Fieldsplit for custom index sets In-Reply-To: <5799C3D2.8000407@imperial.ac.uk> References: <0B3B3C93-5C07-4E07-A37E-DEBA9577D3EE@utdallas.edu>, <5799C3D2.8000407@imperial.ac.uk> Message-ID: <089c42002f5c4f29aa0eb301c80f8dd3@utdallas.edu> Lawrence, Thank you for your help, the program works perfectly now. Artur ________________________________________ From: Lawrence Mitchell Sent: Thursday, July 28, 2016 3:35 AM To: Safin, Artur; Barry Smith Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Nested Fieldsplit for custom index sets Dear Artur, On 28/07/16 02:20, Safin, Artur wrote: > Barry, Lawrence, > >> I think the SubKSPs (and therefore SubPCs) are not set up until you call KSPSetUp(ksp) which your code does not do explicitly and is therefore done in KSPSolve. > > I added KSPSetUp(), but unfortunately the issue did not go away. > > > > I have created a MWE that replicates the issue. The program tries to solve a tridiagonal system, where the first fieldsplit partitions the global matrix > > [ P x ] > [ x T ], > > and the nested fieldsplit partitions P into > > [ A x ] > [ x B ]. Two things: 1. Always check the return value from all PETSc calls. This will normally give you a very useful backtrace when something goes wrong. That is, annotate all your calls with: PetscErrorCode ierr; ierr = SomePetscFunction(...); CHKERRQ(ierr); If I do this, I see that the call to KSPSetUp fails: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: Unhandled case, must have at least two fields, not 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.7.2-931-g1e46b98 GIT Date: 2016-07-06 16:57:50 -0500 ... [0]PETSC ERROR: #1 PCFieldSplitSetDefaults() line 470 in /data/lmitche1/src/deps/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: #2 PCSetUp_FieldSplit() line 487 in /data/lmitche1/src/deps/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: #3 PCSetUp() line 968 in /data/lmitche1/src/deps/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #4 KSPSetUp() line 393 in /data/lmitche1/src/deps/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #5 main() line 65 in /homes/lmitche1/tmp/ex.c The reason is you need to call KSPSetUp *after* setting the outermost fieldsplit ISes. If I move the call to KSPSetUp, then things seem to work. I've attached the working code. Cheers, Lawrence $ cat options.txt -pc_type fieldsplit -pc_fieldsplit_type multiplicative -fieldsplit_T_ksp_type bcgs -fieldsplit_P_ksp_type gmres -fieldsplit_P_pc_type fieldsplit -fieldsplit_P_pc_fieldsplit_type multiplicative -fieldsplit_P_fieldsplit_A_ksp_type gmres -fieldsplit_P_fieldsplit_B_pc_type lu -fieldsplit_P_fieldsplit_B_ksp_type preonly -ksp_converged_reason -ksp_monitor_true_residual -ksp_view $ ./ex -options_file options.txt 0 KSP preconditioned resid norm 5.774607007892e+00 true resid norm 1.414213562373e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.921795888956e-01 true resid norm 4.802975385197e-02 ||r(i)||/||b|| 3.396216464745e-02 2 KSP preconditioned resid norm 1.436304589027e-12 true resid norm 2.435255920058e-13 ||r(i)||/||b|| 1.721985974998e-13 Linear solve converged due to CONVERGED_RTOL iterations 2 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 2 Solver info for each split is in the following KSP objects: Split number 0 Defined by IS KSP Object: (fieldsplit_P_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_P_) 1 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 2 Solver info for each split is in the following KSP objects: Split number 0 Defined by IS KSP Object: (fieldsplit_P_fieldsplit_A_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_P_fieldsplit_A_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=25, cols=25 package used to perform factorization: petsc total: nonzeros=73, allocated nonzeros=73 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_P_fieldsplit_A_) 1 MPI processes type: seqaij rows=25, cols=25 total: nonzeros=73, allocated nonzeros=73 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 1 Defined by IS KSP Object: (fieldsplit_P_fieldsplit_B_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_P_fieldsplit_B_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1.43836 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=25, cols=25 package used to perform factorization: petsc total: nonzeros=105, allocated nonzeros=105 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_P_fieldsplit_B_) 1 MPI processes type: seqaij rows=25, cols=25 total: nonzeros=73, allocated nonzeros=73 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_P_) 1 MPI processes type: seqaij rows=50, cols=50 total: nonzeros=148, allocated nonzeros=148 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 1 Defined by IS KSP Object: (fieldsplit_T_) 1 MPI processes type: bcgs maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_T_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=50, cols=50 package used to perform factorization: petsc total: nonzeros=148, allocated nonzeros=148 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_T_) 1 MPI processes type: seqaij rows=50, cols=50 total: nonzeros=148, allocated nonzeros=148 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=100, cols=100 total: nonzeros=298, allocated nonzeros=500 total number of mallocs used during MatSetValues calls =0 not using I-node routines From aks084000 at utdallas.edu Mon Aug 1 23:13:32 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Tue, 2 Aug 2016 04:13:32 +0000 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) Message-ID: Hello, I am running some code that employs gamg preconditioning within a fieldsplit, and for sufficiently large/refined meshes, I am getting the following error: ------------------------------------------------------------------------------------------------------------------------------------------------ Residual norms for fieldsplit_0_ solve. 0 KSP unpreconditioned resid norm 1.019675281087e-08 true resid norm 1.019675281087e-08 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 3.855246547147e-09 true resid norm 3.855246547147e-09 ||r(i)||/||b|| 3.780857120550e-01 2 KSP unpreconditioned resid norm 1.438241386184e-09 true resid norm 1.438241386184e-09 ||r(i)||/||b|| 1.410489606701e-01 3 KSP unpreconditioned resid norm 3.624902894294e-10 true resid norm 3.624902894294e-10 ||r(i)||/||b|| 3.554958094531e-02 4 KSP unpreconditioned resid norm 1.267419175485e-10 true resid norm 1.267419175485e-10 ||r(i)||/||b|| 1.242963518870e-02 5 KSP unpreconditioned resid norm 2.929693449291e-11 true resid norm 2.929693449291e-11 ||r(i)||/||b|| 2.873163156576e-03 6 KSP unpreconditioned resid norm 9.520263854387e-12 true resid norm 9.520263854423e-12 ||r(i)||/||b|| 9.336564326903e-04 7 KSP unpreconditioned resid norm 1.679490979841e-12 true resid norm 1.679490979825e-12 ||r(i)||/||b|| 1.647084136466e-04 8 KSP unpreconditioned resid norm 3.608932906029e-13 true resid norm 3.608932905928e-13 ||r(i)||/||b|| 3.539296257217e-05 9 KSP unpreconditioned resid norm 9.297426160279e-14 true resid norm 9.297426159708e-14 ||r(i)||/||b|| 9.118026426799e-06 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: !(matA_1 && !matA_1->compressedrow.use) [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./main_3D on a x86_64 named artur-ubuntu by artur Mon Aug 1 22:08:45 2016 [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack [0]PETSC ERROR: #1 smoothAggs() line 354 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c [0]PETSC ERROR: #2 PCGAMGCoarsen_AGG() line 998 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c [0]PETSC ERROR: #3 PCSetUp_GAMG() line 571 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #7 PCApply_FieldSplit() line 1016 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: #8 PCApply() line 482 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #9 KSP_PCApply() line 244 in /home/artur/Rorsrach/Packages/petsc-3.7.3/include/petsc/private/kspimpl.h [0]PETSC ERROR: #10 KSPInitialResidual() line 69 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itres.c [0]PETSC ERROR: #11 KSPSolve_GMRES() line 239 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/impls/gmres/gmres.c [0]PETSC ERROR: #12 KSPSolve() line 656 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #13 solve() line 765 in /home/artur/Desktop/Preconditioned/MI_3D/Cpp/MorseI_PML.cpp terminate called after throwing an instance of 'std::runtime_error' what(): Error detected in C PETSc [artur-ubuntu:07250] *** Process received signal *** [artur-ubuntu:07250] Signal: Aborted (6) [artur-ubuntu:07250] Signal code: (-6) [artur-ubuntu:07250] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7fe413815cb0] [artur-ubuntu:07250] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7fe413815c37] [artur-ubuntu:07250] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7fe413819028] [artur-ubuntu:07250] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x155) [0x7fe413f0a535] [artur-ubuntu:07250] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e6d6) [0x7fe413f086d6] [artur-ubuntu:07250] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e703) [0x7fe413f08703] [artur-ubuntu:07250] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e922) [0x7fe413f08922] [artur-ubuntu:07250] [ 7] /home/artur/Rorsrach/Packages/petsc-3.7.3/x86_64/lib/libpetsc.so.3.7(+0x18d9ec) [0x7fe414c419ec] [artur-ubuntu:07250] [ 8] /home/artur/Rorsrach/Packages/petsc-3.7.3/x86_64/lib/libpetsc.so.3.7(PetscError+0x45b) [0x7fe414c41e94] [artur-ubuntu:07250] [ 9] ./main_3D(_ZN10MorseI_PMLILi3EE5solveEv+0x1cf0) [0x430d00] [artur-ubuntu:07250] [10] ./main_3D(_ZN10MorseI_PMLILi3EE3runEv+0xd9) [0x435bd9] [artur-ubuntu:07250] [11] ./main_3D(main+0x6c) [0x41a19c] [artur-ubuntu:07250] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe413800f45] [artur-ubuntu:07250] [13] ./main_3D() [0x41a223] [artur-ubuntu:07250] *** End of error message *** ------------------------------------------------------------------------------------------------------------------------------------------------ The problem specifically appears when I attempt to precondition fieldsplit_1 with gamg (no problems with gamg in fieldsplit_0 though for some reason). I am curious if someone can explain what this error actually means; this comes from line 354 in http://www.mcs.anl.gov/petsc/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c.html Thanks, Artur -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Tue Aug 2 02:25:06 2016 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Tue, 2 Aug 2016 07:25:06 +0000 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: References: <1469695134232.97712@marin.nl> <1470038402343.32500@marin.nl> <1470059952301.68773@marin.nl> <1470062165758.74421@marin.nl>, Message-ID: <1470122706553.28463@marin.nl> Thanks for your help! Going from individual blocks to a whole matrix makes perfect sense if the blocks are readily available or needed as fully functional matrices. Don't change that! Maybe add the opposite? I'm surprised it's broken though: on this mailing list several petsc developers have stated on several occasions (and not just to me) things like "you should never have a matnest", "you should have a mat then change the type at runtime", "snes ex70 is not the intended use" and so on. I fully appreciate the benefit of having a format-independent assembly and switching mat type from aij to nest depending on the preconditioner. And given the manual and the statements on this list, I thought this would be standard practice and therefore thoroughly tested. But now I get the impression it has never worked... Chris > From: Matthew Knepley > Sent: Tuesday, August 02, 2016 12:28 AM > To: Klaij, Christiaan > Cc: petsc-users at mcs.anl.gov; Jed Brown > Subject: Re: [petsc-users] block matrix without MatCreateNest > > On Mon, Aug 1, 2016 at 9:36 AM, Klaij, Christiaan wrote: > > Matt, > > > 1) great! > > > 2) ??? that's precisely why I paste the output of "cat mattry.F90" in the emails, so you have a small example that produces the errors I mention. Now I'm also attaching it to this email. > > Okay, I have gone through it. You are correct that it is completely broken. > > The way that MatNest currently works is that it trys to use L2G mappings from individual blocks > and then builds a composite L2G map for the whole matrix. This is obviously incompatible with > the primary use case, and should be changed to break up the full L2G into one for each block. > > Jed, can you fix this? I am not sure I know enough about how Nest works. > > Matt > > Thanks, > > Chris dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | C.Klaij at marin.nl | www.marin.nl [LinkedIn] [YouTube] [Twitter] [Facebook] MARIN news: Ship design in EU project Holiship -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagece5c8d.PNG Type: image/png Size: 293 bytes Desc: imagece5c8d.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagec9421a.PNG Type: image/png Size: 331 bytes Desc: imagec9421a.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image257f89.PNG Type: image/png Size: 333 bytes Desc: image257f89.PNG URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagec2e203.PNG Type: image/png Size: 253 bytes Desc: imagec2e203.PNG URL: From bsmith at mcs.anl.gov Tue Aug 2 04:20:48 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 2 Aug 2016 04:20:48 -0500 Subject: [petsc-users] vec norm for local portion of a vector In-Reply-To: References: <79191996-1FC0-4DFE-B1EB-33A29A5D8CA2@mcs.anl.gov> Message-ID: <8D84070D-63F2-40F2-AE1F-B57DE78F7A31@mcs.anl.gov> You are correct, I was wrong. It returns the 2 norm (not the square of the 2 norm). Barry > On Aug 1, 2016, at 2:59 PM, Xiangdong wrote: > > > > On Fri, Jul 29, 2016 at 10:41 AM, Barry Smith wrote: > > > On Jul 27, 2016, at 4:42 PM, Xiangdong wrote: > > > > Hello everyone, > > > > I have a global dmda vector vg. On each processor, if I want to know the norm of local portion of vg, which function should I call? > > > > So far I am thinking of using DMDAVecGetArray and then write a loop to compute the norm of this local array. > > > > Is there a simple function available to call? like *vg->ops->norm_local(vg,NORM_2, &normlocal)? > > There isn't a public interface to this call because it really isn't a mathematically well defined object; the subdomains in the decomposition of the array are arbitrary based on the number of processes used. > > Anyways if you want it and it is the NON-overlapping portion then yes, you can write a little routine (basically just cut and paste VecNorm()) call it say VecNormLocal() and have it call the function pointer you indicated above. Note for the 2 norm the norm_local() returns the square of the norm so you need to take the square root. > > I am interested in this non-overlapping case. I found that this norm_local() function returns the correct l2 norm, not the square of norm. I am using old version 3.5. Are there changes in recent version such that norm_local() returns the square of the norm? > > Xiangdong > > > If you want the overlapping portion of the vector then you should just do the DMDAVecGetArray() as you already do. > > Barry > > > > > > > Thanks. > > > > Best, > > Xiangdong > > From patrick.sanan at gmail.com Tue Aug 2 05:18:37 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 2 Aug 2016 12:18:37 +0200 Subject: [petsc-users] vec norm for local portion of a vector In-Reply-To: <79191996-1FC0-4DFE-B1EB-33A29A5D8CA2@mcs.anl.gov> References: <79191996-1FC0-4DFE-B1EB-33A29A5D8CA2@mcs.anl.gov> Message-ID: There is also VecGetLocalVector() and VecRestoreLocalVector() . You should be able to use those to (copy-free, hopefully) obtain the local entries of a global vector as a local vector which you could call the usual VecNorm() on. On Fri, Jul 29, 2016 at 4:41 PM, Barry Smith wrote: > >> On Jul 27, 2016, at 4:42 PM, Xiangdong wrote: >> >> Hello everyone, >> >> I have a global dmda vector vg. On each processor, if I want to know the norm of local portion of vg, which function should I call? >> >> So far I am thinking of using DMDAVecGetArray and then write a loop to compute the norm of this local array. >> >> Is there a simple function available to call? like *vg->ops->norm_local(vg,NORM_2, &normlocal)? > > There isn't a public interface to this call because it really isn't a mathematically well defined object; the subdomains in the decomposition of the array are arbitrary based on the number of processes used. > > Anyways if you want it and it is the NON-overlapping portion then yes, you can write a little routine (basically just cut and paste VecNorm()) call it say VecNormLocal() and have it call the function pointer you indicated above. Note for the 2 norm the norm_local() returns the square of the norm so you need to take the square root. > > If you want the overlapping portion of the vector then you should just do the DMDAVecGetArray() as you already do. > > Barry > > > >> >> Thanks. >> >> Best, >> Xiangdong >> > From imilian.hartig at gmail.com Tue Aug 2 08:22:19 2016 From: imilian.hartig at gmail.com (Maximilian Hartig) Date: Tue, 2 Aug 2016 15:22:19 +0200 Subject: [petsc-users] TS and petscFE Message-ID: Hello all, I would like to run a transient problem with PetscFE. Example ex11.c seems relevant since it uses the PestcFV context to create boundary conditions and RHS Functions for the TS. Is there an easy way to do transient analysis with TS and petscFE or do I have to code my own time-stepping routine? Thanks, Max From knepley at gmail.com Tue Aug 2 08:49:36 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 2 Aug 2016 08:49:36 -0500 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: <1470122706553.28463@marin.nl> References: <1469695134232.97712@marin.nl> <1470038402343.32500@marin.nl> <1470059952301.68773@marin.nl> <1470062165758.74421@marin.nl> <1470122706553.28463@marin.nl> Message-ID: On Tue, Aug 2, 2016 at 2:25 AM, Klaij, Christiaan wrote: > Thanks for your help! Going from individual blocks to a whole > matrix makes perfect sense if the blocks are readily available or > needed as fully functional matrices. Don't change that! Maybe add > the opposite? > > I'm surprised it's broken though: on this mailing list several > petsc developers have stated on several occasions (and not just > to me) things like "you should never have a matnest", "you should > have a mat then change the type at runtime", "snes ex70 is not > the intended use" and so on. > > I fully appreciate the benefit of having a format-independent > assembly and switching mat type from aij to nest depending on the > preconditioner. And given the manual and the statements on this > list, I thought this would be standard practice and therefore > thoroughly tested. But now I get the impression it has never > worked... > Yes, that way has never worked. Nest is only a memory optimization, and with implicit problems I am never running at the limit of memory (or I use more procs). The people I know who needed it had explicitly coded it in rather than trying to use it from options. It should not take long to get this fixed. Thanks, Matt > Chris > > > > From: Matthew Knepley > > Sent: Tuesday, August 02, 2016 12:28 AM > > To: Klaij, Christiaan > > Cc: petsc-users at mcs.anl.gov; Jed Brown > > Subject: Re: [petsc-users] block matrix without MatCreateNest > > > > On Mon, Aug 1, 2016 at 9:36 AM, Klaij, Christiaan > wrote: > > > > Matt, > > > > > > 1) great! > > > > > > 2) ??? that's precisely why I paste the output of "cat mattry.F90" > in the emails, so you have a small example that produces the errors I > mention. Now I'm also attaching it to this email. > > > > Okay, I have gone through it. You are correct that it is completely > broken. > > > > The way that MatNest currently works is that it trys to use L2G mappings > from individual blocks > > and then builds a composite L2G map for the whole matrix. This is > obviously incompatible with > > the primary use case, and should be changed to break up the full L2G > into one for each block. > > > > Jed, can you fix this? I am not sure I know enough about how Nest works. > > > > Matt > > > > Thanks, > > > > Chris > > > > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development > MARIN | T +31 317 49 33 44 | C.Klaij at marin.nl | www.marin.nl > > [image: LinkedIn] [image: > YouTube] [image: Twitter] > [image: Facebook] > > MARIN news: Ship design in EU project Holiship > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagec9421a.PNG Type: image/png Size: 331 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagece5c8d.PNG Type: image/png Size: 293 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image257f89.PNG Type: image/png Size: 333 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: imagec2e203.PNG Type: image/png Size: 253 bytes Desc: not available URL: From mirzadeh at gmail.com Tue Aug 2 12:40:16 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Tue, 2 Aug 2016 13:40:16 -0400 Subject: [petsc-users] false-positive leak report in log_view? Message-ID: I often use the memory usage information in log_view as a way to check on memory leaks and so far it has worked perfect. However, I had long noticed a false-positive report in memory leak for Viewers, i.e. destruction count is always one less than creation. Today, I noticed what seems to be a second one. If you use VecView to write the same DA to vtk, i.e. call VecView(A, vtk); twice, it also report a memory leak for vectors, vecscatters, dm, etc. I am calling this a false-positive since the code is valgrind-clean. Is this known/expected? Here's the relevant bit from log_view: --- Event Stage 0: Main Stage Vector 8 7 250992 0. Vector Scatter 2 0 0 0. Distributed Mesh 2 0 0 0. Star Forest Bipartite Graph 4 0 0 0. Discrete System 2 0 0 0. Index Set 4 4 83136 0. IS L to G Mapping 2 0 0 0. Viewer 2 1 784 0. ======================================================================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 3 09:44:39 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Aug 2016 09:44:39 -0500 Subject: [petsc-users] TS and petscFE In-Reply-To: References: Message-ID: On Tue, Aug 2, 2016 at 8:22 AM, Maximilian Hartig wrote: > Hello all, > > I would like to run a transient problem with PetscFE. Example ex11.c seems > relevant since it uses the PestcFV context to create boundary conditions > and RHS Functions for the TS. > Is there an easy way to do transient analysis with TS and petscFE or do I > have to code my own time-stepping routine? > You can use ierr = DMTSSetBoundaryLocal(adaptedDM, DMPlexTSComputeBoundary, user);CHKERRQ(ierr); ierr = DMTSSetIFunctionLocal(adaptedDM, DMPlexTSComputeIFunctionFEM, user);CHKERRQ(ierr); ierr = DMTSSetIJacobianLocal(adaptedDM, DMPlexTSComputeIJacobianFEM, user);CHKERRQ(ierr); I have been meaning to write a heat equation example, but I have not finished yet, Thanks, Matt > Thanks, > Max -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Aug 3 09:50:11 2016 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 3 Aug 2016 07:50:11 -0700 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: References: Message-ID: Are you saying it works for a while but fails when the problem is large, or that it never works with fieldsplit_1? And how many processors are you using? On Mon, Aug 1, 2016 at 9:13 PM, Safin, Artur wrote: > Hello, > > > I am running some code that employs gamg preconditioning within a > fieldsplit, and for sufficiently large/refined meshes, I am getting the > following error: > > > ------------------------------------------------------------------------------------------------------------------------------------------------ > > Residual norms for fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.019675281087e-08 true resid norm > 1.019675281087e-08 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 3.855246547147e-09 true resid norm > 3.855246547147e-09 ||r(i)||/||b|| 3.780857120550e-01 > 2 KSP unpreconditioned resid norm 1.438241386184e-09 true resid norm > 1.438241386184e-09 ||r(i)||/||b|| 1.410489606701e-01 > 3 KSP unpreconditioned resid norm 3.624902894294e-10 true resid norm > 3.624902894294e-10 ||r(i)||/||b|| 3.554958094531e-02 > 4 KSP unpreconditioned resid norm 1.267419175485e-10 true resid norm > 1.267419175485e-10 ||r(i)||/||b|| 1.242963518870e-02 > 5 KSP unpreconditioned resid norm 2.929693449291e-11 true resid norm > 2.929693449291e-11 ||r(i)||/||b|| 2.873163156576e-03 > 6 KSP unpreconditioned resid norm 9.520263854387e-12 true resid norm > 9.520263854423e-12 ||r(i)||/||b|| 9.336564326903e-04 > 7 KSP unpreconditioned resid norm 1.679490979841e-12 true resid norm > 1.679490979825e-12 ||r(i)||/||b|| 1.647084136466e-04 > 8 KSP unpreconditioned resid norm 3.608932906029e-13 true resid norm > 3.608932905928e-13 ||r(i)||/||b|| 3.539296257217e-05 > 9 KSP unpreconditioned resid norm 9.297426160279e-14 true resid norm > 9.297426159708e-14 ||r(i)||/||b|| 9.118026426799e-06 > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: !(matA_1 && !matA_1->compressedrow.use) > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for > trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./main_3D on a x86_64 named artur-ubuntu by artur Mon Aug 1 > 22:08:45 2016 > [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 > --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ > --with-fc=mpif90 --download-mumps --download-scalapack > [0]PETSC ERROR: #1 smoothAggs() line 354 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #2 PCGAMGCoarsen_AGG() line 998 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #3 PCSetUp_GAMG() line 571 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c > [0]PETSC ERROR: #4 PCSetUp() line 968 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #5 KSPSetUp() line 390 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #6 KSPSolve() line 599 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #7 PCApply_FieldSplit() line 1016 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: #8 PCApply() line 482 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #9 KSP_PCApply() line 244 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/include/petsc/private/kspimpl.h > [0]PETSC ERROR: #10 KSPInitialResidual() line 69 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itres.c > [0]PETSC ERROR: #11 KSPSolve_GMRES() line 239 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: #12 KSPSolve() line 656 in > /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #13 solve() line 765 in > /home/artur/Desktop/Preconditioned/MI_3D/Cpp/MorseI_PML.cpp > > terminate called after throwing an instance of 'std::runtime_error' > what(): Error detected in C PETSc > [artur-ubuntu:07250] *** Process received signal *** > [artur-ubuntu:07250] Signal: Aborted (6) > [artur-ubuntu:07250] Signal code: (-6) > [artur-ubuntu:07250] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) > [0x7fe413815cb0] > [artur-ubuntu:07250] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) > [0x7fe413815c37] > [artur-ubuntu:07250] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) > [0x7fe413819028] > [artur-ubuntu:07250] [ 3] > /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x155) > [0x7fe413f0a535] > [artur-ubuntu:07250] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e6d6) > [0x7fe413f086d6] > [artur-ubuntu:07250] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e703) > [0x7fe413f08703] > [artur-ubuntu:07250] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e922) > [0x7fe413f08922] > [artur-ubuntu:07250] [ 7] > /home/artur/Rorsrach/Packages/petsc-3.7.3/x86_64/lib/libpetsc.so.3.7(+0x18d9ec) > [0x7fe414c419ec] > [artur-ubuntu:07250] [ 8] > /home/artur/Rorsrach/Packages/petsc-3.7.3/x86_64/lib/libpetsc.so.3.7(PetscError+0x45b) > [0x7fe414c41e94] > [artur-ubuntu:07250] [ 9] ./main_3D(_ZN10MorseI_PMLILi3EE5solveEv+0x1cf0) > [0x430d00] > [artur-ubuntu:07250] [10] ./main_3D(_ZN10MorseI_PMLILi3EE3runEv+0xd9) > [0x435bd9] > [artur-ubuntu:07250] [11] ./main_3D(main+0x6c) [0x41a19c] > [artur-ubuntu:07250] [12] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe413800f45] > [artur-ubuntu:07250] [13] ./main_3D() [0x41a223] > [artur-ubuntu:07250] *** End of error message *** > ------------------------------------------------------------------------------------------------------------------------------------------------ > > > > The problem specifically appears when I attempt to precondition fieldsplit_1 > with gamg (no problems with gamg in fieldsplit_0 though for some reason). I > am curious if someone can explain what this error actually means; this comes > from line 354 in > http://www.mcs.anl.gov/petsc/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c.html > > > Thanks, > > > Artur From knepley at gmail.com Wed Aug 3 09:59:01 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Aug 2016 09:59:01 -0500 Subject: [petsc-users] false-positive leak report in log_view? In-Reply-To: References: Message-ID: On Tue, Aug 2, 2016 at 12:40 PM, Mohammad Mirzadeh wrote: > I often use the memory usage information in log_view as a way to check on > memory leaks and so far it has worked perfect. However, I had long noticed > a false-positive report in memory leak for Viewers, i.e. destruction count > is always one less than creation. > Yes, I believe that is the Viewer being used to print this information. > Today, I noticed what seems to be a second one. If you use VecView to > write the same DA to vtk, i.e. call VecView(A, vtk); twice, it also report > a memory leak for vectors, vecscatters, dm, etc. I am calling this a > false-positive since the code is valgrind-clean. > > Is this known/expected? > The VTK viewers have to hold everything they output until they are destroyed since the format does not allow immediate writing. I think the VTK viewer is not destroyed at the time of this output. Can you make a small example that does this? I have switched to HDF5 and XDMF due to the limitations of VTK format. Thanks, Matt > Here's the relevant bit from log_view: > > --- Event Stage 0: Main Stage > > Vector 8 7 250992 0. > Vector Scatter 2 0 0 0. > Distributed Mesh 2 0 0 0. > Star Forest Bipartite Graph 4 0 0 0. > Discrete System 2 0 0 0. > Index Set 4 4 83136 0. > IS L to G Mapping 2 0 0 0. > Viewer 2 1 784 0. > > ======================================================================================================================== > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 3 10:13:37 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Aug 2016 10:13:37 -0500 Subject: [petsc-users] Implementing discontinuous Galerkin FEM? In-Reply-To: References: Message-ID: On Thu, Jul 28, 2016 at 1:31 PM, Andrew Ho wrote: > I am trying to implement a discontinuous Galerkin discretization using the > PETSc DM features to handle most of the topology/geometry specific > functions. However, I'm not really sure which direction to approach this > from since DG is kind of a middle ground between finite volume and > traditional continuous Galerkin finite element methods. > > It appears to me that if I want to implement a nodal DG method, then it > would be more practical to extend the PetscFE interface, but for a modal DG > method perhaps the PetscFV interface is better? > Yes, we have had roughly the same idea. > There are still a few questions that I don't know the answers to, though. > > Questions about implementing nodal DG: > > 1. Does PetscFE support sub/super parametric element types? If so, how do > I express the internal node structure for a nodal DG method (say, for > example located at the abscissa of a Gauss-Lobatto quadrature scheme)? > Yes, in the sense that ComputeCellGeometryFEM() is something separate. No, in the sense that I have not coded anything but isoparametric. > 2. How would I go about making the dataset stored discontinuous between > neighboring elements (specifically at shared nodes for a nodal DG method)? > Assign them to a cell in the Section. > 3. Similar to 2, how would I handle boundary conditions? Specifically, I > need a layer of data space of just the boundary nodes (not a complete > "ghost" element), and these are the actual constrained points. > I do not understand this question. Dirichlet conditions on the function space are handled the same as always. > Questions about implementing modal DG: > > A. What does specifying the quadrature object for a PetscFV object > actually do? Is it purely a surface flux integration quadrature? How does > the quadrature object handle simplex-type elements in 2D/3D? > Right now, nothing. We would probably specify a face quadrature by attaching it to a subobject for that piece of the reference cell. We handle that kind of quadrature by rotating to the place and then calling the 2D version. > B. How would I go about modifying the limiters to take into account these > multiple modes? > I don't know. I would have to understand exactly what algorithm you wanted to use. Thanks, Matt > -- > Andrew Ho > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 3 12:08:39 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 3 Aug 2016 12:08:39 -0500 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: References: Message-ID: <7F93A49A-B5BC-4897-A7F4-A513463F6F82@mcs.anl.gov> This is ixed in the master branch of PETSc. Barry > On Aug 1, 2016, at 11:13 PM, Safin, Artur wrote: > > Hello, > > I am running some code that employs gamg preconditioning within a fieldsplit, and for sufficiently large/refined meshes, I am getting the following error: > > ------------------------------------------------------------------------------------------------------------------------------------------------ > Residual norms for fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.019675281087e-08 true resid norm 1.019675281087e-08 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 3.855246547147e-09 true resid norm 3.855246547147e-09 ||r(i)||/||b|| 3.780857120550e-01 > 2 KSP unpreconditioned resid norm 1.438241386184e-09 true resid norm 1.438241386184e-09 ||r(i)||/||b|| 1.410489606701e-01 > 3 KSP unpreconditioned resid norm 3.624902894294e-10 true resid norm 3.624902894294e-10 ||r(i)||/||b|| 3.554958094531e-02 > 4 KSP unpreconditioned resid norm 1.267419175485e-10 true resid norm 1.267419175485e-10 ||r(i)||/||b|| 1.242963518870e-02 > 5 KSP unpreconditioned resid norm 2.929693449291e-11 true resid norm 2.929693449291e-11 ||r(i)||/||b|| 2.873163156576e-03 > 6 KSP unpreconditioned resid norm 9.520263854387e-12 true resid norm 9.520263854423e-12 ||r(i)||/||b|| 9.336564326903e-04 > 7 KSP unpreconditioned resid norm 1.679490979841e-12 true resid norm 1.679490979825e-12 ||r(i)||/||b|| 1.647084136466e-04 > 8 KSP unpreconditioned resid norm 3.608932906029e-13 true resid norm 3.608932905928e-13 ||r(i)||/||b|| 3.539296257217e-05 > 9 KSP unpreconditioned resid norm 9.297426160279e-14 true resid norm 9.297426159708e-14 ||r(i)||/||b|| 9.118026426799e-06 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: !(matA_1 && !matA_1->compressedrow.use) > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./main_3D on a x86_64 named artur-ubuntu by artur Mon Aug 1 22:08:45 2016 > [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack > [0]PETSC ERROR: #1 smoothAggs() line 354 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #2 PCGAMGCoarsen_AGG() line 998 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #3 PCSetUp_GAMG() line 571 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #7 PCApply_FieldSplit() line 1016 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: #8 PCApply() line 482 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #9 KSP_PCApply() line 244 in /home/artur/Rorsrach/Packages/petsc-3.7.3/include/petsc/private/kspimpl.h > [0]PETSC ERROR: #10 KSPInitialResidual() line 69 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itres.c > [0]PETSC ERROR: #11 KSPSolve_GMRES() line 239 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: #12 KSPSolve() line 656 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #13 solve() line 765 in /home/artur/Desktop/Preconditioned/MI_3D/Cpp/MorseI_PML.cpp > terminate called after throwing an instance of 'std::runtime_error' > what(): Error detected in C PETSc > [artur-ubuntu:07250] *** Process received signal *** > [artur-ubuntu:07250] Signal: Aborted (6) > [artur-ubuntu:07250] Signal code: (-6) > [artur-ubuntu:07250] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7fe413815cb0] > [artur-ubuntu:07250] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7fe413815c37] > [artur-ubuntu:07250] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7fe413819028] > [artur-ubuntu:07250] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x155) [0x7fe413f0a535] > [artur-ubuntu:07250] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e6d6) [0x7fe413f086d6] > [artur-ubuntu:07250] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e703) [0x7fe413f08703] > [artur-ubuntu:07250] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e922) [0x7fe413f08922] > [artur-ubuntu:07250] [ 7] /home/artur/Rorsrach/Packages/petsc-3.7.3/x86_64/lib/libpetsc.so.3.7(+0x18d9ec) [0x7fe414c419ec] > [artur-ubuntu:07250] [ 8] /home/artur/Rorsrach/Packages/petsc-3.7.3/x86_64/lib/libpetsc.so.3.7(PetscError+0x45b) [0x7fe414c41e94] > [artur-ubuntu:07250] [ 9] ./main_3D(_ZN10MorseI_PMLILi3EE5solveEv+0x1cf0) [0x430d00] > [artur-ubuntu:07250] [10] ./main_3D(_ZN10MorseI_PMLILi3EE3runEv+0xd9) [0x435bd9] > [artur-ubuntu:07250] [11] ./main_3D(main+0x6c) [0x41a19c] > [artur-ubuntu:07250] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe413800f45] > [artur-ubuntu:07250] [13] ./main_3D() [0x41a223] > [artur-ubuntu:07250] *** End of error message *** > ------------------------------------------------------------------------------------------------------------------------------------------------ > > > The problem specifically appears when I attempt to precondition fieldsplit_1 with gamg (no problems with gamg in fieldsplit_0 though for some reason). I am curious if someone can explain what this error actually means; this comes from line 354 in http://www.mcs.anl.gov/petsc/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c.html > > Thanks, > > Artur From bsmith at mcs.anl.gov Wed Aug 3 12:12:20 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 3 Aug 2016 12:12:20 -0500 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: References: Message-ID: <60E3CAD0-F220-437C-B270-425BD063C9AA@mcs.anl.gov> This should be fixed in the master branch > On Aug 1, 2016, at 11:13 PM, Safin, Artur wrote: > > Hello, > > I am running some code that employs gamg preconditioning within a fieldsplit, and for sufficiently large/refined meshes, I am getting the following error: > > ------------------------------------------------------------------------------------------------------------------------------------------------ > Residual norms for fieldsplit_0_ solve. > 0 KSP unpreconditioned resid norm 1.019675281087e-08 true resid norm 1.019675281087e-08 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 3.855246547147e-09 true resid norm 3.855246547147e-09 ||r(i)||/||b|| 3.780857120550e-01 > 2 KSP unpreconditioned resid norm 1.438241386184e-09 true resid norm 1.438241386184e-09 ||r(i)||/||b|| 1.410489606701e-01 > 3 KSP unpreconditioned resid norm 3.624902894294e-10 true resid norm 3.624902894294e-10 ||r(i)||/||b|| 3.554958094531e-02 > 4 KSP unpreconditioned resid norm 1.267419175485e-10 true resid norm 1.267419175485e-10 ||r(i)||/||b|| 1.242963518870e-02 > 5 KSP unpreconditioned resid norm 2.929693449291e-11 true resid norm 2.929693449291e-11 ||r(i)||/||b|| 2.873163156576e-03 > 6 KSP unpreconditioned resid norm 9.520263854387e-12 true resid norm 9.520263854423e-12 ||r(i)||/||b|| 9.336564326903e-04 > 7 KSP unpreconditioned resid norm 1.679490979841e-12 true resid norm 1.679490979825e-12 ||r(i)||/||b|| 1.647084136466e-04 > 8 KSP unpreconditioned resid norm 3.608932906029e-13 true resid norm 3.608932905928e-13 ||r(i)||/||b|| 3.539296257217e-05 > 9 KSP unpreconditioned resid norm 9.297426160279e-14 true resid norm 9.297426159708e-14 ||r(i)||/||b|| 9.118026426799e-06 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: !(matA_1 && !matA_1->compressedrow.use) > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./main_3D on a x86_64 named artur-ubuntu by artur Mon Aug 1 22:08:45 2016 > [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack > [0]PETSC ERROR: #1 smoothAggs() line 354 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #2 PCGAMGCoarsen_AGG() line 998 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #3 PCSetUp_GAMG() line 571 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #5 KSPSetUp() line 390 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #6 KSPSolve() line 599 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #7 PCApply_FieldSplit() line 1016 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: #8 PCApply() line 482 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #9 KSP_PCApply() line 244 in /home/artur/Rorsrach/Packages/petsc-3.7.3/include/petsc/private/kspimpl.h > [0]PETSC ERROR: #10 KSPInitialResidual() line 69 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itres.c > [0]PETSC ERROR: #11 KSPSolve_GMRES() line 239 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: #12 KSPSolve() line 656 in /home/artur/Rorsrach/Packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #13 solve() line 765 in /home/artur/Desktop/Preconditioned/MI_3D/Cpp/MorseI_PML.cpp > terminate called after throwing an instance of 'std::runtime_error' > what(): Error detected in C PETSc > [artur-ubuntu:07250] *** Process received signal *** > [artur-ubuntu:07250] Signal: Aborted (6) > [artur-ubuntu:07250] Signal code: (-6) > [artur-ubuntu:07250] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7fe413815cb0] > [artur-ubuntu:07250] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37) [0x7fe413815c37] > [artur-ubuntu:07250] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x148) [0x7fe413819028] > [artur-ubuntu:07250] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x155) [0x7fe413f0a535] > [artur-ubuntu:07250] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e6d6) [0x7fe413f086d6] > [artur-ubuntu:07250] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e703) [0x7fe413f08703] > [artur-ubuntu:07250] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x5e922) [0x7fe413f08922] > [artur-ubuntu:07250] [ 7] /home/artur/Rorsrach/Packages/petsc-3.7.3/x86_64/lib/libpetsc.so.3.7(+0x18d9ec) [0x7fe414c419ec] > [artur-ubuntu:07250] [ 8] /home/artur/Rorsrach/Packages/petsc-3.7.3/x86_64/lib/libpetsc.so.3.7(PetscError+0x45b) [0x7fe414c41e94] > [artur-ubuntu:07250] [ 9] ./main_3D(_ZN10MorseI_PMLILi3EE5solveEv+0x1cf0) [0x430d00] > [artur-ubuntu:07250] [10] ./main_3D(_ZN10MorseI_PMLILi3EE3runEv+0xd9) [0x435bd9] > [artur-ubuntu:07250] [11] ./main_3D(main+0x6c) [0x41a19c] > [artur-ubuntu:07250] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fe413800f45] > [artur-ubuntu:07250] [13] ./main_3D() [0x41a223] > [artur-ubuntu:07250] *** End of error message *** > ------------------------------------------------------------------------------------------------------------------------------------------------ > > > The problem specifically appears when I attempt to precondition fieldsplit_1 with gamg (no problems with gamg in fieldsplit_0 though for some reason). I am curious if someone can explain what this error actually means; this comes from line 354 in http://www.mcs.anl.gov/petsc/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c.html > > Thanks, > > Artur From mirzadeh at gmail.com Wed Aug 3 13:05:47 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Wed, 3 Aug 2016 14:05:47 -0400 Subject: [petsc-users] false-positive leak report in log_view? In-Reply-To: References: Message-ID: On Wed, Aug 3, 2016 at 10:59 AM, Matthew Knepley wrote: > On Tue, Aug 2, 2016 at 12:40 PM, Mohammad Mirzadeh > wrote: > >> I often use the memory usage information in log_view as a way to check on >> memory leaks and so far it has worked perfect. However, I had long noticed >> a false-positive report in memory leak for Viewers, i.e. destruction count >> is always one less than creation. >> > > Yes, I believe that is the Viewer being used to print this information. > That makes sense. > > >> Today, I noticed what seems to be a second one. If you use VecView to >> write the same DA to vtk, i.e. call VecView(A, vtk); twice, it also report >> a memory leak for vectors, vecscatters, dm, etc. I am calling this a >> false-positive since the code is valgrind-clean. >> >> Is this known/expected? >> > > The VTK viewers have to hold everything they output until they are > destroyed since the format does not allow immediate writing. > I think the VTK viewer is not destroyed at the time of this output. Can > you make a small example that does this? > Here's a small example that illustrates the issues #include int main(int argc, char *argv[]) { PetscInitialize(&argc, &argv, NULL, NULL); DM dm; DMDACreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DMDA_STENCIL_BOX, -10, -10, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, &dm); // DMDASetUniformCoordinates(dm, -1, 1, -1, 1, 0, 0); Vec sol; DMCreateGlobalVector(dm, &sol); VecSet(sol, 0); PetscViewer vtk; PetscViewerVTKOpen(PETSC_COMM_WORLD, "test.vts", FILE_MODE_WRITE, &vtk); VecView(sol, vtk); // VecView(sol, vtk); PetscViewerDestroy(&vtk); DMDestroy(&dm); VecDestroy(&sol); PetscFinalize(); return 0; } If you uncomment the second VecView you get reports for leaks in VecScatter and dm. If you also uncomment the DMDASetUniformCoordinates, and use both VecViews, you also get a leak report for Vecs ... its quite bizarre ... > I have switched to HDF5 and XDMF due to the limitations of VTK format. > > I had used XDMF + raw binary in the past and was satisfied with the result. Do you write a single XDMF as a "post-processing" step when the simulation is finished? If I remember correctly preview could not open xmf files as time-series. > Thanks, > > Matt > > >> Here's the relevant bit from log_view: >> >> --- Event Stage 0: Main Stage >> >> Vector 8 7 250992 0. >> Vector Scatter 2 0 0 0. >> Distributed Mesh 2 0 0 0. >> Star Forest Bipartite Graph 4 0 0 0. >> Discrete System 2 0 0 0. >> Index Set 4 4 83136 0. >> IS L to G Mapping 2 0 0 0. >> Viewer 2 1 784 0. >> >> ======================================================================================================================== >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Wed Aug 3 13:18:49 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Wed, 3 Aug 2016 14:18:49 -0400 Subject: [petsc-users] false-positive leak report in log_view? In-Reply-To: References: Message-ID: OK so I just ran the example under valgrind, and if I use two VecViews, it complains about following leak: ==66838== 24,802 (544 direct, 24,258 indirect) bytes in 1 blocks are definitely lost in loss record 924 of 926 ==66838== at 0x100009EBB: malloc (in /usr/local/Cellar/valgrind/3.11.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so) ==66838== by 0x10005E638: PetscMallocAlign (in /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) ==66838== by 0x100405F00: DMCreate_DA (in /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) ==66838== by 0x1003CFFA4: DMSetType (in /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) ==66838== by 0x100405B7F: DMDACreate (in /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) ==66838== by 0x1003F825F: DMDACreate2d (in /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) ==66838== by 0x100001D89: main (main_test.cpp:7) By I am destroying the dm ... also I dont get this when using a single VecView. As a bonus info, PETSC_VIEWER_STDOUT_WORLD is just fine, so this looks like it is definitely vtk related. On Wed, Aug 3, 2016 at 2:05 PM, Mohammad Mirzadeh wrote: > On Wed, Aug 3, 2016 at 10:59 AM, Matthew Knepley > wrote: > >> On Tue, Aug 2, 2016 at 12:40 PM, Mohammad Mirzadeh >> wrote: >> >>> I often use the memory usage information in log_view as a way to check >>> on memory leaks and so far it has worked perfect. However, I had long >>> noticed a false-positive report in memory leak for Viewers, i.e. >>> destruction count is always one less than creation. >>> >> >> Yes, I believe that is the Viewer being used to print this information. >> > > That makes sense. > >> >> >>> Today, I noticed what seems to be a second one. If you use VecView to >>> write the same DA to vtk, i.e. call VecView(A, vtk); twice, it also report >>> a memory leak for vectors, vecscatters, dm, etc. I am calling this a >>> false-positive since the code is valgrind-clean. >>> >>> Is this known/expected? >>> >> >> The VTK viewers have to hold everything they output until they are >> destroyed since the format does not allow immediate writing. >> I think the VTK viewer is not destroyed at the time of this output. Can >> you make a small example that does this? >> > > Here's a small example that illustrates the issues > > #include > > > int main(int argc, char *argv[]) { > > PetscInitialize(&argc, &argv, NULL, NULL); > > > DM dm; > > DMDACreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DMDA_STENCIL_BOX, > > -10, -10, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, > > &dm); > > // DMDASetUniformCoordinates(dm, -1, 1, -1, 1, 0, 0); > > > Vec sol; > > DMCreateGlobalVector(dm, &sol); > > VecSet(sol, 0); > > > PetscViewer vtk; > > PetscViewerVTKOpen(PETSC_COMM_WORLD, "test.vts", FILE_MODE_WRITE, &vtk); > > VecView(sol, vtk); > > // VecView(sol, vtk); > > PetscViewerDestroy(&vtk); > > > DMDestroy(&dm); > > VecDestroy(&sol); > > > PetscFinalize(); > > return 0; > > } > > > If you uncomment the second VecView you get reports for leaks in > VecScatter and dm. If you also uncomment the DMDASetUniformCoordinates, and > use both VecViews, you also get a leak report for Vecs ... its quite > bizarre ... > > >> I have switched to HDF5 and XDMF due to the limitations of VTK format. >> >> > I had used XDMF + raw binary in the past and was satisfied with the > result. Do you write a single XDMF as a "post-processing" step when the > simulation is finished? If I remember correctly preview could not open xmf > files as time-series. > >> Thanks, >> >> Matt >> >> >>> Here's the relevant bit from log_view: >>> >>> --- Event Stage 0: Main Stage >>> >>> Vector 8 7 250992 0. >>> Vector Scatter 2 0 0 0. >>> Distributed Mesh 2 0 0 0. >>> Star Forest Bipartite Graph 4 0 0 0. >>> Discrete System 2 0 0 0. >>> Index Set 4 4 83136 0. >>> IS L to G Mapping 2 0 0 0. >>> Viewer 2 1 784 0. >>> >>> ======================================================================================================================== >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 3 15:34:47 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Aug 2016 15:34:47 -0500 Subject: [petsc-users] Multi-physics meshes with PETSc DM? In-Reply-To: References: Message-ID: On Sat, Jul 30, 2016 at 1:35 PM, Andrew Ho wrote: > Is there a reason the physical groups aren't sufficient for handling this? > As far as I can tell, this is the only way in GMsh to have any kind of > grouping of elements. > > The Gmsh file format can be found here (happens to be the ASCII version, > but binary version is below that): > http://gmsh.info/doc/texinfo/gmsh.html#MSH-ASCII-file-format > > All tags are attributed to elements; there may be multiple element types > (points, lines, triangles, etc.), but at the end of the day each element > just has a list of indices indicating which physical group(s) each element > belongs to. > It looks like Michael Lange already fixed this. In 'master', I run cd src/dm/impls/plex/examples/tests make ex1 ./ex1 -interpolate 1 -dm_view -filename periodic_square.msh and I get DM Object: Simplicial Mesh 1 MPI processes type: plex Simplicial Mesh in 2 dimensions: 0-cells: 8 1-cells: 15 2-cells: 8 Labels: Cell Sets: 2 strata of sizes (4, 4) depth: 3 strata of sizes (8, 15, 8) The "Cell Sets" label has the two sets of cells specified in the physical region section. It will not be periodic since periodic meshes in Plex are topologically periodic, meaning that there is no separate aliasing array. I could possibly read in the GMsh thing and do surgery on the mesh. Does this work for you? Matt > From the documentation for ASCII formatted mesh files: > > number-of-tags > > gives the number of integer tags that follow for the n-th element. By >> default, the first tag is the number of the physical entity to which the >> element belongs; the second is the number of the elementary geometrical >> entity to which the element belongs; the third is the number of mesh >> partitions to which the element belongs, followed by the partition ids >> (negative partition ids indicate ghost cells). A zero tag is equivalent to >> no tag. Gmsh and most codes using the MSH 2 format require at least the >> first two tags (physical and elementary tags). > > > My understanding is to support markers you only need to add a 4th stratum > level which has one node per physical group. It would be helpful (though > not necessary) if this subdomain marker stratum level had the physical tag > name labels properly associated with the corresponding nodes on the graph, > but this is not necessary since it's just as easy to refer to them by node > number as long as the node numbering matches or is a simple transform of > the numbering scheme in the original physical group id's. > > > On Sat, Jul 30, 2016 at 11:11 AM, Matthew Knepley > wrote: > >> On Sat, Jul 30, 2016 at 1:06 PM, Andrew Ho wrote: >> >>> 1) I don't use Physical Groups from GMsh since its unclear how this >>>> would be reflected in the discretization >>> >>> >>> If I'm not using physical groups in GMsh, how do I easily denote what >>> part of the domain should be handled with which physics? I would like to be >>> able to use the same code with similar but not identical meshes (for >>> example to do a convergence study), so manually iterating through a list of >>> vertices at the element height stratum in a chart doesn't provide any hints >>> on which subdomain an element is suppose to belong in. >>> >> >> I think the right way to handle all this is to just mark pieces of the >> mesh. Mesh formats should just have a generic marking >> ability which does not differentiate between vertices, edges, faces, and >> cells. Some formats come close (ExodusII) and some >> are just crazy (GMsh). If you can point me toward the documentation for >> the GMsh format, I will put in code to translate whatever >> part marks cells to a cell label, as we do for ExodusII. >> >> Thanks, >> >> Matt >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > Andrew Ho > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From epscodes at gmail.com Wed Aug 3 16:39:56 2016 From: epscodes at gmail.com (Xiangdong) Date: Wed, 3 Aug 2016 17:39:56 -0400 Subject: [petsc-users] turn off the snes linesearch Message-ID: Hello everyone, If I want to turn off the line search and always use the full newton step, does the option "-snes_linesearch_type basic" guarantee this? In my case, I found that even with that option, it still try to do function evaluation at a point different from the exact newton step. Any suggestions on this? Thanks. Best, Xiangdong -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 3 16:54:35 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Aug 2016 16:54:35 -0500 Subject: [petsc-users] turn off the snes linesearch In-Reply-To: References: Message-ID: On Wed, Aug 3, 2016 at 4:39 PM, Xiangdong wrote: > Hello everyone, > > If I want to turn off the line search and always use the full newton step, > does the option "-snes_linesearch_type basic" guarantee this? In my case, I > found that even with that option, it still try to do function evaluation at > a point different from the exact newton step. Any suggestions on this? > The code is very short: https://bitbucket.org/petsc/petsc/src/bedc4528a1b7d05f6a15a361bf14aabd7c129542/src/snes/linesearch/impls/basic/linesearchbasic.c?at=master&fileviewer=file-view-default I see nothing like that in there. Matt > Thanks. > > Best, > Xiangdong > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Wed Aug 3 16:57:03 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Wed, 3 Aug 2016 14:57:03 -0700 Subject: [petsc-users] Multi-physics meshes with PETSc DM? In-Reply-To: References: Message-ID: > > It will not be periodic since periodic meshes in Plex are topologically > periodic, meaning that > there is no separate aliasing array. I could possibly read in the GMsh > thing and do surgery > on the mesh. > I'm not sure what you mean by this, but being able to handle periodic domains is important for me. However periodicity is handled there needs to be a way to get access to the correct coordinates for the edge vertices of elements on both sides of the periodic edge so cell volumes/Jacobians can be computed correctly. I don't know why an aliasing array is necessary, since I thought for a topologically periodic mesh *DMPlexConstructGhostCells* should handle this correctly? -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 3 17:03:17 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 3 Aug 2016 17:03:17 -0500 Subject: [petsc-users] Multi-physics meshes with PETSc DM? In-Reply-To: References: Message-ID: On Wed, Aug 3, 2016 at 4:57 PM, Andrew Ho wrote: > It will not be periodic since periodic meshes in Plex are topologically >> periodic, meaning that >> there is no separate aliasing array. I could possibly read in the GMsh >> thing and do surgery >> on the mesh. >> > > I'm not sure what you mean by this, but being able to handle periodic > domains is important for me. However periodicity is handled there needs to > be a way to get access to the correct coordinates for the edge vertices of > elements on both sides of the periodic edge so cell volumes/Jacobians can > be computed correctly. I don't know why an aliasing array is necessary, > since I thought for a topologically periodic mesh > *DMPlexConstructGhostCells* should handle this correctly? > I was saying the _GMsh_ uses an alias array. Plex does not need that. For example, see http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMPlexCreateHexBoxMesh.html which creates periodic meshes. So if two vertices are identified, then in a Plex they are actually the same vertex. This is handled correctly in the DMPlexComputeCellGeometry*() functions. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Aug 3 18:53:51 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 3 Aug 2016 18:53:51 -0500 Subject: [petsc-users] Reference for multi-grid tuning Message-ID: Hi all, I want to play around with algebraic multi-grid parameters to optimize performance for my 3D problems but I don't have a good idea of where to start. Don't really see a detailed reference or tutorial for tuning these parameters in the PETSc manual. Can someone point me to a good paper/reference/book/online resource/etc that can give good guidelines on how to tune parameters for implementations similar to what's currently in GAMG/ML for instance. Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Aug 4 02:31:20 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 4 Aug 2016 00:31:20 -0700 Subject: [petsc-users] Reference for multi-grid tuning In-Reply-To: References: Message-ID: Guidelines for tuning AMG are really hard to do. There is some discussion on this in the manual and I am writing a GAMG paper now (slowly, with others) that has another shot at this sort of thing. I can share that with you privately. Because this sort of one-way text documentation has been inadequate, I've tried to have examples use good sets of parameters, so that you, in theory, find a PETSc example that is like your problem and start from there. Is your 3D problem explicitly elliptic like a Laplacian or elasticity? On Wed, Aug 3, 2016 at 4:53 PM, Justin Chang wrote: > Hi all, > > I want to play around with algebraic multi-grid parameters to optimize > performance for my 3D problems but I don't have a good idea of where to > start. Don't really see a detailed reference or tutorial for tuning these > parameters in the PETSc manual. > > Can someone point me to a good paper/reference/book/online resource/etc that > can give good guidelines on how to tune parameters for implementations > similar to what's currently in GAMG/ML for instance. > > Thanks, > Justin From patrick.sanan at gmail.com Thu Aug 4 03:10:52 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Thu, 4 Aug 2016 10:10:52 +0200 Subject: [petsc-users] false-positive leak report in log_view? In-Reply-To: References: Message-ID: I have a patch that I got from Dave that he got from Jed which seems to be related to this. I'll make a PR. On Wed, Aug 3, 2016 at 8:18 PM, Mohammad Mirzadeh wrote: > OK so I just ran the example under valgrind, and if I use two VecViews, it > complains about following leak: > > ==66838== 24,802 (544 direct, 24,258 indirect) bytes in 1 blocks are > definitely lost in loss record 924 of 926 > ==66838== at 0x100009EBB: malloc (in > /usr/local/Cellar/valgrind/3.11.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so) > ==66838== by 0x10005E638: PetscMallocAlign (in > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > ==66838== by 0x100405F00: DMCreate_DA (in > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > ==66838== by 0x1003CFFA4: DMSetType (in > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > ==66838== by 0x100405B7F: DMDACreate (in > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > ==66838== by 0x1003F825F: DMDACreate2d (in > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > ==66838== by 0x100001D89: main (main_test.cpp:7) > > By I am destroying the dm ... also I dont get this when using a single > VecView. As a bonus info, PETSC_VIEWER_STDOUT_WORLD is just fine, so this > looks like it is definitely vtk related. > > On Wed, Aug 3, 2016 at 2:05 PM, Mohammad Mirzadeh > wrote: >> >> On Wed, Aug 3, 2016 at 10:59 AM, Matthew Knepley >> wrote: >>> >>> On Tue, Aug 2, 2016 at 12:40 PM, Mohammad Mirzadeh >>> wrote: >>>> >>>> I often use the memory usage information in log_view as a way to check >>>> on memory leaks and so far it has worked perfect. However, I had long >>>> noticed a false-positive report in memory leak for Viewers, i.e. destruction >>>> count is always one less than creation. >>> >>> >>> Yes, I believe that is the Viewer being used to print this information. >> >> >> That makes sense. >>> >>> >>>> >>>> Today, I noticed what seems to be a second one. If you use VecView to >>>> write the same DA to vtk, i.e. call VecView(A, vtk); twice, it also report a >>>> memory leak for vectors, vecscatters, dm, etc. I am calling this a >>>> false-positive since the code is valgrind-clean. >>>> >>>> Is this known/expected? >>> >>> >>> The VTK viewers have to hold everything they output until they are >>> destroyed since the format does not allow immediate writing. >>> I think the VTK viewer is not destroyed at the time of this output. Can >>> you make a small example that does this? >> >> >> Here's a small example that illustrates the issues >> >> #include >> >> >> int main(int argc, char *argv[]) { >> >> PetscInitialize(&argc, &argv, NULL, NULL); >> >> >> DM dm; >> >> DMDACreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, >> DMDA_STENCIL_BOX, >> >> -10, -10, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, >> >> &dm); >> >> // DMDASetUniformCoordinates(dm, -1, 1, -1, 1, 0, 0); >> >> >> Vec sol; >> >> DMCreateGlobalVector(dm, &sol); >> >> VecSet(sol, 0); >> >> >> PetscViewer vtk; >> >> PetscViewerVTKOpen(PETSC_COMM_WORLD, "test.vts", FILE_MODE_WRITE, &vtk); >> >> VecView(sol, vtk); >> >> // VecView(sol, vtk); >> >> PetscViewerDestroy(&vtk); >> >> >> DMDestroy(&dm); >> >> VecDestroy(&sol); >> >> >> PetscFinalize(); >> >> return 0; >> >> } >> >> >> If you uncomment the second VecView you get reports for leaks in >> VecScatter and dm. If you also uncomment the DMDASetUniformCoordinates, and >> use both VecViews, you also get a leak report for Vecs ... its quite bizarre >> ... >> >>> >>> I have switched to HDF5 and XDMF due to the limitations of VTK format. >>> >> >> I had used XDMF + raw binary in the past and was satisfied with the >> result. Do you write a single XDMF as a "post-processing" step when the >> simulation is finished? If I remember correctly preview could not open xmf >> files as time-series. >>> >>> Thanks, >>> >>> Matt >>> >>>> >>>> Here's the relevant bit from log_view: >>>> >>>> --- Event Stage 0: Main Stage >>>> >>>> Vector 8 7 250992 0. >>>> Vector Scatter 2 0 0 0. >>>> Distributed Mesh 2 0 0 0. >>>> Star Forest Bipartite Graph 4 0 0 0. >>>> Discrete System 2 0 0 0. >>>> Index Set 4 4 83136 0. >>>> IS L to G Mapping 2 0 0 0. >>>> Viewer 2 1 784 0. >>>> >>>> ======================================================================================================================== >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >> >> > From dave.mayhem23 at gmail.com Thu Aug 4 03:18:00 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 4 Aug 2016 10:18:00 +0200 Subject: [petsc-users] false-positive leak report in log_view? In-Reply-To: References: Message-ID: On 4 August 2016 at 10:10, Patrick Sanan wrote: > I have a patch that I got from Dave that he got from Jed which seems > to be related to this. I'll make a PR. > Jed wrote this variant of the VTK viewer so please mark him as a reviewer for my bug fix. > > > On Wed, Aug 3, 2016 at 8:18 PM, Mohammad Mirzadeh > wrote: > > OK so I just ran the example under valgrind, and if I use two VecViews, > it > > complains about following leak: > > > > ==66838== 24,802 (544 direct, 24,258 indirect) bytes in 1 blocks are > > definitely lost in loss record 924 of 926 > > ==66838== at 0x100009EBB: malloc (in > > > /usr/local/Cellar/valgrind/3.11.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so) > > ==66838== by 0x10005E638: PetscMallocAlign (in > > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > > ==66838== by 0x100405F00: DMCreate_DA (in > > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > > ==66838== by 0x1003CFFA4: DMSetType (in > > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > > ==66838== by 0x100405B7F: DMDACreate (in > > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > > ==66838== by 0x1003F825F: DMDACreate2d (in > > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) > > ==66838== by 0x100001D89: main (main_test.cpp:7) > > > > By I am destroying the dm ... also I dont get this when using a single > > VecView. As a bonus info, PETSC_VIEWER_STDOUT_WORLD is just fine, so this > > looks like it is definitely vtk related. > Mohammad, I can confirm that this VTK functionality bleeds memory if you write more than 1 vector to disk. Cheers Dave > > > > On Wed, Aug 3, 2016 at 2:05 PM, Mohammad Mirzadeh > > wrote: > >> > >> On Wed, Aug 3, 2016 at 10:59 AM, Matthew Knepley > >> wrote: > >>> > >>> On Tue, Aug 2, 2016 at 12:40 PM, Mohammad Mirzadeh > > >>> wrote: > >>>> > >>>> I often use the memory usage information in log_view as a way to check > >>>> on memory leaks and so far it has worked perfect. However, I had long > >>>> noticed a false-positive report in memory leak for Viewers, i.e. > destruction > >>>> count is always one less than creation. > >>> > >>> > >>> Yes, I believe that is the Viewer being used to print this information. > >> > >> > >> That makes sense. > >>> > >>> > >>>> > >>>> Today, I noticed what seems to be a second one. If you use VecView to > >>>> write the same DA to vtk, i.e. call VecView(A, vtk); twice, it also > report a > >>>> memory leak for vectors, vecscatters, dm, etc. I am calling this a > >>>> false-positive since the code is valgrind-clean. > >>>> > >>>> Is this known/expected? > >>> > >>> > >>> The VTK viewers have to hold everything they output until they are > >>> destroyed since the format does not allow immediate writing. > >>> I think the VTK viewer is not destroyed at the time of this output. Can > >>> you make a small example that does this? > >> > >> > >> Here's a small example that illustrates the issues > >> > >> #include > >> > >> > >> int main(int argc, char *argv[]) { > >> > >> PetscInitialize(&argc, &argv, NULL, NULL); > >> > >> > >> DM dm; > >> > >> DMDACreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, > >> DMDA_STENCIL_BOX, > >> > >> -10, -10, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, > >> > >> &dm); > >> > >> // DMDASetUniformCoordinates(dm, -1, 1, -1, 1, 0, 0); > >> > >> > >> Vec sol; > >> > >> DMCreateGlobalVector(dm, &sol); > >> > >> VecSet(sol, 0); > >> > >> > >> PetscViewer vtk; > >> > >> PetscViewerVTKOpen(PETSC_COMM_WORLD, "test.vts", FILE_MODE_WRITE, > &vtk); > >> > >> VecView(sol, vtk); > >> > >> // VecView(sol, vtk); > >> > >> PetscViewerDestroy(&vtk); > >> > >> > >> DMDestroy(&dm); > >> > >> VecDestroy(&sol); > >> > >> > >> PetscFinalize(); > >> > >> return 0; > >> > >> } > >> > >> > >> If you uncomment the second VecView you get reports for leaks in > >> VecScatter and dm. If you also uncomment the DMDASetUniformCoordinates, > and > >> use both VecViews, you also get a leak report for Vecs ... its quite > bizarre > >> ... > >> > >>> > >>> I have switched to HDF5 and XDMF due to the limitations of VTK format. > >>> > >> > >> I had used XDMF + raw binary in the past and was satisfied with the > >> result. Do you write a single XDMF as a "post-processing" step when the > >> simulation is finished? If I remember correctly preview could not open > xmf > >> files as time-series. > >>> > >>> Thanks, > >>> > >>> Matt > >>> > >>>> > >>>> Here's the relevant bit from log_view: > >>>> > >>>> --- Event Stage 0: Main Stage > >>>> > >>>> Vector 8 7 250992 0. > >>>> Vector Scatter 2 0 0 0. > >>>> Distributed Mesh 2 0 0 0. > >>>> Star Forest Bipartite Graph 4 0 0 0. > >>>> Discrete System 2 0 0 0. > >>>> Index Set 4 4 83136 0. > >>>> IS L to G Mapping 2 0 0 0. > >>>> Viewer 2 1 784 0. > >>>> > >>>> > ======================================================================================================================== > >>>> > >>> > >>> > >>> > >>> -- > >>> What most experimenters take for granted before they begin their > >>> experiments is infinitely more interesting than any results to which > their > >>> experiments lead. > >>> -- Norbert Wiener > >> > >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Thu Aug 4 03:23:18 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Thu, 4 Aug 2016 10:23:18 +0200 Subject: [petsc-users] false-positive leak report in log_view? In-Reply-To: References: Message-ID: On Thu, Aug 4, 2016 at 10:18 AM, Dave May wrote: > > > On 4 August 2016 at 10:10, Patrick Sanan wrote: >> >> I have a patch that I got from Dave that he got from Jed which seems >> to be related to this. I'll make a PR. > > > Jed wrote this variant of the VTK viewer so please mark him as a reviewer > for my bug fix. https://bitbucket.org/petsc/petsc/pull-requests/520/petscviewervtk-dereference-dm-to-avoid/diff > > >> >> >> On Wed, Aug 3, 2016 at 8:18 PM, Mohammad Mirzadeh >> wrote: >> > OK so I just ran the example under valgrind, and if I use two VecViews, >> > it >> > complains about following leak: >> > >> > ==66838== 24,802 (544 direct, 24,258 indirect) bytes in 1 blocks are >> > definitely lost in loss record 924 of 926 >> > ==66838== at 0x100009EBB: malloc (in >> > >> > /usr/local/Cellar/valgrind/3.11.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so) >> > ==66838== by 0x10005E638: PetscMallocAlign (in >> > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) >> > ==66838== by 0x100405F00: DMCreate_DA (in >> > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) >> > ==66838== by 0x1003CFFA4: DMSetType (in >> > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) >> > ==66838== by 0x100405B7F: DMDACreate (in >> > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) >> > ==66838== by 0x1003F825F: DMDACreate2d (in >> > /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib) >> > ==66838== by 0x100001D89: main (main_test.cpp:7) >> > >> > By I am destroying the dm ... also I dont get this when using a single >> > VecView. As a bonus info, PETSC_VIEWER_STDOUT_WORLD is just fine, so >> > this >> > looks like it is definitely vtk related. > > > Mohammad, > > I can confirm that this VTK functionality bleeds memory if you write more > than 1 vector to disk. > > Cheers > Dave > > >> >> > >> > On Wed, Aug 3, 2016 at 2:05 PM, Mohammad Mirzadeh >> > wrote: >> >> >> >> On Wed, Aug 3, 2016 at 10:59 AM, Matthew Knepley >> >> wrote: >> >>> >> >>> On Tue, Aug 2, 2016 at 12:40 PM, Mohammad Mirzadeh >> >>> >> >>> wrote: >> >>>> >> >>>> I often use the memory usage information in log_view as a way to >> >>>> check >> >>>> on memory leaks and so far it has worked perfect. However, I had long >> >>>> noticed a false-positive report in memory leak for Viewers, i.e. >> >>>> destruction >> >>>> count is always one less than creation. >> >>> >> >>> >> >>> Yes, I believe that is the Viewer being used to print this >> >>> information. >> >> >> >> >> >> That makes sense. >> >>> >> >>> >> >>>> >> >>>> Today, I noticed what seems to be a second one. If you use VecView to >> >>>> write the same DA to vtk, i.e. call VecView(A, vtk); twice, it also >> >>>> report a >> >>>> memory leak for vectors, vecscatters, dm, etc. I am calling this a >> >>>> false-positive since the code is valgrind-clean. >> >>>> >> >>>> Is this known/expected? >> >>> >> >>> >> >>> The VTK viewers have to hold everything they output until they are >> >>> destroyed since the format does not allow immediate writing. >> >>> I think the VTK viewer is not destroyed at the time of this output. >> >>> Can >> >>> you make a small example that does this? >> >> >> >> >> >> Here's a small example that illustrates the issues >> >> >> >> #include >> >> >> >> >> >> int main(int argc, char *argv[]) { >> >> >> >> PetscInitialize(&argc, &argv, NULL, NULL); >> >> >> >> >> >> DM dm; >> >> >> >> DMDACreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, >> >> DMDA_STENCIL_BOX, >> >> >> >> -10, -10, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL, >> >> >> >> &dm); >> >> >> >> // DMDASetUniformCoordinates(dm, -1, 1, -1, 1, 0, 0); >> >> >> >> >> >> Vec sol; >> >> >> >> DMCreateGlobalVector(dm, &sol); >> >> >> >> VecSet(sol, 0); >> >> >> >> >> >> PetscViewer vtk; >> >> >> >> PetscViewerVTKOpen(PETSC_COMM_WORLD, "test.vts", FILE_MODE_WRITE, >> >> &vtk); >> >> >> >> VecView(sol, vtk); >> >> >> >> // VecView(sol, vtk); >> >> >> >> PetscViewerDestroy(&vtk); >> >> >> >> >> >> DMDestroy(&dm); >> >> >> >> VecDestroy(&sol); >> >> >> >> >> >> PetscFinalize(); >> >> >> >> return 0; >> >> >> >> } >> >> >> >> >> >> If you uncomment the second VecView you get reports for leaks in >> >> VecScatter and dm. If you also uncomment the DMDASetUniformCoordinates, >> >> and >> >> use both VecViews, you also get a leak report for Vecs ... its quite >> >> bizarre >> >> ... >> >> >> >>> >> >>> I have switched to HDF5 and XDMF due to the limitations of VTK format. >> >>> >> >> >> >> I had used XDMF + raw binary in the past and was satisfied with the >> >> result. Do you write a single XDMF as a "post-processing" step when the >> >> simulation is finished? If I remember correctly preview could not open >> >> xmf >> >> files as time-series. >> >>> >> >>> Thanks, >> >>> >> >>> Matt >> >>> >> >>>> >> >>>> Here's the relevant bit from log_view: >> >>>> >> >>>> --- Event Stage 0: Main Stage >> >>>> >> >>>> Vector 8 7 250992 0. >> >>>> Vector Scatter 2 0 0 0. >> >>>> Distributed Mesh 2 0 0 0. >> >>>> Star Forest Bipartite Graph 4 0 0 0. >> >>>> Discrete System 2 0 0 0. >> >>>> Index Set 4 4 83136 0. >> >>>> IS L to G Mapping 2 0 0 0. >> >>>> Viewer 2 1 784 0. >> >>>> >> >>>> >> >>>> ======================================================================================================================== >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> What most experimenters take for granted before they begin their >> >>> experiments is infinitely more interesting than any results to which >> >>> their >> >>> experiments lead. >> >>> -- Norbert Wiener >> >> >> >> >> > > > From C.Klaij at marin.nl Thu Aug 4 03:43:53 2016 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 4 Aug 2016 08:43:53 +0000 Subject: [petsc-users] block matrix without MatCreateNest Message-ID: <1470300232940.43241@marin.nl> OK, looking forward to the fix! Related to this, the preallocation would need to depend on the type that is given at runtime, say if type=XXX, call MatXXXSetPreallocation() That would work for say seqaij and mpiaij, probably even without the if-statement, right? And since there's no MatNestSetPreallocation, should one get the submats and preallocate those if type=nest? Chris > Date: Tue, 2 Aug 2016 08:49:36 -0500 > From: Matthew Knepley > To: "Klaij, Christiaan" > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] block matrix without MatCreateNest > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > On Tue, Aug 2, 2016 at 2:25 AM, Klaij, Christiaan wrote: > > > Thanks for your help! Going from individual blocks to a whole > > matrix makes perfect sense if the blocks are readily available or > > needed as fully functional matrices. Don't change that! Maybe add > > the opposite? > > > > I'm surprised it's broken though: on this mailing list several > > petsc developers have stated on several occasions (and not just > > to me) things like "you should never have a matnest", "you should > > have a mat then change the type at runtime", "snes ex70 is not > > the intended use" and so on. > > > > I fully appreciate the benefit of having a format-independent > > assembly and switching mat type from aij to nest depending on the > > preconditioner. And given the manual and the statements on this > > list, I thought this would be standard practice and therefore > > thoroughly tested. But now I get the impression it has never > > worked... > > > Yes, that way has never worked. Nest is only a memory optimization, and with > implicit problems I am never running at the limit of memory (or I use more > procs). > The people I know who needed it had explicitly coded it in rather than > trying to > use it from options. It should not take long to get this fixed. > > Thanks, > > Matt > > > Chris > > > > > > > From: Matthew Knepley > > > Sent: Tuesday, August 02, 2016 12:28 AM > > > To: Klaij, Christiaan > > > Cc: petsc-users at mcs.anl.gov; Jed Brown > > > Subject: Re: [petsc-users] block matrix without MatCreateNest > > > > > > On Mon, Aug 1, 2016 at 9:36 AM, Klaij, Christiaan > > wrote: > > > > > > Matt, > > > > > > > > > 1) great! > > > > > > > > > 2) ??? that's precisely why I paste the output of "cat mattry.F90" > > in the emails, so you have a small example that produces the errors I > > mention. Now I'm also attaching it to this email. > > > > > > Okay, I have gone through it. You are correct that it is completely > > broken. > > > > > > The way that MatNest currently works is that it trys to use L2G mappings > > from individual blocks > > > and then builds a composite L2G map for the whole matrix. This is > > obviously incompatible with > > > the primary use case, and should be changed to break up the full L2G > > into one for each block. > > > > > > Jed, can you fix this? I am not sure I know enough about how Nest works. > > > > > > Matt > > > > > > Thanks, > > > > > > Chris dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl MARIN news: http://www.marin.nl/web/News/News-items/Vice-Admiral-De-Waard-makes-virtual-test-voyage-on-MARINs-FSSS.htm From Elektro_Kirstetter at t-online.de Thu Aug 4 04:39:39 2016 From: Elektro_Kirstetter at t-online.de (AliSourcePro) Date: Thu, 04 Aug 2016 04:39:39 -0500 Subject: [petsc-users] [Alibaba Notification]Gabriel from CANADA placed an order with you Message-ID: <1bVF4Z-2vCk260@fwd23.t-online.de> The following message was generated before 3 August 2016 09:41(PST) This message was sent to you only Registered Location and Message Origin: CANADA Message IP: 205.204.92.* Dear Supplier, I'm interested in your product and will like to know the MOQ. We want to purchase this particular products. Kindly send the quotation of your various types to me,i will also like to know the delivery time after we place our order. I am expecting your reply asap. Best regards, Mr. Olivia Gabriel (Purchase manager). View more Mr. Olivia Gabriel Company: SPK. Sales $ Distributors Intnl. Ltd Business Type: Other Country/Region: Canada Address: Website: None Verified Email: TEL: 1-488-53888630 FAX: Add to Contacts Reply Now Reject Inquiry Report Spam Alibaba.com shall not be liable for any lost profits or incidental, consequential or other damages arising out of or in connection with this message, our web site content, our services or the activities of any of the users of our web site. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 4 08:31:16 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 4 Aug 2016 08:31:16 -0500 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: <1470300232940.43241@marin.nl> References: <1470300232940.43241@marin.nl> Message-ID: On Thu, Aug 4, 2016 at 3:43 AM, Klaij, Christiaan wrote: > OK, looking forward to the fix! > > Related to this, the preallocation would need to depend on the > type that is given at runtime, say > > if type=XXX, call MatXXXSetPreallocation() > > That would work for say seqaij and mpiaij, probably even without > the if-statement, right? And since there's no > MatNestSetPreallocation, should one get the submats and > preallocate those if type=nest? > This is really a situation where the DM interface is far superior. We could try to break down the AIJ preallocation into pieces, however since we have no column information, this problem is not determined. However, in the DM case we know everything and could properly allocate the submatrices automatically. Matt > Chris > > > Date: Tue, 2 Aug 2016 08:49:36 -0500 > > From: Matthew Knepley > > To: "Klaij, Christiaan" > > Cc: "petsc-users at mcs.anl.gov" > > Subject: Re: [petsc-users] block matrix without MatCreateNest > > Message-ID: > > gmail.com> > > Content-Type: text/plain; charset="utf-8" > > > > On Tue, Aug 2, 2016 at 2:25 AM, Klaij, Christiaan > wrote: > > > > > Thanks for your help! Going from individual blocks to a whole > > > matrix makes perfect sense if the blocks are readily available or > > > needed as fully functional matrices. Don't change that! Maybe add > > > the opposite? > > > > > > I'm surprised it's broken though: on this mailing list several > > > petsc developers have stated on several occasions (and not just > > > to me) things like "you should never have a matnest", "you should > > > have a mat then change the type at runtime", "snes ex70 is not > > > the intended use" and so on. > > > > > > I fully appreciate the benefit of having a format-independent > > > assembly and switching mat type from aij to nest depending on the > > > preconditioner. And given the manual and the statements on this > > > list, I thought this would be standard practice and therefore > > > thoroughly tested. But now I get the impression it has never > > > worked... > > > > > Yes, that way has never worked. Nest is only a memory optimization, and > with > > implicit problems I am never running at the limit of memory (or I use > more > > procs). > > The people I know who needed it had explicitly coded it in rather than > > trying to > > use it from options. It should not take long to get this fixed. > > > > Thanks, > > > > Matt > > > > > Chris > > > > > > > > > > From: Matthew Knepley > > > > Sent: Tuesday, August 02, 2016 12:28 AM > > > > To: Klaij, Christiaan > > > > Cc: petsc-users at mcs.anl.gov; Jed Brown > > > > Subject: Re: [petsc-users] block matrix without MatCreateNest > > > > > > > > On Mon, Aug 1, 2016 at 9:36 AM, Klaij, Christiaan > > > wrote: > > > > > > > > Matt, > > > > > > > > > > > > 1) great! > > > > > > > > > > > > 2) ??? that's precisely why I paste the output of "cat > mattry.F90" > > > in the emails, so you have a small example that produces the errors I > > > mention. Now I'm also attaching it to this email. > > > > > > > > Okay, I have gone through it. You are correct that it is completely > > > broken. > > > > > > > > The way that MatNest currently works is that it trys to use L2G > mappings > > > from individual blocks > > > > and then builds a composite L2G map for the whole matrix. This is > > > obviously incompatible with > > > > the primary use case, and should be changed to break up the full L2G > > > into one for each block. > > > > > > > > Jed, can you fix this? I am not sure I know enough about how Nest > works. > > > > > > > > Matt > > > > > > > > Thanks, > > > > > > > > Chris > > > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development > MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl > > MARIN news: http://www.marin.nl/web/News/News-items/Vice-Admiral-De- > Waard-makes-virtual-test-voyage-on-MARINs-FSSS.htm > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From griffith at cims.nyu.edu Thu Aug 4 10:57:05 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 4 Aug 2016 11:57:05 -0400 Subject: [petsc-users] PCASMType Message-ID: With PETSc 3.7.2, changing the ASM type does not seem to have any effect for serial jobs. E.g. using ksp/examples/tutorials/ex2: $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_type basic 0 KSP preconditioned resid norm 7.227368482718e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.680806990415e+00 true resid norm 3.440170689976e+00 ||r(i)||/||b|| 2.949922029898e-01 2 KSP preconditioned resid norm 1.537918148738e+00 true resid norm 1.676069837125e+00 ||r(i)||/||b|| 1.437218028335e-01 3 KSP preconditioned resid norm 1.025836188311e+00 true resid norm 1.127193636507e+00 ||r(i)||/||b|| 9.665605692131e-02 4 KSP preconditioned resid norm 7.465807140977e-01 true resid norm 7.063999312309e-01 ||r(i)||/||b|| 6.057329437543e-02 5 KSP preconditioned resid norm 5.806789214514e-01 true resid norm 5.248222587585e-01 ||r(i)||/||b|| 4.500313741419e-02 6 KSP preconditioned resid norm 4.709931353737e-01 true resid norm 4.142974151217e-01 ||r(i)||/||b|| 3.552571026079e-02 7 KSP preconditioned resid norm 4.123520053531e-01 true resid norm 3.428306005637e-01 ||r(i)||/||b|| 2.939748147011e-02 8 KSP preconditioned resid norm 3.623327779165e-01 true resid norm 3.188334125692e-01 ||r(i)||/||b|| 2.733973957589e-02 9 KSP preconditioned resid norm 2.416851769562e-01 true resid norm 3.243055253185e-01 ||r(i)||/||b|| 2.780896937301e-02 10 KSP preconditioned resid norm 1.215247217085e-01 true resid norm 1.564954190637e-01 ||r(i)||/||b|| 1.341937147536e-02 11 KSP preconditioned resid norm 6.497470510985e-02 true resid norm 8.673119491739e-02 ||r(i)||/||b|| 7.437138608026e-03 12 KSP preconditioned resid norm 3.246036728276e-02 true resid norm 4.591618222897e-02 ||r(i)||/||b|| 3.937280143707e-03 13 KSP preconditioned resid norm 1.435458873790e-02 true resid norm 1.985166297182e-02 ||r(i)||/||b|| 1.702266056197e-03 14 KSP preconditioned resid norm 6.960292630481e-03 true resid norm 9.778813102206e-03 ||r(i)||/||b|| 8.385263057007e-04 15 KSP preconditioned resid norm 3.604789592169e-03 true resid norm 4.449537275931e-03 ||r(i)||/||b|| 3.815446736805e-04 16 KSP preconditioned resid norm 2.347335356674e-03 true resid norm 2.530581623538e-03 ||r(i)||/||b|| 2.169955840122e-04 17 KSP preconditioned resid norm 1.606443885426e-03 true resid norm 1.835719869720e-03 ||r(i)||/||b|| 1.574116801875e-04 18 KSP preconditioned resid norm 8.685134492648e-04 true resid norm 1.250934036178e-03 ||r(i)||/||b|| 1.072667086555e-04 19 KSP preconditioned resid norm 4.160062164697e-04 true resid norm 5.589755065266e-04 ||r(i)||/||b|| 4.793175424931e-05 20 KSP preconditioned resid norm 2.264387793673e-04 true resid norm 2.733953335794e-04 ||r(i)||/||b|| 2.344345644671e-05 21 KSP preconditioned resid norm 1.572062148327e-04 true resid norm 1.655081111255e-04 ||r(i)||/||b|| 1.419220344382e-05 22 KSP preconditioned resid norm 1.046603808940e-04 true resid norm 1.170344120764e-04 ||r(i)||/||b|| 1.003561804204e-05 23 KSP preconditioned resid norm 7.323396102053e-05 true resid norm 9.226622743289e-05 ||r(i)||/||b|| 7.911763730589e-06 24 KSP preconditioned resid norm 3.961819569402e-05 true resid norm 5.397694170644e-05 ||r(i)||/||b|| 4.628484566487e-06 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: asm Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - BASIC Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: icc 0 levels of fill tolerance for zero pivot 2.22045e-14 using Manteuffel shift [POSITIVE_DEFINITE] matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqsbaij rows=160, cols=160 package used to perform factorization: petsc total: nonzeros=443, allocated nonzeros=443 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=160, cols=160 total: nonzeros=726, allocated nonzeros=726 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1024, cols=1024 total: nonzeros=4992, allocated nonzeros=5120 total number of mallocs used during MatSetValues calls =0 not using I-node routines Norm of error 0.000445244 iterations 24 $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_type restrict 0 KSP preconditioned resid norm 7.227368482718e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.680806990415e+00 true resid norm 3.440170689976e+00 ||r(i)||/||b|| 2.949922029898e-01 2 KSP preconditioned resid norm 1.537918148738e+00 true resid norm 1.676069837125e+00 ||r(i)||/||b|| 1.437218028335e-01 3 KSP preconditioned resid norm 1.025836188311e+00 true resid norm 1.127193636507e+00 ||r(i)||/||b|| 9.665605692131e-02 4 KSP preconditioned resid norm 7.465807140977e-01 true resid norm 7.063999312309e-01 ||r(i)||/||b|| 6.057329437543e-02 5 KSP preconditioned resid norm 5.806789214514e-01 true resid norm 5.248222587585e-01 ||r(i)||/||b|| 4.500313741419e-02 6 KSP preconditioned resid norm 4.709931353737e-01 true resid norm 4.142974151217e-01 ||r(i)||/||b|| 3.552571026079e-02 7 KSP preconditioned resid norm 4.123520053531e-01 true resid norm 3.428306005637e-01 ||r(i)||/||b|| 2.939748147011e-02 8 KSP preconditioned resid norm 3.623327779165e-01 true resid norm 3.188334125692e-01 ||r(i)||/||b|| 2.733973957589e-02 9 KSP preconditioned resid norm 2.416851769562e-01 true resid norm 3.243055253185e-01 ||r(i)||/||b|| 2.780896937301e-02 10 KSP preconditioned resid norm 1.215247217085e-01 true resid norm 1.564954190637e-01 ||r(i)||/||b|| 1.341937147536e-02 11 KSP preconditioned resid norm 6.497470510985e-02 true resid norm 8.673119491739e-02 ||r(i)||/||b|| 7.437138608026e-03 12 KSP preconditioned resid norm 3.246036728276e-02 true resid norm 4.591618222897e-02 ||r(i)||/||b|| 3.937280143707e-03 13 KSP preconditioned resid norm 1.435458873790e-02 true resid norm 1.985166297182e-02 ||r(i)||/||b|| 1.702266056197e-03 14 KSP preconditioned resid norm 6.960292630481e-03 true resid norm 9.778813102206e-03 ||r(i)||/||b|| 8.385263057007e-04 15 KSP preconditioned resid norm 3.604789592169e-03 true resid norm 4.449537275931e-03 ||r(i)||/||b|| 3.815446736805e-04 16 KSP preconditioned resid norm 2.347335356674e-03 true resid norm 2.530581623538e-03 ||r(i)||/||b|| 2.169955840122e-04 17 KSP preconditioned resid norm 1.606443885426e-03 true resid norm 1.835719869720e-03 ||r(i)||/||b|| 1.574116801875e-04 18 KSP preconditioned resid norm 8.685134492648e-04 true resid norm 1.250934036178e-03 ||r(i)||/||b|| 1.072667086555e-04 19 KSP preconditioned resid norm 4.160062164697e-04 true resid norm 5.589755065266e-04 ||r(i)||/||b|| 4.793175424931e-05 20 KSP preconditioned resid norm 2.264387793673e-04 true resid norm 2.733953335794e-04 ||r(i)||/||b|| 2.344345644671e-05 21 KSP preconditioned resid norm 1.572062148327e-04 true resid norm 1.655081111255e-04 ||r(i)||/||b|| 1.419220344382e-05 22 KSP preconditioned resid norm 1.046603808940e-04 true resid norm 1.170344120764e-04 ||r(i)||/||b|| 1.003561804204e-05 23 KSP preconditioned resid norm 7.323396102053e-05 true resid norm 9.226622743289e-05 ||r(i)||/||b|| 7.911763730589e-06 24 KSP preconditioned resid norm 3.961819569402e-05 true resid norm 5.397694170644e-05 ||r(i)||/||b|| 4.628484566487e-06 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: asm Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - RESTRICT Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: icc 0 levels of fill tolerance for zero pivot 2.22045e-14 using Manteuffel shift [POSITIVE_DEFINITE] matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqsbaij rows=160, cols=160 package used to perform factorization: petsc total: nonzeros=443, allocated nonzeros=443 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=160, cols=160 total: nonzeros=726, allocated nonzeros=726 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1024, cols=1024 total: nonzeros=4992, allocated nonzeros=5120 total number of mallocs used during MatSetValues calls =0 not using I-node routines Norm of error 0.000445244 iterations 24 $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_type interpolate 0 KSP preconditioned resid norm 7.227368482718e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.680806990415e+00 true resid norm 3.440170689976e+00 ||r(i)||/||b|| 2.949922029898e-01 2 KSP preconditioned resid norm 1.537918148738e+00 true resid norm 1.676069837125e+00 ||r(i)||/||b|| 1.437218028335e-01 3 KSP preconditioned resid norm 1.025836188311e+00 true resid norm 1.127193636507e+00 ||r(i)||/||b|| 9.665605692131e-02 4 KSP preconditioned resid norm 7.465807140977e-01 true resid norm 7.063999312309e-01 ||r(i)||/||b|| 6.057329437543e-02 5 KSP preconditioned resid norm 5.806789214514e-01 true resid norm 5.248222587585e-01 ||r(i)||/||b|| 4.500313741419e-02 6 KSP preconditioned resid norm 4.709931353737e-01 true resid norm 4.142974151217e-01 ||r(i)||/||b|| 3.552571026079e-02 7 KSP preconditioned resid norm 4.123520053531e-01 true resid norm 3.428306005637e-01 ||r(i)||/||b|| 2.939748147011e-02 8 KSP preconditioned resid norm 3.623327779165e-01 true resid norm 3.188334125692e-01 ||r(i)||/||b|| 2.733973957589e-02 9 KSP preconditioned resid norm 2.416851769562e-01 true resid norm 3.243055253185e-01 ||r(i)||/||b|| 2.780896937301e-02 10 KSP preconditioned resid norm 1.215247217085e-01 true resid norm 1.564954190637e-01 ||r(i)||/||b|| 1.341937147536e-02 11 KSP preconditioned resid norm 6.497470510985e-02 true resid norm 8.673119491739e-02 ||r(i)||/||b|| 7.437138608026e-03 12 KSP preconditioned resid norm 3.246036728276e-02 true resid norm 4.591618222897e-02 ||r(i)||/||b|| 3.937280143707e-03 13 KSP preconditioned resid norm 1.435458873790e-02 true resid norm 1.985166297182e-02 ||r(i)||/||b|| 1.702266056197e-03 14 KSP preconditioned resid norm 6.960292630481e-03 true resid norm 9.778813102206e-03 ||r(i)||/||b|| 8.385263057007e-04 15 KSP preconditioned resid norm 3.604789592169e-03 true resid norm 4.449537275931e-03 ||r(i)||/||b|| 3.815446736805e-04 16 KSP preconditioned resid norm 2.347335356674e-03 true resid norm 2.530581623538e-03 ||r(i)||/||b|| 2.169955840122e-04 17 KSP preconditioned resid norm 1.606443885426e-03 true resid norm 1.835719869720e-03 ||r(i)||/||b|| 1.574116801875e-04 18 KSP preconditioned resid norm 8.685134492648e-04 true resid norm 1.250934036178e-03 ||r(i)||/||b|| 1.072667086555e-04 19 KSP preconditioned resid norm 4.160062164697e-04 true resid norm 5.589755065266e-04 ||r(i)||/||b|| 4.793175424931e-05 20 KSP preconditioned resid norm 2.264387793673e-04 true resid norm 2.733953335794e-04 ||r(i)||/||b|| 2.344345644671e-05 21 KSP preconditioned resid norm 1.572062148327e-04 true resid norm 1.655081111255e-04 ||r(i)||/||b|| 1.419220344382e-05 22 KSP preconditioned resid norm 1.046603808940e-04 true resid norm 1.170344120764e-04 ||r(i)||/||b|| 1.003561804204e-05 23 KSP preconditioned resid norm 7.323396102053e-05 true resid norm 9.226622743289e-05 ||r(i)||/||b|| 7.911763730589e-06 24 KSP preconditioned resid norm 3.961819569402e-05 true resid norm 5.397694170644e-05 ||r(i)||/||b|| 4.628484566487e-06 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: asm Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - INTERPOLATE Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: icc 0 levels of fill tolerance for zero pivot 2.22045e-14 using Manteuffel shift [POSITIVE_DEFINITE] matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqsbaij rows=160, cols=160 package used to perform factorization: petsc total: nonzeros=443, allocated nonzeros=443 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=160, cols=160 total: nonzeros=726, allocated nonzeros=726 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1024, cols=1024 total: nonzeros=4992, allocated nonzeros=5120 total number of mallocs used during MatSetValues calls =0 not using I-node routines Norm of error 0.000445244 iterations 24 $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_type none 0 KSP preconditioned resid norm 7.227368482718e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.680806990415e+00 true resid norm 3.440170689976e+00 ||r(i)||/||b|| 2.949922029898e-01 2 KSP preconditioned resid norm 1.537918148738e+00 true resid norm 1.676069837125e+00 ||r(i)||/||b|| 1.437218028335e-01 3 KSP preconditioned resid norm 1.025836188311e+00 true resid norm 1.127193636507e+00 ||r(i)||/||b|| 9.665605692131e-02 4 KSP preconditioned resid norm 7.465807140977e-01 true resid norm 7.063999312309e-01 ||r(i)||/||b|| 6.057329437543e-02 5 KSP preconditioned resid norm 5.806789214514e-01 true resid norm 5.248222587585e-01 ||r(i)||/||b|| 4.500313741419e-02 6 KSP preconditioned resid norm 4.709931353737e-01 true resid norm 4.142974151217e-01 ||r(i)||/||b|| 3.552571026079e-02 7 KSP preconditioned resid norm 4.123520053531e-01 true resid norm 3.428306005637e-01 ||r(i)||/||b|| 2.939748147011e-02 8 KSP preconditioned resid norm 3.623327779165e-01 true resid norm 3.188334125692e-01 ||r(i)||/||b|| 2.733973957589e-02 9 KSP preconditioned resid norm 2.416851769562e-01 true resid norm 3.243055253185e-01 ||r(i)||/||b|| 2.780896937301e-02 10 KSP preconditioned resid norm 1.215247217085e-01 true resid norm 1.564954190637e-01 ||r(i)||/||b|| 1.341937147536e-02 11 KSP preconditioned resid norm 6.497470510985e-02 true resid norm 8.673119491739e-02 ||r(i)||/||b|| 7.437138608026e-03 12 KSP preconditioned resid norm 3.246036728276e-02 true resid norm 4.591618222897e-02 ||r(i)||/||b|| 3.937280143707e-03 13 KSP preconditioned resid norm 1.435458873790e-02 true resid norm 1.985166297182e-02 ||r(i)||/||b|| 1.702266056197e-03 14 KSP preconditioned resid norm 6.960292630481e-03 true resid norm 9.778813102206e-03 ||r(i)||/||b|| 8.385263057007e-04 15 KSP preconditioned resid norm 3.604789592169e-03 true resid norm 4.449537275931e-03 ||r(i)||/||b|| 3.815446736805e-04 16 KSP preconditioned resid norm 2.347335356674e-03 true resid norm 2.530581623538e-03 ||r(i)||/||b|| 2.169955840122e-04 17 KSP preconditioned resid norm 1.606443885426e-03 true resid norm 1.835719869720e-03 ||r(i)||/||b|| 1.574116801875e-04 18 KSP preconditioned resid norm 8.685134492648e-04 true resid norm 1.250934036178e-03 ||r(i)||/||b|| 1.072667086555e-04 19 KSP preconditioned resid norm 4.160062164697e-04 true resid norm 5.589755065266e-04 ||r(i)||/||b|| 4.793175424931e-05 20 KSP preconditioned resid norm 2.264387793673e-04 true resid norm 2.733953335794e-04 ||r(i)||/||b|| 2.344345644671e-05 21 KSP preconditioned resid norm 1.572062148327e-04 true resid norm 1.655081111255e-04 ||r(i)||/||b|| 1.419220344382e-05 22 KSP preconditioned resid norm 1.046603808940e-04 true resid norm 1.170344120764e-04 ||r(i)||/||b|| 1.003561804204e-05 23 KSP preconditioned resid norm 7.323396102053e-05 true resid norm 9.226622743289e-05 ||r(i)||/||b|| 7.911763730589e-06 24 KSP preconditioned resid norm 3.961819569402e-05 true resid norm 5.397694170644e-05 ||r(i)||/||b|| 4.628484566487e-06 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: asm Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - NONE Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: icc 0 levels of fill tolerance for zero pivot 2.22045e-14 using Manteuffel shift [POSITIVE_DEFINITE] matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqsbaij rows=160, cols=160 package used to perform factorization: petsc total: nonzeros=443, allocated nonzeros=443 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=160, cols=160 total: nonzeros=726, allocated nonzeros=726 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1024, cols=1024 total: nonzeros=4992, allocated nonzeros=5120 total number of mallocs used during MatSetValues calls =0 not using I-node routines Norm of error 0.000445244 iterations 24 Is this setting meant only to affect how communication between processors is handled? Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: asm Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - BASIC Additive Schwarz: local solve composition type - MULTIPLICATIVE Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (sub_) 1 MPI processes type: icc 0 levels of fill tolerance for zero pivot 2.22045e-14 using Manteuffel shift [POSITIVE_DEFINITE] matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqsbaij rows=160, cols=160 package used to perform factorization: petsc total: nonzeros=443, allocated nonzeros=443 total number of mallocs used during MatSetValues calls =0 block size is 1 linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=160, cols=160 total: nonzeros=726, allocated nonzeros=726 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=1024, cols=1024 total: nonzeros=4992, allocated nonzeros=5120 total number of mallocs used during MatSetValues calls =0 not using I-node routines Norm of error 0.000292304 iterations 24 Thanks, -- Boyce -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Aug 4 18:48:03 2016 From: jed at jedbrown.org (Jed Brown) Date: Thu, 04 Aug 2016 17:48:03 -0600 Subject: [petsc-users] false-positive leak report in log_view? In-Reply-To: References: Message-ID: <87invgayfg.fsf@jedbrown.org> Patrick Sanan writes: > On Thu, Aug 4, 2016 at 10:18 AM, Dave May wrote: >> >> >> On 4 August 2016 at 10:10, Patrick Sanan wrote: >>> >>> I have a patch that I got from Dave that he got from Jed which seems >>> to be related to this. I'll make a PR. >> >> >> Jed wrote this variant of the VTK viewer so please mark him as a reviewer >> for my bug fix. > https://bitbucket.org/petsc/petsc/pull-requests/520/petscviewervtk-dereference-dm-to-avoid/diff Thanks, all. I commented on this with an alternate fix. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Thu Aug 4 19:42:38 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 4 Aug 2016 20:42:38 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: Message-ID: History, 1) I originally implemented the ASM with one subdomain per process 2) easily extended to support multiple domain per process 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that restrict etc could be achieved by simply dropping the parallel communication in the vector scatters 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. Barry > On Aug 4, 2016, at 11:57 AM, Boyce Griffith wrote: > > With PETSc 3.7.2, changing the ASM type does not seem to have any effect for serial jobs. E.g. using ksp/examples/tutorials/ex2: > > $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_type basic > 0 KSP preconditioned resid norm 7.227368482718e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.680806990415e+00 true resid norm 3.440170689976e+00 ||r(i)||/||b|| 2.949922029898e-01 > 2 KSP preconditioned resid norm 1.537918148738e+00 true resid norm 1.676069837125e+00 ||r(i)||/||b|| 1.437218028335e-01 > 3 KSP preconditioned resid norm 1.025836188311e+00 true resid norm 1.127193636507e+00 ||r(i)||/||b|| 9.665605692131e-02 > 4 KSP preconditioned resid norm 7.465807140977e-01 true resid norm 7.063999312309e-01 ||r(i)||/||b|| 6.057329437543e-02 > 5 KSP preconditioned resid norm 5.806789214514e-01 true resid norm 5.248222587585e-01 ||r(i)||/||b|| 4.500313741419e-02 > 6 KSP preconditioned resid norm 4.709931353737e-01 true resid norm 4.142974151217e-01 ||r(i)||/||b|| 3.552571026079e-02 > 7 KSP preconditioned resid norm 4.123520053531e-01 true resid norm 3.428306005637e-01 ||r(i)||/||b|| 2.939748147011e-02 > 8 KSP preconditioned resid norm 3.623327779165e-01 true resid norm 3.188334125692e-01 ||r(i)||/||b|| 2.733973957589e-02 > 9 KSP preconditioned resid norm 2.416851769562e-01 true resid norm 3.243055253185e-01 ||r(i)||/||b|| 2.780896937301e-02 > 10 KSP preconditioned resid norm 1.215247217085e-01 true resid norm 1.564954190637e-01 ||r(i)||/||b|| 1.341937147536e-02 > 11 KSP preconditioned resid norm 6.497470510985e-02 true resid norm 8.673119491739e-02 ||r(i)||/||b|| 7.437138608026e-03 > 12 KSP preconditioned resid norm 3.246036728276e-02 true resid norm 4.591618222897e-02 ||r(i)||/||b|| 3.937280143707e-03 > 13 KSP preconditioned resid norm 1.435458873790e-02 true resid norm 1.985166297182e-02 ||r(i)||/||b|| 1.702266056197e-03 > 14 KSP preconditioned resid norm 6.960292630481e-03 true resid norm 9.778813102206e-03 ||r(i)||/||b|| 8.385263057007e-04 > 15 KSP preconditioned resid norm 3.604789592169e-03 true resid norm 4.449537275931e-03 ||r(i)||/||b|| 3.815446736805e-04 > 16 KSP preconditioned resid norm 2.347335356674e-03 true resid norm 2.530581623538e-03 ||r(i)||/||b|| 2.169955840122e-04 > 17 KSP preconditioned resid norm 1.606443885426e-03 true resid norm 1.835719869720e-03 ||r(i)||/||b|| 1.574116801875e-04 > 18 KSP preconditioned resid norm 8.685134492648e-04 true resid norm 1.250934036178e-03 ||r(i)||/||b|| 1.072667086555e-04 > 19 KSP preconditioned resid norm 4.160062164697e-04 true resid norm 5.589755065266e-04 ||r(i)||/||b|| 4.793175424931e-05 > 20 KSP preconditioned resid norm 2.264387793673e-04 true resid norm 2.733953335794e-04 ||r(i)||/||b|| 2.344345644671e-05 > 21 KSP preconditioned resid norm 1.572062148327e-04 true resid norm 1.655081111255e-04 ||r(i)||/||b|| 1.419220344382e-05 > 22 KSP preconditioned resid norm 1.046603808940e-04 true resid norm 1.170344120764e-04 ||r(i)||/||b|| 1.003561804204e-05 > 23 KSP preconditioned resid norm 7.323396102053e-05 true resid norm 9.226622743289e-05 ||r(i)||/||b|| 7.911763730589e-06 > 24 KSP preconditioned resid norm 3.961819569402e-05 true resid norm 5.397694170644e-05 ||r(i)||/||b|| 4.628484566487e-06 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 > Additive Schwarz: restriction/interpolation type - BASIC > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI processes > type: icc > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using Manteuffel shift [POSITIVE_DEFINITE] > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqsbaij > rows=160, cols=160 > package used to perform factorization: petsc > total: nonzeros=443, allocated nonzeros=443 > total number of mallocs used during MatSetValues calls =0 > block size is 1 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=160, cols=160 > total: nonzeros=726, allocated nonzeros=726 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=1024, cols=1024 > total: nonzeros=4992, allocated nonzeros=5120 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 0.000445244 iterations 24 > $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_type restrict > 0 KSP preconditioned resid norm 7.227368482718e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.680806990415e+00 true resid norm 3.440170689976e+00 ||r(i)||/||b|| 2.949922029898e-01 > 2 KSP preconditioned resid norm 1.537918148738e+00 true resid norm 1.676069837125e+00 ||r(i)||/||b|| 1.437218028335e-01 > 3 KSP preconditioned resid norm 1.025836188311e+00 true resid norm 1.127193636507e+00 ||r(i)||/||b|| 9.665605692131e-02 > 4 KSP preconditioned resid norm 7.465807140977e-01 true resid norm 7.063999312309e-01 ||r(i)||/||b|| 6.057329437543e-02 > 5 KSP preconditioned resid norm 5.806789214514e-01 true resid norm 5.248222587585e-01 ||r(i)||/||b|| 4.500313741419e-02 > 6 KSP preconditioned resid norm 4.709931353737e-01 true resid norm 4.142974151217e-01 ||r(i)||/||b|| 3.552571026079e-02 > 7 KSP preconditioned resid norm 4.123520053531e-01 true resid norm 3.428306005637e-01 ||r(i)||/||b|| 2.939748147011e-02 > 8 KSP preconditioned resid norm 3.623327779165e-01 true resid norm 3.188334125692e-01 ||r(i)||/||b|| 2.733973957589e-02 > 9 KSP preconditioned resid norm 2.416851769562e-01 true resid norm 3.243055253185e-01 ||r(i)||/||b|| 2.780896937301e-02 > 10 KSP preconditioned resid norm 1.215247217085e-01 true resid norm 1.564954190637e-01 ||r(i)||/||b|| 1.341937147536e-02 > 11 KSP preconditioned resid norm 6.497470510985e-02 true resid norm 8.673119491739e-02 ||r(i)||/||b|| 7.437138608026e-03 > 12 KSP preconditioned resid norm 3.246036728276e-02 true resid norm 4.591618222897e-02 ||r(i)||/||b|| 3.937280143707e-03 > 13 KSP preconditioned resid norm 1.435458873790e-02 true resid norm 1.985166297182e-02 ||r(i)||/||b|| 1.702266056197e-03 > 14 KSP preconditioned resid norm 6.960292630481e-03 true resid norm 9.778813102206e-03 ||r(i)||/||b|| 8.385263057007e-04 > 15 KSP preconditioned resid norm 3.604789592169e-03 true resid norm 4.449537275931e-03 ||r(i)||/||b|| 3.815446736805e-04 > 16 KSP preconditioned resid norm 2.347335356674e-03 true resid norm 2.530581623538e-03 ||r(i)||/||b|| 2.169955840122e-04 > 17 KSP preconditioned resid norm 1.606443885426e-03 true resid norm 1.835719869720e-03 ||r(i)||/||b|| 1.574116801875e-04 > 18 KSP preconditioned resid norm 8.685134492648e-04 true resid norm 1.250934036178e-03 ||r(i)||/||b|| 1.072667086555e-04 > 19 KSP preconditioned resid norm 4.160062164697e-04 true resid norm 5.589755065266e-04 ||r(i)||/||b|| 4.793175424931e-05 > 20 KSP preconditioned resid norm 2.264387793673e-04 true resid norm 2.733953335794e-04 ||r(i)||/||b|| 2.344345644671e-05 > 21 KSP preconditioned resid norm 1.572062148327e-04 true resid norm 1.655081111255e-04 ||r(i)||/||b|| 1.419220344382e-05 > 22 KSP preconditioned resid norm 1.046603808940e-04 true resid norm 1.170344120764e-04 ||r(i)||/||b|| 1.003561804204e-05 > 23 KSP preconditioned resid norm 7.323396102053e-05 true resid norm 9.226622743289e-05 ||r(i)||/||b|| 7.911763730589e-06 > 24 KSP preconditioned resid norm 3.961819569402e-05 true resid norm 5.397694170644e-05 ||r(i)||/||b|| 4.628484566487e-06 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 > Additive Schwarz: restriction/interpolation type - RESTRICT > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI processes > type: icc > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using Manteuffel shift [POSITIVE_DEFINITE] > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqsbaij > rows=160, cols=160 > package used to perform factorization: petsc > total: nonzeros=443, allocated nonzeros=443 > total number of mallocs used during MatSetValues calls =0 > block size is 1 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=160, cols=160 > total: nonzeros=726, allocated nonzeros=726 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=1024, cols=1024 > total: nonzeros=4992, allocated nonzeros=5120 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 0.000445244 iterations 24 > $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_type interpolate > 0 KSP preconditioned resid norm 7.227368482718e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.680806990415e+00 true resid norm 3.440170689976e+00 ||r(i)||/||b|| 2.949922029898e-01 > 2 KSP preconditioned resid norm 1.537918148738e+00 true resid norm 1.676069837125e+00 ||r(i)||/||b|| 1.437218028335e-01 > 3 KSP preconditioned resid norm 1.025836188311e+00 true resid norm 1.127193636507e+00 ||r(i)||/||b|| 9.665605692131e-02 > 4 KSP preconditioned resid norm 7.465807140977e-01 true resid norm 7.063999312309e-01 ||r(i)||/||b|| 6.057329437543e-02 > 5 KSP preconditioned resid norm 5.806789214514e-01 true resid norm 5.248222587585e-01 ||r(i)||/||b|| 4.500313741419e-02 > 6 KSP preconditioned resid norm 4.709931353737e-01 true resid norm 4.142974151217e-01 ||r(i)||/||b|| 3.552571026079e-02 > 7 KSP preconditioned resid norm 4.123520053531e-01 true resid norm 3.428306005637e-01 ||r(i)||/||b|| 2.939748147011e-02 > 8 KSP preconditioned resid norm 3.623327779165e-01 true resid norm 3.188334125692e-01 ||r(i)||/||b|| 2.733973957589e-02 > 9 KSP preconditioned resid norm 2.416851769562e-01 true resid norm 3.243055253185e-01 ||r(i)||/||b|| 2.780896937301e-02 > 10 KSP preconditioned resid norm 1.215247217085e-01 true resid norm 1.564954190637e-01 ||r(i)||/||b|| 1.341937147536e-02 > 11 KSP preconditioned resid norm 6.497470510985e-02 true resid norm 8.673119491739e-02 ||r(i)||/||b|| 7.437138608026e-03 > 12 KSP preconditioned resid norm 3.246036728276e-02 true resid norm 4.591618222897e-02 ||r(i)||/||b|| 3.937280143707e-03 > 13 KSP preconditioned resid norm 1.435458873790e-02 true resid norm 1.985166297182e-02 ||r(i)||/||b|| 1.702266056197e-03 > 14 KSP preconditioned resid norm 6.960292630481e-03 true resid norm 9.778813102206e-03 ||r(i)||/||b|| 8.385263057007e-04 > 15 KSP preconditioned resid norm 3.604789592169e-03 true resid norm 4.449537275931e-03 ||r(i)||/||b|| 3.815446736805e-04 > 16 KSP preconditioned resid norm 2.347335356674e-03 true resid norm 2.530581623538e-03 ||r(i)||/||b|| 2.169955840122e-04 > 17 KSP preconditioned resid norm 1.606443885426e-03 true resid norm 1.835719869720e-03 ||r(i)||/||b|| 1.574116801875e-04 > 18 KSP preconditioned resid norm 8.685134492648e-04 true resid norm 1.250934036178e-03 ||r(i)||/||b|| 1.072667086555e-04 > 19 KSP preconditioned resid norm 4.160062164697e-04 true resid norm 5.589755065266e-04 ||r(i)||/||b|| 4.793175424931e-05 > 20 KSP preconditioned resid norm 2.264387793673e-04 true resid norm 2.733953335794e-04 ||r(i)||/||b|| 2.344345644671e-05 > 21 KSP preconditioned resid norm 1.572062148327e-04 true resid norm 1.655081111255e-04 ||r(i)||/||b|| 1.419220344382e-05 > 22 KSP preconditioned resid norm 1.046603808940e-04 true resid norm 1.170344120764e-04 ||r(i)||/||b|| 1.003561804204e-05 > 23 KSP preconditioned resid norm 7.323396102053e-05 true resid norm 9.226622743289e-05 ||r(i)||/||b|| 7.911763730589e-06 > 24 KSP preconditioned resid norm 3.961819569402e-05 true resid norm 5.397694170644e-05 ||r(i)||/||b|| 4.628484566487e-06 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 > Additive Schwarz: restriction/interpolation type - INTERPOLATE > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI processes > type: icc > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using Manteuffel shift [POSITIVE_DEFINITE] > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqsbaij > rows=160, cols=160 > package used to perform factorization: petsc > total: nonzeros=443, allocated nonzeros=443 > total number of mallocs used during MatSetValues calls =0 > block size is 1 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=160, cols=160 > total: nonzeros=726, allocated nonzeros=726 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=1024, cols=1024 > total: nonzeros=4992, allocated nonzeros=5120 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 0.000445244 iterations 24 > $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_type none > 0 KSP preconditioned resid norm 7.227368482718e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.680806990415e+00 true resid norm 3.440170689976e+00 ||r(i)||/||b|| 2.949922029898e-01 > 2 KSP preconditioned resid norm 1.537918148738e+00 true resid norm 1.676069837125e+00 ||r(i)||/||b|| 1.437218028335e-01 > 3 KSP preconditioned resid norm 1.025836188311e+00 true resid norm 1.127193636507e+00 ||r(i)||/||b|| 9.665605692131e-02 > 4 KSP preconditioned resid norm 7.465807140977e-01 true resid norm 7.063999312309e-01 ||r(i)||/||b|| 6.057329437543e-02 > 5 KSP preconditioned resid norm 5.806789214514e-01 true resid norm 5.248222587585e-01 ||r(i)||/||b|| 4.500313741419e-02 > 6 KSP preconditioned resid norm 4.709931353737e-01 true resid norm 4.142974151217e-01 ||r(i)||/||b|| 3.552571026079e-02 > 7 KSP preconditioned resid norm 4.123520053531e-01 true resid norm 3.428306005637e-01 ||r(i)||/||b|| 2.939748147011e-02 > 8 KSP preconditioned resid norm 3.623327779165e-01 true resid norm 3.188334125692e-01 ||r(i)||/||b|| 2.733973957589e-02 > 9 KSP preconditioned resid norm 2.416851769562e-01 true resid norm 3.243055253185e-01 ||r(i)||/||b|| 2.780896937301e-02 > 10 KSP preconditioned resid norm 1.215247217085e-01 true resid norm 1.564954190637e-01 ||r(i)||/||b|| 1.341937147536e-02 > 11 KSP preconditioned resid norm 6.497470510985e-02 true resid norm 8.673119491739e-02 ||r(i)||/||b|| 7.437138608026e-03 > 12 KSP preconditioned resid norm 3.246036728276e-02 true resid norm 4.591618222897e-02 ||r(i)||/||b|| 3.937280143707e-03 > 13 KSP preconditioned resid norm 1.435458873790e-02 true resid norm 1.985166297182e-02 ||r(i)||/||b|| 1.702266056197e-03 > 14 KSP preconditioned resid norm 6.960292630481e-03 true resid norm 9.778813102206e-03 ||r(i)||/||b|| 8.385263057007e-04 > 15 KSP preconditioned resid norm 3.604789592169e-03 true resid norm 4.449537275931e-03 ||r(i)||/||b|| 3.815446736805e-04 > 16 KSP preconditioned resid norm 2.347335356674e-03 true resid norm 2.530581623538e-03 ||r(i)||/||b|| 2.169955840122e-04 > 17 KSP preconditioned resid norm 1.606443885426e-03 true resid norm 1.835719869720e-03 ||r(i)||/||b|| 1.574116801875e-04 > 18 KSP preconditioned resid norm 8.685134492648e-04 true resid norm 1.250934036178e-03 ||r(i)||/||b|| 1.072667086555e-04 > 19 KSP preconditioned resid norm 4.160062164697e-04 true resid norm 5.589755065266e-04 ||r(i)||/||b|| 4.793175424931e-05 > 20 KSP preconditioned resid norm 2.264387793673e-04 true resid norm 2.733953335794e-04 ||r(i)||/||b|| 2.344345644671e-05 > 21 KSP preconditioned resid norm 1.572062148327e-04 true resid norm 1.655081111255e-04 ||r(i)||/||b|| 1.419220344382e-05 > 22 KSP preconditioned resid norm 1.046603808940e-04 true resid norm 1.170344120764e-04 ||r(i)||/||b|| 1.003561804204e-05 > 23 KSP preconditioned resid norm 7.323396102053e-05 true resid norm 9.226622743289e-05 ||r(i)||/||b|| 7.911763730589e-06 > 24 KSP preconditioned resid norm 3.961819569402e-05 true resid norm 5.397694170644e-05 ||r(i)||/||b|| 4.628484566487e-06 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 > Additive Schwarz: restriction/interpolation type - NONE > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI processes > type: icc > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using Manteuffel shift [POSITIVE_DEFINITE] > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqsbaij > rows=160, cols=160 > package used to perform factorization: petsc > total: nonzeros=443, allocated nonzeros=443 > total number of mallocs used during MatSetValues calls =0 > block size is 1 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=160, cols=160 > total: nonzeros=726, allocated nonzeros=726 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=1024, cols=1024 > total: nonzeros=4992, allocated nonzeros=5120 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 0.000445244 iterations 24 > > Is this setting meant only to affect how communication between processors is handled? > > Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: > > $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE > 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 > 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 > 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 > 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 > 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 > 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 > 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 > 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 > 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 > 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 > 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 > 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 > 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 > 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 > 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 > 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 > 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 > 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 > 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 > 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 > 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 > 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 > 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 > 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 > Additive Schwarz: restriction/interpolation type - BASIC > Additive Schwarz: local solve composition type - MULTIPLICATIVE > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using NONE norm type for convergence test > PC Object: (sub_) 1 MPI processes > type: icc > 0 levels of fill > tolerance for zero pivot 2.22045e-14 > using Manteuffel shift [POSITIVE_DEFINITE] > matrix ordering: natural > factor fill ratio given 1., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqsbaij > rows=160, cols=160 > package used to perform factorization: petsc > total: nonzeros=443, allocated nonzeros=443 > total number of mallocs used during MatSetValues calls =0 > block size is 1 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=160, cols=160 > total: nonzeros=726, allocated nonzeros=726 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=1024, cols=1024 > total: nonzeros=4992, allocated nonzeros=5120 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 0.000292304 iterations 24 > > Thanks, > > -- Boyce > From griffith at cims.nyu.edu Thu Aug 4 19:51:48 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 4 Aug 2016 20:51:48 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: Message-ID: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> > On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: > > > History, > > 1) I originally implemented the ASM with one subdomain per process > 2) easily extended to support multiple domain per process > 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that > restrict etc could be achieved by simply dropping the parallel communication in the vector scatters > 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. > > Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. > > So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >> >> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >> KSP Object: 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: asm >> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >> Additive Schwarz: restriction/interpolation type - BASIC >> Additive Schwarz: local solve composition type - MULTIPLICATIVE >> Local solve is same for all blocks, in the following KSP and PC objects: >> KSP Object: (sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (sub_) 1 MPI processes >> type: icc >> 0 levels of fill >> tolerance for zero pivot 2.22045e-14 >> using Manteuffel shift [POSITIVE_DEFINITE] >> matrix ordering: natural >> factor fill ratio given 1., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqsbaij >> rows=160, cols=160 >> package used to perform factorization: petsc >> total: nonzeros=443, allocated nonzeros=443 >> total number of mallocs used during MatSetValues calls =0 >> block size is 1 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=160, cols=160 >> total: nonzeros=726, allocated nonzeros=726 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=1024, cols=1024 >> total: nonzeros=4992, allocated nonzeros=5120 >> total number of mallocs used during MatSetValues calls =0 >> not using I-node routines >> Norm of error 0.000292304 iterations 24 >> >> Thanks, >> >> -- Boyce >> From bsmith at mcs.anl.gov Thu Aug 4 20:01:40 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 4 Aug 2016 21:01:40 -0400 Subject: [petsc-users] PCASMType In-Reply-To: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> Message-ID: <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> > On Aug 4, 2016, at 8:51 PM, Boyce Griffith wrote: > >> >> On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: >> >> >> History, >> >> 1) I originally implemented the ASM with one subdomain per process >> 2) easily extended to support multiple domain per process >> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >> >> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >> >> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. > > OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. > > BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): Yeah, better one question per email or we will miss them. There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. Barry > >>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>> >>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>> KSP Object: 1 MPI processes >>> type: gmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI processes >>> type: asm >>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>> Additive Schwarz: restriction/interpolation type - BASIC >>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>> Local solve is same for all blocks, in the following KSP and PC objects: >>> KSP Object: (sub_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (sub_) 1 MPI processes >>> type: icc >>> 0 levels of fill >>> tolerance for zero pivot 2.22045e-14 >>> using Manteuffel shift [POSITIVE_DEFINITE] >>> matrix ordering: natural >>> factor fill ratio given 1., needed 1. >>> Factored matrix follows: >>> Mat Object: 1 MPI processes >>> type: seqsbaij >>> rows=160, cols=160 >>> package used to perform factorization: petsc >>> total: nonzeros=443, allocated nonzeros=443 >>> total number of mallocs used during MatSetValues calls =0 >>> block size is 1 >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=160, cols=160 >>> total: nonzeros=726, allocated nonzeros=726 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node routines >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI processes >>> type: seqaij >>> rows=1024, cols=1024 >>> total: nonzeros=4992, allocated nonzeros=5120 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node routines >>> Norm of error 0.000292304 iterations 24 >>> >>> Thanks, >>> >>> -- Boyce From griffith at cims.nyu.edu Thu Aug 4 20:26:25 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 4 Aug 2016 21:26:25 -0400 Subject: [petsc-users] PCASMType In-Reply-To: <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> Message-ID: > On Aug 4, 2016, at 9:01 PM, Barry Smith wrote: > >> >> On Aug 4, 2016, at 8:51 PM, Boyce Griffith wrote: >> >>> >>> On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: >>> >>> >>> History, >>> >>> 1) I originally implemented the ASM with one subdomain per process >>> 2) easily extended to support multiple domain per process >>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>> >>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>> >>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >> >> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. > > But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. Yes we are computing overlapping and non-overlapping IS?es. I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) >> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): > > Yeah, better one question per email or we will miss them. > > There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. Thanks, ? Boyce > > Barry > >> >>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>> >>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>> KSP Object: 1 MPI processes >>>> type: gmres >>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>> GMRES: happy breakdown tolerance 1e-30 >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using PRECONDITIONED norm type for convergence test >>>> PC Object: 1 MPI processes >>>> type: asm >>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>> Additive Schwarz: restriction/interpolation type - BASIC >>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>> KSP Object: (sub_) 1 MPI processes >>>> type: preonly >>>> maximum iterations=10000, initial guess is zero >>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>> left preconditioning >>>> using NONE norm type for convergence test >>>> PC Object: (sub_) 1 MPI processes >>>> type: icc >>>> 0 levels of fill >>>> tolerance for zero pivot 2.22045e-14 >>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>> matrix ordering: natural >>>> factor fill ratio given 1., needed 1. >>>> Factored matrix follows: >>>> Mat Object: 1 MPI processes >>>> type: seqsbaij >>>> rows=160, cols=160 >>>> package used to perform factorization: petsc >>>> total: nonzeros=443, allocated nonzeros=443 >>>> total number of mallocs used during MatSetValues calls =0 >>>> block size is 1 >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=160, cols=160 >>>> total: nonzeros=726, allocated nonzeros=726 >>>> total number of mallocs used during MatSetValues calls =0 >>>> not using I-node routines >>>> linear system matrix = precond matrix: >>>> Mat Object: 1 MPI processes >>>> type: seqaij >>>> rows=1024, cols=1024 >>>> total: nonzeros=4992, allocated nonzeros=5120 >>>> total number of mallocs used during MatSetValues calls =0 >>>> not using I-node routines >>>> Norm of error 0.000292304 iterations 24 >>>> >>>> Thanks, >>>> >>>> -- Boyce -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 4 20:40:50 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 4 Aug 2016 21:40:50 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> Message-ID: > On Aug 4, 2016, at 9:26 PM, Boyce Griffith wrote: > >> >> On Aug 4, 2016, at 9:01 PM, Barry Smith wrote: >> >>> >>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith wrote: >>> >>>> >>>> On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: >>>> >>>> >>>> History, >>>> >>>> 1) I originally implemented the ASM with one subdomain per process >>>> 2) easily extended to support multiple domain per process >>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>>> >>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>>> >>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >>> >>> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. >> >> But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. > > Yes we are computing overlapping and non-overlapping IS?es. > > I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) > >>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >> >> Yeah, better one question per email or we will miss them. >> >> There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. > > OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. > > However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. Yes, you should definitely check this. One would hope to see similar convergence with Matt's code, perhaps the ordering of the subdomains matter? Barry > > Thanks, > > ? Boyce > >> >> Barry >> >>> >>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>>> >>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>>> KSP Object: 1 MPI processes >>>>> type: gmres >>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> GMRES: happy breakdown tolerance 1e-30 >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI processes >>>>> type: asm >>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>>> Additive Schwarz: restriction/interpolation type - BASIC >>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>>> KSP Object: (sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (sub_) 1 MPI processes >>>>> type: icc >>>>> 0 levels of fill >>>>> tolerance for zero pivot 2.22045e-14 >>>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>>> matrix ordering: natural >>>>> factor fill ratio given 1., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqsbaij >>>>> rows=160, cols=160 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=443, allocated nonzeros=443 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> block size is 1 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=160, cols=160 >>>>> total: nonzeros=726, allocated nonzeros=726 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> not using I-node routines >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=1024, cols=1024 >>>>> total: nonzeros=4992, allocated nonzeros=5120 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> not using I-node routines >>>>> Norm of error 0.000292304 iterations 24 >>>>> >>>>> Thanks, >>>>> >>>>> -- Boyce From griffith at cims.nyu.edu Thu Aug 4 20:41:59 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 4 Aug 2016 21:41:59 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> Message-ID: > On Aug 4, 2016, at 9:26 PM, Boyce Griffith wrote: > >> >> On Aug 4, 2016, at 9:01 PM, Barry Smith > wrote: >> >>> >>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith > wrote: >>> >>>> >>>> On Aug 4, 2016, at 8:42 PM, Barry Smith > wrote: >>>> >>>> >>>> History, >>>> >>>> 1) I originally implemented the ASM with one subdomain per process >>>> 2) easily extended to support multiple domain per process >>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>>> >>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>>> >>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >>> >>> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. >> >> But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. > > Yes we are computing overlapping and non-overlapping IS?es. > > I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) OK, so here is what appears to be happening. These solvers are named things like ?stokes_pc_level_0_?, ?stokes_pc_level_1_?, ? . If I use the command-line argument -stokes_ib_pc_level_0_pc_asm_type basic then the ASM settings are used, but if I do: -stokes_ib_pc_level_pc_asm_type basic they are ignored. Any ideas? :-) Thanks, ? Boyce >>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >> >> Yeah, better one question per email or we will miss them. >> >> There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. > > OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. > > However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. > > Thanks, > > ? Boyce > >> >> Barry >> >>> >>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>>> >>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>>> KSP Object: 1 MPI processes >>>>> type: gmres >>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>> GMRES: happy breakdown tolerance 1e-30 >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using PRECONDITIONED norm type for convergence test >>>>> PC Object: 1 MPI processes >>>>> type: asm >>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>>> Additive Schwarz: restriction/interpolation type - BASIC >>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>>> KSP Object: (sub_) 1 MPI processes >>>>> type: preonly >>>>> maximum iterations=10000, initial guess is zero >>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>> left preconditioning >>>>> using NONE norm type for convergence test >>>>> PC Object: (sub_) 1 MPI processes >>>>> type: icc >>>>> 0 levels of fill >>>>> tolerance for zero pivot 2.22045e-14 >>>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>>> matrix ordering: natural >>>>> factor fill ratio given 1., needed 1. >>>>> Factored matrix follows: >>>>> Mat Object: 1 MPI processes >>>>> type: seqsbaij >>>>> rows=160, cols=160 >>>>> package used to perform factorization: petsc >>>>> total: nonzeros=443, allocated nonzeros=443 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> block size is 1 >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=160, cols=160 >>>>> total: nonzeros=726, allocated nonzeros=726 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> not using I-node routines >>>>> linear system matrix = precond matrix: >>>>> Mat Object: 1 MPI processes >>>>> type: seqaij >>>>> rows=1024, cols=1024 >>>>> total: nonzeros=4992, allocated nonzeros=5120 >>>>> total number of mallocs used during MatSetValues calls =0 >>>>> not using I-node routines >>>>> Norm of error 0.000292304 iterations 24 >>>>> >>>>> Thanks, >>>>> >>>>> -- Boyce -------------- next part -------------- An HTML attachment was scrubbed... URL: From griffith at cims.nyu.edu Thu Aug 4 20:46:42 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 4 Aug 2016 21:46:42 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> Message-ID: > On Aug 4, 2016, at 9:41 PM, Boyce Griffith wrote: > > >> On Aug 4, 2016, at 9:26 PM, Boyce Griffith > wrote: >> >>> >>> On Aug 4, 2016, at 9:01 PM, Barry Smith > wrote: >>> >>>> >>>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith > wrote: >>>> >>>>> >>>>> On Aug 4, 2016, at 8:42 PM, Barry Smith > wrote: >>>>> >>>>> >>>>> History, >>>>> >>>>> 1) I originally implemented the ASM with one subdomain per process >>>>> 2) easily extended to support multiple domain per process >>>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>>>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>>>> >>>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>>>> >>>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >>>> >>>> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. >>> >>> But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. >> >> Yes we are computing overlapping and non-overlapping IS?es. >> >> I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) > > OK, so here is what appears to be happening. These solvers are named things like ?stokes_pc_level_0_?, ?stokes_pc_level_1_?, ? . If I use the command-line argument > > -stokes_ib_pc_level_0_pc_asm_type basic > > then the ASM settings are used, but if I do: > > -stokes_ib_pc_level_pc_asm_type basic > > they are ignored. Any ideas? :-) I should have said: we are playing around with a lot of different command line options that are being collectively applied to all of the level solvers, and these options for ASM are the only ones I?ve encountered so far that have to include the level number to have an effect. Thanks, ? Boyce > > Thanks, > > ? Boyce > >>>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >>> >>> Yeah, better one question per email or we will miss them. >>> >>> There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. >> >> OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. >> >> However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. >> >> Thanks, >> >> ? Boyce >> >>> >>> Barry >>> >>>> >>>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>>>> >>>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>>>> KSP Object: 1 MPI processes >>>>>> type: gmres >>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>> GMRES: happy breakdown tolerance 1e-30 >>>>>> maximum iterations=10000, initial guess is zero >>>>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using PRECONDITIONED norm type for convergence test >>>>>> PC Object: 1 MPI processes >>>>>> type: asm >>>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>>>> Additive Schwarz: restriction/interpolation type - BASIC >>>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>>>> KSP Object: (sub_) 1 MPI processes >>>>>> type: preonly >>>>>> maximum iterations=10000, initial guess is zero >>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using NONE norm type for convergence test >>>>>> PC Object: (sub_) 1 MPI processes >>>>>> type: icc >>>>>> 0 levels of fill >>>>>> tolerance for zero pivot 2.22045e-14 >>>>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>>>> matrix ordering: natural >>>>>> factor fill ratio given 1., needed 1. >>>>>> Factored matrix follows: >>>>>> Mat Object: 1 MPI processes >>>>>> type: seqsbaij >>>>>> rows=160, cols=160 >>>>>> package used to perform factorization: petsc >>>>>> total: nonzeros=443, allocated nonzeros=443 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> block size is 1 >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI processes >>>>>> type: seqaij >>>>>> rows=160, cols=160 >>>>>> total: nonzeros=726, allocated nonzeros=726 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> not using I-node routines >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI processes >>>>>> type: seqaij >>>>>> rows=1024, cols=1024 >>>>>> total: nonzeros=4992, allocated nonzeros=5120 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> not using I-node routines >>>>>> Norm of error 0.000292304 iterations 24 >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -- Boyce > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 4 20:52:18 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 4 Aug 2016 21:52:18 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> Message-ID: <300DF95D-C74E-4F5C-AC18-559276433E6F@mcs.anl.gov> The magic handling of _1_ etc is all done in PetscOptionsFindPair_Private() so you need to put a break point in that routine and see why the requested value is not located. Barry > On Aug 4, 2016, at 9:46 PM, Boyce Griffith wrote: > > >> On Aug 4, 2016, at 9:41 PM, Boyce Griffith wrote: >> >> >>> On Aug 4, 2016, at 9:26 PM, Boyce Griffith wrote: >>> >>>> >>>> On Aug 4, 2016, at 9:01 PM, Barry Smith wrote: >>>> >>>>> >>>>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith wrote: >>>>> >>>>>> >>>>>> On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: >>>>>> >>>>>> >>>>>> History, >>>>>> >>>>>> 1) I originally implemented the ASM with one subdomain per process >>>>>> 2) easily extended to support multiple domain per process >>>>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>>>>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>>>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>>>>> >>>>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>>>>> >>>>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >>>>> >>>>> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. >>>> >>>> But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. >>> >>> Yes we are computing overlapping and non-overlapping IS?es. >>> >>> I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) >> >> OK, so here is what appears to be happening. These solvers are named things like ?stokes_pc_level_0_?, ?stokes_pc_level_1_?, ? . If I use the command-line argument >> >> -stokes_ib_pc_level_0_pc_asm_type basic >> >> then the ASM settings are used, but if I do: >> >> -stokes_ib_pc_level_pc_asm_type basic >> >> they are ignored. Any ideas? :-) > > I should have said: we are playing around with a lot of different command line options that are being collectively applied to all of the level solvers, and these options for ASM are the only ones I?ve encountered so far that have to include the level number to have an effect. > > Thanks, > > ? Boyce > >> >> Thanks, >> >> ? Boyce >> >>>>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >>>> >>>> Yeah, better one question per email or we will miss them. >>>> >>>> There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. >>> >>> OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. >>> >>> However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. >>> >>> Thanks, >>> >>> ? Boyce >>> >>>> >>>> Barry >>>> >>>>> >>>>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>>>>> >>>>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>>>>> KSP Object: 1 MPI processes >>>>>>> type: gmres >>>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>> GMRES: happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI processes >>>>>>> type: asm >>>>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>>>>> Additive Schwarz: restriction/interpolation type - BASIC >>>>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>>>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>>>>> KSP Object: (sub_) 1 MPI processes >>>>>>> type: preonly >>>>>>> maximum iterations=10000, initial guess is zero >>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>> left preconditioning >>>>>>> using NONE norm type for convergence test >>>>>>> PC Object: (sub_) 1 MPI processes >>>>>>> type: icc >>>>>>> 0 levels of fill >>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>>>>> matrix ordering: natural >>>>>>> factor fill ratio given 1., needed 1. >>>>>>> Factored matrix follows: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqsbaij >>>>>>> rows=160, cols=160 >>>>>>> package used to perform factorization: petsc >>>>>>> total: nonzeros=443, allocated nonzeros=443 >>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>> block size is 1 >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=160, cols=160 >>>>>>> total: nonzeros=726, allocated nonzeros=726 >>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>> not using I-node routines >>>>>>> linear system matrix = precond matrix: >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: seqaij >>>>>>> rows=1024, cols=1024 >>>>>>> total: nonzeros=4992, allocated nonzeros=5120 >>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>> not using I-node routines >>>>>>> Norm of error 0.000292304 iterations 24 >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -- Boyce >> > From griffith at cims.nyu.edu Thu Aug 4 20:58:02 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Thu, 4 Aug 2016 21:58:02 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> Message-ID: <94027DFF-48FE-4A7C-8EBF-4649FD111F12@cims.nyu.edu> > On Aug 4, 2016, at 9:40 PM, Barry Smith wrote: > >> >> On Aug 4, 2016, at 9:26 PM, Boyce Griffith wrote: >> >>> >>> On Aug 4, 2016, at 9:01 PM, Barry Smith wrote: >>> >>>> >>>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith wrote: >>>> >>>>> >>>>> On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: >>>>> >>>>> >>>>> History, >>>>> >>>>> 1) I originally implemented the ASM with one subdomain per process >>>>> 2) easily extended to support multiple domain per process >>>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>>>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>>>> >>>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>>>> >>>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >>>> >>>> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. >>> >>> But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. >> >> Yes we are computing overlapping and non-overlapping IS?es. >> >> I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) >> >>>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >>> >>> Yeah, better one question per email or we will miss them. >>> >>> There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. >> >> OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. >> >> However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. > > Yes, you should definitely check this. One would hope to see similar convergence with Matt's code, perhaps the ordering of the subdomains matter? I verified that the command line MULTIPLICATIVE setting was being ignored. If I list MULTIPLICATIVE separately for each level solver, those settings get used, and the MULTPLICATIVE variant of PCASM gives slower convergence than PCASM + ADDITIVE. I guess that it could be the ordering. I think we are listing the subdomains in the same for ASM in the order that we process in our custom MSM. I?m not sure how to check this. ? Boyce > Barry > >> >> Thanks, >> >> ? Boyce >> >>> >>> Barry >>> >>>> >>>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>>>> >>>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>>>> KSP Object: 1 MPI processes >>>>>> type: gmres >>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>> GMRES: happy breakdown tolerance 1e-30 >>>>>> maximum iterations=10000, initial guess is zero >>>>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using PRECONDITIONED norm type for convergence test >>>>>> PC Object: 1 MPI processes >>>>>> type: asm >>>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>>>> Additive Schwarz: restriction/interpolation type - BASIC >>>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>>>> KSP Object: (sub_) 1 MPI processes >>>>>> type: preonly >>>>>> maximum iterations=10000, initial guess is zero >>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>> left preconditioning >>>>>> using NONE norm type for convergence test >>>>>> PC Object: (sub_) 1 MPI processes >>>>>> type: icc >>>>>> 0 levels of fill >>>>>> tolerance for zero pivot 2.22045e-14 >>>>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>>>> matrix ordering: natural >>>>>> factor fill ratio given 1., needed 1. >>>>>> Factored matrix follows: >>>>>> Mat Object: 1 MPI processes >>>>>> type: seqsbaij >>>>>> rows=160, cols=160 >>>>>> package used to perform factorization: petsc >>>>>> total: nonzeros=443, allocated nonzeros=443 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> block size is 1 >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI processes >>>>>> type: seqaij >>>>>> rows=160, cols=160 >>>>>> total: nonzeros=726, allocated nonzeros=726 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> not using I-node routines >>>>>> linear system matrix = precond matrix: >>>>>> Mat Object: 1 MPI processes >>>>>> type: seqaij >>>>>> rows=1024, cols=1024 >>>>>> total: nonzeros=4992, allocated nonzeros=5120 >>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>> not using I-node routines >>>>>> Norm of error 0.000292304 iterations 24 >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -- Boyce -------------- next part -------------- An HTML attachment was scrubbed... URL: From griffith at cims.nyu.edu Fri Aug 5 00:26:48 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Fri, 5 Aug 2016 01:26:48 -0400 Subject: [petsc-users] PCASMType In-Reply-To: <300DF95D-C74E-4F5C-AC18-559276433E6F@mcs.anl.gov> References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> <300DF95D-C74E-4F5C-AC18-559276433E6F@mcs.anl.gov> Message-ID: > On Aug 4, 2016, at 9:52 PM, Barry Smith wrote: > > > The magic handling of _1_ etc is all done in PetscOptionsFindPair_Private() so you need to put a break point in that routine and see why the requested value is not located. I haven?t tracked down the source of the problem with using _1_ etc, but I have checked to see what happens if I switch between basic/restrict/interpolate/none ?manually? on each level, and I still see the same results for all choices. I?ve checked the IS?es and am reasonably confident that they are being generated correctly the the ?overlap? and ?non-overlap? regions. It is definitely the case that the overlap region contains the non-overlap regions, and the overlap region is bigger (by the proper amount) from the non-overlap region. It looks like ksp/ksp/examples/tutorials/ex8.c uses PCASMSetLocalSubdomains to set up the subdomains for ASM. If I run this example using, e.g., ./ex8 -m 100 -n 100 -Mdomains 8 -Ndomains 8 -user_set_subdomains -ksp_rtol 1.0e-3 -ksp_monitor -pc_asm_type XXXX I get the same exact results for all different ASM types. I checked (using -ksp_view) that the ASM type settings were being honored. Are these subdomains not being setup to include overlaps (in which case I guess all ASM versions would yield the same results)? Thanks, ? Boyce > > Barry > > >> On Aug 4, 2016, at 9:46 PM, Boyce Griffith wrote: >> >> >>> On Aug 4, 2016, at 9:41 PM, Boyce Griffith wrote: >>> >>> >>>> On Aug 4, 2016, at 9:26 PM, Boyce Griffith wrote: >>>> >>>>> >>>>> On Aug 4, 2016, at 9:01 PM, Barry Smith wrote: >>>>> >>>>>> >>>>>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith wrote: >>>>>> >>>>>>> >>>>>>> On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: >>>>>>> >>>>>>> >>>>>>> History, >>>>>>> >>>>>>> 1) I originally implemented the ASM with one subdomain per process >>>>>>> 2) easily extended to support multiple domain per process >>>>>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>>>>>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>>>>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>>>>>> >>>>>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>>>>>> >>>>>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >>>>>> >>>>>> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. >>>>> >>>>> But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. >>>> >>>> Yes we are computing overlapping and non-overlapping IS?es. >>>> >>>> I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) >>> >>> OK, so here is what appears to be happening. These solvers are named things like ?stokes_pc_level_0_?, ?stokes_pc_level_1_?, ? . If I use the command-line argument >>> >>> -stokes_ib_pc_level_0_pc_asm_type basic >>> >>> then the ASM settings are used, but if I do: >>> >>> -stokes_ib_pc_level_pc_asm_type basic >>> >>> they are ignored. Any ideas? :-) >> >> I should have said: we are playing around with a lot of different command line options that are being collectively applied to all of the level solvers, and these options for ASM are the only ones I?ve encountered so far that have to include the level number to have an effect. >> >> Thanks, >> >> ? Boyce >> >>> >>> Thanks, >>> >>> ? Boyce >>> >>>>>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >>>>> >>>>> Yeah, better one question per email or we will miss them. >>>>> >>>>> There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. >>>> >>>> OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. >>>> >>>> However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. >>>> >>>> Thanks, >>>> >>>> ? Boyce >>>> >>>>> >>>>> Barry >>>>> >>>>>> >>>>>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>>>>>> >>>>>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>>>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>>>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>>>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>>>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>>>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>>>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>>>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>>>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>>>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>>>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>>>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>>>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>>>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>>>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>>>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>>>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>>>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>>>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>>>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>>>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>>>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>>>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>>>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>>>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>>>>>> KSP Object: 1 MPI processes >>>>>>>> type: gmres >>>>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>> GMRES: happy breakdown tolerance 1e-30 >>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>>>>>> left preconditioning >>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>> PC Object: 1 MPI processes >>>>>>>> type: asm >>>>>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>>>>>> Additive Schwarz: restriction/interpolation type - BASIC >>>>>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>>>>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>>>>>> KSP Object: (sub_) 1 MPI processes >>>>>>>> type: preonly >>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>> left preconditioning >>>>>>>> using NONE norm type for convergence test >>>>>>>> PC Object: (sub_) 1 MPI processes >>>>>>>> type: icc >>>>>>>> 0 levels of fill >>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>>>>>> matrix ordering: natural >>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>> Factored matrix follows: >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: seqsbaij >>>>>>>> rows=160, cols=160 >>>>>>>> package used to perform factorization: petsc >>>>>>>> total: nonzeros=443, allocated nonzeros=443 >>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>> block size is 1 >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: seqaij >>>>>>>> rows=160, cols=160 >>>>>>>> total: nonzeros=726, allocated nonzeros=726 >>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>> not using I-node routines >>>>>>>> linear system matrix = precond matrix: >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: seqaij >>>>>>>> rows=1024, cols=1024 >>>>>>>> total: nonzeros=4992, allocated nonzeros=5120 >>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>> not using I-node routines >>>>>>>> Norm of error 0.000292304 iterations 24 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -- Boyce >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 5 15:27:10 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 5 Aug 2016 16:27:10 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> <300DF95D-C74E-4F5C-AC18-559276433E6F@mcs.anl.gov> Message-ID: I looked at the code (and read the manual page better) PC_ASM_BASIC - full interpolation and restriction PC_ASM_RESTRICT - full restriction, local processor interpolation PC_ASM_INTERPOLATE - full interpolation, local processor restriction PC_ASM_NONE - local processor restriction and interpolation It is not doing what you and I assumed it is doing. The restrict and interpolate are only short circuited (skipped) across processes any restriction and interpolation within an MPI process is always done. Thus in sequential runs the different variants will make no difference. I don't think I would have written it this way. Sorry I wasted your time, but it doesn't look like there is anything useful for you with PCASM; it needs to be completely refactored. Barry > On Aug 5, 2016, at 1:26 AM, Boyce Griffith wrote: > > >> On Aug 4, 2016, at 9:52 PM, Barry Smith wrote: >> >> >> The magic handling of _1_ etc is all done in PetscOptionsFindPair_Private() so you need to put a break point in that routine and see why the requested value is not located. > > I haven?t tracked down the source of the problem with using _1_ etc, but I have checked to see what happens if I switch between basic/restrict/interpolate/none ?manually? on each level, and I still see the same results for all choices. > > I?ve checked the IS?es and am reasonably confident that they are being generated correctly the the ?overlap? and ?non-overlap? regions. It is definitely the case that the overlap region contains the non-overlap regions, and the overlap region is bigger (by the proper amount) from the non-overlap region. > > It looks like ksp/ksp/examples/tutorials/ex8.c uses PCASMSetLocalSubdomains to set up the subdomains for ASM. If I run this example using, e.g., > > ./ex8 -m 100 -n 100 -Mdomains 8 -Ndomains 8 -user_set_subdomains -ksp_rtol 1.0e-3 -ksp_monitor -pc_asm_type XXXX > > I get the same exact results for all different ASM types. I checked (using -ksp_view) that the ASM type settings were being honored. Are these subdomains not being setup to include overlaps (in which case I guess all ASM versions would yield the same results)? > > Thanks, > > ? Boyce > >> >> Barry >> >> >>> On Aug 4, 2016, at 9:46 PM, Boyce Griffith wrote: >>> >>> >>>> On Aug 4, 2016, at 9:41 PM, Boyce Griffith wrote: >>>> >>>> >>>>> On Aug 4, 2016, at 9:26 PM, Boyce Griffith wrote: >>>>> >>>>>> >>>>>> On Aug 4, 2016, at 9:01 PM, Barry Smith wrote: >>>>>> >>>>>>> >>>>>>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith wrote: >>>>>>> >>>>>>>> >>>>>>>> On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: >>>>>>>> >>>>>>>> >>>>>>>> History, >>>>>>>> >>>>>>>> 1) I originally implemented the ASM with one subdomain per process >>>>>>>> 2) easily extended to support multiple domain per process >>>>>>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>>>>>>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>>>>>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>>>>>>> >>>>>>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>>>>>>> >>>>>>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >>>>>>> >>>>>>> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. >>>>>> >>>>>> But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. >>>>> >>>>> Yes we are computing overlapping and non-overlapping IS?es. >>>>> >>>>> I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) >>>> >>>> OK, so here is what appears to be happening. These solvers are named things like ?stokes_pc_level_0_?, ?stokes_pc_level_1_?, ? . If I use the command-line argument >>>> >>>> -stokes_ib_pc_level_0_pc_asm_type basic >>>> >>>> then the ASM settings are used, but if I do: >>>> >>>> -stokes_ib_pc_level_pc_asm_type basic >>>> >>>> they are ignored. Any ideas? :-) >>> >>> I should have said: we are playing around with a lot of different command line options that are being collectively applied to all of the level solvers, and these options for ASM are the only ones I?ve encountered so far that have to include the level number to have an effect. >>> >>> Thanks, >>> >>> ? Boyce >>> >>>> >>>> Thanks, >>>> >>>> ? Boyce >>>> >>>>>>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >>>>>> >>>>>> Yeah, better one question per email or we will miss them. >>>>>> >>>>>> There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. >>>>> >>>>> OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. >>>>> >>>>> However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. >>>>> >>>>> Thanks, >>>>> >>>>> ? Boyce >>>>> >>>>>> >>>>>> Barry >>>>>> >>>>>>> >>>>>>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>>>>>>> >>>>>>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>>>>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>>>>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>>>>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>>>>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>>>>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>>>>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>>>>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>>>>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>>>>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>>>>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>>>>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>>>>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>>>>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>>>>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>>>>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>>>>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>>>>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>>>>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>>>>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>>>>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>>>>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>>>>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>>>>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>>>>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>>>>>>> KSP Object: 1 MPI processes >>>>>>>>> type: gmres >>>>>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>> GMRES: happy breakdown tolerance 1e-30 >>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>>>>>>> left preconditioning >>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>> PC Object: 1 MPI processes >>>>>>>>> type: asm >>>>>>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>>>>>>> Additive Schwarz: restriction/interpolation type - BASIC >>>>>>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>>>>>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>>>>>>> KSP Object: (sub_) 1 MPI processes >>>>>>>>> type: preonly >>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>>> left preconditioning >>>>>>>>> using NONE norm type for convergence test >>>>>>>>> PC Object: (sub_) 1 MPI processes >>>>>>>>> type: icc >>>>>>>>> 0 levels of fill >>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>>>>>>> matrix ordering: natural >>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>> Factored matrix follows: >>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>> type: seqsbaij >>>>>>>>> rows=160, cols=160 >>>>>>>>> package used to perform factorization: petsc >>>>>>>>> total: nonzeros=443, allocated nonzeros=443 >>>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>>> block size is 1 >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>> type: seqaij >>>>>>>>> rows=160, cols=160 >>>>>>>>> total: nonzeros=726, allocated nonzeros=726 >>>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>>> not using I-node routines >>>>>>>>> linear system matrix = precond matrix: >>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>> type: seqaij >>>>>>>>> rows=1024, cols=1024 >>>>>>>>> total: nonzeros=4992, allocated nonzeros=5120 >>>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>>> not using I-node routines >>>>>>>>> Norm of error 0.000292304 iterations 24 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> -- Boyce >>>> >>> > From griffith at cims.nyu.edu Fri Aug 5 15:37:01 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Fri, 5 Aug 2016 16:37:01 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> <300DF95D-C74E-4F5C-AC18-559276433E6F@mcs.anl.gov> Message-ID: > On Aug 5, 2016, at 4:27 PM, Barry Smith wrote: > > > I looked at the code (and read the manual page better) > > PC_ASM_BASIC - full interpolation and restriction > PC_ASM_RESTRICT - full restriction, local processor interpolation > PC_ASM_INTERPOLATE - full interpolation, local processor restriction > PC_ASM_NONE - local processor restriction and interpolation > > > It is not doing what you and I assumed it is doing. The restrict and interpolate are only short circuited (skipped) across processes any restriction and interpolation within an MPI process is always done. Thus in sequential runs the different variants will make no difference. I don't think I would have written it this way. Thanks, Barry --- I think that this explains some weird results we've been getting when trying to use PCASM with small subdomains as a smoother (e.g. performance degrades with larger overlaps). At least for convergence benchmarking, we can get away with using a simple implementation of RASM. Also, could this explain why the locally multiplicative version of PCASM seems to perform the same (or worse) than the locally additive version? > Sorry I wasted your time, but it doesn't look like there is anything useful for you with PCASM; it needs to be completely refactored. > > Barry > > > > > >> On Aug 5, 2016, at 1:26 AM, Boyce Griffith wrote: >> >> >>> On Aug 4, 2016, at 9:52 PM, Barry Smith wrote: >>> >>> >>> The magic handling of _1_ etc is all done in PetscOptionsFindPair_Private() so you need to put a break point in that routine and see why the requested value is not located. >> >> I haven?t tracked down the source of the problem with using _1_ etc, but I have checked to see what happens if I switch between basic/restrict/interpolate/none ?manually? on each level, and I still see the same results for all choices. >> >> I?ve checked the IS?es and am reasonably confident that they are being generated correctly the the ?overlap? and ?non-overlap? regions. It is definitely the case that the overlap region contains the non-overlap regions, and the overlap region is bigger (by the proper amount) from the non-overlap region. >> >> It looks like ksp/ksp/examples/tutorials/ex8.c uses PCASMSetLocalSubdomains to set up the subdomains for ASM. If I run this example using, e.g., >> >> ./ex8 -m 100 -n 100 -Mdomains 8 -Ndomains 8 -user_set_subdomains -ksp_rtol 1.0e-3 -ksp_monitor -pc_asm_type XXXX >> >> I get the same exact results for all different ASM types. I checked (using -ksp_view) that the ASM type settings were being honored. Are these subdomains not being setup to include overlaps (in which case I guess all ASM versions would yield the same results)? >> >> Thanks, >> >> ? Boyce >> >>> >>> Barry >>> >>> >>>> On Aug 4, 2016, at 9:46 PM, Boyce Griffith wrote: >>>> >>>> >>>>> On Aug 4, 2016, at 9:41 PM, Boyce Griffith wrote: >>>>> >>>>> >>>>>> On Aug 4, 2016, at 9:26 PM, Boyce Griffith wrote: >>>>>> >>>>>>> >>>>>>> On Aug 4, 2016, at 9:01 PM, Barry Smith wrote: >>>>>>> >>>>>>>> >>>>>>>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> History, >>>>>>>>> >>>>>>>>> 1) I originally implemented the ASM with one subdomain per process >>>>>>>>> 2) easily extended to support multiple domain per process >>>>>>>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>>>>>>>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>>>>>>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>>>>>>>> >>>>>>>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>>>>>>>> >>>>>>>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >>>>>>>> >>>>>>>> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. >>>>>>> >>>>>>> But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. >>>>>> >>>>>> Yes we are computing overlapping and non-overlapping IS?es. >>>>>> >>>>>> I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) >>>>> >>>>> OK, so here is what appears to be happening. These solvers are named things like ?stokes_pc_level_0_?, ?stokes_pc_level_1_?, ? . If I use the command-line argument >>>>> >>>>> -stokes_ib_pc_level_0_pc_asm_type basic >>>>> >>>>> then the ASM settings are used, but if I do: >>>>> >>>>> -stokes_ib_pc_level_pc_asm_type basic >>>>> >>>>> they are ignored. Any ideas? :-) >>>> >>>> I should have said: we are playing around with a lot of different command line options that are being collectively applied to all of the level solvers, and these options for ASM are the only ones I?ve encountered so far that have to include the level number to have an effect. >>>> >>>> Thanks, >>>> >>>> ? Boyce >>>> >>>>> >>>>> Thanks, >>>>> >>>>> ? Boyce >>>>> >>>>>>>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >>>>>>> >>>>>>> Yeah, better one question per email or we will miss them. >>>>>>> >>>>>>> There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. >>>>>> >>>>>> OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. >>>>>> >>>>>> However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> ? Boyce >>>>>> >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>>> >>>>>>>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>>>>>>>> >>>>>>>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>>>>>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>>>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>>>>>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>>>>>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>>>>>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>>>>>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>>>>>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>>>>>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>>>>>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>>>>>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>>>>>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>>>>>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>>>>>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>>>>>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>>>>>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>>>>>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>>>>>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>>>>>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>>>>>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>>>>>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>>>>>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>>>>>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>>>>>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>>>>>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>>>>>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>>>>>>>> KSP Object: 1 MPI processes >>>>>>>>>> type: gmres >>>>>>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>>> GMRES: happy breakdown tolerance 1e-30 >>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>>>>>>>> left preconditioning >>>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>>> PC Object: 1 MPI processes >>>>>>>>>> type: asm >>>>>>>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>>>>>>>> Additive Schwarz: restriction/interpolation type - BASIC >>>>>>>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>>>>>>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>>>>>>>> KSP Object: (sub_) 1 MPI processes >>>>>>>>>> type: preonly >>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>>>> left preconditioning >>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>> PC Object: (sub_) 1 MPI processes >>>>>>>>>> type: icc >>>>>>>>>> 0 levels of fill >>>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>>>>>>>> matrix ordering: natural >>>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>>> Factored matrix follows: >>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>> type: seqsbaij >>>>>>>>>> rows=160, cols=160 >>>>>>>>>> package used to perform factorization: petsc >>>>>>>>>> total: nonzeros=443, allocated nonzeros=443 >>>>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>>>> block size is 1 >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>> type: seqaij >>>>>>>>>> rows=160, cols=160 >>>>>>>>>> total: nonzeros=726, allocated nonzeros=726 >>>>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>>>> not using I-node routines >>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>> type: seqaij >>>>>>>>>> rows=1024, cols=1024 >>>>>>>>>> total: nonzeros=4992, allocated nonzeros=5120 >>>>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>>>> not using I-node routines >>>>>>>>>> Norm of error 0.000292304 iterations 24 >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> -- Boyce >>>>> >>>> >> From bsmith at mcs.anl.gov Fri Aug 5 15:45:22 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 5 Aug 2016 16:45:22 -0400 Subject: [petsc-users] PCASMType In-Reply-To: References: <5E0F636E-FE0B-4924-B865-17CD44B8AE02@cims.nyu.edu> <78DD31F8-44B7-434C-AABE-A850544EBA66@mcs.anl.gov> <300DF95D-C74E-4F5C-AC18-559276433E6F@mcs.anl.gov> Message-ID: <4633012E-6092-4E05-BBB1-F37F74529A4E@mcs.anl.gov> > On Aug 5, 2016, at 4:37 PM, Boyce Griffith wrote: > >> >> On Aug 5, 2016, at 4:27 PM, Barry Smith wrote: >> >> >> I looked at the code (and read the manual page better) >> >> PC_ASM_BASIC - full interpolation and restriction >> PC_ASM_RESTRICT - full restriction, local processor interpolation >> PC_ASM_INTERPOLATE - full interpolation, local processor restriction >> PC_ASM_NONE - local processor restriction and interpolation >> >> >> It is not doing what you and I assumed it is doing. The restrict and interpolate are only short circuited (skipped) across processes any restriction and interpolation within an MPI process is always done. Thus in sequential runs the different variants will make no difference. I don't think I would have written it this way. > > Thanks, Barry --- I think that this explains some weird results we've been getting when trying to use PCASM with small subdomains as a smoother (e.g. performance degrades with larger overlaps). At least for convergence benchmarking, we can get away with using a simple implementation of RASM. > > Also, could this explain why the locally multiplicative version of PCASM seems to perform the same (or worse) than the locally additive version? I won't speculate on that because I am too frightened to look at the "multiplicative version" > >> Sorry I wasted your time, but it doesn't look like there is anything useful for you with PCASM; it needs to be completely refactored. >> >> Barry >> >> >> >> >> >>> On Aug 5, 2016, at 1:26 AM, Boyce Griffith wrote: >>> >>> >>>> On Aug 4, 2016, at 9:52 PM, Barry Smith wrote: >>>> >>>> >>>> The magic handling of _1_ etc is all done in PetscOptionsFindPair_Private() so you need to put a break point in that routine and see why the requested value is not located. >>> >>> I haven?t tracked down the source of the problem with using _1_ etc, but I have checked to see what happens if I switch between basic/restrict/interpolate/none ?manually? on each level, and I still see the same results for all choices. >>> >>> I?ve checked the IS?es and am reasonably confident that they are being generated correctly the the ?overlap? and ?non-overlap? regions. It is definitely the case that the overlap region contains the non-overlap regions, and the overlap region is bigger (by the proper amount) from the non-overlap region. >>> >>> It looks like ksp/ksp/examples/tutorials/ex8.c uses PCASMSetLocalSubdomains to set up the subdomains for ASM. If I run this example using, e.g., >>> >>> ./ex8 -m 100 -n 100 -Mdomains 8 -Ndomains 8 -user_set_subdomains -ksp_rtol 1.0e-3 -ksp_monitor -pc_asm_type XXXX >>> >>> I get the same exact results for all different ASM types. I checked (using -ksp_view) that the ASM type settings were being honored. Are these subdomains not being setup to include overlaps (in which case I guess all ASM versions would yield the same results)? >>> >>> Thanks, >>> >>> ? Boyce >>> >>>> >>>> Barry >>>> >>>> >>>>> On Aug 4, 2016, at 9:46 PM, Boyce Griffith wrote: >>>>> >>>>> >>>>>> On Aug 4, 2016, at 9:41 PM, Boyce Griffith wrote: >>>>>> >>>>>> >>>>>>> On Aug 4, 2016, at 9:26 PM, Boyce Griffith wrote: >>>>>>> >>>>>>>> >>>>>>>> On Aug 4, 2016, at 9:01 PM, Barry Smith wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Aug 4, 2016, at 8:42 PM, Barry Smith wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> History, >>>>>>>>>> >>>>>>>>>> 1) I originally implemented the ASM with one subdomain per process >>>>>>>>>> 2) easily extended to support multiple domain per process >>>>>>>>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain per process because it took advantage of the fact that >>>>>>>>>> restrict etc could be achieved by simply dropping the parallel communication in the vector scatters >>>>>>>>>> 4) Matt didn't like the restriction to one process per subdomain so he added an additional argument to PCASMSetLocalSubdomains() that allowed passing in the overlapping and non-overlapping regions of each domain (foolishly calling the non-overlapping index set is_local even though local has nothing to do with), so that the restrict etc could be handled. >>>>>>>>>> >>>>>>>>>> Unfortunately IMHO Matt made a mess of things because if you use things like -pc_asm_blocks n or -pc_asm_overlap 1 etc it does not handle the -pc_asm_type restrict since it cannot track the is vs is_local. The code needs to be refactored so that things like -pc_asm_blocks and -pc_asm_overlap 1 can track the is vs is_local index sets properly when the -pc_asm_type is set. Also the name is_local needs to be changed to something meaningfully like is_nonoverlapping This refactoring would also result in easier cleaner code then is currently there. >>>>>>>>>> >>>>>>>>>> So basically until the PCASM is refactored properly to handle restrict etc you are stuck with being able to use the restrict etc ONLY if you specifically supply the overlapping and non overlapping domains yourself with PCASMSetLocalSubdomains and curse at Matt everyday like we all do. >>>>>>>>> >>>>>>>>> OK, got it. The reason I?m asking is that we are using PCASM in a custom smoother, and I noticed that basic/restrict/interpolate/none all give identical results. We are using PCASMSetLocalSubdomains to set up the subdomains. >>>>>>>> >>>>>>>> But are you setting different is and is_local (stupid name) and not have PETSc computing the overlap in your custom code? If you are setting them differently and not having PETSc compute overlap but getting identical convergence then something is wrong and you likely have to run in the debugger to insure that restrict etc is properly being set and used. >>>>>>> >>>>>>> Yes we are computing overlapping and non-overlapping IS?es. >>>>>>> >>>>>>> I just double-checked, and somehow the ASMType setting is not making it from the command line into the solver configuration ? sorry, I should have checked this more carefully before emailing the list. (I thought that the command line options were being captured correctly, since I am able to control the PC type and all of the sub-KSP/sub-PC settings.) >>>>>> >>>>>> OK, so here is what appears to be happening. These solvers are named things like ?stokes_pc_level_0_?, ?stokes_pc_level_1_?, ? . If I use the command-line argument >>>>>> >>>>>> -stokes_ib_pc_level_0_pc_asm_type basic >>>>>> >>>>>> then the ASM settings are used, but if I do: >>>>>> >>>>>> -stokes_ib_pc_level_pc_asm_type basic >>>>>> >>>>>> they are ignored. Any ideas? :-) >>>>> >>>>> I should have said: we are playing around with a lot of different command line options that are being collectively applied to all of the level solvers, and these options for ASM are the only ones I?ve encountered so far that have to include the level number to have an effect. >>>>> >>>>> Thanks, >>>>> >>>>> ? Boyce >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> ? Boyce >>>>>> >>>>>>>>> BTW, there is also this bit (which was easy to overlook in all of the repetitive convergence histories): >>>>>>>> >>>>>>>> Yeah, better one question per email or we will miss them. >>>>>>>> >>>>>>>> There is nothing that says that multiplicative will ALWAYS beat additive, though intuitively you expect it to. >>>>>>> >>>>>>> OK, so similar story as above: we have a custom MSM that, when used as a MG smoother, gives convergence rates that are about 2x PCASM, whereas when we use PCASM with MULTIPLICATIVE, it doesn?t seem to help. >>>>>>> >>>>>>> However, now I am questioning whether the settings are getting propagated into PCASM? I?ll need to take another look. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> ? Boyce >>>>>>> >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>>> >>>>>>>>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would expect --- for this same example, if you switch from ADDITIVE to MULTIPLICATIVE, the solver converges slightly more slowly: >>>>>>>>>>> >>>>>>>>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE >>>>>>>>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00 >>>>>>>>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01 >>>>>>>>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01 >>>>>>>>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01 >>>>>>>>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02 >>>>>>>>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02 >>>>>>>>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02 >>>>>>>>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02 >>>>>>>>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02 >>>>>>>>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02 >>>>>>>>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02 >>>>>>>>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03 >>>>>>>>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03 >>>>>>>>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03 >>>>>>>>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03 >>>>>>>>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03 >>>>>>>>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04 >>>>>>>>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04 >>>>>>>>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04 >>>>>>>>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04 >>>>>>>>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05 >>>>>>>>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05 >>>>>>>>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05 >>>>>>>>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06 >>>>>>>>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06 >>>>>>>>>>> KSP Object: 1 MPI processes >>>>>>>>>>> type: gmres >>>>>>>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>>>>>>>>>> GMRES: happy breakdown tolerance 1e-30 >>>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>>> tolerances: relative=9.18274e-06, absolute=1e-50, divergence=10000. >>>>>>>>>>> left preconditioning >>>>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>>>> PC Object: 1 MPI processes >>>>>>>>>>> type: asm >>>>>>>>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1 >>>>>>>>>>> Additive Schwarz: restriction/interpolation type - BASIC >>>>>>>>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE >>>>>>>>>>> Local solve is same for all blocks, in the following KSP and PC objects: >>>>>>>>>>> KSP Object: (sub_) 1 MPI processes >>>>>>>>>>> type: preonly >>>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >>>>>>>>>>> left preconditioning >>>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>>> PC Object: (sub_) 1 MPI processes >>>>>>>>>>> type: icc >>>>>>>>>>> 0 levels of fill >>>>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>>>> using Manteuffel shift [POSITIVE_DEFINITE] >>>>>>>>>>> matrix ordering: natural >>>>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>>>> Factored matrix follows: >>>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>>> type: seqsbaij >>>>>>>>>>> rows=160, cols=160 >>>>>>>>>>> package used to perform factorization: petsc >>>>>>>>>>> total: nonzeros=443, allocated nonzeros=443 >>>>>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>>>>> block size is 1 >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>>> type: seqaij >>>>>>>>>>> rows=160, cols=160 >>>>>>>>>>> total: nonzeros=726, allocated nonzeros=726 >>>>>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>>>>> not using I-node routines >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>>>> type: seqaij >>>>>>>>>>> rows=1024, cols=1024 >>>>>>>>>>> total: nonzeros=4992, allocated nonzeros=5120 >>>>>>>>>>> total number of mallocs used during MatSetValues calls =0 >>>>>>>>>>> not using I-node routines >>>>>>>>>>> Norm of error 0.000292304 iterations 24 >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> -- Boyce From bsmith at mcs.anl.gov Fri Aug 5 15:54:07 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 5 Aug 2016 16:54:07 -0400 Subject: [petsc-users] Segmentation faults: Derived types In-Reply-To: References: Message-ID: > On Aug 1, 2016, at 4:41 PM, Santiago Ospina De Los Rios wrote: > > Hello there, > > I'm having problems defining some variables into derived types in Fortran. Before, I had a similar problems with an allocatable array "PetsInt" but I solved it just doing a non-collective Petsc Vec. Today I'm having troubles with "PetscBool" or "Logical": > > In a module which define the variables, I have the following: > > MODULE ANISOFLOW_Types > > IMPLICIT NONE > > #include > #include > > ... > > TYPE ConductivityField > PetscBool :: DefinedByCvtZones=.FALSE. ! It produces the segmentation fault. > PetscBool :: DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > PetscBool :: DefinedByCell=.FALSE. > ! Conductivity defined by zones (Local): > Vec :: ZoneID > TYPE(Tensor),ALLOCATABLE :: Zone(:) > ! Conductivity defined on every cell (Local): > Vec :: Cell > END TYPE ConductivityField > > > TYPE SpecificStorageField > PetscBool :: DefinedByStoZones=.FALSE. ! It produces the segmentation fault. > PetscBool :: DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > PetscBool :: DefinedByCell=.FALSE. > ! Specific Storage defined by zones (Local): > Vec :: ZoneID > Vec :: Zone > ! Specific Storage defined on every cell (Global).: > Vec :: Cell > END TYPE SpecificStorageField > > TYPE PropertiesField > TYPE(ConductivityField) :: Cvt > TYPE(SpecificStorageField) :: Sto > ! Property defined by zones (Local): > PetscBool :: DefinedByPptZones=.FALSE. > Vec :: ZoneID > END TYPE PropertiesField > > ... > > CONTAINS > > ... > > END MODULE ANISOFLOW_Types > > > Later I use it in the main program, with something like this > > PROGRAM ANISOFLOW > > USE ANISOFLOW_Types, ONLY : ... ,PropertiesField, ... > ... > > IMPLICIT NONE > > #include > > ... > TYPE(PropertiesField) :: PptFld > ... > > CALL PetscInitialize(PETSC_COMM_WORLD,ierr) > ... > CALL PetscFinalize(ierr) > > END PROGRAM > > > When I run the program appears a Segmentation Fault, which disappears when I comment the booleans marked in the code. Because I need them, I used Valgrind to figure out what is happening but it is yet a mistery to me. > > Valgrind message: > ==5160== > ==5160== Invalid read of size 1 It is curious that it says "of size 1" when we declare PetscBool to be a logical*4 I don't see anything obviously wrong. Please send a simple code we can compile and run that reproduces the problem. Barry > ==5160== at 0x4FB2156: petscinitialize_ (zstart.c:433) > ==5160== by 0x4030EA: MAIN__ (ANISOFLOW.F90:29) # line of petsc inizalitation > ==5160== by 0x404380: main (ANISOFLOW.F90:3) # line of "USE ANISOFLOW_Types, ONLY : ... ,PropertiesField, ..." > ==5160== Address 0xc54fff is not stack'd, malloc'd or (recently) free'd > ==5160== > > Program received signal SIGSEGV: Segmentation fault - invalid memory reference. > > Backtrace for this error: > #0 0x699E777 > #1 0x699ED7E > #2 0x6F0BCAF > #3 0x4FB2156 > #4 0x4030EA in anisoflow at ANISOFLOW.F90:29 > > I think it is maybe related with petsc because the error popped out just in its initialization, so if you know what's going on, I would appreciate to tell me. > > Santiago Ospina > -- > > -- > Att: > > Santiago Ospina De Los R?os > National University of Colombia From jshen25 at jhu.edu Fri Aug 5 17:58:51 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Fri, 5 Aug 2016 18:58:51 -0400 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: References: Message-ID: ?Hi, Thanks for your answers. I just figured out the issues which are mainly due to the ill-conditioning of my matrix. I found the conditional number blows up when the beam is discretized into large number of elements. Now, I am using the 1D bar model to solve the same problem. The good news is the solution is always accurate and stable even I discretized into 10 million elements. When I run the model with both iterative solver(CG+BJACOBI/ASM) and direct solver(SUPER_LU) in parallelization, I got the following results: Mesh size: 1 million unknowns Processes 1 2 4 6 8 10 12 16 20 CG+BJ 0.36 0.22 0.15 0.12 0.11 0.1 0.096 0.097 0.099 CG+ASM 0.47 0.46 0.267 0.2 0.17 0.15 0.145 0.16 0.15 SUPER_LU_DIST 4.73 5.4 4.69 4.58 4.38 4.2 4.27 4.28 4.38 It seems the CG+BJ works correctly, i.e. time decreases fast with a few more processes and reach stable with many more cores. However, I have some concerns about CG+ASM and SUPER_LU_DIST. The time of both two methods goes up when I use two processes compared with uniprocess. The tendency is more obvious when I use larger mesh size. I especially doubt the results of SUPER_LU_DIST in parallelism since the overall expedition is very small which is not expected. The runtime option I use for ASM pc and SUPER_LU_DIST solver is shown as below: ASM preconditioner: -pc_type asm -pc_asm_type basic SUPER_LU_DIST solver: -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist I use same mpiexec -n np ./xxxx for all solvers. Am I using them correctly? If so, is there anyway to speed up the computation further, especially for SUPER_LU_DIST? Thank you very much! Bests, Jinlei On Mon, Aug 1, 2016 at 2:10 PM, Matthew Knepley wrote: > On Mon, Aug 1, 2016 at 12:52 PM, Jinlei Shen wrote: > >> Hi Barry, >> >> Thanks for your reply. >> >> Firstly, as you suggested, I checked my program under valgrind. The >> results for both sequential and parallel cases showed there are no memory >> errors detected. >> >> Second, I coded a sequential program without using PETSC to generate the >> global matrix of small mesh for the same problem. I then checked the matrix >> both from petsc(sequential and parallel) and serial code, and they are same. >> The way I assembled the global matrix in parallel is first distributing >> the nodes and elements into processes, then I loop with elements on the >> calling process to put the element stiffness into the global. Since the >> nodes and elements in cantilever beam are numbered successively, the >> connectivity is simple. I didn't use any partition tools to optimize mesh. >> It's also easy to determine the preallocation d_nnz and o_nnz since each >> node only connects the left and right nodes except for beginning and end, >> the maximum nonzeros in each row is 6. The MatSetValue process is shown as >> follows: >> do iEL = idElStart, idElEnd >> g_EL = (/2*iEL-1-1,2*iEL-1,2*iEL+1-1,2*iEL+2-1/) >> call MatSetValues(SG,4,g_EL,4,g_El,SE,ADD_VALUES,ierr) >> end do >> where idElStart and idElEnd are the global number of first element and >> end element that the process owns, g_EL is the global index for DOF in >> element iEL, SE is the element stiffness which is same for all elements. >> From above assembling, most of the elements are assembled within own >> process while there are few elements crossing two processes. >> >> The BC for my problem(cantilever under end point load) is to fix the >> first two DOF, so I called the MatZeroRowsColumns to set the first two >> rows and columns into zero with diagonal equal to one, without changing the >> RHS. >> >> Now some new issues show up : >> >> I run with -ksp_monitor_true_residual and -ksp_converged_reason, the >> monitor showed two different residues, one is the residue I can >> set(preconditioned, unpreconditioned, natural), the other is called true >> residue. >> ?? >> I initially thought the true residue is same as unpreconditioned based on >> definition. But it seems not true. Is it the norm of the residue (b-Ax) >> between computed RHS and true RHS? But, how to understand >> unprecondition residue since its definition is b-Ax as well? >> > > It is the unpreconditioned residual. You must be misinterpreting. And we > could determine exactly if you sent the output with the suggested options. > > >> Can I set the true residue as my converging criteria? >> > > Use right preconditioning. > > >> I found the accuracy of large mesh in my problem didn't necessary depend >> on the tolerance I set, either preconditioned or unpreconditioned, >> sometimes, it showed converged while the solution is not correct. But the >> true residue looks reflecting the true convergence very well, if the true >> residue is diverging, no matter what the first residue says, the results >> are bad! >> > > Yes, your preconditioner looks singular. Note that BJACOBI has an inner > solver, and by default the is GMRES/ILU(0). I think > ILU(0) is really ill-conditioned for your problem. > > >> For the preconditioner concerns, actually, I used BJACOBI before I sent >> the first email, since the JACOBI or PBJACOBI didn't even converge when the >> size was large. >> But BJACOBI also didn't perform well in the paralleliztion for large mesh >> as posed in my last email, while it's fine for small size (below 10k >> elements) >> >> Yesterday, I tried the ASM with CG using the runtime option: -pc_type >> asm -pc_asm_type basic -sub_pc_type lu (default is ilu). >> For 15k elements mesh, I am now able to get the correct answer with 1-3, >> 6 and more processes, using either -sub_pc_type lu or ilu. >> > > Yes, LU works for your subdomain solver. > > >> Based on all the results I have got, it shows the results varies a lot >> with different PC and seems ASM is better for large problem. >> > > Its not ASM so much as an LU subsolver that is better. > > >> But what is the major factor to produce such difference between different >> PCs, since it's not just the issue of computational efficiency, but also >> the accuracy. >> Also, I noticed for large mesh, the solution is unstable with small >> number of processes, for the 15k case, the solution is not correct with 4 >> and 5 processes, however, the solution becomes always correct with more >> than 6 processes. For the 50k mesh case, more processes are required to >> show the stability. >> > > Yes, partitioning is very important here. Since you do not have a good > partition, you can get these wild variations. > > Thanks, > > Matt > > >> What do you think about this? Anything wrong? >> Since the iterative solver in parallel is first computed locally(if this >> is correct), can it be possible that there are 'good' and 'bad' locals when >> dividing the global matrix, and the result from 'bad' local will >> contaminate the global results. But with more processes, such risk is >> reduced. >> >> It is highly appreciated if you could give me some instruction for above >> questions. >> >> Thank you very much. >> >> Bests, >> Jinlei >> >> >> On Fri, Jul 29, 2016 at 2:09 PM, Barry Smith wrote: >> >>> >>> First run under valgrind all the cases to make sure there is not some >>> use of uninitialized data or overwriting of data. Go to >>> http://www.mcs.anl.gov/petsc follow the link to FAQ and search for >>> valgrind (the web server seems to be broken at the moment). >>> >>> Second it is possible that your code the assembles the matrices and >>> vectors is not correctly assembling it for either the sequential or >>> parallel case. Hence a different number of processes could be generating a >>> different linear system hence inconsistent results. How are you handling >>> the parallelism? How do you know the matrix generated in parallel is >>> identically to that sequentially? >>> >>> Simple preconditioners such as pbjacobi will converge slower and slower >>> with more elements. >>> >>> Note that you should run with -ksp_monitor_true_residual and >>> -ksp_converged_reason to make sure that the iterative solver is even >>> converging. By default PETSc KSP solvers do not stop with a big error >>> message if they do not converge so you need make sure they are always >>> converging. >>> >>> Barry >>> >>> >>> >>> > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: >>> > >>> > Dear PETSC developers, >>> > >>> > Thank you for developing such a powerful tool for scientific >>> computations. >>> > >>> > I'm currently trying to run a simple cantilever beam FEM to test the >>> scalability of PETSC on multi-processors. I also want to verify whether >>> iterative solver or direct solver is more efficient for parallel large FEM >>> problem. >>> > >>> > Problem description, An Euler elementary cantilever beam with point >>> load at the end along -y direction. Each node has 2 DOF (deflection and >>> rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based >>> on the connectivity. Loop with elements in each processor to assemble the >>> global matrix with same element stiffness matrix. The boundary condition is >>> set using call MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_ >>> NULL_OBJECT,ierr); >>> > >>> > Based on what I have done, I find the computations work well, i.e the >>> results are correct compared with theoretical solution, for small mesh size >>> (small than 5000 elements) using both solvers with different numbers of >>> processes. >>> > >>> > However, there are several confusing issues when I increase the mesh >>> size to 10000 and more elements with iterative solve(CG + PCBJACOBI) >>> > >>> > 1. For 10k elements, I can get accurate solution using iterative >>> solver with uni-processor(i.e. only one process). However, when I use 2-8 >>> processes, it tells the linear solver converged with different iterations, >>> but, the results are all different for different processes and erroneous. >>> The wired thing is when I use >9 processes, the results are correct again. >>> I am really confused by this. Could you explain me why? If my >>> parallelization is not correct, why it works for small cases? And I check >>> the global matrix and RHS vector and didn't see any mallocs during the >>> process. >>> > >>> > 2. For 30k elements, if I use one process, it says: Linear solve did >>> not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for >>> large sparse matrix? If so, is there any stable solver or pc for large >>> problem? >>> > >>> > >>> > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can >>> only get accuracy when the number of elements are below 5000. There must be >>> something wrong. The way I use the superlu_dist solver is first convert >>> MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to >>> PCLU. Do I miss anything else to run SUPER_LU correctly? >>> > >>> > >>> > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the >>> sequential version of the same problem. The results shows that iterative >>> solver works well for <50k elements, while SUPER_LU only gets right >>> solution below 5k elements. Can I say iterative solver is better than >>> SUPER_LU for large problem? How can I improve the solver to copy with very >>> large problem, such as million by million? Another thing is it's still >>> doubtable of performance of SUPER_LU. >>> > >>> > For the inaccuracy issue, do you think it may be due to the memory? >>> However, there is no memory error showing during the execution. >>> > >>> > I really appreciate someone could resolve those puzzles above for me. >>> My goal is to replace the current SUPER_LU solver in my parallel CPFEM >>> main program with the iterative solver using PETSC. >>> > >>> > >>> > Please let me if you would like to see my code in detail. >>> > >>> > Thank you very much. >>> > >>> > Bests, >>> > Jinlei >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 5 21:09:23 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 5 Aug 2016 21:09:23 -0500 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: References: Message-ID: <51E07B0F-810B-492A-8330-3469E04D14F1@mcs.anl.gov> > On Aug 5, 2016, at 5:58 PM, Jinlei Shen wrote: > > ?Hi, > > Thanks for your answers. > > I just figured out the issues which are mainly due to the ill-conditioning of my matrix. I found the conditional number blows up when the beam is discretized into large number of elements. > > Now, I am using the 1D bar model to solve the same problem. The good news is the solution is always accurate and stable even I discretized into 10 million elements. > > When I run the model with both iterative solver(CG+BJACOBI/ASM) and direct solver(SUPER_LU) in parallelization, I got the following results: > > Mesh size: 1 million unknowns > Processes 1 2 4 6 8 10 12 16 20 > CG+BJ 0.36 0.22 0.15 0.12 0.11 0.1 0.096 0.097 0.099 > CG+ASM 0.47 0.46 0.267 0.2 0.17 0.15 0.145 0.16 0.15 > SUPER_LU_DIST 4.73 5.4 4.69 4.58 4.38 4.2 4.27 4.28 4.38 > > It seems the CG+BJ works correctly, i.e. time decreases fast with a few more processes and reach stable with many more cores. > > However, I have some concerns about CG+ASM and SUPER_LU_DIST. The time of both two methods goes up when I use two processes compared with uniprocess. This is actually not surprising at all but since the mantra is "parallelism will always make things faster" it can confuse people. When run with one process the ASM and SuperLU_DIST utilize essentially sequential algorithms, when run with two processes they "switch" to parallel algorithms which simply are not as good as the essentially sequential algorithm that is obtained with one process hence they run slower. This is just life, there really isn't something one can do about it except to perhaps use a poorer quality algorithm on one process so that two processes look better but the goal of PETSc is not to make parallelism to look good but to provide efficient solvers (as best we can) for one and multiple processes. Barry > The tendency is more obvious when I use larger mesh size. > I especially doubt the results of SUPER_LU_DIST in parallelism since the overall expedition is very small which is not expected. > The runtime option I use for ASM pc and SUPER_LU_DIST solver is shown as below: > ASM preconditioner: -pc_type asm -pc_asm_type basic > SUPER_LU_DIST solver: -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist > > I use same mpiexec -n np ./xxxx for all solvers. > > Am I using them correctly? If so, is there anyway to speed up the computation further, especially for SUPER_LU_DIST? > > Thank you very much! > > Bests, > Jinlei > > On Mon, Aug 1, 2016 at 2:10 PM, Matthew Knepley wrote: > On Mon, Aug 1, 2016 at 12:52 PM, Jinlei Shen wrote: > Hi Barry, > > Thanks for your reply. > > Firstly, as you suggested, I checked my program under valgrind. The results for both sequential and parallel cases showed there are no memory errors detected. > > Second, I coded a sequential program without using PETSC to generate the global matrix of small mesh for the same problem. I then checked the matrix both from petsc(sequential and parallel) and serial code, and they are same. > The way I assembled the global matrix in parallel is first distributing the nodes and elements into processes, then I loop with elements on the calling process to put the element stiffness into the global. Since the nodes and elements in cantilever beam are numbered successively, the connectivity is simple. I didn't use any partition tools to optimize mesh. It's also easy to determine the preallocation d_nnz and o_nnz since each node only connects the left and right nodes except for beginning and end, the maximum nonzeros in each row is 6. The MatSetValue process is shown as follows: > do iEL = idElStart, idElEnd > g_EL = (/2*iEL-1-1,2*iEL-1,2*iEL+1-1,2*iEL+2-1/) > call MatSetValues(SG,4,g_EL,4,g_El,SE,ADD_VALUES,ierr) > end do > where idElStart and idElEnd are the global number of first element and end element that the process owns, g_EL is the global index for DOF in element iEL, SE is the element stiffness which is same for all elements. > From above assembling, most of the elements are assembled within own process while there are few elements crossing two processes. > > The BC for my problem(cantilever under end point load) is to fix the first two DOF, so I called the MatZeroRowsColumns to set the first two rows and columns into zero with diagonal equal to one, without changing the RHS. > > Now some new issues show up : > > I run with -ksp_monitor_true_residual and -ksp_converged_reason, the monitor showed two different residues, one is the residue I can set(preconditioned, unpreconditioned, natural), the other is called true residue. > ?? > I initially thought the true residue is same as unpreconditioned based on definition. But it seems not true. Is it the norm of the residue (b-Ax) between computed RHS and true RHS? But, how to understand unprecondition residue since its definition is b-Ax as well? > > It is the unpreconditioned residual. You must be misinterpreting. And we could determine exactly if you sent the output with the suggested options. > > Can I set the true residue as my converging criteria? > > Use right preconditioning. > > I found the accuracy of large mesh in my problem didn't necessary depend on the tolerance I set, either preconditioned or unpreconditioned, sometimes, it showed converged while the solution is not correct. But the true residue looks reflecting the true convergence very well, if the true residue is diverging, no matter what the first residue says, the results are bad! > > Yes, your preconditioner looks singular. Note that BJACOBI has an inner solver, and by default the is GMRES/ILU(0). I think > ILU(0) is really ill-conditioned for your problem. > > For the preconditioner concerns, actually, I used BJACOBI before I sent the first email, since the JACOBI or PBJACOBI didn't even converge when the size was large. > But BJACOBI also didn't perform well in the paralleliztion for large mesh as posed in my last email, while it's fine for small size (below 10k elements) > > Yesterday, I tried the ASM with CG using the runtime option: -pc_type asm -pc_asm_type basic -sub_pc_type lu (default is ilu). > For 15k elements mesh, I am now able to get the correct answer with 1-3, 6 and more processes, using either -sub_pc_type lu or ilu. > > Yes, LU works for your subdomain solver. > > Based on all the results I have got, it shows the results varies a lot with different PC and seems ASM is better for large problem. > > Its not ASM so much as an LU subsolver that is better. > > But what is the major factor to produce such difference between different PCs, since it's not just the issue of computational efficiency, but also the accuracy. > Also, I noticed for large mesh, the solution is unstable with small number of processes, for the 15k case, the solution is not correct with 4 and 5 processes, however, the solution becomes always correct with more than 6 processes. For the 50k mesh case, more processes are required to show the stability. > > Yes, partitioning is very important here. Since you do not have a good partition, you can get these wild variations. > > Thanks, > > Matt > > What do you think about this? Anything wrong? > Since the iterative solver in parallel is first computed locally(if this is correct), can it be possible that there are 'good' and 'bad' locals when dividing the global matrix, and the result from 'bad' local will contaminate the global results. But with more processes, such risk is reduced. > > It is highly appreciated if you could give me some instruction for above questions. > > Thank you very much. > > Bests, > Jinlei > > > On Fri, Jul 29, 2016 at 2:09 PM, Barry Smith wrote: > > First run under valgrind all the cases to make sure there is not some use of uninitialized data or overwriting of data. Go to http://www.mcs.anl.gov/petsc follow the link to FAQ and search for valgrind (the web server seems to be broken at the moment). > > Second it is possible that your code the assembles the matrices and vectors is not correctly assembling it for either the sequential or parallel case. Hence a different number of processes could be generating a different linear system hence inconsistent results. How are you handling the parallelism? How do you know the matrix generated in parallel is identically to that sequentially? > > Simple preconditioners such as pbjacobi will converge slower and slower with more elements. > > Note that you should run with -ksp_monitor_true_residual and -ksp_converged_reason to make sure that the iterative solver is even converging. By default PETSc KSP solvers do not stop with a big error message if they do not converge so you need make sure they are always converging. > > Barry > > > > > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: > > > > Dear PETSC developers, > > > > Thank you for developing such a powerful tool for scientific computations. > > > > I'm currently trying to run a simple cantilever beam FEM to test the scalability of PETSC on multi-processors. I also want to verify whether iterative solver or direct solver is more efficient for parallel large FEM problem. > > > > Problem description, An Euler elementary cantilever beam with point load at the end along -y direction. Each node has 2 DOF (deflection and rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based on the connectivity. Loop with elements in each processor to assemble the global matrix with same element stiffness matrix. The boundary condition is set using call MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr); > > > > Based on what I have done, I find the computations work well, i.e the results are correct compared with theoretical solution, for small mesh size (small than 5000 elements) using both solvers with different numbers of processes. > > > > However, there are several confusing issues when I increase the mesh size to 10000 and more elements with iterative solve(CG + PCBJACOBI) > > > > 1. For 10k elements, I can get accurate solution using iterative solver with uni-processor(i.e. only one process). However, when I use 2-8 processes, it tells the linear solver converged with different iterations, but, the results are all different for different processes and erroneous. The wired thing is when I use >9 processes, the results are correct again. I am really confused by this. Could you explain me why? If my parallelization is not correct, why it works for small cases? And I check the global matrix and RHS vector and didn't see any mallocs during the process. > > > > 2. For 30k elements, if I use one process, it says: Linear solve did not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for large sparse matrix? If so, is there any stable solver or pc for large problem? > > > > > > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can only get accuracy when the number of elements are below 5000. There must be something wrong. The way I use the superlu_dist solver is first convert MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to PCLU. Do I miss anything else to run SUPER_LU correctly? > > > > > > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the sequential version of the same problem. The results shows that iterative solver works well for <50k elements, while SUPER_LU only gets right solution below 5k elements. Can I say iterative solver is better than SUPER_LU for large problem? How can I improve the solver to copy with very large problem, such as million by million? Another thing is it's still doubtable of performance of SUPER_LU. > > > > For the inaccuracy issue, do you think it may be due to the memory? However, there is no memory error showing during the execution. > > > > I really appreciate someone could resolve those puzzles above for me. My goal is to replace the current SUPER_LU solver in my parallel CPFEM main program with the iterative solver using PETSC. > > > > > > Please let me if you would like to see my code in detail. > > > > Thank you very much. > > > > Bests, > > Jinlei > > > > > > > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From sospinar at unal.edu.co Fri Aug 5 23:34:49 2016 From: sospinar at unal.edu.co (Santiago Ospina De Los Rios) Date: Fri, 5 Aug 2016 23:34:49 -0500 Subject: [petsc-users] Segmentation faults: Derived types In-Reply-To: References: Message-ID: Dear Barry, I tried to build a simple code with the same things I mentioned to you on last e-mail but it worked, which is more strange to me. So I built two branches on my git code to show you the problem: git code: https://github.com/SoilRos/ANISOFLOWPACK Branch: PETSc_debug_boolean_0 The first one is a simple code which is working for what was designed. Forget sample problems, just compile and run the ANISOFLOW executable on src folder, if there are some verbose messages then the program is working. Branch: PETSc_debug_boolean_1 The second one is just a modification of the first one adding the two booleans mentioned above in 01_Types.F90 file. I tried it in mac El Capitan and Ubuntu (with Valgrind) and PETSc 3.7.3 and 3.7.2 respectively, both with the same segmentation fault. PD: Although I already fixed it compressing the three booleans into one integer, I think is better if we try to figure out why there is a segmentation fault because I had similar problems before. PD2: Please obviate the variable description because are pretty out of date. I'm trying to change it, so it can be confusing. Best wishes, Santiago Ospina 2016-08-05 15:54 GMT-05:00 Barry Smith : > > > On Aug 1, 2016, at 4:41 PM, Santiago Ospina De Los Rios < > sospinar at unal.edu.co> wrote: > > > > Hello there, > > > > I'm having problems defining some variables into derived types in > Fortran. Before, I had a similar problems with an allocatable array > "PetsInt" but I solved it just doing a non-collective Petsc Vec. Today I'm > having troubles with "PetscBool" or "Logical": > > > > In a module which define the variables, I have the following: > > > > MODULE ANISOFLOW_Types > > > > IMPLICIT NONE > > > > #include > > #include > > > > ... > > > > TYPE ConductivityField > > PetscBool :: > DefinedByCvtZones=.FALSE. ! It produces the segmentation fault. > > PetscBool :: > DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > > PetscBool :: DefinedByCell=.FALSE. > > ! Conductivity defined by zones (Local): > > Vec :: ZoneID > > TYPE(Tensor),ALLOCATABLE :: Zone(:) > > ! Conductivity defined on every cell (Local): > > Vec :: Cell > > END TYPE ConductivityField > > > > > > TYPE SpecificStorageField > > PetscBool :: > DefinedByStoZones=.FALSE. ! It produces the segmentation fault. > > PetscBool :: > DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > > PetscBool :: DefinedByCell=.FALSE. > > ! Specific Storage defined by zones (Local): > > Vec :: ZoneID > > Vec :: Zone > > ! Specific Storage defined on every cell (Global).: > > Vec :: Cell > > END TYPE SpecificStorageField > > > > TYPE PropertiesField > > TYPE(ConductivityField) :: Cvt > > TYPE(SpecificStorageField) :: Sto > > ! Property defined by zones (Local): > > PetscBool :: DefinedByPptZones=.FALSE. > > Vec :: ZoneID > > END TYPE PropertiesField > > > > ... > > > > CONTAINS > > > > ... > > > > END MODULE ANISOFLOW_Types > > > > > > Later I use it in the main program, with something like this > > > > PROGRAM ANISOFLOW > > > > USE ANISOFLOW_Types, ONLY : ... ,PropertiesField, ... > > ... > > > > IMPLICIT NONE > > > > #include > > > > ... > > TYPE(PropertiesField) :: PptFld > > ... > > > > CALL PetscInitialize(PETSC_COMM_WORLD,ierr) > > ... > > CALL PetscFinalize(ierr) > > > > END PROGRAM > > > > > > When I run the program appears a Segmentation Fault, which disappears > when I comment the booleans marked in the code. Because I need them, I used > Valgrind to figure out what is happening but it is yet a mistery to me. > > > > Valgrind message: > > ==5160== > > ==5160== Invalid read of size 1 > > It is curious that it says "of size 1" when we declare PetscBool to be > a logical*4 I don't see anything obviously wrong. > > Please send a simple code we can compile and run that reproduces the > problem. > > Barry > > ==5160== at 0x4FB2156: petscinitialize_ (zstart.c:433) > > ==5160== by 0x4030EA: MAIN__ (ANISOFLOW.F90:29) # line of petsc > inizalitation > > ==5160== by 0x404380: main (ANISOFLOW.F90:3) # line of "USE > ANISOFLOW_Types, ONLY : ... ,PropertiesField, ..." > > ==5160== Address 0xc54fff is not stack'd, malloc'd or (recently) free'd > > ==5160== > > > > Program received signal SIGSEGV: Segmentation fault - invalid memory > reference. > > > > Backtrace for this error: > > #0 0x699E777 > > #1 0x699ED7E > > #2 0x6F0BCAF > > #3 0x4FB2156 > > #4 0x4030EA in anisoflow at ANISOFLOW.F90:29 > > > > I think it is maybe related with petsc because the error popped out just > in its initialization, so if you know what's going on, I would appreciate > to tell me. > > > > Santiago Ospina > > -- > > > > -- > > Att: > > > > Santiago Ospina De Los R?os > > National University of Colombia > > -- -- Att: Santiago Ospina De Los R?os National University of Colombia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ztdepyahoo at 163.com Sun Aug 7 02:59:49 2016 From: ztdepyahoo at 163.com (=?GBK?B?tqHAz8qm?=) Date: Sun, 7 Aug 2016 15:59:49 +0800 (CST) Subject: [petsc-users] KSPSetOperators(ksp, A, A, SAME_NONZERO_PATTERN) does not work any more Message-ID: <3a8140ab.2108.1566403d95c.Coremail.ztdepyahoo@163.com> Dear friends: I just update to 3.6.3, but the command " KSPSetOperators(ksp,A,A,SAME_NONZERO_PATTERN)" does not work any more. could you please give me some suggestions Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Sun Aug 7 03:06:26 2016 From: jychang48 at gmail.com (Justin Chang) Date: Sun, 7 Aug 2016 03:06:26 -0500 Subject: [petsc-users] KSPSetOperators(ksp, A, A, SAME_NONZERO_PATTERN) does not work any more In-Reply-To: <3a8140ab.2108.1566403d95c.Coremail.ztdepyahoo@163.com> References: <3a8140ab.2108.1566403d95c.Coremail.ztdepyahoo@163.com> Message-ID: PETSc 3.6.3 and on automatically determines that last field, so you just need "KSPSetOperators(ksp,A,A)" On Sun, Aug 7, 2016 at 2:59 AM, ??? wrote: > Dear friends: > I just update to 3.6.3, but the command " > KSPSetOperators(ksp,A,A,SAME_NONZERO_PATTERN)" does not work any more. > could you please give me some suggestions > Regards > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico_lahaye at yahoo.com Sun Aug 7 13:17:00 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Sun, 7 Aug 2016 18:17:00 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <8CB9F29A-77CA-46D2-9C3C-4E7CD494D2D0@mcs.anl.gov> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com> <8CB9F29A-77CA-46D2-9C3C-4 E7CD494D2D0@mcs.anl.gov> Message-ID: <1312104176.10413956.1470593820466.JavaMail.yahoo@mail.yahoo.com> My most sincere apologies for my ignorance in the matter that follows.? I am new to git. I performed a checkout of the branch barry/extend-pcmg-galerkinand made sure to be in the new branch. I however fail to see the changes that Barry?implemented. The git -log output I obtain after the checkout of the branch?barry/extend-pcmg-galerkin does not show Barry's commit with ID? 2134b1e4b80e7da7410bde701042d857987f373d? ? ? I am obviously missing something. Can you please help?? Domenico.? From: Barry Smith To: domenico lahaye Cc: "petsc-users at mcs.anl.gov" Sent: Sunday, July 24, 2016 2:52 AM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations ? Took a little more time than I expected but the branch barry/extend-pcmg-galerkin now supports PCMGSetGalerkin() and -pc_mg_galerkin now take PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE as arguments instead of PetscBool This allows computing either mat, or pmat or both via the Galerkin process so you should be able to provide A and M with KSPSetOperators() and then run with -pc_mg_galerkin both to get both generated on the coarse meshes via the Galekin process.? Note that if you use the additional option -pc_use_amat false it will use only the M for both mat and pmat in the multigrid process (while A is only used for the outer Krylov solver definition of the operator.) For some problems this is actually a better approach. ? Please let me know if you have any difficulties with it. Barry > On Jul 22, 2016, at 3:42 AM, domenico lahaye wrote: > > Dear Barry, >? >? Thank you for your suggestion. >? >? I will be happy to test drive the new code when available. > >? Kind wishes, Domenico. > > > > From: Barry Smith > To: Lawrence Mitchell > Cc: domenico lahaye ; PETSc Users List > Sent: Friday, July 22, 2016 1:41 AM > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > >? I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.? I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where > > typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE > } PCMGGalerkinType; > > Barry > > > > > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell wrote: > > > > > >> On 21 Jul 2016, at 10:55, domenico lahaye wrote: > >> > >> Apologies for being not sufficient clear in my previous message. > >> > >> I would like to be able to Galerkin coarsen A^h to obtain A^H > >> and to separately Galerkin coarsen M^h to obtain M^H. > >> > >> So, yes, the way in which I currently (partially) understand your > >> description of the new DMCreateMatrices would do the job. > > > > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. > > > > Cheers, > > > > Lawrence > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Aug 7 14:19:59 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 7 Aug 2016 14:19:59 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <1312104176.10413956.1470593820466.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com> <8CB9F29A-77CA-46D2-9C3C-4 E7CD494D2D0@mcs.anl.gov> <1312104176.10413956.1470593820466.JavaMail.yahoo@mail.yahoo.com> Message-ID: <3CE0C7C1-F148-4C2D-BDA7-ECDBD082A6AA@mcs.anl.gov> $ git checkout barry/extend-pcmg-galerkin Switched to branch 'barry/extend-pcmg-galerkin' Your branch is up-to-date with 'origin/barry/extend-pcmg-galerkin'. ~/Src/petsc (barry/extend-pcmg-galerkin=) arch-basic $ git log commit 69aca0b8a5a19c0ddd266601f4ca39e70e8b369f Author: Barry Smith Date: Sun Jul 24 10:37:46 2016 -0500 fixed use of integers for mg->galerkin Reported-by: nightly tests commit 2134b1e4b80e7da7410bde701042d857987f373d Author: Barry Smith Date: Sat Jul 23 19:44:19 2016 -0500 PCMGSetGalerkin() and -pc_mg_galerkin now take PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE as arguments instead of PetscBool This allows computing either mat, or pmat or both via the Galerkin process Time: 16 hours Reported-by: domenico lahaye Thanks-to: Lawrence Mitchell BTW: the fixes are now in the master branch so you should use that instead anyways. Barry > On Aug 7, 2016, at 1:17 PM, domenico lahaye wrote: > > My most sincere apologies for my ignorance in the matter that follows. > > I am new to git. I performed a checkout of the branch barry/extend-pcmg-galerkin > and made sure to be in the new branch. I however fail to see the changes that Barry > implemented. The git -log output I obtain after the checkout of the branch > barry/extend-pcmg-galerkin does not show Barry's commit with ID > > 2134b1e4b80e7da7410bde701042d857987f373d > > I am obviously missing something. Can you please help? > > Domenico. > > From: Barry Smith > To: domenico lahaye > Cc: "petsc-users at mcs.anl.gov" > Sent: Sunday, July 24, 2016 2:52 AM > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > > Took a little more time than I expected but the branch barry/extend-pcmg-galerkin now supports > > PCMGSetGalerkin() and -pc_mg_galerkin now take PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE as arguments instead of PetscBool > This allows computing either mat, or pmat or both via the Galerkin process > > so you should be able to provide A and M with KSPSetOperators() and then run with -pc_mg_galerkin both to get both generated on the coarse meshes via the Galekin process. Note that if you use the additional option -pc_use_amat false it will use only the M for both mat and pmat in the multigrid process (while A is only used for the outer Krylov solver definition of the operator.) For some problems this is actually a better approach. > > > Please let me know if you have any difficulties with it. > > Barry > > > On Jul 22, 2016, at 3:42 AM, domenico lahaye wrote: > > > > Dear Barry, > > > > Thank you for your suggestion. > > > > I will be happy to test drive the new code when available. > > > > Kind wishes, Domenico. > > > > > > > > From: Barry Smith > > To: Lawrence Mitchell > > Cc: domenico lahaye ; PETSc Users List > > Sent: Friday, July 22, 2016 1:41 AM > > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > > > > > I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult. I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where > > > > typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE > > } PCMGGalerkinType; > > > > Barry > > > > > > > > > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell wrote: > > > > > > > > >> On 21 Jul 2016, at 10:55, domenico lahaye wrote: > > >> > > >> Apologies for being not sufficient clear in my previous message. > > >> > > >> I would like to be able to Galerkin coarsen A^h to obtain A^H > > >> and to separately Galerkin coarsen M^h to obtain M^H. > > >> > > >> So, yes, the way in which I currently (partially) understand your > > >> description of the new DMCreateMatrices would do the job. > > > > > > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels. Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. > > > > > > Cheers, > > > > > > Lawrence > > > > > > > > > From domenico_lahaye at yahoo.com Sun Aug 7 14:57:36 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Sun, 7 Aug 2016 19:57:36 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <3CE0C7C1-F148-4C2D-BDA7-ECDBD082A6AA@mcs.anl.gov> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com> <8CB9F29A-77CA-46D2-9C3C-4 E7CD494D2D0@mcs.anl.gov> <1312104176.10413956.1470593820466.JavaMail.yahoo@mail.yahoo.com> <3CE0C7C1-F148-4C2D-BDA7-ECDBD082A6AA@mcs.anl.gov> Message-ID: <251710790.10910410.1470599856090.JavaMail.yahoo@mail.yahoo.com> Thanks. I will look into it. Domenico. From: Barry Smith To: domenico lahaye Cc: PETSc Users List Sent: Sunday, August 7, 2016 9:19 PM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations $ git checkout barry/extend-pcmg-galerkin Switched to branch 'barry/extend-pcmg-galerkin' Your branch is up-to-date with 'origin/barry/extend-pcmg-galerkin'. ~/Src/petsc (barry/extend-pcmg-galerkin=) arch-basic $ git log commit 69aca0b8a5a19c0ddd266601f4ca39e70e8b369f Author: Barry Smith Date:? Sun Jul 24 10:37:46 2016 -0500 ? ? fixed use of integers for mg->galerkin ? ? ? ? Reported-by: nightly tests commit 2134b1e4b80e7da7410bde701042d857987f373d Author: Barry Smith Date:? Sat Jul 23 19:44:19 2016 -0500 ? ? PCMGSetGalerkin() and -pc_mg_galerkin now take PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE as arguments instead of PetscBool ? ? ? ? This allows computing either mat, or pmat or both via the Galerkin process ? ? ? ? Time: 16 hours ? ? Reported-by: domenico lahaye ? ? Thanks-to: Lawrence Mitchell ? BTW: the fixes are now in the master branch so you should use that instead anyways. ? Barry > On Aug 7, 2016, at 1:17 PM, domenico lahaye wrote: > > My most sincere apologies for my ignorance in the matter that follows. > > I am new to git. I performed a checkout of the branch barry/extend-pcmg-galerkin > and made sure to be in the new branch. I however fail to see the changes that Barry > implemented. The git -log output I obtain after the checkout of the branch > barry/extend-pcmg-galerkin does not show Barry's commit with ID > > 2134b1e4b80e7da7410bde701042d857987f373d? ? > > I am obviously missing something. Can you please help? > > Domenico. > > From: Barry Smith > To: domenico lahaye > Cc: "petsc-users at mcs.anl.gov" > Sent: Sunday, July 24, 2016 2:52 AM > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > >? Took a little more time than I expected but the branch barry/extend-pcmg-galerkin now supports > > PCMGSetGalerkin() and -pc_mg_galerkin now take PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE as arguments instead of PetscBool > This allows computing either mat, or pmat or both via the Galerkin process > > so you should be able to provide A and M with KSPSetOperators() and then run with -pc_mg_galerkin both to get both generated on the coarse meshes via the Galekin process.? Note that if you use the additional option -pc_use_amat false it will use only the M for both mat and pmat in the multigrid process (while A is only used for the outer Krylov solver definition of the operator.) For some problems this is actually a better approach. > > >? Please let me know if you have any difficulties with it. > > Barry > > > On Jul 22, 2016, at 3:42 AM, domenico lahaye wrote: > > > > Dear Barry, > >? > >? Thank you for your suggestion. > >? > >? I will be happy to test drive the new code when available. > > > >? Kind wishes, Domenico. > > > > > > > > From: Barry Smith > > To: Lawrence Mitchell > > Cc: domenico lahaye ; PETSc Users List > > Sent: Friday, July 22, 2016 1:41 AM > > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > > > > >? I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.? I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where > > > > typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE > > } PCMGGalerkinType; > > > > Barry > > > > > > > > > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell wrote: > > > > > > > > >> On 21 Jul 2016, at 10:55, domenico lahaye wrote: > > >> > > >> Apologies for being not sufficient clear in my previous message. > > >> > > >> I would like to be able to Galerkin coarsen A^h to obtain A^H > > >> and to separately Galerkin coarsen M^h to obtain M^H. > > >> > > >> So, yes, the way in which I currently (partially) understand your > > >> description of the new DMCreateMatrices would do the job. > > > > > > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. > > > > > > Cheers, > > > > > > Lawrence > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Sun Aug 7 15:58:20 2016 From: hzhang at mcs.anl.gov (Hong) Date: Sun, 7 Aug 2016 15:58:20 -0500 Subject: [petsc-users] KSPSetOperators(ksp, A, A, SAME_NONZERO_PATTERN) does not work any more In-Reply-To: <3a8140ab.2108.1566403d95c.Coremail.ztdepyahoo@163.com> References: <3a8140ab.2108.1566403d95c.Coremail.ztdepyahoo@163.com> Message-ID: - KSPSetOperators() no longer has the MatStructure argument. The Mat objects now track that information themselves. Use KPS/PCSetReusePreconditioner() to prevent the recomputation of the preconditioner if the operator changed in the way that SAME_PRECONDITIONER did with KSPSetOperators() see http://www.mcs.anl.gov/petsc/documentation/changes/35.html Hong On Sun, Aug 7, 2016 at 2:59 AM, ??? wrote: > Dear friends: > I just update to 3.6.3, but the command " > KSPSetOperators(ksp,A,A,SAME_NONZERO_PATTERN)" does not work any more. > could you please give me some suggestions > Regards > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Mon Aug 8 07:31:59 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Mon, 8 Aug 2016 21:31:59 +0900 Subject: [petsc-users] Question about SNESSetFunction - FormFunction part Message-ID: Hi, my name is Michael Choi, currently working on applying SNES solver to my compressible Euler equation solver with Fortran90 language. I'm having an error like below [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. I think my problem occurs at the position below. SNESSetFunction(SNES snes, Vec r, SNESFunction, ctx) and SNESFunction has the form like this. --> SNESFunction(SNES snes, Vec x, Vec f, ctx) >From this definition, the user-defined function context (ctx; in my case, type variable) argument can be passed by only one argument. But I need three different data types for calculating residual in that subroutine SNESFunction. I tried using type pointer like below type(t_Collect) :: Collect Collect%pGrid => Grid Collect%pConfig => Config Collect%pMixture => Mixture I tried but pointers can not pass the SNESFunction argument. In this kind of situation, what should I have to do?? -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Aug 8 09:32:41 2016 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 8 Aug 2016 09:32:41 -0500 Subject: [petsc-users] Question about SNESSetFunction - FormFunction part In-Reply-To: References: Message-ID: You may take a look at our Fortran examples, e.g., petsc/src/snes/examples/tutorials/ex5f90t.F Hong On Mon, Aug 8, 2016 at 7:31 AM, ??? wrote: > Hi, my name is Michael Choi, currently working on applying SNES solver to > my compressible Euler equation solver with Fortran90 language. > > I'm having an error like below > > [0]PETSC ERROR: ------------------------------ > ------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/ > documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > > > I think my problem occurs at the position below. > > SNESSetFunction(SNES snes, Vec r, SNESFunction, ctx) > > and SNESFunction has the form like this. > --> SNESFunction(SNES snes, Vec x, Vec f, ctx) > > From this definition, the user-defined function context (ctx; in my case, > type variable) argument can be passed by only one argument. But I need > three different data types for calculating residual in that subroutine > SNESFunction. > > I tried using type pointer like below > type(t_Collect) :: Collect > Collect%pGrid => Grid > Collect%pConfig => Config > Collect%pMixture => Mixture > > I tried but pointers can not pass the SNESFunction argument. > > > In this kind of situation, what should I have to do?? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From imilian.hartig at gmail.com Mon Aug 8 10:15:53 2016 From: imilian.hartig at gmail.com (Maximilian Hartig) Date: Mon, 8 Aug 2016 17:15:53 +0200 Subject: [petsc-users] TS and petscFE -unable to access u_t In-Reply-To: References: Message-ID: <17212535-E739-4A58-A2E1-D06199309B0A@gmail.com> Thank you, I used these routines to setup a CN type TS. The problem I face now is that I seem to be unable to access the temporal derivatives u_t[..] for the definition of the residuals. I get a segmentation violation inside the PetscFEIntegrateResidual routine whenever I try to. I have attached the error message below. Also, I am wondering whether it is possible to update the neumann boundary condition of the FE object for each timestep. This would be useful for coupling purposes. Thank you, Max [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] PetscFEIntegrateResidual_Basic line 3503 /Users/maxhartig/PETSc/src/dm/dt/interface/dtfe.c [0]PETSC ERROR: [0] PetscFEIntegrateResidual line 5753 /Users/maxhartig/PETSc/src/dm/dt/interface/dtfe.c [0]PETSC ERROR: [0] DMPlexComputeResidual_Internal line 1706 /Users/maxhartig/PETSc/src/snes/utils/dmplexsnes.c [0]PETSC ERROR: [0] DMPlexSNESComputeResidualFEM line 2152 /Users/maxhartig/PETSc/src/snes/utils/dmplexsnes.c [0]PETSC ERROR: [0] SNESComputeFunction_DMLocal line 65 /Users/maxhartig/PETSc/src/snes/utils/dmlocalsnes.c [0]PETSC ERROR: [0] SNES user function line 2144 /Users/maxhartig/PETSc/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESComputeFunction line 2129 /Users/maxhartig/PETSc/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_NEWTONLS line 150 /Users/maxhartig/PETSc/src/snes/impls/ls/ls.c [0]PETSC ERROR: [0] SNESSolve line 3961 /Users/maxhartig/PETSc/src/snes/interface/snes.c [0]PETSC ERROR: [0] TS_SNESSolve line 188 /Users/maxhartig/PETSc/src/ts/impls/implicit/theta/theta.c [0]PETSC ERROR: [0] TSStep_Theta line 206 /Users/maxhartig/PETSc/src/ts/impls/implicit/theta/theta.c [0]PETSC ERROR: [0] TSStep line 3700 /Users/maxhartig/PETSc/src/ts/interface/ts.c [0]PETSC ERROR: [0] TSSolve line 3921 /Users/maxhartig/PETSc/src/ts/interface/ts.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown [0]PETSC ERROR: Configure options --download-triangle [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=59 > On 03 Aug 2016, at 16:44, Matthew Knepley wrote: > > On Tue, Aug 2, 2016 at 8:22 AM, Maximilian Hartig > wrote: > Hello all, > > I would like to run a transient problem with PetscFE. Example ex11.c seems relevant since it uses the PestcFV context to create boundary conditions and RHS Functions for the TS. > Is there an easy way to do transient analysis with TS and petscFE or do I have to code my own time-stepping routine? > > You can use > > ierr = DMTSSetBoundaryLocal(adaptedDM, DMPlexTSComputeBoundary, user);CHKERRQ(ierr); > ierr = DMTSSetIFunctionLocal(adaptedDM, DMPlexTSComputeIFunctionFEM, user);CHKERRQ(ierr); > ierr = DMTSSetIJacobianLocal(adaptedDM, DMPlexTSComputeIJacobianFEM, user);CHKERRQ(ierr); > > I have been meaning to write a heat equation example, but I have not finished yet, > > Thanks, > > Matt > > Thanks, > Max > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 8 10:23:40 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 8 Aug 2016 10:23:40 -0500 Subject: [petsc-users] TS and petscFE -unable to access u_t In-Reply-To: <17212535-E739-4A58-A2E1-D06199309B0A@gmail.com> References: <17212535-E739-4A58-A2E1-D06199309B0A@gmail.com> Message-ID: On Mon, Aug 8, 2016 at 10:15 AM, Maximilian Hartig wrote: > Thank you, > > I used these routines to setup a CN type TS. The problem I face now is > that I seem to be unable to access the temporal derivatives u_t[..] for the > definition of the residuals. I get a segmentation violation inside the > PetscFEIntegrateResidual routine whenever I try to. I have attached the > error message below. > Also, I am wondering whether it is possible to update the neumann boundary > condition of the FE object for each timestep. This would be useful for > coupling purposes. > Could you send you example so I can run it myself? Thanks, Matt > Thank you, > > Max > > > [0]PETSC ERROR: ------------------------------ > ------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/ > documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] PetscFEIntegrateResidual_Basic line 3503 > /Users/maxhartig/PETSc/src/dm/dt/interface/dtfe.c > [0]PETSC ERROR: [0] PetscFEIntegrateResidual line 5753 > /Users/maxhartig/PETSc/src/dm/dt/interface/dtfe.c > [0]PETSC ERROR: [0] DMPlexComputeResidual_Internal line 1706 > /Users/maxhartig/PETSc/src/snes/utils/dmplexsnes.c > [0]PETSC ERROR: [0] DMPlexSNESComputeResidualFEM line 2152 > /Users/maxhartig/PETSc/src/snes/utils/dmplexsnes.c > [0]PETSC ERROR: [0] SNESComputeFunction_DMLocal line 65 > /Users/maxhartig/PETSc/src/snes/utils/dmlocalsnes.c > [0]PETSC ERROR: [0] SNES user function line 2144 > /Users/maxhartig/PETSc/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeFunction line 2129 > /Users/maxhartig/PETSc/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_NEWTONLS line 150 /Users/maxhartig/PETSc/src/ > snes/impls/ls/ls.c > [0]PETSC ERROR: [0] SNESSolve line 3961 /Users/maxhartig/PETSc/src/ > snes/interface/snes.c > [0]PETSC ERROR: [0] TS_SNESSolve line 188 /Users/maxhartig/PETSc/src/ts/ > impls/implicit/theta/theta.c > [0]PETSC ERROR: [0] TSStep_Theta line 206 /Users/maxhartig/PETSc/src/ts/ > impls/implicit/theta/theta.c > [0]PETSC ERROR: [0] TSStep line 3700 /Users/maxhartig/PETSc/src/ts/ > interface/ts.c > [0]PETSC ERROR: [0] TSSolve line 3921 /Users/maxhartig/PETSc/src/ts/ > interface/ts.c > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown > [0]PETSC ERROR: Configure options --download-triangle > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=59 > > On 03 Aug 2016, at 16:44, Matthew Knepley wrote: > > On Tue, Aug 2, 2016 at 8:22 AM, Maximilian Hartig < > imilian.hartig at gmail.com> wrote: > >> Hello all, >> >> I would like to run a transient problem with PetscFE. Example ex11.c >> seems relevant since it uses the PestcFV context to create boundary >> conditions and RHS Functions for the TS. >> Is there an easy way to do transient analysis with TS and petscFE or do I >> have to code my own time-stepping routine? >> > > You can use > > ierr = DMTSSetBoundaryLocal(adaptedDM, DMPlexTSComputeBoundary, > user);CHKERRQ(ierr); > ierr = DMTSSetIFunctionLocal(adaptedDM, DMPlexTSComputeIFunctionFEM, > user);CHKERRQ(ierr); > ierr = DMTSSetIJacobianLocal(adaptedDM, DMPlexTSComputeIJacobianFEM, > user);CHKERRQ(ierr); > > I have been meaning to write a heat equation example, but I have not > finished yet, > > Thanks, > > Matt > > >> Thanks, >> Max > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From imilian.hartig at gmail.com Mon Aug 8 10:42:55 2016 From: imilian.hartig at gmail.com (Maximilian Hartig) Date: Mon, 8 Aug 2016 17:42:55 +0200 Subject: [petsc-users] TS and petscFE -unable to access u_t In-Reply-To: References: <17212535-E739-4A58-A2E1-D06199309B0A@gmail.com> Message-ID: Gladly. I also send the simple gmsh I am using. Thank you, Max > On 08 Aug 2016, at 17:23, Matthew Knepley wrote: > > On Mon, Aug 8, 2016 at 10:15 AM, Maximilian Hartig > wrote: > Thank you, > > I used these routines to setup a CN type TS. The problem I face now is that I seem to be unable to access the temporal derivatives u_t[..] for the definition of the residuals. I get a segmentation violation inside the PetscFEIntegrateResidual routine whenever I try to. I have attached the error message below. > Also, I am wondering whether it is possible to update the neumann boundary condition of the FE object for each timestep. This would be useful for coupling purposes. > > Could you send you example so I can run it myself? > > Thanks, > > Matt > > Thank you, > > Max > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] PetscFEIntegrateResidual_Basic line 3503 /Users/maxhartig/PETSc/src/dm/dt/interface/dtfe.c > [0]PETSC ERROR: [0] PetscFEIntegrateResidual line 5753 /Users/maxhartig/PETSc/src/dm/dt/interface/dtfe.c > [0]PETSC ERROR: [0] DMPlexComputeResidual_Internal line 1706 /Users/maxhartig/PETSc/src/snes/utils/dmplexsnes.c > [0]PETSC ERROR: [0] DMPlexSNESComputeResidualFEM line 2152 /Users/maxhartig/PETSc/src/snes/utils/dmplexsnes.c > [0]PETSC ERROR: [0] SNESComputeFunction_DMLocal line 65 /Users/maxhartig/PETSc/src/snes/utils/dmlocalsnes.c > [0]PETSC ERROR: [0] SNES user function line 2144 /Users/maxhartig/PETSc/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeFunction line 2129 /Users/maxhartig/PETSc/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_NEWTONLS line 150 /Users/maxhartig/PETSc/src/snes/impls/ls/ls.c > [0]PETSC ERROR: [0] SNESSolve line 3961 /Users/maxhartig/PETSc/src/snes/interface/snes.c > [0]PETSC ERROR: [0] TS_SNESSolve line 188 /Users/maxhartig/PETSc/src/ts/impls/implicit/theta/theta.c > [0]PETSC ERROR: [0] TSStep_Theta line 206 /Users/maxhartig/PETSc/src/ts/impls/implicit/theta/theta.c > [0]PETSC ERROR: [0] TSStep line 3700 /Users/maxhartig/PETSc/src/ts/interface/ts.c > [0]PETSC ERROR: [0] TSSolve line 3921 /Users/maxhartig/PETSc/src/ts/interface/ts.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown > [0]PETSC ERROR: Configure options --download-triangle > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=59 > >> On 03 Aug 2016, at 16:44, Matthew Knepley > wrote: >> >> On Tue, Aug 2, 2016 at 8:22 AM, Maximilian Hartig > wrote: >> Hello all, >> >> I would like to run a transient problem with PetscFE. Example ex11.c seems relevant since it uses the PestcFV context to create boundary conditions and RHS Functions for the TS. >> Is there an easy way to do transient analysis with TS and petscFE or do I have to code my own time-stepping routine? >> >> You can use >> >> ierr = DMTSSetBoundaryLocal(adaptedDM, DMPlexTSComputeBoundary, user);CHKERRQ(ierr); >> ierr = DMTSSetIFunctionLocal(adaptedDM, DMPlexTSComputeIFunctionFEM, user);CHKERRQ(ierr); >> ierr = DMTSSetIJacobianLocal(adaptedDM, DMPlexTSComputeIJacobianFEM, user);CHKERRQ(ierr); >> >> I have been meaning to write a heat equation example, but I have not finished yet, >> >> Thanks, >> >> Matt >> >> Thanks, >> Max >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cimply.c Type: application/octet-stream Size: 25960 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: testmesh_2D_box_quad.msh Type: application/octet-stream Size: 32272 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.neiferd at wright.edu Mon Aug 8 15:12:58 2016 From: david.neiferd at wright.edu (Neiferd, David John) Date: Mon, 8 Aug 2016 20:12:58 +0000 Subject: [petsc-users] How to solve nonlinear F(x) = b(x)? Message-ID: Hello all, I've been searching through the PETSc documentation to try to find how to solve a nonlinear system where the right hand side (b) varies as a function of the state variables (x). According to the PETSc documentation, SNES solves the equations F(x) = b where b is a constant vector. What would I do to solve F(x) = b(x)? An example of this would be a nonlinear thermoelastic structure where as the structure deforms the direction of the loads generated by the thermal expansion changes as well. Any insight into how to implement this is appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Mon Aug 8 15:19:24 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 8 Aug 2016 22:19:24 +0200 Subject: [petsc-users] How to solve nonlinear F(x) = b(x)? In-Reply-To: References: Message-ID: On Monday, 8 August 2016, Neiferd, David John > wrote: > Hello all, > > I've been searching through the PETSc documentation to try to find how to > solve a nonlinear system where the right hand side (b) varies as a function > of the state variables (x). According to the PETSc documentation, SNES > solves the equations F(x) = b where b is a constant vector. What would I > do to solve F(x) = b(x)? An example of this would be a nonlinear > thermoelastic structure where as the structure deforms the direction of the > loads generated by the thermal expansion changes as well. Any insight into > how to implement this is appreciated. > All you need to do is define the non-linear residual F (a vector) such that it includes b(x) Eg, suppose I have some discrete non-linear system of the form, Ax = b(x), then I would define F(x) as F(x) = Ax -b(x) Thanks, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From oxberry1 at llnl.gov Mon Aug 8 15:20:27 2016 From: oxberry1 at llnl.gov (Oxberry, Geoffrey Malcolm) Date: Mon, 8 Aug 2016 20:20:27 +0000 Subject: [petsc-users] How to solve nonlinear F(x) = b(x)? In-Reply-To: References: Message-ID: <78A7070D-DD93-4F22-95FC-99D180E52B81@llnl.gov> David, What about solving G(x) = F(x) - b(x) = 0? Geoff On Aug 8, 2016, at 1:12 PM, Neiferd, David John > wrote: Hello all, I've been searching through the PETSc documentation to try to find how to solve a nonlinear system where the right hand side (b) varies as a function of the state variables (x). According to the PETSc documentation, SNES solves the equations F(x) = b where b is a constant vector. What would I do to solve F(x) = b(x)? An example of this would be a nonlinear thermoelastic structure where as the structure deforms the direction of the loads generated by the thermal expansion changes as well. Any insight into how to implement this is appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.neiferd at wright.edu Mon Aug 8 16:23:15 2016 From: david.neiferd at wright.edu (Neiferd, David John) Date: Mon, 8 Aug 2016 21:23:15 +0000 Subject: [petsc-users] How to solve nonlinear F(x) = b(x)? In-Reply-To: <78A7070D-DD93-4F22-95FC-99D180E52B81@llnl.gov> References: , <78A7070D-DD93-4F22-95FC-99D180E52B81@llnl.gov> Message-ID: Thanks for the suggestions Geoff and Dave. Using G(x) = F(x) - b(x) = 0, will required redefinition of the Jacobian correct? If I understand correctly, the Jacobian is the derivative of F(x) with respect to x. Since we are redefining F(x) to G(x), it would be necessary to change the Jacobian from dF(x)/dx to dF(x)/dx - db(x)/dx, correct? Also, I noticed when I implemented G(x) = F(x) - b = 0 (where b is constant) the method seems less robust when using newton's method with a line search, at least for one particular problem, the line search (using default settings) diverges (converged reason = -6), but using a trust region newton method or a quasi-newton method it converges to the answer. ________________________________ From: Oxberry, Geoffrey Malcolm Sent: Monday, August 8, 2016 4:20:27 PM To: Neiferd, David John Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] How to solve nonlinear F(x) = b(x)? David, What about solving G(x) = F(x) - b(x) = 0? Geoff On Aug 8, 2016, at 1:12 PM, Neiferd, David John > wrote: Hello all, I've been searching through the PETSc documentation to try to find how to solve a nonlinear system where the right hand side (b) varies as a function of the state variables (x). According to the PETSc documentation, SNES solves the equations F(x) = b where b is a constant vector. What would I do to solve F(x) = b(x)? An example of this would be a nonlinear thermoelastic structure where as the structure deforms the direction of the loads generated by the thermal expansion changes as well. Any insight into how to implement this is appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oxberry1 at llnl.gov Mon Aug 8 16:33:56 2016 From: oxberry1 at llnl.gov (Oxberry, Geoffrey Malcolm) Date: Mon, 8 Aug 2016 21:33:56 +0000 Subject: [petsc-users] How to solve nonlinear F(x) = b(x)? In-Reply-To: References: <78A7070D-DD93-4F22-95FC-99D180E52B81@llnl.gov> Message-ID: On Aug 8, 2016, at 2:23 PM, Neiferd, David John > wrote: Thanks for the suggestions Geoff and Dave. Using G(x) = F(x) - b(x) = 0, will required redefinition of the Jacobian correct? If I understand correctly, the Jacobian is the derivative of F(x) with respect to x. Since we are redefining F(x) to G(x), it would be necessary to change the Jacobian from dF(x)/dx to dF(x)/dx - db(x)/dx, correct? Yes. Also, I noticed when I implemented G(x) = F(x) - b = 0 (where b is constant) the method seems less robust when using newton's method with a line search, at least for one particular problem, the line search (using default settings) diverges (converged reason = -6), but using a trust region newton method or a quasi-newton method it converges to the answer. I would start with the suggestions in http://www.mcs.anl.gov/petsc/documentation/faq.html#newton first before doing any more tuning. In optimization, trust region solvers have a reputation of being more robust, but slower, than comparable line search methods; I?m not sure if this statement is true for general equation solving. Geoff ________________________________ From: Oxberry, Geoffrey Malcolm > Sent: Monday, August 8, 2016 4:20:27 PM To: Neiferd, David John Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] How to solve nonlinear F(x) = b(x)? David, What about solving G(x) = F(x) - b(x) = 0? Geoff On Aug 8, 2016, at 1:12 PM, Neiferd, David John > wrote: Hello all, I've been searching through the PETSc documentation to try to find how to solve a nonlinear system where the right hand side (b) varies as a function of the state variables (x). According to the PETSc documentation, SNES solves the equations F(x) = b where b is a constant vector. What would I do to solve F(x) = b(x)? An example of this would be a nonlinear thermoelastic structure where as the structure deforms the direction of the loads generated by the thermal expansion changes as well. Any insight into how to implement this is appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From athena.paz1 at gmail.com Tue Aug 9 02:24:45 2016 From: athena.paz1 at gmail.com (Athena Paz) Date: Tue, 9 Aug 2016 16:24:45 +0900 Subject: [petsc-users] Meaning of Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end Message-ID: Hi all, I'm very new to PETSC. I'm trying to solve a diffusion problem in 3D. I tried running a 500 x 500 x 500 grid using 20 processors but I encounter the following error: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html# valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html# valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatSetValues_SeqAIJ line 441 /home/paz/petsc-3.7.3/src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: [0] MatSetValues line 1157 /home/paz/petsc-3.7.3/src/mat/interface/matrix.c [0]PETSC ERROR: [0] MatSetValuesLocal line 2019 /home/paz/petsc-3.7.3/src/mat/interface/matrix.c [0]PETSC ERROR: [0] DMCreateMatrix_DA_3d_MPIAIJ line 1036 /home/paz/petsc-3.7.3/src/dm/impls/da/fdda.c [0]PETSC ERROR: [0] DMCreateMatrix_DA line 625 /home/paz/petsc-3.7.3/src/dm/impls/da/fdda.c [0]PETSC ERROR: [0] DMCreateMatrix line 1171 /home/paz/petsc-3.7.3/src/dm/interface/dm.c [0]PETSC ERROR: [0] SNESSetUpMatrices line 579 /home/paz/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSetUp_NEWTONLS line 303 /home/paz/petsc-3.7.3/src/snes/impls/ls/ls.c [0]PETSC ERROR: [0] SNESSetUp line 2661 /home/paz/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve line 3958 /home/paz/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./ex7 on a arch-linux2-c-debug named akagi by paz Tue Aug 9 16:01:17 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack --download-mpich --with-debugging [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 What does this mean? I am able to run the code successfully with a 300x300x300 grid size. I also tried using -malloc_debug and valgrind as suggested in the Debugging FAQ for a small grid size and the code comes out clean. Any help is much appreciated! Thank you all for your time! Have a great day! Athena -------------- next part -------------- An HTML attachment was scrubbed... URL: From rupp at iue.tuwien.ac.at Tue Aug 9 03:06:18 2016 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Tue, 9 Aug 2016 10:06:18 +0200 Subject: [petsc-users] Meaning of Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end In-Reply-To: References: Message-ID: <57A98EFA.5060007@iue.tuwien.ac.at> Hi, are you running the code on a cluster? If so, you need to submit your jobs through the batch system, as larger jobs get killed by the batch system. Best regards, Karli On 08/09/2016 09:24 AM, Athena Paz wrote: > Hi all, > > I'm very new to PETSC. I'm trying to solve a diffusion problem in 3D. I > tried running a 500 x 500 x 500 grid using 20 processors but I encounter > the following error: > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] MatSetValues_SeqAIJ line 441 > /home/paz/petsc-3.7.3/src/mat/impls/aij/seq/aij.c > > [0]PETSC ERROR: [0] MatSetValues line 1157 > /home/paz/petsc-3.7.3/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatSetValuesLocal line 2019 > /home/paz/petsc-3.7.3/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] DMCreateMatrix_DA_3d_MPIAIJ line 1036 > /home/paz/petsc-3.7.3/src/dm/impls/da/fdda.c > > [0]PETSC ERROR: [0] DMCreateMatrix_DA line 625 > /home/paz/petsc-3.7.3/src/dm/impls/da/fdda.c > > [0]PETSC ERROR: [0] DMCreateMatrix line 1171 > /home/paz/petsc-3.7.3/src/dm/interface/dm.c > > [0]PETSC ERROR: [0] SNESSetUpMatrices line 579 > /home/paz/petsc-3.7.3/src/snes/interface/snes.c > > [0]PETSC ERROR: [0] SNESSetUp_NEWTONLS line 303 > /home/paz/petsc-3.7.3/src/snes/impls/ls/ls.c > > [0]PETSC ERROR: [0] SNESSetUp line 2661 > /home/paz/petsc-3.7.3/src/snes/interface/snes.c > > [0]PETSC ERROR: [0] SNESSolve line 3958 > /home/paz/petsc-3.7.3/src/snes/interface/snes.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > > [0]PETSC ERROR: ./ex7 on a arch-linux2-c-debug named akagi by paz Tue > Aug 9 16:01:17 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack --download-mpich --with-debugging > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > > What does this mean? I am able to run the code successfully with a > 300x300x300 grid size. I also tried using -malloc_debug and valgrind as > suggested in the Debugging FAQ for a small grid size and the code comes > out clean. Any help is much appreciated! > > > Thank you all for your time! Have a great day! > > > Athena From jshen25 at jhu.edu Tue Aug 9 09:52:02 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Tue, 9 Aug 2016 10:52:02 -0400 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: <51E07B0F-810B-492A-8330-3469E04D14F1@mcs.anl.gov> References: <51E07B0F-810B-492A-8330-3469E04D14F1@mcs.anl.gov> Message-ID: Hi Barry, Thanks for your answer. But logically for large problem, we are always expecting to see paralleled program perform better with regard to both speed and memory since each of the multi-processes independently deal with its own submatrix, especially for iterative solver, which is revealed by CG+BJ. I just don't understand, in the computing with CG+ASM and SUPER_LU, why the two-process is most inefficient among these cases. If this is due to the communication cost compared with uni-process, why the speed goes down for triple and more processes. I'm new to parallelism, could you speculate any possible reason for such situation? Great thanks On Fri, Aug 5, 2016 at 10:09 PM, Barry Smith wrote: > > > On Aug 5, 2016, at 5:58 PM, Jinlei Shen wrote: > > > > ?Hi, > > > > Thanks for your answers. > > > > I just figured out the issues which are mainly due to the > ill-conditioning of my matrix. I found the conditional number blows up when > the beam is discretized into large number of elements. > > > > Now, I am using the 1D bar model to solve the same problem. The good > news is the solution is always accurate and stable even I discretized into > 10 million elements. > > > > When I run the model with both iterative solver(CG+BJACOBI/ASM) and > direct solver(SUPER_LU) in parallelization, I got the following results: > > > > Mesh size: 1 million unknowns > > Processes 1 2 4 6 8 10 12 > 16 20 > > CG+BJ 0.36 0.22 0.15 0.12 0.11 0.1 0.096 0.097 > 0.099 > > CG+ASM 0.47 0.46 0.267 0.2 0.17 0.15 0.145 > 0.16 0.15 > > SUPER_LU_DIST 4.73 5.4 4.69 4.58 4.38 4.2 4.27 > 4.28 4.38 > > > > It seems the CG+BJ works correctly, i.e. time decreases fast with a few > more processes and reach stable with many more cores. > > > > However, I have some concerns about CG+ASM and SUPER_LU_DIST. The time > of both two methods goes up when I use two processes compared with > uniprocess. > > This is actually not surprising at all but since the mantra is > "parallelism will always make things faster" it can confuse people. When > run with one process the ASM and SuperLU_DIST utilize essentially > sequential algorithms, when run with two processes they "switch" to > parallel algorithms which simply are not as good as the essentially > sequential algorithm that is obtained with one process hence they run > slower. This is just life, there really isn't something one can do about it > except to perhaps use a poorer quality algorithm on one process so that two > processes look better but the goal of PETSc is not to make parallelism to > look good but to provide efficient solvers (as best we can) for one and > multiple processes. > > Barry > > > > > > The tendency is more obvious when I use larger mesh size. > > I especially doubt the results of SUPER_LU_DIST in parallelism since the > overall expedition is very small which is not expected. > > The runtime option I use for ASM pc and SUPER_LU_DIST solver is shown as > below: > > ASM preconditioner: -pc_type asm -pc_asm_type basic > > SUPER_LU_DIST solver: -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist > > > > I use same mpiexec -n np ./xxxx for all solvers. > > > > Am I using them correctly? If so, is there anyway to speed up the > computation further, especially for SUPER_LU_DIST? > > > > Thank you very much! > > > > Bests, > > Jinlei > > > > On Mon, Aug 1, 2016 at 2:10 PM, Matthew Knepley > wrote: > > On Mon, Aug 1, 2016 at 12:52 PM, Jinlei Shen wrote: > > Hi Barry, > > > > Thanks for your reply. > > > > Firstly, as you suggested, I checked my program under valgrind. The > results for both sequential and parallel cases showed there are no memory > errors detected. > > > > Second, I coded a sequential program without using PETSC to generate the > global matrix of small mesh for the same problem. I then checked the matrix > both from petsc(sequential and parallel) and serial code, and they are same. > > The way I assembled the global matrix in parallel is first distributing > the nodes and elements into processes, then I loop with elements on the > calling process to put the element stiffness into the global. Since the > nodes and elements in cantilever beam are numbered successively, the > connectivity is simple. I didn't use any partition tools to optimize mesh. > It's also easy to determine the preallocation d_nnz and o_nnz since each > node only connects the left and right nodes except for beginning and end, > the maximum nonzeros in each row is 6. The MatSetValue process is shown as > follows: > > do iEL = idElStart, idElEnd > > g_EL = (/2*iEL-1-1,2*iEL-1,2*iEL+1-1,2*iEL+2-1/) > > call MatSetValues(SG,4,g_EL,4,g_El,SE,ADD_VALUES,ierr) > > end do > > where idElStart and idElEnd are the global number of first element and > end element that the process owns, g_EL is the global index for DOF in > element iEL, SE is the element stiffness which is same for all elements. > > From above assembling, most of the elements are assembled within own > process while there are few elements crossing two processes. > > > > The BC for my problem(cantilever under end point load) is to fix the > first two DOF, so I called the MatZeroRowsColumns to set the first two rows > and columns into zero with diagonal equal to one, without changing the RHS. > > > > Now some new issues show up : > > > > I run with -ksp_monitor_true_residual and -ksp_converged_reason, the > monitor showed two different residues, one is the residue I can > set(preconditioned, unpreconditioned, natural), the other is called true > residue. > > ?? > > I initially thought the true residue is same as unpreconditioned based > on definition. But it seems not true. Is it the norm of the residue (b-Ax) > between computed RHS and true RHS? But, how to understand unprecondition > residue since its definition is b-Ax as well? > > > > It is the unpreconditioned residual. You must be misinterpreting. And we > could determine exactly if you sent the output with the suggested options. > > > > Can I set the true residue as my converging criteria? > > > > Use right preconditioning. > > > > I found the accuracy of large mesh in my problem didn't necessary depend > on the tolerance I set, either preconditioned or unpreconditioned, > sometimes, it showed converged while the solution is not correct. But the > true residue looks reflecting the true convergence very well, if the true > residue is diverging, no matter what the first residue says, the results > are bad! > > > > Yes, your preconditioner looks singular. Note that BJACOBI has an inner > solver, and by default the is GMRES/ILU(0). I think > > ILU(0) is really ill-conditioned for your problem. > > > > For the preconditioner concerns, actually, I used BJACOBI before I sent > the first email, since the JACOBI or PBJACOBI didn't even converge when the > size was large. > > But BJACOBI also didn't perform well in the paralleliztion for large > mesh as posed in my last email, while it's fine for small size (below 10k > elements) > > > > Yesterday, I tried the ASM with CG using the runtime option: -pc_type > asm -pc_asm_type basic -sub_pc_type lu (default is ilu). > > For 15k elements mesh, I am now able to get the correct answer with 1-3, > 6 and more processes, using either -sub_pc_type lu or ilu. > > > > Yes, LU works for your subdomain solver. > > > > Based on all the results I have got, it shows the results varies a lot > with different PC and seems ASM is better for large problem. > > > > Its not ASM so much as an LU subsolver that is better. > > > > But what is the major factor to produce such difference between > different PCs, since it's not just the issue of computational efficiency, > but also the accuracy. > > Also, I noticed for large mesh, the solution is unstable with small > number of processes, for the 15k case, the solution is not correct with 4 > and 5 processes, however, the solution becomes always correct with more > than 6 processes. For the 50k mesh case, more processes are required to > show the stability. > > > > Yes, partitioning is very important here. Since you do not have a good > partition, you can get these wild variations. > > > > Thanks, > > > > Matt > > > > What do you think about this? Anything wrong? > > Since the iterative solver in parallel is first computed locally(if this > is correct), can it be possible that there are 'good' and 'bad' locals when > dividing the global matrix, and the result from 'bad' local will > contaminate the global results. But with more processes, such risk is > reduced. > > > > It is highly appreciated if you could give me some instruction for above > questions. > > > > Thank you very much. > > > > Bests, > > Jinlei > > > > > > On Fri, Jul 29, 2016 at 2:09 PM, Barry Smith wrote: > > > > First run under valgrind all the cases to make sure there is not some > use of uninitialized data or overwriting of data. Go to > http://www.mcs.anl.gov/petsc follow the link to FAQ and search for > valgrind (the web server seems to be broken at the moment). > > > > Second it is possible that your code the assembles the matrices and > vectors is not correctly assembling it for either the sequential or > parallel case. Hence a different number of processes could be generating a > different linear system hence inconsistent results. How are you handling > the parallelism? How do you know the matrix generated in parallel is > identically to that sequentially? > > > > Simple preconditioners such as pbjacobi will converge slower and slower > with more elements. > > > > Note that you should run with -ksp_monitor_true_residual and > -ksp_converged_reason to make sure that the iterative solver is even > converging. By default PETSc KSP solvers do not stop with a big error > message if they do not converge so you need make sure they are always > converging. > > > > Barry > > > > > > > > > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: > > > > > > Dear PETSC developers, > > > > > > Thank you for developing such a powerful tool for scientific > computations. > > > > > > I'm currently trying to run a simple cantilever beam FEM to test the > scalability of PETSC on multi-processors. I also want to verify whether > iterative solver or direct solver is more efficient for parallel large FEM > problem. > > > > > > Problem description, An Euler elementary cantilever beam with point > load at the end along -y direction. Each node has 2 DOF (deflection and > rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based > on the connectivity. Loop with elements in each processor to assemble the > global matrix with same element stiffness matrix. The boundary condition is > set using call MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_ > NULL_OBJECT,ierr); > > > > > > Based on what I have done, I find the computations work well, i.e the > results are correct compared with theoretical solution, for small mesh size > (small than 5000 elements) using both solvers with different numbers of > processes. > > > > > > However, there are several confusing issues when I increase the mesh > size to 10000 and more elements with iterative solve(CG + PCBJACOBI) > > > > > > 1. For 10k elements, I can get accurate solution using iterative > solver with uni-processor(i.e. only one process). However, when I use 2-8 > processes, it tells the linear solver converged with different iterations, > but, the results are all different for different processes and erroneous. > The wired thing is when I use >9 processes, the results are correct again. > I am really confused by this. Could you explain me why? If my > parallelization is not correct, why it works for small cases? And I check > the global matrix and RHS vector and didn't see any mallocs during the > process. > > > > > > 2. For 30k elements, if I use one process, it says: Linear solve did > not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for > large sparse matrix? If so, is there any stable solver or pc for large > problem? > > > > > > > > > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can > only get accuracy when the number of elements are below 5000. There must be > something wrong. The way I use the superlu_dist solver is first convert > MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to > PCLU. Do I miss anything else to run SUPER_LU correctly? > > > > > > > > > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the > sequential version of the same problem. The results shows that iterative > solver works well for <50k elements, while SUPER_LU only gets right > solution below 5k elements. Can I say iterative solver is better than > SUPER_LU for large problem? How can I improve the solver to copy with very > large problem, such as million by million? Another thing is it's still > doubtable of performance of SUPER_LU. > > > > > > For the inaccuracy issue, do you think it may be due to the memory? > However, there is no memory error showing during the execution. > > > > > > I really appreciate someone could resolve those puzzles above for me. > My goal is to replace the current SUPER_LU solver in my parallel CPFEM > main program with the iterative solver using PETSC. > > > > > > > > > Please let me if you would like to see my code in detail. > > > > > > Thank you very much. > > > > > > Bests, > > > Jinlei > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lailaizhu00 at gmail.com Tue Aug 9 10:27:43 2016 From: lailaizhu00 at gmail.com (Lailai Zhu) Date: Tue, 9 Aug 2016 17:27:43 +0200 Subject: [petsc-users] matrix-free method with preconditioning for a linear system Message-ID: <8953e07b-2712-6e9d-c30d-d0853665d7de@gmail.com> Hi, dear developers, I am currently facing such a problem. I would like to use a matrix-free method to solve a linear system. So I use 'MatCreateShell' with user-defined subroutine to evaluate matrix-vector product. This works well, but is much slower than the traditional matrix-assembling approach where different preconditioning can be applied. Now I would like to build a user-defined preconditioning for this matrix-free approach, then I used to 'PCSHELL'. However, petsc told me that '[0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Mat type shell'. I am using petsc 3.7.2. It seems that this user-defined PC is not supported with the 'MatCreateShell'. Is this indeed the case? Is there any way to circumvent this, for example by using 'MatCreateMFFD' or a SNSE artificially for a linear problem? Thanks in advance, best, lailai From hzhang at mcs.anl.gov Tue Aug 9 10:42:31 2016 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 9 Aug 2016 10:42:31 -0500 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: References: <51E07B0F-810B-492A-8330-3469E04D14F1@mcs.anl.gov> Message-ID: Jinlei: See http://www.mcs.anl.gov/petsc/documentation/faq.html#computers Hong Hi Barry, > > Thanks for your answer. > > But logically for large problem, we are always expecting to see paralleled > program perform better with regard to both speed and memory since each of > the multi-processes independently deal with its own submatrix, especially > for iterative solver, which is revealed by CG+BJ. > I just don't understand, in the computing with CG+ASM and SUPER_LU, why > the two-process is most inefficient among these cases. If this is due to > the communication cost compared with uni-process, why the speed goes down > for triple and more processes. I'm new to parallelism, could you speculate > any possible reason for such situation? > > Great thanks > > > > On Fri, Aug 5, 2016 at 10:09 PM, Barry Smith wrote: > >> >> > On Aug 5, 2016, at 5:58 PM, Jinlei Shen wrote: >> > >> > ?Hi, >> > >> > Thanks for your answers. >> > >> > I just figured out the issues which are mainly due to the >> ill-conditioning of my matrix. I found the conditional number blows up when >> the beam is discretized into large number of elements. >> > >> > Now, I am using the 1D bar model to solve the same problem. The good >> news is the solution is always accurate and stable even I discretized into >> 10 million elements. >> > >> > When I run the model with both iterative solver(CG+BJACOBI/ASM) and >> direct solver(SUPER_LU) in parallelization, I got the following results: >> > >> > Mesh size: 1 million unknowns >> > Processes 1 2 4 6 8 10 12 >> 16 20 >> > CG+BJ 0.36 0.22 0.15 0.12 0.11 0.1 0.096 0.097 >> 0.099 >> > CG+ASM 0.47 0.46 0.267 0.2 0.17 0.15 0.145 >> 0.16 0.15 >> > SUPER_LU_DIST 4.73 5.4 4.69 4.58 4.38 4.2 4.27 >> 4.28 4.38 >> > >> > It seems the CG+BJ works correctly, i.e. time decreases fast with a few >> more processes and reach stable with many more cores. >> > >> > However, I have some concerns about CG+ASM and SUPER_LU_DIST. The time >> of both two methods goes up when I use two processes compared with >> uniprocess. >> >> This is actually not surprising at all but since the mantra is >> "parallelism will always make things faster" it can confuse people. When >> run with one process the ASM and SuperLU_DIST utilize essentially >> sequential algorithms, when run with two processes they "switch" to >> parallel algorithms which simply are not as good as the essentially >> sequential algorithm that is obtained with one process hence they run >> slower. This is just life, there really isn't something one can do about it >> except to perhaps use a poorer quality algorithm on one process so that two >> processes look better but the goal of PETSc is not to make parallelism to >> look good but to provide efficient solvers (as best we can) for one and >> multiple processes. >> >> Barry >> >> >> >> >> > The tendency is more obvious when I use larger mesh size. >> > I especially doubt the results of SUPER_LU_DIST in parallelism since >> the overall expedition is very small which is not expected. >> > The runtime option I use for ASM pc and SUPER_LU_DIST solver is shown >> as below: >> > ASM preconditioner: -pc_type asm -pc_asm_type basic >> > SUPER_LU_DIST solver: -ksp_type preonly -pc_type lu >> -pc_factor_mat_solver_package superlu_dist >> > >> > I use same mpiexec -n np ./xxxx for all solvers. >> > >> > Am I using them correctly? If so, is there anyway to speed up the >> computation further, especially for SUPER_LU_DIST? >> > >> > Thank you very much! >> > >> > Bests, >> > Jinlei >> > >> > On Mon, Aug 1, 2016 at 2:10 PM, Matthew Knepley >> wrote: >> > On Mon, Aug 1, 2016 at 12:52 PM, Jinlei Shen wrote: >> > Hi Barry, >> > >> > Thanks for your reply. >> > >> > Firstly, as you suggested, I checked my program under valgrind. The >> results for both sequential and parallel cases showed there are no memory >> errors detected. >> > >> > Second, I coded a sequential program without using PETSC to generate >> the global matrix of small mesh for the same problem. I then checked the >> matrix both from petsc(sequential and parallel) and serial code, and they >> are same. >> > The way I assembled the global matrix in parallel is first distributing >> the nodes and elements into processes, then I loop with elements on the >> calling process to put the element stiffness into the global. Since the >> nodes and elements in cantilever beam are numbered successively, the >> connectivity is simple. I didn't use any partition tools to optimize mesh. >> It's also easy to determine the preallocation d_nnz and o_nnz since each >> node only connects the left and right nodes except for beginning and end, >> the maximum nonzeros in each row is 6. The MatSetValue process is shown as >> follows: >> > do iEL = idElStart, idElEnd >> > g_EL = (/2*iEL-1-1,2*iEL-1,2*iEL+1-1,2*iEL+2-1/) >> > call MatSetValues(SG,4,g_EL,4,g_El,SE,ADD_VALUES,ierr) >> > end do >> > where idElStart and idElEnd are the global number of first element and >> end element that the process owns, g_EL is the global index for DOF in >> element iEL, SE is the element stiffness which is same for all elements. >> > From above assembling, most of the elements are assembled within own >> process while there are few elements crossing two processes. >> > >> > The BC for my problem(cantilever under end point load) is to fix the >> first two DOF, so I called the MatZeroRowsColumns to set the first two rows >> and columns into zero with diagonal equal to one, without changing the RHS. >> > >> > Now some new issues show up : >> > >> > I run with -ksp_monitor_true_residual and -ksp_converged_reason, the >> monitor showed two different residues, one is the residue I can >> set(preconditioned, unpreconditioned, natural), the other is called true >> residue. >> > ?? >> > I initially thought the true residue is same as unpreconditioned based >> on definition. But it seems not true. Is it the norm of the residue (b-Ax) >> between computed RHS and true RHS? But, how to understand unprecondition >> residue since its definition is b-Ax as well? >> > >> > It is the unpreconditioned residual. You must be misinterpreting. And >> we could determine exactly if you sent the output with the suggested >> options. >> > >> > Can I set the true residue as my converging criteria? >> > >> > Use right preconditioning. >> > >> > I found the accuracy of large mesh in my problem didn't necessary >> depend on the tolerance I set, either preconditioned or unpreconditioned, >> sometimes, it showed converged while the solution is not correct. But the >> true residue looks reflecting the true convergence very well, if the true >> residue is diverging, no matter what the first residue says, the results >> are bad! >> > >> > Yes, your preconditioner looks singular. Note that BJACOBI has an inner >> solver, and by default the is GMRES/ILU(0). I think >> > ILU(0) is really ill-conditioned for your problem. >> > >> > For the preconditioner concerns, actually, I used BJACOBI before I sent >> the first email, since the JACOBI or PBJACOBI didn't even converge when the >> size was large. >> > But BJACOBI also didn't perform well in the paralleliztion for large >> mesh as posed in my last email, while it's fine for small size (below 10k >> elements) >> > >> > Yesterday, I tried the ASM with CG using the runtime option: -pc_type >> asm -pc_asm_type basic -sub_pc_type lu (default is ilu). >> > For 15k elements mesh, I am now able to get the correct answer with >> 1-3, 6 and more processes, using either -sub_pc_type lu or ilu. >> > >> > Yes, LU works for your subdomain solver. >> > >> > Based on all the results I have got, it shows the results varies a lot >> with different PC and seems ASM is better for large problem. >> > >> > Its not ASM so much as an LU subsolver that is better. >> > >> > But what is the major factor to produce such difference between >> different PCs, since it's not just the issue of computational efficiency, >> but also the accuracy. >> > Also, I noticed for large mesh, the solution is unstable with small >> number of processes, for the 15k case, the solution is not correct with 4 >> and 5 processes, however, the solution becomes always correct with more >> than 6 processes. For the 50k mesh case, more processes are required to >> show the stability. >> > >> > Yes, partitioning is very important here. Since you do not have a good >> partition, you can get these wild variations. >> > >> > Thanks, >> > >> > Matt >> > >> > What do you think about this? Anything wrong? >> > Since the iterative solver in parallel is first computed locally(if >> this is correct), can it be possible that there are 'good' and 'bad' locals >> when dividing the global matrix, and the result from 'bad' local will >> contaminate the global results. But with more processes, such risk is >> reduced. >> > >> > It is highly appreciated if you could give me some instruction for >> above questions. >> > >> > Thank you very much. >> > >> > Bests, >> > Jinlei >> > >> > >> > On Fri, Jul 29, 2016 at 2:09 PM, Barry Smith >> wrote: >> > >> > First run under valgrind all the cases to make sure there is not >> some use of uninitialized data or overwriting of data. Go to >> http://www.mcs.anl.gov/petsc follow the link to FAQ and search for >> valgrind (the web server seems to be broken at the moment). >> > >> > Second it is possible that your code the assembles the matrices and >> vectors is not correctly assembling it for either the sequential or >> parallel case. Hence a different number of processes could be generating a >> different linear system hence inconsistent results. How are you handling >> the parallelism? How do you know the matrix generated in parallel is >> identically to that sequentially? >> > >> > Simple preconditioners such as pbjacobi will converge slower and slower >> with more elements. >> > >> > Note that you should run with -ksp_monitor_true_residual and >> -ksp_converged_reason to make sure that the iterative solver is even >> converging. By default PETSc KSP solvers do not stop with a big error >> message if they do not converge so you need make sure they are always >> converging. >> > >> > Barry >> > >> > >> > >> > > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: >> > > >> > > Dear PETSC developers, >> > > >> > > Thank you for developing such a powerful tool for scientific >> computations. >> > > >> > > I'm currently trying to run a simple cantilever beam FEM to test the >> scalability of PETSC on multi-processors. I also want to verify whether >> iterative solver or direct solver is more efficient for parallel large FEM >> problem. >> > > >> > > Problem description, An Euler elementary cantilever beam with point >> load at the end along -y direction. Each node has 2 DOF (deflection and >> rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based >> on the connectivity. Loop with elements in each processor to assemble the >> global matrix with same element stiffness matrix. The boundary condition is >> set using call MatZeroRowsColumns(SG,2,g_BC,o >> ne,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr); >> > > >> > > Based on what I have done, I find the computations work well, i.e the >> results are correct compared with theoretical solution, for small mesh size >> (small than 5000 elements) using both solvers with different numbers of >> processes. >> > > >> > > However, there are several confusing issues when I increase the mesh >> size to 10000 and more elements with iterative solve(CG + PCBJACOBI) >> > > >> > > 1. For 10k elements, I can get accurate solution using iterative >> solver with uni-processor(i.e. only one process). However, when I use 2-8 >> processes, it tells the linear solver converged with different iterations, >> but, the results are all different for different processes and erroneous. >> The wired thing is when I use >9 processes, the results are correct again. >> I am really confused by this. Could you explain me why? If my >> parallelization is not correct, why it works for small cases? And I check >> the global matrix and RHS vector and didn't see any mallocs during the >> process. >> > > >> > > 2. For 30k elements, if I use one process, it says: Linear solve did >> not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for >> large sparse matrix? If so, is there any stable solver or pc for large >> problem? >> > > >> > > >> > > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I >> can only get accuracy when the number of elements are below 5000. There >> must be something wrong. The way I use the superlu_dist solver is first >> convert MatType to AIJ, then call PCFactorSetMatSolverPackage, and change >> the PC to PCLU. Do I miss anything else to run SUPER_LU correctly? >> > > >> > > >> > > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the >> sequential version of the same problem. The results shows that iterative >> solver works well for <50k elements, while SUPER_LU only gets right >> solution below 5k elements. Can I say iterative solver is better than >> SUPER_LU for large problem? How can I improve the solver to copy with very >> large problem, such as million by million? Another thing is it's still >> doubtable of performance of SUPER_LU. >> > > >> > > For the inaccuracy issue, do you think it may be due to the memory? >> However, there is no memory error showing during the execution. >> > > >> > > I really appreciate someone could resolve those puzzles above for me. >> My goal is to replace the current SUPER_LU solver in my parallel CPFEM >> main program with the iterative solver using PETSC. >> > > >> > > >> > > Please let me if you would like to see my code in detail. >> > > >> > > Thank you very much. >> > > >> > > Bests, >> > > Jinlei >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 9 10:48:02 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 Aug 2016 10:48:02 -0500 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: References: <51E07B0F-810B-492A-8330-3469E04D14F1@mcs.anl.gov> Message-ID: On Tue, Aug 9, 2016 at 9:52 AM, Jinlei Shen wrote: > Hi Barry, > > Thanks for your answer. > > But logically for large problem, we are always expecting to see paralleled > program perform better with regard to both speed and memory since each of > the multi-processes independently deal with its own submatrix, especially > for iterative solver, which is revealed by CG+BJ. > I just don't understand, in the computing with CG+ASM and SUPER_LU, why > the two-process is most inefficient among these cases. If this is due to > the communication cost compared with uni-process, why the speed goes down > for triple and more processes. I'm new to parallelism, could you speculate > any possible reason for such situation? > As Barry noted, there are two reasons that you get slowdown: 1) Worse performance of existing algorithm This comes from things like communication. This is usually a small contributor to slowdown. 2) Different parallel algorithm This is what is causing your slowdown most likely. Matt > Great thanks > > > > On Fri, Aug 5, 2016 at 10:09 PM, Barry Smith wrote: > >> >> > On Aug 5, 2016, at 5:58 PM, Jinlei Shen wrote: >> > >> > ?Hi, >> > >> > Thanks for your answers. >> > >> > I just figured out the issues which are mainly due to the >> ill-conditioning of my matrix. I found the conditional number blows up when >> the beam is discretized into large number of elements. >> > >> > Now, I am using the 1D bar model to solve the same problem. The good >> news is the solution is always accurate and stable even I discretized into >> 10 million elements. >> > >> > When I run the model with both iterative solver(CG+BJACOBI/ASM) and >> direct solver(SUPER_LU) in parallelization, I got the following results: >> > >> > Mesh size: 1 million unknowns >> > Processes 1 2 4 6 8 10 12 >> 16 20 >> > CG+BJ 0.36 0.22 0.15 0.12 0.11 0.1 0.096 0.097 >> 0.099 >> > CG+ASM 0.47 0.46 0.267 0.2 0.17 0.15 0.145 >> 0.16 0.15 >> > SUPER_LU_DIST 4.73 5.4 4.69 4.58 4.38 4.2 4.27 >> 4.28 4.38 >> > >> > It seems the CG+BJ works correctly, i.e. time decreases fast with a few >> more processes and reach stable with many more cores. >> > >> > However, I have some concerns about CG+ASM and SUPER_LU_DIST. The time >> of both two methods goes up when I use two processes compared with >> uniprocess. >> >> This is actually not surprising at all but since the mantra is >> "parallelism will always make things faster" it can confuse people. When >> run with one process the ASM and SuperLU_DIST utilize essentially >> sequential algorithms, when run with two processes they "switch" to >> parallel algorithms which simply are not as good as the essentially >> sequential algorithm that is obtained with one process hence they run >> slower. This is just life, there really isn't something one can do about it >> except to perhaps use a poorer quality algorithm on one process so that two >> processes look better but the goal of PETSc is not to make parallelism to >> look good but to provide efficient solvers (as best we can) for one and >> multiple processes. >> >> Barry >> >> >> >> >> > The tendency is more obvious when I use larger mesh size. >> > I especially doubt the results of SUPER_LU_DIST in parallelism since >> the overall expedition is very small which is not expected. >> > The runtime option I use for ASM pc and SUPER_LU_DIST solver is shown >> as below: >> > ASM preconditioner: -pc_type asm -pc_asm_type basic >> > SUPER_LU_DIST solver: -ksp_type preonly -pc_type lu >> -pc_factor_mat_solver_package superlu_dist >> > >> > I use same mpiexec -n np ./xxxx for all solvers. >> > >> > Am I using them correctly? If so, is there anyway to speed up the >> computation further, especially for SUPER_LU_DIST? >> > >> > Thank you very much! >> > >> > Bests, >> > Jinlei >> > >> > On Mon, Aug 1, 2016 at 2:10 PM, Matthew Knepley >> wrote: >> > On Mon, Aug 1, 2016 at 12:52 PM, Jinlei Shen wrote: >> > Hi Barry, >> > >> > Thanks for your reply. >> > >> > Firstly, as you suggested, I checked my program under valgrind. The >> results for both sequential and parallel cases showed there are no memory >> errors detected. >> > >> > Second, I coded a sequential program without using PETSC to generate >> the global matrix of small mesh for the same problem. I then checked the >> matrix both from petsc(sequential and parallel) and serial code, and they >> are same. >> > The way I assembled the global matrix in parallel is first distributing >> the nodes and elements into processes, then I loop with elements on the >> calling process to put the element stiffness into the global. Since the >> nodes and elements in cantilever beam are numbered successively, the >> connectivity is simple. I didn't use any partition tools to optimize mesh. >> It's also easy to determine the preallocation d_nnz and o_nnz since each >> node only connects the left and right nodes except for beginning and end, >> the maximum nonzeros in each row is 6. The MatSetValue process is shown as >> follows: >> > do iEL = idElStart, idElEnd >> > g_EL = (/2*iEL-1-1,2*iEL-1,2*iEL+1-1,2*iEL+2-1/) >> > call MatSetValues(SG,4,g_EL,4,g_El,SE,ADD_VALUES,ierr) >> > end do >> > where idElStart and idElEnd are the global number of first element and >> end element that the process owns, g_EL is the global index for DOF in >> element iEL, SE is the element stiffness which is same for all elements. >> > From above assembling, most of the elements are assembled within own >> process while there are few elements crossing two processes. >> > >> > The BC for my problem(cantilever under end point load) is to fix the >> first two DOF, so I called the MatZeroRowsColumns to set the first two rows >> and columns into zero with diagonal equal to one, without changing the RHS. >> > >> > Now some new issues show up : >> > >> > I run with -ksp_monitor_true_residual and -ksp_converged_reason, the >> monitor showed two different residues, one is the residue I can >> set(preconditioned, unpreconditioned, natural), the other is called true >> residue. >> > ?? >> > I initially thought the true residue is same as unpreconditioned based >> on definition. But it seems not true. Is it the norm of the residue (b-Ax) >> between computed RHS and true RHS? But, how to understand unprecondition >> residue since its definition is b-Ax as well? >> > >> > It is the unpreconditioned residual. You must be misinterpreting. And >> we could determine exactly if you sent the output with the suggested >> options. >> > >> > Can I set the true residue as my converging criteria? >> > >> > Use right preconditioning. >> > >> > I found the accuracy of large mesh in my problem didn't necessary >> depend on the tolerance I set, either preconditioned or unpreconditioned, >> sometimes, it showed converged while the solution is not correct. But the >> true residue looks reflecting the true convergence very well, if the true >> residue is diverging, no matter what the first residue says, the results >> are bad! >> > >> > Yes, your preconditioner looks singular. Note that BJACOBI has an inner >> solver, and by default the is GMRES/ILU(0). I think >> > ILU(0) is really ill-conditioned for your problem. >> > >> > For the preconditioner concerns, actually, I used BJACOBI before I sent >> the first email, since the JACOBI or PBJACOBI didn't even converge when the >> size was large. >> > But BJACOBI also didn't perform well in the paralleliztion for large >> mesh as posed in my last email, while it's fine for small size (below 10k >> elements) >> > >> > Yesterday, I tried the ASM with CG using the runtime option: -pc_type >> asm -pc_asm_type basic -sub_pc_type lu (default is ilu). >> > For 15k elements mesh, I am now able to get the correct answer with >> 1-3, 6 and more processes, using either -sub_pc_type lu or ilu. >> > >> > Yes, LU works for your subdomain solver. >> > >> > Based on all the results I have got, it shows the results varies a lot >> with different PC and seems ASM is better for large problem. >> > >> > Its not ASM so much as an LU subsolver that is better. >> > >> > But what is the major factor to produce such difference between >> different PCs, since it's not just the issue of computational efficiency, >> but also the accuracy. >> > Also, I noticed for large mesh, the solution is unstable with small >> number of processes, for the 15k case, the solution is not correct with 4 >> and 5 processes, however, the solution becomes always correct with more >> than 6 processes. For the 50k mesh case, more processes are required to >> show the stability. >> > >> > Yes, partitioning is very important here. Since you do not have a good >> partition, you can get these wild variations. >> > >> > Thanks, >> > >> > Matt >> > >> > What do you think about this? Anything wrong? >> > Since the iterative solver in parallel is first computed locally(if >> this is correct), can it be possible that there are 'good' and 'bad' locals >> when dividing the global matrix, and the result from 'bad' local will >> contaminate the global results. But with more processes, such risk is >> reduced. >> > >> > It is highly appreciated if you could give me some instruction for >> above questions. >> > >> > Thank you very much. >> > >> > Bests, >> > Jinlei >> > >> > >> > On Fri, Jul 29, 2016 at 2:09 PM, Barry Smith >> wrote: >> > >> > First run under valgrind all the cases to make sure there is not >> some use of uninitialized data or overwriting of data. Go to >> http://www.mcs.anl.gov/petsc follow the link to FAQ and search for >> valgrind (the web server seems to be broken at the moment). >> > >> > Second it is possible that your code the assembles the matrices and >> vectors is not correctly assembling it for either the sequential or >> parallel case. Hence a different number of processes could be generating a >> different linear system hence inconsistent results. How are you handling >> the parallelism? How do you know the matrix generated in parallel is >> identically to that sequentially? >> > >> > Simple preconditioners such as pbjacobi will converge slower and slower >> with more elements. >> > >> > Note that you should run with -ksp_monitor_true_residual and >> -ksp_converged_reason to make sure that the iterative solver is even >> converging. By default PETSc KSP solvers do not stop with a big error >> message if they do not converge so you need make sure they are always >> converging. >> > >> > Barry >> > >> > >> > >> > > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: >> > > >> > > Dear PETSC developers, >> > > >> > > Thank you for developing such a powerful tool for scientific >> computations. >> > > >> > > I'm currently trying to run a simple cantilever beam FEM to test the >> scalability of PETSC on multi-processors. I also want to verify whether >> iterative solver or direct solver is more efficient for parallel large FEM >> problem. >> > > >> > > Problem description, An Euler elementary cantilever beam with point >> load at the end along -y direction. Each node has 2 DOF (deflection and >> rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based >> on the connectivity. Loop with elements in each processor to assemble the >> global matrix with same element stiffness matrix. The boundary condition is >> set using call MatZeroRowsColumns(SG,2,g_BC,o >> ne,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr); >> > > >> > > Based on what I have done, I find the computations work well, i.e the >> results are correct compared with theoretical solution, for small mesh size >> (small than 5000 elements) using both solvers with different numbers of >> processes. >> > > >> > > However, there are several confusing issues when I increase the mesh >> size to 10000 and more elements with iterative solve(CG + PCBJACOBI) >> > > >> > > 1. For 10k elements, I can get accurate solution using iterative >> solver with uni-processor(i.e. only one process). However, when I use 2-8 >> processes, it tells the linear solver converged with different iterations, >> but, the results are all different for different processes and erroneous. >> The wired thing is when I use >9 processes, the results are correct again. >> I am really confused by this. Could you explain me why? If my >> parallelization is not correct, why it works for small cases? And I check >> the global matrix and RHS vector and didn't see any mallocs during the >> process. >> > > >> > > 2. For 30k elements, if I use one process, it says: Linear solve did >> not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for >> large sparse matrix? If so, is there any stable solver or pc for large >> problem? >> > > >> > > >> > > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I >> can only get accuracy when the number of elements are below 5000. There >> must be something wrong. The way I use the superlu_dist solver is first >> convert MatType to AIJ, then call PCFactorSetMatSolverPackage, and change >> the PC to PCLU. Do I miss anything else to run SUPER_LU correctly? >> > > >> > > >> > > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the >> sequential version of the same problem. The results shows that iterative >> solver works well for <50k elements, while SUPER_LU only gets right >> solution below 5k elements. Can I say iterative solver is better than >> SUPER_LU for large problem? How can I improve the solver to copy with very >> large problem, such as million by million? Another thing is it's still >> doubtable of performance of SUPER_LU. >> > > >> > > For the inaccuracy issue, do you think it may be due to the memory? >> However, there is no memory error showing during the execution. >> > > >> > > I really appreciate someone could resolve those puzzles above for me. >> My goal is to replace the current SUPER_LU solver in my parallel CPFEM >> main program with the iterative solver using PETSC. >> > > >> > > >> > > Please let me if you would like to see my code in detail. >> > > >> > > Thank you very much. >> > > >> > > Bests, >> > > Jinlei >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 9 10:57:01 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 Aug 2016 10:57:01 -0500 Subject: [petsc-users] matrix-free method with preconditioning for a linear system In-Reply-To: <8953e07b-2712-6e9d-c30d-d0853665d7de@gmail.com> References: <8953e07b-2712-6e9d-c30d-d0853665d7de@gmail.com> Message-ID: On Tue, Aug 9, 2016 at 10:27 AM, Lailai Zhu wrote: > Hi, dear developers, > > I am currently facing such a problem. I would like to use a matrix-free > method to solve a linear system. So I use 'MatCreateShell' with > user-defined > subroutine to evaluate matrix-vector product. This works well, but is much > slower > than the traditional matrix-assembling approach where different > preconditioning > can be applied. > > Now I would like to build a user-defined preconditioning for this > matrix-free approach, > then I used to 'PCSHELL'. However, petsc told me that > '[0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Mat type shell'. > Send the ENTIRE error message. You should get a stack trace, which tells us where this is happening. Make sure you are checking all return values using CHKERRQ(). Thanks, Matt > I am using petsc 3.7.2. It seems that this user-defined PC is not > supported with the 'MatCreateShell'. Is this > indeed the case? Is there any way to circumvent this, for example by using > 'MatCreateMFFD' > or a SNSE artificially for a linear problem? Thanks in advance, > > best, > lailai > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From lailaizhu00 at gmail.com Tue Aug 9 11:11:46 2016 From: lailaizhu00 at gmail.com (Lailai Zhu) Date: Tue, 9 Aug 2016 18:11:46 +0200 Subject: [petsc-users] matrix-free method with preconditioning for a linear system In-Reply-To: References: <8953e07b-2712-6e9d-c30d-d0853665d7de@gmail.com> Message-ID: <2ebff100-654e-53bd-68da-6f8123c17bfa@gmail.com> hi, Dear Matt, thanks very much. I think I found the problem, in the pcshell part, i was using 'MatGetDiagonal' which is not supported for the shell-type matrix. This caused the problem. best, lailai On 2016/8/9 17:57, Matthew Knepley wrote: > On Tue, Aug 9, 2016 at 10:27 AM, Lailai Zhu > wrote: > > Hi, dear developers, > > I am currently facing such a problem. I would like to use a > matrix-free > method to solve a linear system. So I use 'MatCreateShell' with > user-defined > subroutine to evaluate matrix-vector product. This works well, but > is much slower > than the traditional matrix-assembling approach where different > preconditioning > can be applied. > > Now I would like to build a user-defined preconditioning for this > matrix-free approach, > then I used to 'PCSHELL'. However, petsc told me that > '[0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Mat type shell'. > > > Send the ENTIRE error message. You should get a stack trace, which > tells us where this > is happening. Make sure you are checking all return values using > CHKERRQ(). > > Thanks, > > Matt > > I am using petsc 3.7.2. It seems that this user-defined PC is not > supported with the 'MatCreateShell'. Is this > indeed the case? Is there any way to circumvent this, for example > by using 'MatCreateMFFD' > or a SNSE artificially for a linear problem? Thanks in advance, > > best, > lailai > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 9 12:55:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 9 Aug 2016 12:55:14 -0500 Subject: [petsc-users] Meaning of Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end In-Reply-To: References: Message-ID: Almost for sure the process has swallowed up all the system memory and so the batch system or os has killed the job. Likely you need to run on more nodes to solve this large a problem. Barry > On Aug 9, 2016, at 2:24 AM, Athena Paz wrote: > > Hi all, > > I'm very new to PETSC. I'm trying to solve a diffusion problem in 3D. I tried running a 500 x 500 x 500 grid using 20 processors but I encounter the following error: > > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatSetValues_SeqAIJ line 441 /home/paz/petsc-3.7.3/src/mat/impls/aij/seq/aij.c > [0]PETSC ERROR: [0] MatSetValues line 1157 /home/paz/petsc-3.7.3/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] MatSetValuesLocal line 2019 /home/paz/petsc-3.7.3/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] DMCreateMatrix_DA_3d_MPIAIJ line 1036 /home/paz/petsc-3.7.3/src/dm/impls/da/fdda.c > [0]PETSC ERROR: [0] DMCreateMatrix_DA line 625 /home/paz/petsc-3.7.3/src/dm/impls/da/fdda.c > [0]PETSC ERROR: [0] DMCreateMatrix line 1171 /home/paz/petsc-3.7.3/src/dm/interface/dm.c > [0]PETSC ERROR: [0] SNESSetUpMatrices line 579 /home/paz/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSetUp_NEWTONLS line 303 /home/paz/petsc-3.7.3/src/snes/impls/ls/ls.c > [0]PETSC ERROR: [0] SNESSetUp line 2661 /home/paz/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve line 3958 /home/paz/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./ex7 on a arch-linux2-c-debug named akagi by paz Tue Aug 9 16:01:17 2016 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack --download-mpich --with-debugging > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > What does this mean? I am able to run the code successfully with a 300x300x300 grid size. I also tried using -malloc_debug and valgrind as suggested in the Debugging FAQ for a small grid size and the code comes out clean. Any help is much appreciated! > > > Thank you all for your time! Have a great day! > > > Athena From bsmith at mcs.anl.gov Tue Aug 9 13:16:53 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 9 Aug 2016 13:16:53 -0500 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: References: <51E07B0F-810B-492A-8330-3469E04D14F1@mcs.anl.gov> Message-ID: <96BD4481-CE14-47A1-909E-037417B182D2@mcs.anl.gov> > On Aug 9, 2016, at 10:48 AM, Matthew Knepley wrote: > > On Tue, Aug 9, 2016 at 9:52 AM, Jinlei Shen wrote: > Hi Barry, > > Thanks for your answer. > > But logically for large problem, we are always expecting to see paralleled program perform better with regard to both speed and memory since each of the multi-processes independently deal with its own submatrix, especially for iterative solver, which is revealed by CG+BJ. > I just don't understand, in the computing with CG+ASM and SUPER_LU, why the two-process is most inefficient among these cases. If this is due to the communication cost compared with uni-process, why the speed goes down for triple and more processes. I'm new to parallelism, could you speculate any possible reason for such situation? > > As Barry noted, there are two reasons that you get slowdown: > > 1) Worse performance of existing algorithm > > This comes from things like communication. This is usually a small contributor to slowdown. > > 2) Different parallel algorithm > > This is what is causing your slowdown most likely. Yes, this is the one that matters. If you could look at the total amount of work needed in the computation on 1 and 2 processes (measuring work even by the simple minded count of total floating point operations; note we don't actually have a good way of measuring the total amount of work) you would see that the amount of work on 2 processes is a great deal more than the amount of work on 1 process. Hence it is completely natural using 2 processes will not take half the time of using 1 process. Barry > > Matt > > Great thanks > > > > On Fri, Aug 5, 2016 at 10:09 PM, Barry Smith wrote: > > > On Aug 5, 2016, at 5:58 PM, Jinlei Shen wrote: > > > > ?Hi, > > > > Thanks for your answers. > > > > I just figured out the issues which are mainly due to the ill-conditioning of my matrix. I found the conditional number blows up when the beam is discretized into large number of elements. > > > > Now, I am using the 1D bar model to solve the same problem. The good news is the solution is always accurate and stable even I discretized into 10 million elements. > > > > When I run the model with both iterative solver(CG+BJACOBI/ASM) and direct solver(SUPER_LU) in parallelization, I got the following results: > > > > Mesh size: 1 million unknowns > > Processes 1 2 4 6 8 10 12 16 20 > > CG+BJ 0.36 0.22 0.15 0.12 0.11 0.1 0.096 0.097 0.099 > > CG+ASM 0.47 0.46 0.267 0.2 0.17 0.15 0.145 0.16 0.15 > > SUPER_LU_DIST 4.73 5.4 4.69 4.58 4.38 4.2 4.27 4.28 4.38 > > > > It seems the CG+BJ works correctly, i.e. time decreases fast with a few more processes and reach stable with many more cores. > > > > However, I have some concerns about CG+ASM and SUPER_LU_DIST. The time of both two methods goes up when I use two processes compared with uniprocess. > > This is actually not surprising at all but since the mantra is "parallelism will always make things faster" it can confuse people. When run with one process the ASM and SuperLU_DIST utilize essentially sequential algorithms, when run with two processes they "switch" to parallel algorithms which simply are not as good as the essentially sequential algorithm that is obtained with one process hence they run slower. This is just life, there really isn't something one can do about it except to perhaps use a poorer quality algorithm on one process so that two processes look better but the goal of PETSc is not to make parallelism to look good but to provide efficient solvers (as best we can) for one and multiple processes. > > Barry > > > > > > The tendency is more obvious when I use larger mesh size. > > I especially doubt the results of SUPER_LU_DIST in parallelism since the overall expedition is very small which is not expected. > > The runtime option I use for ASM pc and SUPER_LU_DIST solver is shown as below: > > ASM preconditioner: -pc_type asm -pc_asm_type basic > > SUPER_LU_DIST solver: -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist > > > > I use same mpiexec -n np ./xxxx for all solvers. > > > > Am I using them correctly? If so, is there anyway to speed up the computation further, especially for SUPER_LU_DIST? > > > > Thank you very much! > > > > Bests, > > Jinlei > > > > On Mon, Aug 1, 2016 at 2:10 PM, Matthew Knepley wrote: > > On Mon, Aug 1, 2016 at 12:52 PM, Jinlei Shen wrote: > > Hi Barry, > > > > Thanks for your reply. > > > > Firstly, as you suggested, I checked my program under valgrind. The results for both sequential and parallel cases showed there are no memory errors detected. > > > > Second, I coded a sequential program without using PETSC to generate the global matrix of small mesh for the same problem. I then checked the matrix both from petsc(sequential and parallel) and serial code, and they are same. > > The way I assembled the global matrix in parallel is first distributing the nodes and elements into processes, then I loop with elements on the calling process to put the element stiffness into the global. Since the nodes and elements in cantilever beam are numbered successively, the connectivity is simple. I didn't use any partition tools to optimize mesh. It's also easy to determine the preallocation d_nnz and o_nnz since each node only connects the left and right nodes except for beginning and end, the maximum nonzeros in each row is 6. The MatSetValue process is shown as follows: > > do iEL = idElStart, idElEnd > > g_EL = (/2*iEL-1-1,2*iEL-1,2*iEL+1-1,2*iEL+2-1/) > > call MatSetValues(SG,4,g_EL,4,g_El,SE,ADD_VALUES,ierr) > > end do > > where idElStart and idElEnd are the global number of first element and end element that the process owns, g_EL is the global index for DOF in element iEL, SE is the element stiffness which is same for all elements. > > From above assembling, most of the elements are assembled within own process while there are few elements crossing two processes. > > > > The BC for my problem(cantilever under end point load) is to fix the first two DOF, so I called the MatZeroRowsColumns to set the first two rows and columns into zero with diagonal equal to one, without changing the RHS. > > > > Now some new issues show up : > > > > I run with -ksp_monitor_true_residual and -ksp_converged_reason, the monitor showed two different residues, one is the residue I can set(preconditioned, unpreconditioned, natural), the other is called true residue. > > ?? > > I initially thought the true residue is same as unpreconditioned based on definition. But it seems not true. Is it the norm of the residue (b-Ax) between computed RHS and true RHS? But, how to understand unprecondition residue since its definition is b-Ax as well? > > > > It is the unpreconditioned residual. You must be misinterpreting. And we could determine exactly if you sent the output with the suggested options. > > > > Can I set the true residue as my converging criteria? > > > > Use right preconditioning. > > > > I found the accuracy of large mesh in my problem didn't necessary depend on the tolerance I set, either preconditioned or unpreconditioned, sometimes, it showed converged while the solution is not correct. But the true residue looks reflecting the true convergence very well, if the true residue is diverging, no matter what the first residue says, the results are bad! > > > > Yes, your preconditioner looks singular. Note that BJACOBI has an inner solver, and by default the is GMRES/ILU(0). I think > > ILU(0) is really ill-conditioned for your problem. > > > > For the preconditioner concerns, actually, I used BJACOBI before I sent the first email, since the JACOBI or PBJACOBI didn't even converge when the size was large. > > But BJACOBI also didn't perform well in the paralleliztion for large mesh as posed in my last email, while it's fine for small size (below 10k elements) > > > > Yesterday, I tried the ASM with CG using the runtime option: -pc_type asm -pc_asm_type basic -sub_pc_type lu (default is ilu). > > For 15k elements mesh, I am now able to get the correct answer with 1-3, 6 and more processes, using either -sub_pc_type lu or ilu. > > > > Yes, LU works for your subdomain solver. > > > > Based on all the results I have got, it shows the results varies a lot with different PC and seems ASM is better for large problem. > > > > Its not ASM so much as an LU subsolver that is better. > > > > But what is the major factor to produce such difference between different PCs, since it's not just the issue of computational efficiency, but also the accuracy. > > Also, I noticed for large mesh, the solution is unstable with small number of processes, for the 15k case, the solution is not correct with 4 and 5 processes, however, the solution becomes always correct with more than 6 processes. For the 50k mesh case, more processes are required to show the stability. > > > > Yes, partitioning is very important here. Since you do not have a good partition, you can get these wild variations. > > > > Thanks, > > > > Matt > > > > What do you think about this? Anything wrong? > > Since the iterative solver in parallel is first computed locally(if this is correct), can it be possible that there are 'good' and 'bad' locals when dividing the global matrix, and the result from 'bad' local will contaminate the global results. But with more processes, such risk is reduced. > > > > It is highly appreciated if you could give me some instruction for above questions. > > > > Thank you very much. > > > > Bests, > > Jinlei > > > > > > On Fri, Jul 29, 2016 at 2:09 PM, Barry Smith wrote: > > > > First run under valgrind all the cases to make sure there is not some use of uninitialized data or overwriting of data. Go to http://www.mcs.anl.gov/petsc follow the link to FAQ and search for valgrind (the web server seems to be broken at the moment). > > > > Second it is possible that your code the assembles the matrices and vectors is not correctly assembling it for either the sequential or parallel case. Hence a different number of processes could be generating a different linear system hence inconsistent results. How are you handling the parallelism? How do you know the matrix generated in parallel is identically to that sequentially? > > > > Simple preconditioners such as pbjacobi will converge slower and slower with more elements. > > > > Note that you should run with -ksp_monitor_true_residual and -ksp_converged_reason to make sure that the iterative solver is even converging. By default PETSc KSP solvers do not stop with a big error message if they do not converge so you need make sure they are always converging. > > > > Barry > > > > > > > > > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: > > > > > > Dear PETSC developers, > > > > > > Thank you for developing such a powerful tool for scientific computations. > > > > > > I'm currently trying to run a simple cantilever beam FEM to test the scalability of PETSC on multi-processors. I also want to verify whether iterative solver or direct solver is more efficient for parallel large FEM problem. > > > > > > Problem description, An Euler elementary cantilever beam with point load at the end along -y direction. Each node has 2 DOF (deflection and rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based on the connectivity. Loop with elements in each processor to assemble the global matrix with same element stiffness matrix. The boundary condition is set using call MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr); > > > > > > Based on what I have done, I find the computations work well, i.e the results are correct compared with theoretical solution, for small mesh size (small than 5000 elements) using both solvers with different numbers of processes. > > > > > > However, there are several confusing issues when I increase the mesh size to 10000 and more elements with iterative solve(CG + PCBJACOBI) > > > > > > 1. For 10k elements, I can get accurate solution using iterative solver with uni-processor(i.e. only one process). However, when I use 2-8 processes, it tells the linear solver converged with different iterations, but, the results are all different for different processes and erroneous. The wired thing is when I use >9 processes, the results are correct again. I am really confused by this. Could you explain me why? If my parallelization is not correct, why it works for small cases? And I check the global matrix and RHS vector and didn't see any mallocs during the process. > > > > > > 2. For 30k elements, if I use one process, it says: Linear solve did not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for large sparse matrix? If so, is there any stable solver or pc for large problem? > > > > > > > > > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can only get accuracy when the number of elements are below 5000. There must be something wrong. The way I use the superlu_dist solver is first convert MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to PCLU. Do I miss anything else to run SUPER_LU correctly? > > > > > > > > > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the sequential version of the same problem. The results shows that iterative solver works well for <50k elements, while SUPER_LU only gets right solution below 5k elements. Can I say iterative solver is better than SUPER_LU for large problem? How can I improve the solver to copy with very large problem, such as million by million? Another thing is it's still doubtable of performance of SUPER_LU. > > > > > > For the inaccuracy issue, do you think it may be due to the memory? However, there is no memory error showing during the execution. > > > > > > I really appreciate someone could resolve those puzzles above for me. My goal is to replace the current SUPER_LU solver in my parallel CPFEM main program with the iterative solver using PETSC. > > > > > > > > > Please let me if you would like to see my code in detail. > > > > > > Thank you very much. > > > > > > Bests, > > > Jinlei > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From doss0032 at umn.edu Tue Aug 9 14:08:02 2016 From: doss0032 at umn.edu (Scott Dossa) Date: Tue, 9 Aug 2016 14:08:02 -0500 Subject: [petsc-users] function set by TSSetRHSJacobian() not being called Message-ID: Hi all, I adapted ex13.c from .../ts/examples/tutorials/ex13.c to solve a 1D TD PDE (swift-hohenberg). *Long story short*: the function which sets the Jacobian is never being called despite being assigned by TSSetRHSJacobian(). I create the TS environment with the following commands (omitting some code before and after -- Full code attached): --------------------------------------------------------------------------------------------------------------------------------------------------------------------- ierr = DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 15, 1, 1, NULL, &da); CHKERRQ(ierr); /*================================================================== Let DMDA create vector ==================================================================*/ ierr = DMCreateGlobalVector(da, &x);CHKERRQ(ierr); /*================================================================== Create timestepper context which will use Euler Method ==================================================================*/ ierr = TSCreate(PETSC_COMM_WORLD,&ts);CHKERRQ(ierr); ierr = TSSetDM(ts, da);CHKERRQ(ierr); ierr = TSSetType(ts, TSEULER);CHKERRQ(ierr); ierr = TSSetRHSFunction(ts, x, RHSFunction, NULL);CHKERRQ(ierr); /*================================================================= Create Jacobian for Euler Method =================================================================*/ ierr = DMSetMatType(da, MATAIJ);CHKERRQ(ierr); ierr = DMCreateMatrix(da,&J);CHKERRQ(ierr); ierr = TSSetRHSJacobian(ts,J,J,RHSJacobian,NULL); CHKERRQ(ierr) --------------------------------------------------------------------------------------------------------------------------------------------------------------------- I define the function RHSJacobian later, PetscPrintf(PETSC_COMM_WORLD, "Jacobian function called") is used to see if the function is ever called. Neither is the message ever called, nor is the Jacobian matrix ever non-zero. The J matrix is checked with MatView( ). No error messages are produced either at compiling nor runtime. Has anyone else run into such an issue or have any advice of how best to probe the problem? Thanks so much! -Scott Dossa -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: swift.cpp Type: text/x-c++src Size: 8999 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Aug 9 14:13:43 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 9 Aug 2016 14:13:43 -0500 Subject: [petsc-users] function set by TSSetRHSJacobian() not being called In-Reply-To: References: Message-ID: The Euler method is an explicit method that does not require the Jacobian. Hence the Jacobian you provide is never used. If you use any implement method, for example backward Euler beuler then you will see the Jacobian being used. Barry > On Aug 9, 2016, at 2:08 PM, Scott Dossa wrote: > > Hi all, > > I adapted ex13.c from .../ts/examples/tutorials/ex13.c to solve a 1D TD PDE (swift-hohenberg). > Long story short: the function which sets the Jacobian is never being called despite being assigned by TSSetRHSJacobian(). > > I create the TS environment with the following commands (omitting some code before and after -- Full code attached): > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > ierr = DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 15, 1, 1, NULL, &da); CHKERRQ(ierr); > > /*================================================================== > Let DMDA create vector > ==================================================================*/ > > ierr = DMCreateGlobalVector(da, &x);CHKERRQ(ierr); > > /*================================================================== > Create timestepper context which will use Euler Method > ==================================================================*/ > > ierr = TSCreate(PETSC_COMM_WORLD,&ts);CHKERRQ(ierr); > ierr = TSSetDM(ts, da);CHKERRQ(ierr); > ierr = TSSetType(ts, TSEULER);CHKERRQ(ierr); > > ierr = TSSetRHSFunction(ts, x, RHSFunction, NULL);CHKERRQ(ierr); > > /*================================================================= > Create Jacobian for Euler Method > =================================================================*/ > ierr = DMSetMatType(da, MATAIJ);CHKERRQ(ierr); > ierr = DMCreateMatrix(da,&J);CHKERRQ(ierr); > ierr = TSSetRHSJacobian(ts,J,J,RHSJacobian,NULL); CHKERRQ(ierr) > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > I define the function RHSJacobian later, PetscPrintf(PETSC_COMM_WORLD, "Jacobian function called") is used to see if the function is ever called. > > Neither is the message ever called, nor is the Jacobian matrix ever non-zero. The J matrix is checked with MatView( ). No error messages are produced either at compiling nor runtime. > > Has anyone else run into such an issue or have any advice of how best to probe the problem? > Thanks so much! > -Scott Dossa > > > From hongzhang at anl.gov Tue Aug 9 14:34:52 2016 From: hongzhang at anl.gov (Hong Zhang) Date: Tue, 9 Aug 2016 14:34:52 -0500 Subject: [petsc-users] function set by TSSetRHSJacobian() not being called In-Reply-To: References: Message-ID: I guess Barry meant ?implicit method?. To use backward Euler, change TSEULER to TSBEULER in the function call of TSSetType(). Hong > On Aug 9, 2016, at 2:13 PM, Barry Smith wrote: > > > The Euler method is an explicit method that does not require the Jacobian. Hence the Jacobian you provide is never used. > > If you use any implement method, for example backward Euler beuler then you will see the Jacobian being used. > > Barry > >> On Aug 9, 2016, at 2:08 PM, Scott Dossa wrote: >> >> Hi all, >> >> I adapted ex13.c from .../ts/examples/tutorials/ex13.c to solve a 1D TD PDE (swift-hohenberg). >> Long story short: the function which sets the Jacobian is never being called despite being assigned by TSSetRHSJacobian(). >> >> I create the TS environment with the following commands (omitting some code before and after -- Full code attached): >> >> --------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> ierr = DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 15, 1, 1, NULL, &da); CHKERRQ(ierr); >> >> /*================================================================== >> Let DMDA create vector >> ==================================================================*/ >> >> ierr = DMCreateGlobalVector(da, &x);CHKERRQ(ierr); >> >> /*================================================================== >> Create timestepper context which will use Euler Method >> ==================================================================*/ >> >> ierr = TSCreate(PETSC_COMM_WORLD,&ts);CHKERRQ(ierr); >> ierr = TSSetDM(ts, da);CHKERRQ(ierr); >> ierr = TSSetType(ts, TSEULER);CHKERRQ(ierr); >> >> ierr = TSSetRHSFunction(ts, x, RHSFunction, NULL);CHKERRQ(ierr); >> >> /*================================================================= >> Create Jacobian for Euler Method >> =================================================================*/ >> ierr = DMSetMatType(da, MATAIJ);CHKERRQ(ierr); >> ierr = DMCreateMatrix(da,&J);CHKERRQ(ierr); >> ierr = TSSetRHSJacobian(ts,J,J,RHSJacobian,NULL); CHKERRQ(ierr) >> --------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> I define the function RHSJacobian later, PetscPrintf(PETSC_COMM_WORLD, "Jacobian function called") is used to see if the function is ever called. >> >> Neither is the message ever called, nor is the Jacobian matrix ever non-zero. The J matrix is checked with MatView( ). No error messages are produced either at compiling nor runtime. >> >> Has anyone else run into such an issue or have any advice of how best to probe the problem? >> Thanks so much! >> -Scott Dossa >> >> >> > From kyungjun.choi92 at gmail.com Tue Aug 9 15:29:28 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Wed, 10 Aug 2016 05:29:28 +0900 Subject: [petsc-users] Question about SNESSetFunction - FormFunction part Message-ID: Hi, I'm currently working on FormFunction routine my subroutine goes like this --> *subroutine FormFunction(snes, x, f, userctx, ierr)* Inside the above subroutine, the problem occurs when I try to use *VecGetArrayF90(x, xx_v, ierr)* The error pops up with this kind of message " *Vec is locked read only* " So I used *VecGetArrayReadF90*, but then I got these below [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] F90Array1dCreate line 50 /home/ckj/Repository/petsc-3.7.3/src/sys/f90-src/f90_cwrap.c [0]PETSC ERROR: [0] oursnesfunction line 84 /home/ckj/Repository/petsc-3.7.3/src/snes/interface/ftn-custom/zsnesf.c [0]PETSC ERROR: [0] SNES user function line 2144 /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESComputeFunction line 2129 /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: [0] SNESSolve_NEWTONTR line 98 /home/ckj/Repository/petsc-3.7.3/src/snes/impls/tr/tr.c [0]PETSC ERROR: [0] SNESSolve line 3958 /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c Please give me some help. Best regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 9 15:41:13 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 9 Aug 2016 15:41:13 -0500 Subject: [petsc-users] Question about SNESSetFunction - FormFunction part In-Reply-To: References: Message-ID: On Tue, Aug 9, 2016 at 3:29 PM, ??? wrote: > Hi, I'm currently working on FormFunction routine > > my subroutine goes like this > > --> *subroutine FormFunction(snes, x, f, userctx, ierr)* > > Inside the above subroutine, the problem occurs when I try to use *VecGetArrayF90(x, > xx_v, ierr)* > > The error pops up with this kind of message " *Vec is locked read only* " > Does SNES ex5f90 run for you? If so, then there must be a problem in your code. I would start with the example and change it slowly until you get what you want. Thanks, Matt > So I used *VecGetArrayReadF90*, but then I got these below > > [0]PETSC ERROR: ------------------------------ > ------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/ > documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] F90Array1dCreate line 50 /home/ckj/Repository/petsc-3. > 7.3/src/sys/f90-src/f90_cwrap.c > [0]PETSC ERROR: [0] oursnesfunction line 84 /home/ckj/Repository/petsc-3. > 7.3/src/snes/interface/ftn-custom/zsnesf.c > [0]PETSC ERROR: [0] SNES user function line 2144 > /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeFunction line 2129 > /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_NEWTONTR line 98 > /home/ckj/Repository/petsc-3.7.3/src/snes/impls/tr/tr.c > [0]PETSC ERROR: [0] SNESSolve line 3958 /home/ckj/Repository/petsc-3. > 7.3/src/snes/interface/snes.c > > > Please give me some help. > > Best regards. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From doss0032 at umn.edu Tue Aug 9 15:44:48 2016 From: doss0032 at umn.edu (Scott Dossa) Date: Tue, 9 Aug 2016 15:44:48 -0500 Subject: [petsc-users] function set by TSSetRHSJacobian() not being called In-Reply-To: References: Message-ID: I agree that he meant implicit method. The code works as expected now. I must have erronously used Forward euler instead of backwards euler. Thanks again, Scott Dossa On Tue, Aug 9, 2016 at 2:34 PM, Hong Zhang wrote: > I guess Barry meant ?implicit method?. > > To use backward Euler, change TSEULER to TSBEULER in the function call of > TSSetType(). > > Hong > > > On Aug 9, 2016, at 2:13 PM, Barry Smith wrote: > > > > > > The Euler method is an explicit method that does not require the > Jacobian. Hence the Jacobian you provide is never used. > > > > If you use any implement method, for example backward Euler beuler > then you will see the Jacobian being used. > > > > Barry > > > >> On Aug 9, 2016, at 2:08 PM, Scott Dossa wrote: > >> > >> Hi all, > >> > >> I adapted ex13.c from .../ts/examples/tutorials/ex13.c to solve a 1D > TD PDE (swift-hohenberg). > >> Long story short: the function which sets the Jacobian is never being > called despite being assigned by TSSetRHSJacobian(). > >> > >> I create the TS environment with the following commands (omitting some > code before and after -- Full code attached): > >> > >> ------------------------------------------------------------ > ------------------------------------------------------------ > --------------------------------------------- > >> ierr = DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 15, 1, 1, > NULL, &da); CHKERRQ(ierr); > >> > >> /*================================================================== > >> Let DMDA create vector > >> ==================================================================*/ > >> > >> ierr = DMCreateGlobalVector(da, &x);CHKERRQ(ierr); > >> > >> /*================================================================== > >> Create timestepper context which will use Euler Method > >> ==================================================================*/ > >> > >> ierr = TSCreate(PETSC_COMM_WORLD,&ts);CHKERRQ(ierr); > >> ierr = TSSetDM(ts, da);CHKERRQ(ierr); > >> ierr = TSSetType(ts, TSEULER);CHKERRQ(ierr); > >> > >> ierr = TSSetRHSFunction(ts, x, RHSFunction, NULL);CHKERRQ(ierr); > >> > >> /*================================================================= > >> Create Jacobian for Euler Method > >> =================================================================*/ > >> ierr = DMSetMatType(da, MATAIJ);CHKERRQ(ierr); > >> ierr = DMCreateMatrix(da,&J);CHKERRQ(ierr); > >> ierr = TSSetRHSJacobian(ts,J,J,RHSJacobian,NULL); CHKERRQ(ierr) > >> ------------------------------------------------------------ > ------------------------------------------------------------ > --------------------------------------------- > >> > >> I define the function RHSJacobian later, PetscPrintf(PETSC_COMM_WORLD, > "Jacobian function called") is used to see if the function is ever called. > >> > >> Neither is the message ever called, nor is the Jacobian matrix ever > non-zero. The J matrix is checked with MatView( ). No error messages are > produced either at compiling nor runtime. > >> > >> Has anyone else run into such an issue or have any advice of how best > to probe the problem? > >> Thanks so much! > >> -Scott Dossa > >> > >> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 9 16:47:11 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 9 Aug 2016 16:47:11 -0500 Subject: [petsc-users] Question about SNESSetFunction - FormFunction part In-Reply-To: References: Message-ID: You cannot call VecGetArrayF90() on x since x is a read only variable in this function; that is it makes no sense for you to be changing values in the x vector. You should be calling VecGetArrayReadF90() on the array since you can only read the values in the array, not change them. Barry > On Aug 9, 2016, at 3:29 PM, ??? wrote: > > Hi, I'm currently working on FormFunction routine > > my subroutine goes like this > > --> subroutine FormFunction(snes, x, f, userctx, ierr) > > Inside the above subroutine, the problem occurs when I try to use VecGetArrayF90(x, xx_v, ierr) > > The error pops up with this kind of message " Vec is locked read only " > > > So I used VecGetArrayReadF90, but then I got these below > > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] F90Array1dCreate line 50 /home/ckj/Repository/petsc-3.7.3/src/sys/f90-src/f90_cwrap.c > [0]PETSC ERROR: [0] oursnesfunction line 84 /home/ckj/Repository/petsc-3.7.3/src/snes/interface/ftn-custom/zsnesf.c > [0]PETSC ERROR: [0] SNES user function line 2144 /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESComputeFunction line 2129 /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: [0] SNESSolve_NEWTONTR line 98 /home/ckj/Repository/petsc-3.7.3/src/snes/impls/tr/tr.c > [0]PETSC ERROR: [0] SNESSolve line 3958 /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c > > > Please give me some help. > > Best regards. > From jshen25 at jhu.edu Tue Aug 9 23:17:10 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Wed, 10 Aug 2016 00:17:10 -0400 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: References: <51E07B0F-810B-492A-8330-3469E04D14F1@mcs.anl.gov> Message-ID: Thanks for all the replies. Really appreciate. Bests, Jinlei On Tue, Aug 9, 2016 at 11:48 AM, Matthew Knepley wrote: > On Tue, Aug 9, 2016 at 9:52 AM, Jinlei Shen wrote: > >> Hi Barry, >> >> Thanks for your answer. >> >> But logically for large problem, we are always expecting to see >> paralleled program perform better with regard to both speed and memory >> since each of the multi-processes independently deal with its own >> submatrix, especially for iterative solver, which is revealed by CG+BJ. >> I just don't understand, in the computing with CG+ASM and SUPER_LU, why >> the two-process is most inefficient among these cases. If this is due to >> the communication cost compared with uni-process, why the speed goes down >> for triple and more processes. I'm new to parallelism, could you speculate >> any possible reason for such situation? >> > > As Barry noted, there are two reasons that you get slowdown: > > 1) Worse performance of existing algorithm > > This comes from things like communication. This is usually a small > contributor to slowdown. > > 2) Different parallel algorithm > > This is what is causing your slowdown most likely. > > Matt > > >> Great thanks >> >> >> >> On Fri, Aug 5, 2016 at 10:09 PM, Barry Smith wrote: >> >>> >>> > On Aug 5, 2016, at 5:58 PM, Jinlei Shen wrote: >>> > >>> > ?Hi, >>> > >>> > Thanks for your answers. >>> > >>> > I just figured out the issues which are mainly due to the >>> ill-conditioning of my matrix. I found the conditional number blows up when >>> the beam is discretized into large number of elements. >>> > >>> > Now, I am using the 1D bar model to solve the same problem. The good >>> news is the solution is always accurate and stable even I discretized into >>> 10 million elements. >>> > >>> > When I run the model with both iterative solver(CG+BJACOBI/ASM) and >>> direct solver(SUPER_LU) in parallelization, I got the following results: >>> > >>> > Mesh size: 1 million unknowns >>> > Processes 1 2 4 6 8 10 12 >>> 16 20 >>> > CG+BJ 0.36 0.22 0.15 0.12 0.11 0.1 0.096 0.097 >>> 0.099 >>> > CG+ASM 0.47 0.46 0.267 0.2 0.17 0.15 0.145 >>> 0.16 0.15 >>> > SUPER_LU_DIST 4.73 5.4 4.69 4.58 4.38 4.2 4.27 >>> 4.28 4.38 >>> > >>> > It seems the CG+BJ works correctly, i.e. time decreases fast with a >>> few more processes and reach stable with many more cores. >>> > >>> > However, I have some concerns about CG+ASM and SUPER_LU_DIST. The time >>> of both two methods goes up when I use two processes compared with >>> uniprocess. >>> >>> This is actually not surprising at all but since the mantra is >>> "parallelism will always make things faster" it can confuse people. When >>> run with one process the ASM and SuperLU_DIST utilize essentially >>> sequential algorithms, when run with two processes they "switch" to >>> parallel algorithms which simply are not as good as the essentially >>> sequential algorithm that is obtained with one process hence they run >>> slower. This is just life, there really isn't something one can do about it >>> except to perhaps use a poorer quality algorithm on one process so that two >>> processes look better but the goal of PETSc is not to make parallelism to >>> look good but to provide efficient solvers (as best we can) for one and >>> multiple processes. >>> >>> Barry >>> >>> >>> >>> >>> > The tendency is more obvious when I use larger mesh size. >>> > I especially doubt the results of SUPER_LU_DIST in parallelism since >>> the overall expedition is very small which is not expected. >>> > The runtime option I use for ASM pc and SUPER_LU_DIST solver is shown >>> as below: >>> > ASM preconditioner: -pc_type asm -pc_asm_type basic >>> > SUPER_LU_DIST solver: -ksp_type preonly -pc_type lu >>> -pc_factor_mat_solver_package superlu_dist >>> > >>> > I use same mpiexec -n np ./xxxx for all solvers. >>> > >>> > Am I using them correctly? If so, is there anyway to speed up the >>> computation further, especially for SUPER_LU_DIST? >>> > >>> > Thank you very much! >>> > >>> > Bests, >>> > Jinlei >>> > >>> > On Mon, Aug 1, 2016 at 2:10 PM, Matthew Knepley >>> wrote: >>> > On Mon, Aug 1, 2016 at 12:52 PM, Jinlei Shen wrote: >>> > Hi Barry, >>> > >>> > Thanks for your reply. >>> > >>> > Firstly, as you suggested, I checked my program under valgrind. The >>> results for both sequential and parallel cases showed there are no memory >>> errors detected. >>> > >>> > Second, I coded a sequential program without using PETSC to generate >>> the global matrix of small mesh for the same problem. I then checked the >>> matrix both from petsc(sequential and parallel) and serial code, and they >>> are same. >>> > The way I assembled the global matrix in parallel is first >>> distributing the nodes and elements into processes, then I loop with >>> elements on the calling process to put the element stiffness into the >>> global. Since the nodes and elements in cantilever beam are numbered >>> successively, the connectivity is simple. I didn't use any partition tools >>> to optimize mesh. It's also easy to determine the preallocation d_nnz and >>> o_nnz since each node only connects the left and right nodes except for >>> beginning and end, the maximum nonzeros in each row is 6. The MatSetValue >>> process is shown as follows: >>> > do iEL = idElStart, idElEnd >>> > g_EL = (/2*iEL-1-1,2*iEL-1,2*iEL+1-1,2*iEL+2-1/) >>> > call MatSetValues(SG,4,g_EL,4,g_El,SE,ADD_VALUES,ierr) >>> > end do >>> > where idElStart and idElEnd are the global number of first element and >>> end element that the process owns, g_EL is the global index for DOF in >>> element iEL, SE is the element stiffness which is same for all elements. >>> > From above assembling, most of the elements are assembled within own >>> process while there are few elements crossing two processes. >>> > >>> > The BC for my problem(cantilever under end point load) is to fix the >>> first two DOF, so I called the MatZeroRowsColumns to set the first two rows >>> and columns into zero with diagonal equal to one, without changing the RHS. >>> > >>> > Now some new issues show up : >>> > >>> > I run with -ksp_monitor_true_residual and -ksp_converged_reason, the >>> monitor showed two different residues, one is the residue I can >>> set(preconditioned, unpreconditioned, natural), the other is called true >>> residue. >>> > ?? >>> > I initially thought the true residue is same as unpreconditioned based >>> on definition. But it seems not true. Is it the norm of the residue (b-Ax) >>> between computed RHS and true RHS? But, how to understand unprecondition >>> residue since its definition is b-Ax as well? >>> > >>> > It is the unpreconditioned residual. You must be misinterpreting. And >>> we could determine exactly if you sent the output with the suggested >>> options. >>> > >>> > Can I set the true residue as my converging criteria? >>> > >>> > Use right preconditioning. >>> > >>> > I found the accuracy of large mesh in my problem didn't necessary >>> depend on the tolerance I set, either preconditioned or unpreconditioned, >>> sometimes, it showed converged while the solution is not correct. But the >>> true residue looks reflecting the true convergence very well, if the true >>> residue is diverging, no matter what the first residue says, the results >>> are bad! >>> > >>> > Yes, your preconditioner looks singular. Note that BJACOBI has an >>> inner solver, and by default the is GMRES/ILU(0). I think >>> > ILU(0) is really ill-conditioned for your problem. >>> > >>> > For the preconditioner concerns, actually, I used BJACOBI before I >>> sent the first email, since the JACOBI or PBJACOBI didn't even converge >>> when the size was large. >>> > But BJACOBI also didn't perform well in the paralleliztion for large >>> mesh as posed in my last email, while it's fine for small size (below 10k >>> elements) >>> > >>> > Yesterday, I tried the ASM with CG using the runtime option: -pc_type >>> asm -pc_asm_type basic -sub_pc_type lu (default is ilu). >>> > For 15k elements mesh, I am now able to get the correct answer with >>> 1-3, 6 and more processes, using either -sub_pc_type lu or ilu. >>> > >>> > Yes, LU works for your subdomain solver. >>> > >>> > Based on all the results I have got, it shows the results varies a lot >>> with different PC and seems ASM is better for large problem. >>> > >>> > Its not ASM so much as an LU subsolver that is better. >>> > >>> > But what is the major factor to produce such difference between >>> different PCs, since it's not just the issue of computational efficiency, >>> but also the accuracy. >>> > Also, I noticed for large mesh, the solution is unstable with small >>> number of processes, for the 15k case, the solution is not correct with 4 >>> and 5 processes, however, the solution becomes always correct with more >>> than 6 processes. For the 50k mesh case, more processes are required to >>> show the stability. >>> > >>> > Yes, partitioning is very important here. Since you do not have a good >>> partition, you can get these wild variations. >>> > >>> > Thanks, >>> > >>> > Matt >>> > >>> > What do you think about this? Anything wrong? >>> > Since the iterative solver in parallel is first computed locally(if >>> this is correct), can it be possible that there are 'good' and 'bad' locals >>> when dividing the global matrix, and the result from 'bad' local will >>> contaminate the global results. But with more processes, such risk is >>> reduced. >>> > >>> > It is highly appreciated if you could give me some instruction for >>> above questions. >>> > >>> > Thank you very much. >>> > >>> > Bests, >>> > Jinlei >>> > >>> > >>> > On Fri, Jul 29, 2016 at 2:09 PM, Barry Smith >>> wrote: >>> > >>> > First run under valgrind all the cases to make sure there is not >>> some use of uninitialized data or overwriting of data. Go to >>> http://www.mcs.anl.gov/petsc follow the link to FAQ and search for >>> valgrind (the web server seems to be broken at the moment). >>> > >>> > Second it is possible that your code the assembles the matrices and >>> vectors is not correctly assembling it for either the sequential or >>> parallel case. Hence a different number of processes could be generating a >>> different linear system hence inconsistent results. How are you handling >>> the parallelism? How do you know the matrix generated in parallel is >>> identically to that sequentially? >>> > >>> > Simple preconditioners such as pbjacobi will converge slower and >>> slower with more elements. >>> > >>> > Note that you should run with -ksp_monitor_true_residual and >>> -ksp_converged_reason to make sure that the iterative solver is even >>> converging. By default PETSc KSP solvers do not stop with a big error >>> message if they do not converge so you need make sure they are always >>> converging. >>> > >>> > Barry >>> > >>> > >>> > >>> > > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: >>> > > >>> > > Dear PETSC developers, >>> > > >>> > > Thank you for developing such a powerful tool for scientific >>> computations. >>> > > >>> > > I'm currently trying to run a simple cantilever beam FEM to test the >>> scalability of PETSC on multi-processors. I also want to verify whether >>> iterative solver or direct solver is more efficient for parallel large FEM >>> problem. >>> > > >>> > > Problem description, An Euler elementary cantilever beam with point >>> load at the end along -y direction. Each node has 2 DOF (deflection and >>> rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based >>> on the connectivity. Loop with elements in each processor to assemble the >>> global matrix with same element stiffness matrix. The boundary condition is >>> set using call MatZeroRowsColumns(SG,2,g_BC,o >>> ne,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr); >>> > > >>> > > Based on what I have done, I find the computations work well, i.e >>> the results are correct compared with theoretical solution, for small mesh >>> size (small than 5000 elements) using both solvers with different numbers >>> of processes. >>> > > >>> > > However, there are several confusing issues when I increase the mesh >>> size to 10000 and more elements with iterative solve(CG + PCBJACOBI) >>> > > >>> > > 1. For 10k elements, I can get accurate solution using iterative >>> solver with uni-processor(i.e. only one process). However, when I use 2-8 >>> processes, it tells the linear solver converged with different iterations, >>> but, the results are all different for different processes and erroneous. >>> The wired thing is when I use >9 processes, the results are correct again. >>> I am really confused by this. Could you explain me why? If my >>> parallelization is not correct, why it works for small cases? And I check >>> the global matrix and RHS vector and didn't see any mallocs during the >>> process. >>> > > >>> > > 2. For 30k elements, if I use one process, it says: Linear solve did >>> not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for >>> large sparse matrix? If so, is there any stable solver or pc for large >>> problem? >>> > > >>> > > >>> > > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I >>> can only get accuracy when the number of elements are below 5000. There >>> must be something wrong. The way I use the superlu_dist solver is first >>> convert MatType to AIJ, then call PCFactorSetMatSolverPackage, and change >>> the PC to PCLU. Do I miss anything else to run SUPER_LU correctly? >>> > > >>> > > >>> > > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the >>> sequential version of the same problem. The results shows that iterative >>> solver works well for <50k elements, while SUPER_LU only gets right >>> solution below 5k elements. Can I say iterative solver is better than >>> SUPER_LU for large problem? How can I improve the solver to copy with very >>> large problem, such as million by million? Another thing is it's still >>> doubtable of performance of SUPER_LU. >>> > > >>> > > For the inaccuracy issue, do you think it may be due to the memory? >>> However, there is no memory error showing during the execution. >>> > > >>> > > I really appreciate someone could resolve those puzzles above for >>> me. My goal is to replace the current SUPER_LU solver in my parallel CPFEM >>> main program with the iterative solver using PETSC. >>> > > >>> > > >>> > > Please let me if you would like to see my code in detail. >>> > > >>> > > Thank you very much. >>> > > >>> > > Bests, >>> > > Jinlei >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > >>> > >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > -- Norbert Wiener >>> > >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From harshadranadive at gmail.com Wed Aug 10 21:54:40 2016 From: harshadranadive at gmail.com (Harshad Ranadive) Date: Thu, 11 Aug 2016 12:54:40 +1000 Subject: [petsc-users] Code performance for solving multiple RHS Message-ID: Hi All, I have currently added the PETSc library with our CFD solver. In this I need to use KSPSolve(...) multiple time for the same matrix A. I have read that PETSc does not support passing multiple RHS vectors in the form of a matrix and the only solution to this is calling KSPSolve multiple times as in example given here: http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex16.c.html I have followed this technique, but I find that the performance of the code is very slow now. I basically have a mesh size of 8-10 Million and I need to solve the matrix A very large number of times. I have checked that the statement KSPSolve(..) is taking close to 90% of my computation time. I am setting up the matrix A, KSPCreate, KSPSetup etc just once at the start. Only the following statements are executed in a repeated loop *Loop begin: (say million times !!)* * loop over vector length* * VecSetValues( ....)* * end* * VecAssemblyBegin( ... )* * VecAssemblyEnd (...)* * KSPSolve (...)* * VecGetValues* *Loop end.* Is there an efficient way of doing this rather than using KSPSolve multiple times? Please note my matrix A never changes during the time steps or across the mesh ... So essentially if I can get the inverse once would it be good enough? It has been recommended in the FAQ that matrix inverse should be avoided but would it be okay to use in my case? Also could someone please provide an example of how to use MatLUFactor and MatCholeskyFactor() to find the matrix inverse... the arguments below were not clear to me. *IS row * *IS col* *const MatFactorInfo *info* Apologies for a long email and thanks to anyone for help. Regards Harshad -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 10 22:02:17 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 10 Aug 2016 22:02:17 -0500 Subject: [petsc-users] Code performance for solving multiple RHS In-Reply-To: References: Message-ID: On Wed, Aug 10, 2016 at 9:54 PM, Harshad Ranadive wrote: > Hi All, > > I have currently added the PETSc library with our CFD solver. > > In this I need to use KSPSolve(...) multiple time for the same matrix A. I > have read that PETSc does not support passing multiple RHS vectors in the > form of a matrix and the only solution to this is calling KSPSolve multiple > times as in example given here: > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/ > examples/tutorials/ex16.c.html > > I have followed this technique, but I find that the performance of the > code is very slow now. I basically have a mesh size of 8-10 Million and I > need to solve the matrix A very large number of times. I have checked that > the statement KSPSolve(..) is taking close to 90% of my computation time. > > I am setting up the matrix A, KSPCreate, KSPSetup etc just once at the > start. Only the following statements are executed in a repeated loop > > *Loop begin: (say million times !!)* > > * loop over vector length* > * VecSetValues( ....)* > * end* > > * VecAssemblyBegin( ... )* > * VecAssemblyEnd (...)* > > * KSPSolve (...)* > > * VecGetValues* > > *Loop end.* > > Is there an efficient way of doing this rather than using KSPSolve > multiple times? > > Please note my matrix A never changes during the time steps or across the > mesh ... So essentially if I can get the inverse once would it be good > enough? It has been recommended in the FAQ that matrix inverse should be > avoided but would it be okay to use in my case? > > Also could someone please provide an example of how to use MatLUFactor > and MatCholeskyFactor() to find the matrix inverse... the arguments below > were not clear to me. > *IS row * > *IS col* > *const MatFactorInfo *info* > > Apologies for a long email and thanks to anyone for help. > 1) For any questions, we NEED to see the output of -log_view for all cases you are asking about 2) You could try factoring your matrix -pc_type lu -pc_factor_mat_solver_package and it would be reused across solves, but for large matrices you can run out of memory. 3) You can factor subdomains with BJacobi or ASM and they would be reused, but this might be a crap preconditioner. 4) For any questions about convergence, we need to see the output of -ksp_view -ksp_monitor_true_residual -ksp_converged_reason Matt > Regards > Harshad > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 10 22:07:05 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 10 Aug 2016 22:07:05 -0500 Subject: [petsc-users] Code performance for solving multiple RHS In-Reply-To: References: Message-ID: <6699AE6B-520C-4220-94D0-0389FD5941E6@mcs.anl.gov> Effectively utilizing multiple right hand sides with the same system can result in roughly 2 or at absolute most 3 times improvement in solve time. A great improvement but when you have a million right hand sides not a giant improvement. The first step is to get the best (most efficient) preconditioner for you problem. Since you have many right hand sides it obviously pays to spend more time building the preconditioner so that each solve is faster. If you provide more information on your linear system we might have suggestions. CFD so is your linear system a Poisson problem? Are you using geometric or algebraic multigrid with PETSc? It not a Poisson problem how can you describe the linear system? Barry > On Aug 10, 2016, at 9:54 PM, Harshad Ranadive wrote: > > Hi All, > > I have currently added the PETSc library with our CFD solver. > > In this I need to use KSPSolve(...) multiple time for the same matrix A. I have read that PETSc does not support passing multiple RHS vectors in the form of a matrix and the only solution to this is calling KSPSolve multiple times as in example given here: > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex16.c.html > > I have followed this technique, but I find that the performance of the code is very slow now. I basically have a mesh size of 8-10 Million and I need to solve the matrix A very large number of times. I have checked that the statement KSPSolve(..) is taking close to 90% of my computation time. > > I am setting up the matrix A, KSPCreate, KSPSetup etc just once at the start. Only the following statements are executed in a repeated loop > > Loop begin: (say million times !!) > > loop over vector length > VecSetValues( ....) > end > > VecAssemblyBegin( ... ) > VecAssemblyEnd (...) > > KSPSolve (...) > > VecGetValues > > Loop end. > > Is there an efficient way of doing this rather than using KSPSolve multiple times? > > Please note my matrix A never changes during the time steps or across the mesh ... So essentially if I can get the inverse once would it be good enough? It has been recommended in the FAQ that matrix inverse should be avoided but would it be okay to use in my case? > > Also could someone please provide an example of how to use MatLUFactor and MatCholeskyFactor() to find the matrix inverse... the arguments below were not clear to me. > IS row > IS col > const MatFactorInfo *info > > Apologies for a long email and thanks to anyone for help. > > Regards > Harshad > > > > > From mirzadeh at gmail.com Wed Aug 10 22:59:02 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Wed, 10 Aug 2016 23:59:02 -0400 Subject: [petsc-users] Docker performance Message-ID: I have been recently hearing about Docker [1] and its potential as a "platform-independent" environment for application development. At first it sounded just like another VM to me until I came across this IBM Research paper [2] and this SO post [3] (to be fair I still don't get the detailed differences ... ) Whats exciting about paper, is that they report quite impressive performance metrics for docker, e.g. for LINPACK and stream tests. So I just tried installing PETSc in a ubuntu image for myself tonight and I got relatively close performance for a toy example (2D poisson) compared to my native development setup (OS X) (~%5-10 loss - although I was not using the same petsc, mpi, etc). So I thought to share my excitement with other users who might find this useful. Personally I don't see myself using docker a whole lot; not now anyway, mostly because after years of suffering on OS X, I think homebrew has reached a mature point :). I would love to hear if anyone has more interesting stories to share! Also, perhaps having a standard petsc container could be beneficial, specially to new users? Anyway, food for thought! Cheers, Mohammad [1]: https://www.docker.com/what-docker [2]: http://domino.research.ibm.com/library/cyberdig.nsf/papers/0929052195DD819C85257D2300681E7B/$File/rc25482.pdf [3]: http://stackoverflow.com/questions/16047306/how-is-docker-different-from-a-normal-virtual-machine -------------- next part -------------- An HTML attachment was scrubbed... URL: From support at pagepress.org Thu Aug 11 01:19:47 2016 From: support at pagepress.org (Acque Sotterranee - Italian Journal of Groundwater) Date: Thu, 11 Aug 2016 08:19:47 +0200 Subject: [petsc-users] Launching! Message-ID: An HTML attachment was scrubbed... URL: From leejearl at 126.com Thu Aug 11 01:48:14 2016 From: leejearl at 126.com (leejearl) Date: Thu, 11 Aug 2016 14:48:14 +0800 Subject: [petsc-users] A question about DMPlexDistribute Message-ID: Hi, all: I want to use PETSc to build my FVM code. Now, I have a question about the function DMPlexDistribute(DM dm, PetscInt overlap, PetscSF *sf, DM *dmOverlap) . In the example "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". When I set the overlap as 0 or 1, it works well. But, if I set the overlap as 2, it suffers a problem. I am confused about the value of overlap. Can it be set as 2? What is the meaning of the parameter overlap? Any helps are appreciated! leejearl From juan at tf.uni-kiel.de Thu Aug 11 01:57:47 2016 From: juan at tf.uni-kiel.de (Julian Andrej) Date: Thu, 11 Aug 2016 08:57:47 +0200 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: References: Message-ID: Hi, take a look at slide 10 of [1], there is visually explained what the overlap between partitions is. [1] https://www.archer.ac.uk/training/virtual/files/2015/06-PETSc/slides.pdf On Thu, Aug 11, 2016 at 8:48 AM, leejearl wrote: > Hi, all: > I want to use PETSc to build my FVM code. Now, I have a question about > the function DMPlexDistribute(DM dm, PetscInt overlap, PetscSF *sf, DM > *dmOverlap) . > > In the example "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". When > I set the overlap > as 0 or 1, it works well. But, if I set the overlap as 2, it suffers a > problem. > I am confused about the value of overlap. Can it be set as 2? What is > the meaning of > the parameter overlap? > Any helps are appreciated! > > leejearl > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From francesco.caimmi at polimi.it Thu Aug 11 02:36:39 2016 From: francesco.caimmi at polimi.it (Francesco Caimmi) Date: Thu, 11 Aug 2016 09:36:39 +0200 Subject: [petsc-users] [petsc4py] a problem with computeRHSFunctionLinear interface? Message-ID: <4270161.snsPm0L6UZ@wotan> Dear all, I was trying to reproduce /ts/examples/tutorials/ex4.c in python to learn how to use TS solvers; the example uses the function TSComputeRHSFunctionLinear. However I get an error when running my code (attached in case you want to look at it), when I call ts.solve. Here is the trace: [fcaimmi at Wotan 2645] > ./ts_ex4.py Solving a linear TS problem, number of processors = 1 Timestep 0 : time = 0.0 2-norm error = 1.14956855594e-08 max norm error = 0 Traceback (most recent call last): File "./ts_ex4.py", line 473, in main(m = m, debug = debug) File "./ts_ex4.py", line 340, in main ts.solve(u) File "PETSc/TS.pyx", line 568, in petsc4py.PETSc.TS.solve (src/petsc4py.PETSc.c:188927) File "PETSc/petscts.pxi", line 221, in petsc4py.PETSc.TS_RHSFunction (src/petsc4py.PETSc.c:35490) File "PETSc/TS.pyx", line 189, in petsc4py.PETSc.TS.computeRHSFunctionLinear (src/petsc4py.PETSc.c:181611) TypeError: computeRHSFunctionLinear() takes exactly 3 positional arguments (5 given) I cannot understand if there is a problem with my code or if the problem is in computeRHSFunctionLinear interface. I checked https://bitbucket.org/petsc/petsc4py/ and the interface to computeRHSFunctionLinear has three arguments, however I am not that much into petsc4py to tell how it gets called. I am on Petsc Release Version 3.7.3 Thank you for your time. Best, -- Francesco Caimmi Laboratorio di Ingegneria dei Polimeri http://www.chem.polimi.it/polyenglab/ Politecnico di Milano - Dipartimento di Chimica, Materiali e Ingegneria Chimica ?Giulio Natta? P.zza Leonardo da Vinci, 32 I-20133 Milano Tel. +39.02.2399.4711 Fax +39.02.7063.8173 francesco.caimmi at polimi.it Skype: fmglcaimmi (please arrange meetings by e-mail) -------------- next part -------------- A non-text attachment was scrubbed... Name: ts_ex4.py Type: text/x-python Size: 15136 bytes Desc: not available URL: From norihiro.w at gmail.com Thu Aug 11 03:12:02 2016 From: norihiro.w at gmail.com (Norihiro Watanabe) Date: Thu, 11 Aug 2016 10:12:02 +0200 Subject: [petsc-users] mat option producing error for stash Message-ID: Hi, I would like to check if my program assembles a matrix without generating stash. To help checking it, I wonder if there is a mat option producing errors if entries destined for other processors are added/set. I mean something like MAT_NEW_NONZERO_LOCATION_ERR for stashing. Best, Nori From leejearl at 126.com Thu Aug 11 03:14:48 2016 From: leejearl at 126.com (leejearl) Date: Thu, 11 Aug 2016 16:14:48 +0800 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: References: Message-ID: Hi, Thank you for your reply. It help me very much. But, for "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when I set the overlap to 2 levels, the command is "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 -physics sw", it suffers a error. It seems to me that setting overlap to 2 is very common. Are there issues that I have not take into consideration? Any help are appreciated. leejearl On 2016?08?11? 14:57, Julian Andrej wrote: > Hi, > > take a look at slide 10 of [1], there is visually explained what the > overlap between partitions is. > > [1] > https://www.archer.ac.uk/training/virtual/files/2015/06-PETSc/slides.pdf > > On Thu, Aug 11, 2016 at 8:48 AM, leejearl > wrote: > > Hi, all: > I want to use PETSc to build my FVM code. Now, I have a > question about > the function DMPlexDistribute(DM dm, PetscInt overlap, PetscSF > *sf, DM *dmOverlap) . > > In the example > "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". When I set the > overlap > as 0 or 1, it works well. But, if I set the overlap as 2, it > suffers a problem. > I am confused about the value of overlap. Can it be set as 2? > What is the meaning of > the parameter overlap? > Any helps are appreciated! > > leejearl > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leejearl at 126.com Thu Aug 11 08:16:08 2016 From: leejearl at 126.com (leejearl) Date: Thu, 11 Aug 2016 21:16:08 +0800 Subject: [petsc-users] High order finite volume method in unstructured grid using PETSc Message-ID: Hi, all: I want to build a high order finite volume method in unstructured grid using PETSc. The first issue is to partition the grid. I use the DMPlex to manage the data structure. The procedure is as follow: 1> DMPlexCreateFromFile(), to load a grid into DMPlex; 2> DMPlexDistribute(dm, overlap, NULL, &dmdist) I use DMPlexDistribute to distribute gird. For a high order method, I want to set the value of overlap to 2, and I think 0 or 1 is not enough. So, I test the example "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". The example is a 2ND order FVM code, and it use "DMPlexDistribute" too. But when I use the command "mpirun -n 3 ./ex11 -ufv_mesh_overlap 2" to set the value of overlap to 2, it suffers an error. Is there anyone can help me to point out: 1> Can I set the value of overlap to 2? 2> If I want to give the value of overlap as 2, is there any additional codes which must be added? Any helps are appreciated. leejearl From balay at mcs.anl.gov Thu Aug 11 09:35:46 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 11 Aug 2016 09:35:46 -0500 Subject: [petsc-users] mat option producing error for stash In-Reply-To: References: Message-ID: On Thu, 11 Aug 2016, Norihiro Watanabe wrote: > Hi, > > I would like to check if my program assembles a matrix without > generating stash. To help checking it, I wonder if there is a mat > option producing errors if entries destined for other processors are > added/set. I mean something like MAT_NEW_NONZERO_LOCATION_ERR for > stashing. http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetOption.html MAT_NO_OFF_PROC_ENTRIES - you know each process will only set values for its own rows, will generate an error if any process sets values for another process. This avoids all reductions in the MatAssembly routines and thus improves performance for very large process counts. You can also run with -info - and it should print the size of stack thats used.. Satish From norihiro.w at gmail.com Thu Aug 11 09:47:59 2016 From: norihiro.w at gmail.com (Norihiro Watanabe) Date: Thu, 11 Aug 2016 16:47:59 +0200 Subject: [petsc-users] mat option producing error for stash In-Reply-To: References: Message-ID: thanks! On Thu, Aug 11, 2016 at 4:35 PM, Satish Balay wrote: > On Thu, 11 Aug 2016, Norihiro Watanabe wrote: > >> Hi, >> >> I would like to check if my program assembles a matrix without >> generating stash. To help checking it, I wonder if there is a mat >> option producing errors if entries destined for other processors are >> added/set. I mean something like MAT_NEW_NONZERO_LOCATION_ERR for >> stashing. > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatSetOption.html > > MAT_NO_OFF_PROC_ENTRIES - you know each process will only set values for its own rows, will generate an error if any process sets values for another process. This avoids all reductions in the MatAssembly routines and thus improves performance for very large process counts. > > You can also run with -info - and it should print the size of stack thats used.. > > Satish -- Norihiro Watanabe From knepley at gmail.com Thu Aug 11 10:29:42 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Aug 2016 10:29:42 -0500 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: References: Message-ID: On Thu, Aug 11, 2016 at 3:14 AM, leejearl wrote: > Hi, > Thank you for your reply. It help me very much. > But, for "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when I set > the overlap to 2 levels, the command is > "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 -physics sw", it > suffers a error. > It seems to me that setting overlap to 2 is very common. Are there > issues that I have not take into consideration? > Any help are appreciated. > I will check this out. I have not tested an overlap of 2 here since I generally use nearest neighbor FV methods for unstructured stuff. I have test examples that run fine for overlap > 1. Can you send the entire error message? If the error is not in the distribution, but rather in the analytics, that is understandable because this example is only intended to be run using a nearest neighbor FV method, and thus might be confused if we give it two layers of ghost cells. Matt > > leejearl > > On 2016?08?11? 14:57, Julian Andrej wrote: > > Hi, > > take a look at slide 10 of [1], there is visually explained what the > overlap between partitions is. > > [1] https://www.archer.ac.uk/training/virtual/files/2015/ > 06-PETSc/slides.pdf > > On Thu, Aug 11, 2016 at 8:48 AM, leejearl wrote: > >> Hi, all: >> I want to use PETSc to build my FVM code. Now, I have a question about >> the function DMPlexDistribute(DM dm, PetscInt overlap, PetscSF *sf, DM >> *dmOverlap) . >> >> In the example "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". When >> I set the overlap >> as 0 or 1, it works well. But, if I set the overlap as 2, it suffers a >> problem. >> I am confused about the value of overlap. Can it be set as 2? What is >> the meaning of >> the parameter overlap? >> Any helps are appreciated! >> >> leejearl >> >> >> >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 11 10:32:00 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 11 Aug 2016 10:32:00 -0500 Subject: [petsc-users] High order finite volume method in unstructured grid using PETSc In-Reply-To: References: Message-ID: On Thu, Aug 11, 2016 at 8:16 AM, leejearl wrote: > Hi, all: > > I want to build a high order finite volume method in unstructured grid > using PETSc. > > The first issue is to partition the grid. I use the DMPlex to manage the > data structure. > > The procedure is as follow: > > 1> DMPlexCreateFromFile(), to load a grid into DMPlex; > > 2> DMPlexDistribute(dm, overlap, NULL, &dmdist) > > I use DMPlexDistribute to distribute gird. For a high order method, I want > to set the > > value of overlap to 2, and I think 0 or 1 is not enough. So, I test the > example > > "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". The example is a 2ND > order FVM > > code, and it use "DMPlexDistribute" too. But when I use the command > > "mpirun -n 3 ./ex11 -ufv_mesh_overlap 2" to set the value of overlap to 2, > it suffers > > an error. > > Is there anyone can help me to point out: > > 1> Can I set the value of overlap to 2? > > 2> If I want to give the value of overlap as 2, is there any additional > codes which must be > > added? > > Any helps are appreciated. I replied to this in the other mail. Matt > > leejearl > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 11 11:09:44 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 11 Aug 2016 11:09:44 -0500 Subject: [petsc-users] Code performance for solving multiple RHS In-Reply-To: References: <6699AE6B-520C-4220-94D0-0389FD5941E6@mcs.anl.gov> Message-ID: <4F22DE27-9E50-47D9-A2EA-8EDA762B6D02@mcs.anl.gov> If it is sequential, which it probably should be, then you can you MatLUFactorSymbolic(), MatLUFactorNumeric() and MatMatSolve() where you put a bunch of your right hand side vectors into a dense array; not all million of them but maybe 10 to 100 at a time. Barry > On Aug 10, 2016, at 10:18 PM, Harshad Ranadive wrote: > > Hi Barry > > The matrix A is mostly tridiagonal > > 1 ? 0 ......... 0 > > ? 1 ? 0 .......0 > > > 0 ? 1 ? 0 ....0 > > > .................... > 0..............? 1 > > In some cases (periodic boundaries) there would be an '?' in right-top-corner and left-bottom corner. > > I am not using multigrid approach. I just implemented an implicit filtering approach (instead of an explicit existing one) which requires the solution of the above system. > > Thanks > Harshad > > On Thu, Aug 11, 2016 at 1:07 PM, Barry Smith wrote: > > Effectively utilizing multiple right hand sides with the same system can result in roughly 2 or at absolute most 3 times improvement in solve time. A great improvement but when you have a million right hand sides not a giant improvement. > > The first step is to get the best (most efficient) preconditioner for you problem. Since you have many right hand sides it obviously pays to spend more time building the preconditioner so that each solve is faster. If you provide more information on your linear system we might have suggestions. CFD so is your linear system a Poisson problem? Are you using geometric or algebraic multigrid with PETSc? It not a Poisson problem how can you describe the linear system? > > Barry > > > > > On Aug 10, 2016, at 9:54 PM, Harshad Ranadive wrote: > > > > Hi All, > > > > I have currently added the PETSc library with our CFD solver. > > > > In this I need to use KSPSolve(...) multiple time for the same matrix A. I have read that PETSc does not support passing multiple RHS vectors in the form of a matrix and the only solution to this is calling KSPSolve multiple times as in example given here: > > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex16.c.html > > > > I have followed this technique, but I find that the performance of the code is very slow now. I basically have a mesh size of 8-10 Million and I need to solve the matrix A very large number of times. I have checked that the statement KSPSolve(..) is taking close to 90% of my computation time. > > > > I am setting up the matrix A, KSPCreate, KSPSetup etc just once at the start. Only the following statements are executed in a repeated loop > > > > Loop begin: (say million times !!) > > > > loop over vector length > > VecSetValues( ....) > > end > > > > VecAssemblyBegin( ... ) > > VecAssemblyEnd (...) > > > > KSPSolve (...) > > > > VecGetValues > > > > Loop end. > > > > Is there an efficient way of doing this rather than using KSPSolve multiple times? > > > > Please note my matrix A never changes during the time steps or across the mesh ... So essentially if I can get the inverse once would it be good enough? It has been recommended in the FAQ that matrix inverse should be avoided but would it be okay to use in my case? > > > > Also could someone please provide an example of how to use MatLUFactor and MatCholeskyFactor() to find the matrix inverse... the arguments below were not clear to me. > > IS row > > IS col > > const MatFactorInfo *info > > > > Apologies for a long email and thanks to anyone for help. > > > > Regards > > Harshad > > > > > > > > > > > > From hgbk2008 at gmail.com Thu Aug 11 11:32:45 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Thu, 11 Aug 2016 18:32:45 +0200 Subject: [petsc-users] different convergence behaviour In-Reply-To: <7E31CA9A-7717-4E0D-9E58-3BE243A05AB4@mcs.anl.gov> References: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> <7E31CA9A-7717-4E0D-9E58-3BE243A05AB4@mcs.anl.gov> Message-ID: Hi all I'm a bit embarrassed that after careful investigation I found that I made a wrong configuration in my problem settings. This makes the problem also not converged with MUMPS at the first place. After fixing that problem, the NR iterator converges normally with both MUMPS and Hypre. Although ill-posed, the iterative solver achieves comparable tolerance comparing with MUMPS solution. Thanks again for all your helpful suggestions. Giang On Fri, Jul 15, 2016 at 6:32 PM, Barry Smith wrote: > > Use -ksp_type gmres to get it to print the residuals. With preonly it > doesn't compute or print them. > > > > On Jul 15, 2016, at 11:28 AM, Hoang Giang Bui > wrote: > > > > I used > > > > -ksp_monitor_true_residual > > -ksp_monitor_true_solution > > -ksp_converged_reason > > > > with MUMPS but it does not compute the true residual. Should I compute > that myself? > > > > Below is a sample for a full log of MUMPS > > https://www.dropbox.com/s/fy5uknooxw77r19/log13Jun16_mumps?dl=0 > > > > > > Giang > > > > On Fri, Jul 15, 2016 at 2:52 AM, Mark Adams wrote: > > > > > > On Thu, Jul 14, 2016 at 7:29 PM, Matthew Knepley > wrote: > > On Thu, Jul 14, 2016 at 6:27 PM, Mark Adams wrote: > > > > > > Notice that there are 7 orders of magnitude between the apparent > residual (using the preconditioner), and the actual residual, Ax - b. > > You are using Hypre, and this generally means the Hypre coarse grid > operator is crap. Please > > > > > > Huh?, this data looks fine, both the true and preconditioned residual > stay separated by about 9 orders of magnitude. This just tells you that the > norm of A (or is it A^-1) is 10^9. Am I misunderstanding this? > > > > This is why Barry and I asked for a comparsion with MUMPS. If you are > right, and its just the condition number, > > > > I said norm not condition number. I trust I'm missing something in this > thread. > > > > the LU > > will not be any more accurate. > > > > Matt > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doss0032 at umn.edu Thu Aug 11 19:41:12 2016 From: doss0032 at umn.edu (Scott Dossa) Date: Thu, 11 Aug 2016 19:41:12 -0500 Subject: [petsc-users] Changing DM domain from default [0,1] Message-ID: Hi All, Basic Question: When one creates a DMDA to handle objects, it sets the domain to [0,1] by default. Is there a call/function to change this? All the examples seem to be over the default domain. Thank you for the help! Best, Scott Dossa -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Thu Aug 11 19:44:59 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Thu, 11 Aug 2016 20:44:59 -0400 Subject: [petsc-users] Changing DM domain from default [0,1] In-Reply-To: References: Message-ID: Have you tried DMDASetUniformCoordinates? http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/DMDASetUniformCoordinates.html On Thu, Aug 11, 2016 at 8:41 PM, Scott Dossa wrote: > Hi All, > > Basic Question: > > When one creates a DMDA to handle objects, it sets the domain to [0,1] by > default. Is there a call/function to change this? > > All the examples seem to be over the default domain. > > Thank you for the help! > Best, > Scott Dossa > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doss0032 at umn.edu Thu Aug 11 19:47:06 2016 From: doss0032 at umn.edu (Scott Dossa) Date: Thu, 11 Aug 2016 19:47:06 -0500 Subject: [petsc-users] Changing DM domain from default [0,1] In-Reply-To: References: Message-ID: Thanks Mohammad. That is exactly what I was searching for. -Scott Dossa On Thu, Aug 11, 2016 at 7:44 PM, Mohammad Mirzadeh wrote: > Have you tried DMDASetUniformCoordinates? > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DM/ > DMDASetUniformCoordinates.html > > On Thu, Aug 11, 2016 at 8:41 PM, Scott Dossa wrote: > >> Hi All, >> >> Basic Question: >> >> When one creates a DMDA to handle objects, it sets the domain to [0,1] by >> default. Is there a call/function to change this? >> >> All the examples seem to be over the default domain. >> >> Thank you for the help! >> Best, >> Scott Dossa >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leejearl at 126.com Thu Aug 11 20:00:17 2016 From: leejearl at 126.com (leejearl) Date: Fri, 12 Aug 2016 09:00:17 +0800 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: References: Message-ID: Thank you for your reply. I have attached the code, grid and the error message. cavity.c is the code file, cavity.exo is the grid, and error.dat is the error message. The command is "mpirun -n 2 ./cavity" On 2016?08?11? 23:29, Matthew Knepley wrote: > On Thu, Aug 11, 2016 at 3:14 AM, leejearl > wrote: > > Hi, > Thank you for your reply. It help me very much. > But, for "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when > I set the overlap to 2 levels, the command is > "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 -physics > sw", it suffers a error. > It seems to me that setting overlap to 2 is very common. Are > there issues that I have not take into consideration? > Any help are appreciated. > > I will check this out. I have not tested an overlap of 2 here since I > generally use nearest neighbor FV methods for > unstructured stuff. I have test examples that run fine for overlap > > 1. Can you send the entire error message? > > If the error is not in the distribution, but rather in the analytics, > that is understandable because this example is only > intended to be run using a nearest neighbor FV method, and thus might > be confused if we give it two layers of ghost > cells. > > Matt > > > leejearl > > > On 2016?08?11? 14:57, Julian Andrej wrote: >> Hi, >> >> take a look at slide 10 of [1], there is visually explained what >> the overlap between partitions is. >> >> [1] >> https://www.archer.ac.uk/training/virtual/files/2015/06-PETSc/slides.pdf >> >> >> On Thu, Aug 11, 2016 at 8:48 AM, leejearl > > wrote: >> >> Hi, all: >> I want to use PETSc to build my FVM code. Now, I have a >> question about >> the function DMPlexDistribute(DM dm, PetscInt overlap, >> PetscSF *sf, DM *dmOverlap) . >> >> In the example >> "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". When I set >> the overlap >> as 0 or 1, it works well. But, if I set the overlap as 2, it >> suffers a problem. >> I am confused about the value of overlap. Can it be set >> as 2? What is the meaning of >> the parameter overlap? >> Any helps are appreciated! >> >> leejearl >> >> >> >> > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cavity.c Type: text/x-csrc Size: 1933 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cavity.exo Type: application/octet-stream Size: 344931 bytes Desc: not available URL: -------------- next part -------------- [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Argument out of range [1]PETSC ERROR: key 1633 is greater than largest key allowed 2 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [1]PETSC ERROR: ./cavity on a arch-linux2-c-debug named leejearl by leejearl Fri Aug 12 08:51:51 2016 [1]PETSC ERROR: Configure options --prefix=/home/leejearl/Install/petsc-openmpi --with-mpi=/home/leejearl/Install/openmpi/gnu/1.8.4 --download-exodusii=yes --download-netcdf --with-hdf5-dir=/home/leejearl/Install/hdf5-1.8.14 --download-metis=yes [1]PETSC ERROR: #1 PetscTableAdd() line 46 in /home/leejearl/Software/petsc/petsc-3.7.2/include/petscctable.h [1]PETSC ERROR: #2 PetscSFSetGraph() line 347 in /home/leejearl/Software/petsc/petsc-3.7.2/src/vec/is/sf/interface/sf.c [1]PETSC ERROR: #3 DMLabelGather() line 1092 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/label/dmlabel.c [1]PETSC ERROR: #4 DMPlexPartitionLabelPropagate() line 1633 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexpartition.c [1]PETSC ERROR: #5 DMPlexCreateOverlap() line 615 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexdistribute.c [1]PETSC ERROR: #6 DMPlexDistributeOverlap() line 1729 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexdistribute.c [1]PETSC ERROR: #7 DMPlexDistribute() line 1635 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexdistribute.c [1]PETSC ERROR: #8 main() line 54 in /home/leejearl/Desktop/PETSc/gks_cavity/cavity.c [1]PETSC ERROR: No PETSc Option Table entries [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 63. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] PetscCommDuplicate line 136 /home/leejearl/Software/petsc/petsc-3.7.2/src/sys/objects/tagm.c [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 36 /home/leejearl/Software/petsc/petsc-3.7.2/src/sys/objects/inherit.c [0]PETSC ERROR: [0] PetscSectionCreate line 42 /home/leejearl/Software/petsc/petsc-3.7.2/src/vec/is/utils/vsectionis.c [0]PETSC ERROR: [0] DMLabelDistribute_Internal line 898 /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/label/dmlabel.c [0]PETSC ERROR: [0] DMLabelGather line 1063 /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/label/dmlabel.c [0]PETSC ERROR: [0] DMPlexPartitionLabelPropagate line 1628 /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexpartition.c [0]PETSC ERROR: [0] DMPlexCreateOverlap line 561 /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexdistribute.c [0]PETSC ERROR: [0] DMPlexDistributeOverlap line 1715 /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexdistribute.c [0]PETSC ERROR: [0] DMPlexDistribute line 1555 /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexdistribute.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [0]PETSC ERROR: ./cavity on a arch-linux2-c-debug named leejearl by leejearl Fri Aug 12 08:51:51 2016 [0]PETSC ERROR: Configure options --prefix=/home/leejearl/Install/petsc-openmpi --with-mpi=/home/leejearl/Install/openmpi/gnu/1.8.4 --download-exodusii=yes --download-netcdf --with-hdf5-dir=/home/leejearl/Install/hdf5-1.8.14 --download-metis=yes [0]PETSC ERROR: #1 User provided function() line 0 in unknown file [leejearl:22730] 1 more process has sent help message help-mpi-api.txt / mpi-abort [leejearl:22730] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages -------------- next part -------------- ALL: cavity CFLAGS = FFLAGS = CPPFLAGS = FPPFLAGS = CLEANFILES = cavity include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules cavity: cavity.o chkopts ${CLINKER}$ -o cavity cavity.o ${PETSC_LIB}$ ${RM} cavity.o From harshadranadive at gmail.com Thu Aug 11 22:14:29 2016 From: harshadranadive at gmail.com (Harshad Ranadive) Date: Fri, 12 Aug 2016 13:14:29 +1000 Subject: [petsc-users] Code performance for solving multiple RHS In-Reply-To: <4F22DE27-9E50-47D9-A2EA-8EDA762B6D02@mcs.anl.gov> References: <6699AE6B-520C-4220-94D0-0389FD5941E6@mcs.anl.gov> <4F22DE27-9E50-47D9-A2EA-8EDA762B6D02@mcs.anl.gov> Message-ID: Hi Barry, Thanks for this recommendation. As you mention, the matrix factorization should be on a single processor. If the factored matrix A is available on all processors can I then use MatMatSolve(A,B,X) in parallel? That is could the RHS block matrix 'B' and solution matrix 'X' be distributed in different processors as is done while using MatCreateDense(...) ? Thanks, Harshad On Fri, Aug 12, 2016 at 2:09 AM, Barry Smith wrote: > > If it is sequential, which it probably should be, then you can you > MatLUFactorSymbolic(), MatLUFactorNumeric() and MatMatSolve() where you put > a bunch of your right hand side vectors into a dense array; not all million > of them but maybe 10 to 100 at a time. > > Barry > > > On Aug 10, 2016, at 10:18 PM, Harshad Ranadive < > harshadranadive at gmail.com> wrote: > > > > Hi Barry > > > > The matrix A is mostly tridiagonal > > > > 1 ? 0 ......... 0 > > > > ? 1 ? 0 .......0 > > > > > > 0 ? 1 ? 0 ....0 > > > > > > .................... > > 0..............? 1 > > > > In some cases (periodic boundaries) there would be an '?' in > right-top-corner and left-bottom corner. > > > > I am not using multigrid approach. I just implemented an implicit > filtering approach (instead of an explicit existing one) which requires the > solution of the above system. > > > > Thanks > > Harshad > > > > On Thu, Aug 11, 2016 at 1:07 PM, Barry Smith wrote: > > > > Effectively utilizing multiple right hand sides with the same system > can result in roughly 2 or at absolute most 3 times improvement in solve > time. A great improvement but when you have a million right hand sides not > a giant improvement. > > > > The first step is to get the best (most efficient) preconditioner for > you problem. Since you have many right hand sides it obviously pays to > spend more time building the preconditioner so that each solve is faster. > If you provide more information on your linear system we might have > suggestions. CFD so is your linear system a Poisson problem? Are you using > geometric or algebraic multigrid with PETSc? It not a Poisson problem how > can you describe the linear system? > > > > Barry > > > > > > > > > On Aug 10, 2016, at 9:54 PM, Harshad Ranadive < > harshadranadive at gmail.com> wrote: > > > > > > Hi All, > > > > > > I have currently added the PETSc library with our CFD solver. > > > > > > In this I need to use KSPSolve(...) multiple time for the same matrix > A. I have read that PETSc does not support passing multiple RHS vectors in > the form of a matrix and the only solution to this is calling KSPSolve > multiple times as in example given here: > > > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/ > examples/tutorials/ex16.c.html > > > > > > I have followed this technique, but I find that the performance of the > code is very slow now. I basically have a mesh size of 8-10 Million and I > need to solve the matrix A very large number of times. I have checked that > the statement KSPSolve(..) is taking close to 90% of my computation time. > > > > > > I am setting up the matrix A, KSPCreate, KSPSetup etc just once at the > start. Only the following statements are executed in a repeated loop > > > > > > Loop begin: (say million times !!) > > > > > > loop over vector length > > > VecSetValues( ....) > > > end > > > > > > VecAssemblyBegin( ... ) > > > VecAssemblyEnd (...) > > > > > > KSPSolve (...) > > > > > > VecGetValues > > > > > > Loop end. > > > > > > Is there an efficient way of doing this rather than using KSPSolve > multiple times? > > > > > > Please note my matrix A never changes during the time steps or across > the mesh ... So essentially if I can get the inverse once would it be good > enough? It has been recommended in the FAQ that matrix inverse should be > avoided but would it be okay to use in my case? > > > > > > Also could someone please provide an example of how to use MatLUFactor > and MatCholeskyFactor() to find the matrix inverse... the arguments below > were not clear to me. > > > IS row > > > IS col > > > const MatFactorInfo *info > > > > > > Apologies for a long email and thanks to anyone for help. > > > > > > Regards > > > Harshad > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 11 22:27:13 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 11 Aug 2016 22:27:13 -0500 Subject: [petsc-users] Code performance for solving multiple RHS In-Reply-To: References: <6699AE6B-520C-4220-94D0-0389FD5941E6@mcs.anl.gov> <4F22DE27-9E50-47D9-A2EA-8EDA762B6D02@mcs.anl.gov> Message-ID: > On Aug 11, 2016, at 10:14 PM, Harshad Ranadive wrote: > > Hi Barry, > > Thanks for this recommendation. > > As you mention, the matrix factorization should be on a single processor. > If the factored matrix A is available on all processors can I then use MatMatSolve(A,B,X) in parallel? That is could the RHS block matrix 'B' and solution matrix 'X' be distributed in different processors as is done while using MatCreateDense(...) ? Note sure what you mean. You can have different processes handle different right hand sides. So give the full linear system matrix to each process; each process factors it and then each process solves a different set of right hand sides. Embarrassingly parallel except for any communication you need to do to get the matrix and right hand sides to the right processes. If the linear system involves say millions of unknowns this is the way to go. If the linear system is over say 1 billion unknowns then it might be worth each linear system in parallel. Barry > > Thanks, > Harshad > > > > On Fri, Aug 12, 2016 at 2:09 AM, Barry Smith wrote: > > If it is sequential, which it probably should be, then you can you MatLUFactorSymbolic(), MatLUFactorNumeric() and MatMatSolve() where you put a bunch of your right hand side vectors into a dense array; not all million of them but maybe 10 to 100 at a time. > > Barry > > > On Aug 10, 2016, at 10:18 PM, Harshad Ranadive wrote: > > > > Hi Barry > > > > The matrix A is mostly tridiagonal > > > > 1 ? 0 ......... 0 > > > > ? 1 ? 0 .......0 > > > > > > 0 ? 1 ? 0 ....0 > > > > > > .................... > > 0..............? 1 > > > > In some cases (periodic boundaries) there would be an '?' in right-top-corner and left-bottom corner. > > > > I am not using multigrid approach. I just implemented an implicit filtering approach (instead of an explicit existing one) which requires the solution of the above system. > > > > Thanks > > Harshad > > > > On Thu, Aug 11, 2016 at 1:07 PM, Barry Smith wrote: > > > > Effectively utilizing multiple right hand sides with the same system can result in roughly 2 or at absolute most 3 times improvement in solve time. A great improvement but when you have a million right hand sides not a giant improvement. > > > > The first step is to get the best (most efficient) preconditioner for you problem. Since you have many right hand sides it obviously pays to spend more time building the preconditioner so that each solve is faster. If you provide more information on your linear system we might have suggestions. CFD so is your linear system a Poisson problem? Are you using geometric or algebraic multigrid with PETSc? It not a Poisson problem how can you describe the linear system? > > > > Barry > > > > > > > > > On Aug 10, 2016, at 9:54 PM, Harshad Ranadive wrote: > > > > > > Hi All, > > > > > > I have currently added the PETSc library with our CFD solver. > > > > > > In this I need to use KSPSolve(...) multiple time for the same matrix A. I have read that PETSc does not support passing multiple RHS vectors in the form of a matrix and the only solution to this is calling KSPSolve multiple times as in example given here: > > > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex16.c.html > > > > > > I have followed this technique, but I find that the performance of the code is very slow now. I basically have a mesh size of 8-10 Million and I need to solve the matrix A very large number of times. I have checked that the statement KSPSolve(..) is taking close to 90% of my computation time. > > > > > > I am setting up the matrix A, KSPCreate, KSPSetup etc just once at the start. Only the following statements are executed in a repeated loop > > > > > > Loop begin: (say million times !!) > > > > > > loop over vector length > > > VecSetValues( ....) > > > end > > > > > > VecAssemblyBegin( ... ) > > > VecAssemblyEnd (...) > > > > > > KSPSolve (...) > > > > > > VecGetValues > > > > > > Loop end. > > > > > > Is there an efficient way of doing this rather than using KSPSolve multiple times? > > > > > > Please note my matrix A never changes during the time steps or across the mesh ... So essentially if I can get the inverse once would it be good enough? It has been recommended in the FAQ that matrix inverse should be avoided but would it be okay to use in my case? > > > > > > Also could someone please provide an example of how to use MatLUFactor and MatCholeskyFactor() to find the matrix inverse... the arguments below were not clear to me. > > > IS row > > > IS col > > > const MatFactorInfo *info > > > > > > Apologies for a long email and thanks to anyone for help. > > > > > > Regards > > > Harshad > > > > > > > > > > > > > > > > > > > > > From gaetank at gmail.com Thu Aug 11 19:16:55 2016 From: gaetank at gmail.com (Gaetan Kenway) Date: Thu, 11 Aug 2016 20:16:55 -0400 Subject: [petsc-users] GAMG with PETSc Message-ID: Hi I'm attempting to try out using GAMG for a preconditioner for my compressible CFD problem. However, I'm getting segfaults when trying to run the code. The code is based on ksp ex23.c which is attached. It just reads in two precomputed matrices (the actual jacobian and an approximate jacobian used to build the PC) and solves with a RHS of ones. My normal approach to solving the system is with ASM+ILU. With the following options, everything works fine. mpirun -np 1 ./ex23 -ksp_monitor -ksp_type gmres -ksp_view -ksp_max_it 200 -mat_type mpibaij -ksp_gmres_restart 100 -pc_type asm \ -sub_ksp_type richardson -sub_ksp_max_it 2 -sub_pc_factor_levels 2 --ksp_pc_side right -sub_pc_factor_mat_ordering_type rcm \ -sub_ksp_pc_side right -info -log_summary > ASM.out Now when I try to replace the PC with gamg and run the following: mpirun -np 1 ./ex23 -ksp_monitor -ksp_type fgmres -ksp_view -ksp_max_it 200 -mat_type mpibaij -ksp_gmres_restart 100 -pc_type gamg -info -log_summary > GAMG.out I just get a segfault. The backtrace from gdb is [0] PCSetUp(): Setting up PC for first time [0] PCSetUp_GAMG(): level 0) N=483840, n data rows=5, n data cols=5, nnz/row (ave)=34, np=1 Program received signal SIGSEGV, Segmentation fault. MatSetValues_MPIBAIJ (mat=0xc7e9e0, m=1, im=0x7fffffffccd0, n=1, in=0x7fffffffccd4, v=0x7fffffffcd18, addv=ADD_VALUES) at /home/gaetan/packages/petsc-3.7.3/src/mat/impls/baij/mpi/mpibaij.c:193 193 Mat_SeqBAIJ *a = (Mat_SeqBAIJ*)(A)->data; (gdb) bt #0 MatSetValues_MPIBAIJ (mat=0xc7e9e0, m=1, im=0x7fffffffccd0, n=1, in=0x7fffffffccd4, v=0x7fffffffcd18, addv=ADD_VALUES) at /home/gaetan/packages/petsc-3.7.3/src/mat/impls/baij/mpi/mpibaij.c:193 #1 0x00007ffff6cfccae in MatSetValues (mat=0xc7e9e0, m=m at entry=1, idxm=idxm at entry=0x7fffffffccd0, n=n at entry=1, idxn=idxn at entry=0x7fffffffccd4, v=v at entry=0x7fffffffcd18, addv=addv at entry=ADD_VALUES) at /home/gaetan/packages/petsc-3.7.3/src/mat/interface/matrix.c:1190 #2 0x00007ffff723ce17 in PCGAMGCreateGraph (Amat=Amat at entry=0x8f6940, a_Gmat=a_Gmat at entry=0x7fffffffcd70) at /home/gaetan/packages/petsc-3.7.3/src/ksp/pc/impls/gamg/util.c:206 #3 0x00007ffff7231898 in PCGAMGGraph_AGG (pc=, Amat=0x8f6940, a_Gmat=0x7fffffffce60) at /home/gaetan/packages/petsc-3.7.3/src/ksp/pc/impls/gamg/agg.c:904 #4 0x00007ffff7228e87 in PCSetUp_GAMG (pc=0xc4f800) at /home/gaetan/packages/petsc-3.7.3/src/ksp/pc/impls/gamg/gamg.c:570 #5 0x00007ffff7172d65 in PCSetUp (pc=0xc4f800) at /home/gaetan/packages/petsc-3.7.3/src/ksp/pc/interface/precon.c:968 #6 0x00007ffff7293c26 in KSPSetUp (ksp=ksp at entry=0xc4e2b0) at /home/gaetan/packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c:390 #7 0x00007ffff7294282 in KSPSolve (ksp=0xc4e2b0, b=0xc4cda0, x=0x8ec9d0) at /home/gaetan/packages/petsc-3.7.3/src/ksp/ksp/interface/itfunc.c:599 #8 0x0000000000401209 in main (argc=15, args=0x7fffffffd548) at /home/gaetan/packages/petsc-3.7.3/src/ksp/ksp/examples/tutorials/ex23.c:62 (gdb) Any thoughts on what might be going wrong? Thanks, Gaetan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 7883882 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex23.c Type: text/x-csrc Size: 2544 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ASM.out Type: application/octet-stream Size: 45582 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: GAMG.out Type: application/octet-stream Size: 5727 bytes Desc: not available URL: From lawrence.mitchell at imperial.ac.uk Fri Aug 12 06:02:09 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Fri, 12 Aug 2016 12:02:09 +0100 Subject: [petsc-users] GAMG with PETSc In-Reply-To: References: Message-ID: <915BA6A8-DBAB-49B3-B27C-D7D4009974EB@imperial.ac.uk> [Added petsc-maint to cc, since I think this is an actual bug] > On 12 Aug 2016, at 01:16, Gaetan Kenway wrote: > > Hi > > I'm attempting to try out using GAMG for a preconditioner for my compressible CFD problem. However, I'm getting segfaults when trying to run the code. The code is based on ksp ex23.c which is attached. It just reads in two precomputed matrices (the actual jacobian and an approximate jacobian used to build the PC) and solves with a RHS of ones. > > My normal approach to solving the system is with ASM+ILU. With the following options, everything works fine. This appears to be a problem that GAMG doesn't work with BAIJ matrices. But there is no checking of the input matrix type anywhere. For example, with a debug PETSc: cd src/ksp/ksp/examples/tutorials make ex23 ./ex23 -pc_type gamg -mat_type baij [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "mat" before MatSetValues() [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1123-g21143a8 GIT Date: 2016-08-08 17:24:17 -0700 [0]PETSC ERROR: ./ex23 on a arch-darwin-c-dbg named yam-laptop.local by lmitche1 Fri Aug 12 11:51:17 2016 [0]PETSC ERROR: Configure options --download-chaco=1 --download-ctetgen=1 --download-exodusii=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 --download-netcdf=1 --download-parmetis=1 --download-ptscotch=1 --download-scalapack=1 --download-triangle=1 --with-c2html=0 --with-debugging=1 --with-shared-libraries=1 PETSC_ARCH=arch-darwin-c-dbg [0]PETSC ERROR: #1 MatSetValues() line 1195 in /Users/lmitche1/Documents/work/src/deps/petsc/src/mat/interface/matrix.c [0]PETSC ERROR: #2 PCGAMGFilterGraph() line 342 in /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/impls/gamg/util.c [0]PETSC ERROR: #3 PCGAMGGraph_AGG() line 908 in /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/impls/gamg/agg.c [0]PETSC ERROR: #4 PCSetUp_GAMG() line 525 in /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/impls/gamg/gamg.c [0]PETSC ERROR: #5 PCSetUp() line 968 in /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #6 KSPSetUp() line 393 in /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #7 KSPSolve() line 602 in /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #8 main() line 158 in /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/ksp/examples/tutorials/ex23.c [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -mat_type baij [0]PETSC ERROR: -pc_type gamg [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Looking in PCGAMGFilterGraph I see: ierr = MatGetType(Gmat,&mtype);CHKERRQ(ierr); ierr = MatCreate(comm, &tGmat);CHKERRQ(ierr); ierr = MatSetSizes(tGmat,nloc,nloc,MM,MM);CHKERRQ(ierr); ierr = MatSetBlockSizes(tGmat, 1, 1);CHKERRQ(ierr); ierr = MatSetType(tGmat, mtype);CHKERRQ(ierr); ierr = MatSeqAIJSetPreallocation(tGmat,0,d_nnz);CHKERRQ(ierr); ierr = MatMPIAIJSetPreallocation(tGmat,0,d_nnz,0,o_nnz);CHKERRQ(ierr); ... ierr = MatSetValues(tGmat,1,&Ii,1,&idx[jj],&sv,ADD_VALUES);CHKERRQ(ierr); So if Gmat is neither SEQAIJ nor MPIAIJ, then no preallocation has happened (and MatSetUp is not called). Fixing the few instances here by just changing the type of these matrices to AIJ. One runs into to the problem that creating the coarse grid operators doesn't work, since MatMatMult and friends do not exist for BAIJ matrices. I guess GAMG could MatConvert from BAIJ to AIJ (but this uses extra memory). But it should probably barf with a comprehensible error message. Thoughts? Lawrence -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: Message signed with OpenPGP using GPGMail URL: From mfadams at lbl.gov Fri Aug 12 10:03:30 2016 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 12 Aug 2016 11:03:30 -0400 Subject: [petsc-users] [petsc-maint] GAMG with PETSc In-Reply-To: <915BA6A8-DBAB-49B3-B27C-D7D4009974EB@imperial.ac.uk> References: <915BA6A8-DBAB-49B3-B27C-D7D4009974EB@imperial.ac.uk> Message-ID: Yes, we should check the type. #define MATAIJ "aij" #define MATSEQAIJ "seqaij" #define MATMPIAIJ "mpiaij" I assume that MATAIJ is mapped to SEQ or MPI, and so just need to check for the latter two. I'll do that now. Thanks, Mark On Fri, Aug 12, 2016 at 7:02 AM, Lawrence Mitchell < lawrence.mitchell at imperial.ac.uk> wrote: > [Added petsc-maint to cc, since I think this is an actual bug] > > > On 12 Aug 2016, at 01:16, Gaetan Kenway wrote: > > > > Hi > > > > I'm attempting to try out using GAMG for a preconditioner for my > compressible CFD problem. However, I'm getting segfaults when trying to run > the code. The code is based on ksp ex23.c which is attached. It just reads > in two precomputed matrices (the actual jacobian and an approximate > jacobian used to build the PC) and solves with a RHS of ones. > > > > My normal approach to solving the system is with ASM+ILU. With the > following options, everything works fine. > > This appears to be a problem that GAMG doesn't work with BAIJ matrices. > > But there is no checking of the input matrix type anywhere. > > For example, with a debug PETSc: > > cd src/ksp/ksp/examples/tutorials > make ex23 > ./ex23 -pc_type gamg -mat_type baij > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on > argument 1 "mat" before MatSetValues() > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1123-g21143a8 GIT > Date: 2016-08-08 17:24:17 -0700 > [0]PETSC ERROR: ./ex23 on a arch-darwin-c-dbg named yam-laptop.local by > lmitche1 Fri Aug 12 11:51:17 2016 > [0]PETSC ERROR: Configure options --download-chaco=1 --download-ctetgen=1 > --download-exodusii=1 --download-hdf5=1 --download-hypre=1 > --download-metis=1 --download-ml=1 --download-mumps=1 --download-netcdf=1 > --download-parmetis=1 --download-ptscotch=1 --download-scalapack=1 > --download-triangle=1 --with-c2html=0 --with-debugging=1 > --with-shared-libraries=1 PETSC_ARCH=arch-darwin-c-dbg > [0]PETSC ERROR: #1 MatSetValues() line 1195 in /Users/lmitche1/Documents/ > work/src/deps/petsc/src/mat/interface/matrix.c > [0]PETSC ERROR: #2 PCGAMGFilterGraph() line 342 in > /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/impls/gamg/util.c > [0]PETSC ERROR: #3 PCGAMGGraph_AGG() line 908 in /Users/lmitche1/Documents/ > work/src/deps/petsc/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #4 PCSetUp_GAMG() line 525 in /Users/lmitche1/Documents/ > work/src/deps/petsc/src/ksp/pc/impls/gamg/gamg.c > [0]PETSC ERROR: #5 PCSetUp() line 968 in /Users/lmitche1/Documents/ > work/src/deps/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #6 KSPSetUp() line 393 in /Users/lmitche1/Documents/ > work/src/deps/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #7 KSPSolve() line 602 in /Users/lmitche1/Documents/ > work/src/deps/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #8 main() line 158 in /Users/lmitche1/Documents/ > work/src/deps/petsc/src/ksp/ksp/examples/tutorials/ex23.c > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -mat_type baij > [0]PETSC ERROR: -pc_type gamg > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > Looking in PCGAMGFilterGraph I see: > > ierr = MatGetType(Gmat,&mtype);CHKERRQ(ierr); > ierr = MatCreate(comm, &tGmat);CHKERRQ(ierr); > ierr = MatSetSizes(tGmat,nloc,nloc,MM,MM);CHKERRQ(ierr); > ierr = MatSetBlockSizes(tGmat, 1, 1);CHKERRQ(ierr); > ierr = MatSetType(tGmat, mtype);CHKERRQ(ierr); > ierr = MatSeqAIJSetPreallocation(tGmat,0,d_nnz);CHKERRQ(ierr); > ierr = MatMPIAIJSetPreallocation(tGmat,0,d_nnz,0,o_nnz);CHKERRQ(ierr); > > > ... > ierr = MatSetValues(tGmat,1,&Ii,1,&idx[jj],&sv,ADD_VALUES); > CHKERRQ(ierr); > > > So if Gmat is neither SEQAIJ nor MPIAIJ, then no preallocation has > happened (and MatSetUp is not called). > > Fixing the few instances here by just changing the type of these matrices > to AIJ. One runs into to the problem that creating the coarse grid > operators doesn't work, since MatMatMult and friends do not exist for BAIJ > matrices. > > I guess GAMG could MatConvert from BAIJ to AIJ (but this uses extra > memory). > > But it should probably barf with a comprehensible error message. > > Thoughts? > > Lawrence > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaetank at gmail.com Fri Aug 12 10:08:13 2016 From: gaetank at gmail.com (Gaetan Kenway) Date: Fri, 12 Aug 2016 11:08:13 -0400 Subject: [petsc-users] [petsc-maint] GAMG with PETSc In-Reply-To: References: <915BA6A8-DBAB-49B3-B27C-D7D4009974EB@imperial.ac.uk> Message-ID: Thanks. When I put the type back to mpiaij, my code runs fine without segfaults with GAMG. It doesn't work very well yet, but that is a separate issue. Gaetan On Fri, Aug 12, 2016 at 11:03 AM, Mark Adams wrote: > Yes, we should check the type. > > #define MATAIJ "aij" > #define MATSEQAIJ "seqaij" > #define MATMPIAIJ "mpiaij" > > I assume that MATAIJ is mapped to SEQ or MPI, and so just need to check > for the latter two. I'll do that now. > > Thanks, > Mark > > On Fri, Aug 12, 2016 at 7:02 AM, Lawrence Mitchell < > lawrence.mitchell at imperial.ac.uk> wrote: > >> [Added petsc-maint to cc, since I think this is an actual bug] >> >> > On 12 Aug 2016, at 01:16, Gaetan Kenway wrote: >> > >> > Hi >> > >> > I'm attempting to try out using GAMG for a preconditioner for my >> compressible CFD problem. However, I'm getting segfaults when trying to run >> the code. The code is based on ksp ex23.c which is attached. It just reads >> in two precomputed matrices (the actual jacobian and an approximate >> jacobian used to build the PC) and solves with a RHS of ones. >> > >> > My normal approach to solving the system is with ASM+ILU. With the >> following options, everything works fine. >> >> This appears to be a problem that GAMG doesn't work with BAIJ matrices. >> >> But there is no checking of the input matrix type anywhere. >> >> For example, with a debug PETSc: >> >> cd src/ksp/ksp/examples/tutorials >> make ex23 >> ./ex23 -pc_type gamg -mat_type baij >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [0]PETSC ERROR: Object is in wrong state >> [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on >> argument 1 "mat" before MatSetValues() >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1123-g21143a8 GIT >> Date: 2016-08-08 17:24:17 -0700 >> [0]PETSC ERROR: ./ex23 on a arch-darwin-c-dbg named yam-laptop.local by >> lmitche1 Fri Aug 12 11:51:17 2016 >> [0]PETSC ERROR: Configure options --download-chaco=1 --download-ctetgen=1 >> --download-exodusii=1 --download-hdf5=1 --download-hypre=1 >> --download-metis=1 --download-ml=1 --download-mumps=1 --download-netcdf=1 >> --download-parmetis=1 --download-ptscotch=1 --download-scalapack=1 >> --download-triangle=1 --with-c2html=0 --with-debugging=1 >> --with-shared-libraries=1 PETSC_ARCH=arch-darwin-c-dbg >> [0]PETSC ERROR: #1 MatSetValues() line 1195 in >> /Users/lmitche1/Documents/work/src/deps/petsc/src/mat/interface/matrix.c >> [0]PETSC ERROR: #2 PCGAMGFilterGraph() line 342 in >> /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/ >> impls/gamg/util.c >> [0]PETSC ERROR: #3 PCGAMGGraph_AGG() line 908 in >> /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/impls/gamg/agg.c >> [0]PETSC ERROR: #4 PCSetUp_GAMG() line 525 in >> /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/ >> impls/gamg/gamg.c >> [0]PETSC ERROR: #5 PCSetUp() line 968 in /Users/lmitche1/Documents/work >> /src/deps/petsc/src/ksp/pc/interface/precon.c >> [0]PETSC ERROR: #6 KSPSetUp() line 393 in /Users/lmitche1/Documents/work >> /src/deps/petsc/src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: #7 KSPSolve() line 602 in /Users/lmitche1/Documents/work >> /src/deps/petsc/src/ksp/ksp/interface/itfunc.c >> [0]PETSC ERROR: #8 main() line 158 in /Users/lmitche1/Documents/work >> /src/deps/petsc/src/ksp/ksp/examples/tutorials/ex23.c >> [0]PETSC ERROR: PETSc Option Table entries: >> [0]PETSC ERROR: -mat_type baij >> [0]PETSC ERROR: -pc_type gamg >> [0]PETSC ERROR: ----------------End of Error Message -------send entire >> error message to petsc-maint at mcs.anl.gov---------- >> >> Looking in PCGAMGFilterGraph I see: >> >> ierr = MatGetType(Gmat,&mtype);CHKERRQ(ierr); >> ierr = MatCreate(comm, &tGmat);CHKERRQ(ierr); >> ierr = MatSetSizes(tGmat,nloc,nloc,MM,MM);CHKERRQ(ierr); >> ierr = MatSetBlockSizes(tGmat, 1, 1);CHKERRQ(ierr); >> ierr = MatSetType(tGmat, mtype);CHKERRQ(ierr); >> ierr = MatSeqAIJSetPreallocation(tGmat,0,d_nnz);CHKERRQ(ierr); >> ierr = MatMPIAIJSetPreallocation(tGmat,0,d_nnz,0,o_nnz);CHKERRQ(ierr); >> >> >> ... >> ierr = MatSetValues(tGmat,1,&Ii,1,&id >> x[jj],&sv,ADD_VALUES);CHKERRQ(ierr); >> >> >> So if Gmat is neither SEQAIJ nor MPIAIJ, then no preallocation has >> happened (and MatSetUp is not called). >> >> Fixing the few instances here by just changing the type of these matrices >> to AIJ. One runs into to the problem that creating the coarse grid >> operators doesn't work, since MatMatMult and friends do not exist for BAIJ >> matrices. >> >> I guess GAMG could MatConvert from BAIJ to AIJ (but this uses extra >> memory). >> >> But it should probably barf with a comprehensible error message. >> >> Thoughts? >> >> Lawrence >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Aug 12 10:49:15 2016 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 12 Aug 2016 11:49:15 -0400 Subject: [petsc-users] [petsc-maint] GAMG with PETSc In-Reply-To: References: <915BA6A8-DBAB-49B3-B27C-D7D4009974EB@imperial.ac.uk> Message-ID: Gaetan, This was simple, if you are setup to easily check this you can test this in my branch mark/gamg-aijcheck -- you should get an error message "Require AIJ matrix." Thanks again, On Fri, Aug 12, 2016 at 11:08 AM, Gaetan Kenway wrote: > Thanks. > > When I put the type back to mpiaij, my code runs fine without segfaults > with GAMG. It doesn't work very well yet, but that is a separate issue. > > Gaetan > > On Fri, Aug 12, 2016 at 11:03 AM, Mark Adams wrote: > >> Yes, we should check the type. >> >> #define MATAIJ "aij" >> #define MATSEQAIJ "seqaij" >> #define MATMPIAIJ "mpiaij" >> >> I assume that MATAIJ is mapped to SEQ or MPI, and so just need to check >> for the latter two. I'll do that now. >> >> Thanks, >> Mark >> >> On Fri, Aug 12, 2016 at 7:02 AM, Lawrence Mitchell < >> lawrence.mitchell at imperial.ac.uk> wrote: >> >>> [Added petsc-maint to cc, since I think this is an actual bug] >>> >>> > On 12 Aug 2016, at 01:16, Gaetan Kenway wrote: >>> > >>> > Hi >>> > >>> > I'm attempting to try out using GAMG for a preconditioner for my >>> compressible CFD problem. However, I'm getting segfaults when trying to run >>> the code. The code is based on ksp ex23.c which is attached. It just reads >>> in two precomputed matrices (the actual jacobian and an approximate >>> jacobian used to build the PC) and solves with a RHS of ones. >>> > >>> > My normal approach to solving the system is with ASM+ILU. With the >>> following options, everything works fine. >>> >>> This appears to be a problem that GAMG doesn't work with BAIJ matrices. >>> >>> But there is no checking of the input matrix type anywhere. >>> >>> For example, with a debug PETSc: >>> >>> cd src/ksp/ksp/examples/tutorials >>> make ex23 >>> ./ex23 -pc_type gamg -mat_type baij >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [0]PETSC ERROR: Object is in wrong state >>> [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on >>> argument 1 "mat" before MatSetValues() >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> for trouble shooting. >>> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1123-g21143a8 >>> GIT Date: 2016-08-08 17:24:17 -0700 >>> [0]PETSC ERROR: ./ex23 on a arch-darwin-c-dbg named yam-laptop.local by >>> lmitche1 Fri Aug 12 11:51:17 2016 >>> [0]PETSC ERROR: Configure options --download-chaco=1 >>> --download-ctetgen=1 --download-exodusii=1 --download-hdf5=1 >>> --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 >>> --download-netcdf=1 --download-parmetis=1 --download-ptscotch=1 >>> --download-scalapack=1 --download-triangle=1 --with-c2html=0 >>> --with-debugging=1 --with-shared-libraries=1 PETSC_ARCH=arch-darwin-c-dbg >>> [0]PETSC ERROR: #1 MatSetValues() line 1195 in >>> /Users/lmitche1/Documents/work/src/deps/petsc/src/mat/interface/matrix.c >>> [0]PETSC ERROR: #2 PCGAMGFilterGraph() line 342 in >>> /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/imp >>> ls/gamg/util.c >>> [0]PETSC ERROR: #3 PCGAMGGraph_AGG() line 908 in >>> /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/imp >>> ls/gamg/agg.c >>> [0]PETSC ERROR: #4 PCSetUp_GAMG() line 525 in >>> /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/imp >>> ls/gamg/gamg.c >>> [0]PETSC ERROR: #5 PCSetUp() line 968 in /Users/lmitche1/Documents/work >>> /src/deps/petsc/src/ksp/pc/interface/precon.c >>> [0]PETSC ERROR: #6 KSPSetUp() line 393 in /Users/lmitche1/Documents/work >>> /src/deps/petsc/src/ksp/ksp/interface/itfunc.c >>> [0]PETSC ERROR: #7 KSPSolve() line 602 in /Users/lmitche1/Documents/work >>> /src/deps/petsc/src/ksp/ksp/interface/itfunc.c >>> [0]PETSC ERROR: #8 main() line 158 in /Users/lmitche1/Documents/work >>> /src/deps/petsc/src/ksp/ksp/examples/tutorials/ex23.c >>> [0]PETSC ERROR: PETSc Option Table entries: >>> [0]PETSC ERROR: -mat_type baij >>> [0]PETSC ERROR: -pc_type gamg >>> [0]PETSC ERROR: ----------------End of Error Message -------send entire >>> error message to petsc-maint at mcs.anl.gov---------- >>> >>> Looking in PCGAMGFilterGraph I see: >>> >>> ierr = MatGetType(Gmat,&mtype);CHKERRQ(ierr); >>> ierr = MatCreate(comm, &tGmat);CHKERRQ(ierr); >>> ierr = MatSetSizes(tGmat,nloc,nloc,MM,MM);CHKERRQ(ierr); >>> ierr = MatSetBlockSizes(tGmat, 1, 1);CHKERRQ(ierr); >>> ierr = MatSetType(tGmat, mtype);CHKERRQ(ierr); >>> ierr = MatSeqAIJSetPreallocation(tGmat,0,d_nnz);CHKERRQ(ierr); >>> ierr = MatMPIAIJSetPreallocation(tGmat,0,d_nnz,0,o_nnz);CHKERRQ(ierr); >>> >>> >>> ... >>> ierr = MatSetValues(tGmat,1,&Ii,1,&id >>> x[jj],&sv,ADD_VALUES);CHKERRQ(ierr); >>> >>> >>> So if Gmat is neither SEQAIJ nor MPIAIJ, then no preallocation has >>> happened (and MatSetUp is not called). >>> >>> Fixing the few instances here by just changing the type of these >>> matrices to AIJ. One runs into to the problem that creating the coarse >>> grid operators doesn't work, since MatMatMult and friends do not exist for >>> BAIJ matrices. >>> >>> I guess GAMG could MatConvert from BAIJ to AIJ (but this uses extra >>> memory). >>> >>> But it should probably barf with a comprehensible error message. >>> >>> Thoughts? >>> >>> Lawrence >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 12 12:10:10 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 12 Aug 2016 12:10:10 -0500 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: References: Message-ID: On Thu, Aug 11, 2016 at 8:00 PM, leejearl wrote: > Thank you for your reply. I have attached the code, grid and the error > message. > > cavity.c is the code file, cavity.exo is the grid, and error.dat is the > error message. > > The command is "mpirun -n 2 ./cavity > Can you verify that you are running the master branch? I just ran this and got DM Object: 2 MPI processes type: plex DM_0x84000004_0 in 2 dimensions: 0-cells: 5253 5252 1-cells: 10352 10350 2-cells: 5298 (198) 5297 (198) Labels: ghost: 2 strata of sizes (199, 400) vtk: 1 strata of sizes (4901) Cell Sets: 1 strata of sizes (5100) Face Sets: 3 strata of sizes (53, 99, 50) depth: 3 strata of sizes (5253, 10352, 5298) Thanks, Matt > On 2016?08?11? 23:29, Matthew Knepley wrote: > > On Thu, Aug 11, 2016 at 3:14 AM, leejearl wrote: > >> Hi, >> Thank you for your reply. It help me very much. >> But, for "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when I set >> the overlap to 2 levels, the command is >> "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 -physics sw", >> it suffers a error. >> It seems to me that setting overlap to 2 is very common. Are there >> issues that I have not take into consideration? >> Any help are appreciated. >> > I will check this out. I have not tested an overlap of 2 here since I > generally use nearest neighbor FV methods for > unstructured stuff. I have test examples that run fine for overlap > 1. > Can you send the entire error message? > > If the error is not in the distribution, but rather in the analytics, that > is understandable because this example is only > intended to be run using a nearest neighbor FV method, and thus might be > confused if we give it two layers of ghost > cells. > > Matt > > >> >> leejearl >> >> On 2016?08?11? 14:57, Julian Andrej wrote: >> >> Hi, >> >> take a look at slide 10 of [1], there is visually explained what the >> overlap between partitions is. >> >> [1] https://www.archer.ac.uk/training/virtual/files/2015/06- >> PETSc/slides.pdf >> >> On Thu, Aug 11, 2016 at 8:48 AM, leejearl wrote: >> >>> Hi, all: >>> I want to use PETSc to build my FVM code. Now, I have a question >>> about >>> the function DMPlexDistribute(DM dm, PetscInt overlap, PetscSF *sf, DM >>> *dmOverlap) . >>> >>> In the example "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". >>> When I set the overlap >>> as 0 or 1, it works well. But, if I set the overlap as 2, it suffers a >>> problem. >>> I am confused about the value of overlap. Can it be set as 2? What >>> is the meaning of >>> the parameter overlap? >>> Any helps are appreciated! >>> >>> leejearl >>> >>> >>> >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From leejearl at 126.com Fri Aug 12 19:41:31 2016 From: leejearl at 126.com (leejearl) Date: Sat, 13 Aug 2016 08:41:31 +0800 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: References: Message-ID: <523b459b-1af9-1a81-55b3-76d5a651fe76@126.com> Hi, Matt: > Can you verify that you are running the master branch? I am not sure, how can I verify this? And I configure PETSc with this command "./configure --prefix=$HOME/Install/petsc-openmpi --with-mpi=/home/leejearl/Install/openmpi/gnu/1.8.4 --download-exodusii=yes --download-netcdf --with-hdf5-dir=/home/leejearl/Install/hdf5-1.8.14 --download-metis=yes". Is there some problem? Can you show me your command for configuring PETSc? Thanks leejearl On 2016?08?13? 01:10, Matthew Knepley wrote: > On Thu, Aug 11, 2016 at 8:00 PM, leejearl > wrote: > > Thank you for your reply. I have attached the code, grid and the > error message. > > cavity.c is the code file, cavity.exo is the grid, and error.dat > is the error message. > > The command is "mpirun -n 2 ./cavity > > > Can you verify that you are running the master branch? I just ran this > and got > > DM Object: 2 MPI processes > type: plex > DM_0x84000004_0 in 2 dimensions: > 0-cells: 5253 5252 > 1-cells: 10352 10350 > 2-cells: 5298 (198) 5297 (198) > Labels: > ghost: 2 strata of sizes (199, 400) > vtk: 1 strata of sizes (4901) > Cell Sets: 1 strata of sizes (5100) > Face Sets: 3 strata of sizes (53, 99, 50) > depth: 3 strata of sizes (5253, 10352, 5298) > > Thanks, > > Matt > > On 2016?08?11? 23:29, Matthew Knepley wrote: >> On Thu, Aug 11, 2016 at 3:14 AM, leejearl > > wrote: >> >> Hi, >> Thank you for your reply. It help me very much. >> But, for "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", >> when I set the overlap to 2 levels, the command is >> "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 >> -physics sw", it suffers a error. >> It seems to me that setting overlap to 2 is very common. >> Are there issues that I have not take into consideration? >> Any help are appreciated. >> >> I will check this out. I have not tested an overlap of 2 here >> since I generally use nearest neighbor FV methods for >> unstructured stuff. I have test examples that run fine for >> overlap > 1. Can you send the entire error message? >> >> If the error is not in the distribution, but rather in the >> analytics, that is understandable because this example is only >> intended to be run using a nearest neighbor FV method, and thus >> might be confused if we give it two layers of ghost >> cells. >> >> Matt >> >> >> leejearl >> >> >> On 2016?08?11? 14:57, Julian Andrej wrote: >>> Hi, >>> >>> take a look at slide 10 of [1], there is visually explained >>> what the overlap between partitions is. >>> >>> [1] >>> https://www.archer.ac.uk/training/virtual/files/2015/06-PETSc/slides.pdf >>> >>> >>> On Thu, Aug 11, 2016 at 8:48 AM, leejearl >> > wrote: >>> >>> Hi, all: >>> I want to use PETSc to build my FVM code. Now, I >>> have a question about >>> the function DMPlexDistribute(DM dm, PetscInt overlap, >>> PetscSF *sf, DM *dmOverlap) . >>> >>> In the example >>> "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". When I >>> set the overlap >>> as 0 or 1, it works well. But, if I set the overlap as >>> 2, it suffers a problem. >>> I am confused about the value of overlap. Can it be >>> set as 2? What is the meaning of >>> the parameter overlap? >>> Any helps are appreciated! >>> >>> leejearl >>> >>> >>> >>> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From oxberry1 at llnl.gov Fri Aug 12 19:49:07 2016 From: oxberry1 at llnl.gov (Oxberry, Geoffrey Malcolm) Date: Sat, 13 Aug 2016 00:49:07 +0000 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: <523b459b-1af9-1a81-55b3-76d5a651fe76@126.com> References: <523b459b-1af9-1a81-55b3-76d5a651fe76@126.com> Message-ID: On Aug 12, 2016, at 5:41 PM, leejearl > wrote: Hi, Matt: > Can you verify that you are running the master branch? cd ${PETSC_DIR} git branch The last command should return something like a list of branch names, and the branch name with an asterisk to the left of it will be the branch you are currently on. Geoff I am not sure, how can I verify this? And I configure PETSc with this command "./configure --prefix=$HOME/Install/petsc-openmpi --with-mpi=/home/leejearl/Install/openmpi/gnu/1.8.4 --download-exodusii=yes --download-netcdf --with-hdf5-dir=/home/leejearl/Install/hdf5-1.8.14 --download-metis=yes". Is there some problem? Can you show me your command for configuring PETSc? Thanks leejearl On 2016?08?13? 01:10, Matthew Knepley wrote: On Thu, Aug 11, 2016 at 8:00 PM, leejearl > wrote: Thank you for your reply. I have attached the code, grid and the error message. cavity.c is the code file, cavity.exo is the grid, and error.dat is the error message. The command is "mpirun -n 2 ./cavity Can you verify that you are running the master branch? I just ran this and got DM Object: 2 MPI processes type: plex DM_0x84000004_0 in 2 dimensions: 0-cells: 5253 5252 1-cells: 10352 10350 2-cells: 5298 (198) 5297 (198) Labels: ghost: 2 strata of sizes (199, 400) vtk: 1 strata of sizes (4901) Cell Sets: 1 strata of sizes (5100) Face Sets: 3 strata of sizes (53, 99, 50) depth: 3 strata of sizes (5253, 10352, 5298) Thanks, Matt On 2016?08?11? 23:29, Matthew Knepley wrote: On Thu, Aug 11, 2016 at 3:14 AM, leejearl > wrote: Hi, Thank you for your reply. It help me very much. But, for "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when I set the overlap to 2 levels, the command is "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 -physics sw", it suffers a error. It seems to me that setting overlap to 2 is very common. Are there issues that I have not take into consideration? Any help are appreciated. I will check this out. I have not tested an overlap of 2 here since I generally use nearest neighbor FV methods for unstructured stuff. I have test examples that run fine for overlap > 1. Can you send the entire error message? If the error is not in the distribution, but rather in the analytics, that is understandable because this example is only intended to be run using a nearest neighbor FV method, and thus might be confused if we give it two layers of ghost cells. Matt leejearl On 2016?08?11? 14:57, Julian Andrej wrote: Hi, take a look at slide 10 of [1], there is visually explained what the overlap between partitions is. [1] https://www.archer.ac.uk/training/virtual/files/2015/06-PETSc/slides.pdf On Thu, Aug 11, 2016 at 8:48 AM, leejearl > wrote: Hi, all: I want to use PETSc to build my FVM code. Now, I have a question about the function DMPlexDistribute(DM dm, PetscInt overlap, PetscSF *sf, DM *dmOverlap) . In the example "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". When I set the overlap as 0 or 1, it works well. But, if I set the overlap as 2, it suffers a problem. I am confused about the value of overlap. Can it be set as 2? What is the meaning of the parameter overlap? Any helps are appreciated! leejearl -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From leejearl at 126.com Fri Aug 12 20:04:39 2016 From: leejearl at 126.com (leejearl) Date: Sat, 13 Aug 2016 09:04:39 +0800 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: References: <523b459b-1af9-1a81-55b3-76d5a651fe76@126.com> Message-ID: Thank you for your reply. The source code I have used is from the website of PETSc, not from the git repository. I will have a test with the code from git repository. leejearl On 2016?08?13? 08:49, Oxberry, Geoffrey Malcolm wrote: > >> On Aug 12, 2016, at 5:41 PM, leejearl > > wrote: >> >> Hi, Matt: >> >> >> >> > Can you verify that you are running the master branch? > > cd ${PETSC_DIR} > git branch > > The last command should return something like a list of branch names, > and the branch name with an asterisk to the left of it will be the > branch you are currently on. > > Geoff > >> I am not sure, how can I verify this? >> And I configure PETSc with this command >> "./configure --prefix=$HOME/Install/petsc-openmpi >> --with-mpi=/home/leejearl/Install/openmpi/gnu/1.8.4 >> --download-exodusii=yes --download-netcdf >> --with-hdf5-dir=/home/leejearl/Install/hdf5-1.8.14 --download-metis=yes". >> Is there some problem? Can you show me your command for configuring >> PETSc? >> >> >> Thanks >> >> leejearl >> >> >> >> >> >> On 2016?08?13? 01:10, Matthew Knepley wrote: >>> On Thu, Aug 11, 2016 at 8:00 PM, leejearl >> > wrote: >>> >>> Thank you for your reply. I have attached the code, grid and the >>> error message. >>> >>> cavity.c is the code file, cavity.exo is the grid, and error.dat >>> is the error message. >>> >>> The command is "mpirun -n 2 ./cavity >>> >>> >>> Can you verify that you are running the master branch? I just ran >>> this and got >>> >>> DM Object: 2 MPI processes >>> type: plex >>> DM_0x84000004_0 in 2 dimensions: >>> 0-cells: 5253 5252 >>> 1-cells: 10352 10350 >>> 2-cells: 5298 (198) 5297 (198) >>> Labels: >>> ghost: 2 strata of sizes (199, 400) >>> vtk: 1 strata of sizes (4901) >>> Cell Sets: 1 strata of sizes (5100) >>> Face Sets: 3 strata of sizes (53, 99, 50) >>> depth: 3 strata of sizes (5253, 10352, 5298) >>> >>> Thanks, >>> >>> Matt >>> >>> On 2016?08?11? 23:29, Matthew Knepley wrote: >>>> On Thu, Aug 11, 2016 at 3:14 AM, leejearl >>> > wrote: >>>> >>>> Hi, >>>> Thank you for your reply. It help me very much. >>>> But, for >>>> "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when I set >>>> the overlap to 2 levels, the command is >>>> "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 >>>> -physics sw", it suffers a error. >>>> It seems to me that setting overlap to 2 is very >>>> common. Are there issues that I have not take into >>>> consideration? >>>> Any help are appreciated. >>>> >>>> I will check this out. I have not tested an overlap of 2 here >>>> since I generally use nearest neighbor FV methods for >>>> unstructured stuff. I have test examples that run fine for >>>> overlap > 1. Can you send the entire error message? >>>> >>>> If the error is not in the distribution, but rather in the >>>> analytics, that is understandable because this example is only >>>> intended to be run using a nearest neighbor FV method, and thus >>>> might be confused if we give it two layers of ghost >>>> cells. >>>> >>>> Matt >>>> >>>> >>>> leejearl >>>> >>>> >>>> On 2016?08?11? 14:57, Julian Andrej wrote: >>>>> Hi, >>>>> >>>>> take a look at slide 10 of [1], there is visually >>>>> explained what the overlap between partitions is. >>>>> >>>>> [1] >>>>> https://www.archer.ac.uk/training/virtual/files/2015/06-PETSc/slides.pdf >>>>> >>>>> >>>>> On Thu, Aug 11, 2016 at 8:48 AM, leejearl >>>>> > wrote: >>>>> >>>>> Hi, all: >>>>> I want to use PETSc to build my FVM code. Now, I >>>>> have a question about >>>>> the function DMPlexDistribute(DM dm, PetscInt overlap, >>>>> PetscSF *sf, DM *dmOverlap) . >>>>> >>>>> In the example >>>>> "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". When >>>>> I set the overlap >>>>> as 0 or 1, it works well. But, if I set the overlap as >>>>> 2, it suffers a problem. >>>>> I am confused about the value of overlap. Can it >>>>> be set as 2? What is the meaning of >>>>> the parameter overlap? >>>>> Any helps are appreciated! >>>>> >>>>> leejearl >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin >>>> their experiments is infinitely more interesting than any >>>> results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >> > -- ?? ??????????????? Phone: 13324530085 QQ: 188524324 -------------- next part -------------- An HTML attachment was scrubbed... URL: From fsantost at student.ethz.ch Sun Aug 14 13:49:31 2016 From: fsantost at student.ethz.ch (Santos Teixeira Frederico) Date: Sun, 14 Aug 2016 18:49:31 +0000 Subject: [petsc-users] Problems with Gmsh Message-ID: <682CC3CD7A208742B8C2D116C67199013AE6259C@MBX13.d.ethz.ch> Hi folks, Based on ex62, I started a project and I would like to use gmsh. It seems however that I am missing something, because it works very well with both 2D and 3D meshes from PETSc (with DMPlexCreateBoxMesh), it works quite well with 2D meshes from gmsh, but it fails with the 3D gmsh. Also, the Initial Residual vector for this latter case contains big values, while it is almost zero for the other successful cases. The full output is attached, along with the code, .msh files and command-line options, where these issues can hopefully be reproduced. Any hint is highly appreciated. Regards, Frederico. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: fmcm.tar.gz Type: application/x-gzip Size: 12781 bytes Desc: fmcm.tar.gz URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: error.txt URL: From Sander.Arens at ugent.be Mon Aug 15 08:24:55 2016 From: Sander.Arens at ugent.be (Sander Arens) Date: Mon, 15 Aug 2016 15:24:55 +0200 Subject: [petsc-users] Problems with Gmsh In-Reply-To: <682CC3CD7A208742B8C2D116C67199013AE6259C@MBX13.d.ethz.ch> References: <682CC3CD7A208742B8C2D116C67199013AE6259C@MBX13.d.ethz.ch> Message-ID: Are you using the master branch? A while ago I also had this problem which was caused by different vertex numberings being used by DMPlex and gmsh. But this is fixed now if you use master. Thanks, Sander On 14 August 2016 at 20:49, Santos Teixeira Frederico < fsantost at student.ethz.ch> wrote: > Hi folks, > > Based on ex62, I started a project and I would like to use gmsh. > It seems however that I am missing something, because it works very well > with both 2D and 3D meshes from PETSc (with DMPlexCreateBoxMesh), it > works quite well with 2D meshes from gmsh, but it fails with the 3D gmsh. > Also, the Initial Residual vector for this latter case contains big values, > while it is almost zero for the other successful cases. > > The full output is attached, along with the code, .msh files and > command-line options, where these issues can hopefully be reproduced. Any > hint is highly appreciated. > > Regards, > Frederico. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaetank at gmail.com Mon Aug 15 09:11:39 2016 From: gaetank at gmail.com (Gaetan Kenway) Date: Mon, 15 Aug 2016 10:11:39 -0400 Subject: [petsc-users] [petsc-maint] GAMG with PETSc In-Reply-To: References: <915BA6A8-DBAB-49B3-B27C-D7D4009974EB@imperial.ac.uk> Message-ID: Thanks Mark I haven't been able to try it, but I trust the error message will show up. That's always better than I seg fault. I may have a few more questions later when I try to make it work well, but Ill leave that for another thread Thanks, Gaetan On Fri, Aug 12, 2016 at 11:49 AM, Mark Adams wrote: > Gaetan, This was simple, if you are setup to easily check this you can > test this in my branch mark/gamg-aijcheck -- you should get an error > message "Require AIJ matrix." > Thanks again, > > > On Fri, Aug 12, 2016 at 11:08 AM, Gaetan Kenway wrote: > >> Thanks. >> >> When I put the type back to mpiaij, my code runs fine without segfaults >> with GAMG. It doesn't work very well yet, but that is a separate issue. >> >> Gaetan >> >> On Fri, Aug 12, 2016 at 11:03 AM, Mark Adams wrote: >> >>> Yes, we should check the type. >>> >>> #define MATAIJ "aij" >>> #define MATSEQAIJ "seqaij" >>> #define MATMPIAIJ "mpiaij" >>> >>> I assume that MATAIJ is mapped to SEQ or MPI, and so just need to check >>> for the latter two. I'll do that now. >>> >>> Thanks, >>> Mark >>> >>> On Fri, Aug 12, 2016 at 7:02 AM, Lawrence Mitchell < >>> lawrence.mitchell at imperial.ac.uk> wrote: >>> >>>> [Added petsc-maint to cc, since I think this is an actual bug] >>>> >>>> > On 12 Aug 2016, at 01:16, Gaetan Kenway wrote: >>>> > >>>> > Hi >>>> > >>>> > I'm attempting to try out using GAMG for a preconditioner for my >>>> compressible CFD problem. However, I'm getting segfaults when trying to run >>>> the code. The code is based on ksp ex23.c which is attached. It just reads >>>> in two precomputed matrices (the actual jacobian and an approximate >>>> jacobian used to build the PC) and solves with a RHS of ones. >>>> > >>>> > My normal approach to solving the system is with ASM+ILU. With the >>>> following options, everything works fine. >>>> >>>> This appears to be a problem that GAMG doesn't work with BAIJ matrices. >>>> >>>> But there is no checking of the input matrix type anywhere. >>>> >>>> For example, with a debug PETSc: >>>> >>>> cd src/ksp/ksp/examples/tutorials >>>> make ex23 >>>> ./ex23 -pc_type gamg -mat_type baij >>>> >>>> [0]PETSC ERROR: --------------------- Error Message >>>> -------------------------------------------------------------- >>>> [0]PETSC ERROR: Object is in wrong state >>>> [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on >>>> argument 1 "mat" before MatSetValues() >>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>>> for trouble shooting. >>>> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1123-g21143a8 >>>> GIT Date: 2016-08-08 17:24:17 -0700 >>>> [0]PETSC ERROR: ./ex23 on a arch-darwin-c-dbg named yam-laptop.local by >>>> lmitche1 Fri Aug 12 11:51:17 2016 >>>> [0]PETSC ERROR: Configure options --download-chaco=1 >>>> --download-ctetgen=1 --download-exodusii=1 --download-hdf5=1 >>>> --download-hypre=1 --download-metis=1 --download-ml=1 --download-mumps=1 >>>> --download-netcdf=1 --download-parmetis=1 --download-ptscotch=1 >>>> --download-scalapack=1 --download-triangle=1 --with-c2html=0 >>>> --with-debugging=1 --with-shared-libraries=1 PETSC_ARCH=arch-darwin-c-dbg >>>> [0]PETSC ERROR: #1 MatSetValues() line 1195 in >>>> /Users/lmitche1/Documents/work/src/deps/petsc/src/mat/interf >>>> ace/matrix.c >>>> [0]PETSC ERROR: #2 PCGAMGFilterGraph() line 342 in >>>> /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/imp >>>> ls/gamg/util.c >>>> [0]PETSC ERROR: #3 PCGAMGGraph_AGG() line 908 in >>>> /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/imp >>>> ls/gamg/agg.c >>>> [0]PETSC ERROR: #4 PCSetUp_GAMG() line 525 in >>>> /Users/lmitche1/Documents/work/src/deps/petsc/src/ksp/pc/imp >>>> ls/gamg/gamg.c >>>> [0]PETSC ERROR: #5 PCSetUp() line 968 in /Users/lmitche1/Documents/work >>>> /src/deps/petsc/src/ksp/pc/interface/precon.c >>>> [0]PETSC ERROR: #6 KSPSetUp() line 393 in /Users/lmitche1/Documents/work >>>> /src/deps/petsc/src/ksp/ksp/interface/itfunc.c >>>> [0]PETSC ERROR: #7 KSPSolve() line 602 in /Users/lmitche1/Documents/work >>>> /src/deps/petsc/src/ksp/ksp/interface/itfunc.c >>>> [0]PETSC ERROR: #8 main() line 158 in /Users/lmitche1/Documents/work >>>> /src/deps/petsc/src/ksp/ksp/examples/tutorials/ex23.c >>>> [0]PETSC ERROR: PETSc Option Table entries: >>>> [0]PETSC ERROR: -mat_type baij >>>> [0]PETSC ERROR: -pc_type gamg >>>> [0]PETSC ERROR: ----------------End of Error Message -------send entire >>>> error message to petsc-maint at mcs.anl.gov---------- >>>> >>>> Looking in PCGAMGFilterGraph I see: >>>> >>>> ierr = MatGetType(Gmat,&mtype);CHKERRQ(ierr); >>>> ierr = MatCreate(comm, &tGmat);CHKERRQ(ierr); >>>> ierr = MatSetSizes(tGmat,nloc,nloc,MM,MM);CHKERRQ(ierr); >>>> ierr = MatSetBlockSizes(tGmat, 1, 1);CHKERRQ(ierr); >>>> ierr = MatSetType(tGmat, mtype);CHKERRQ(ierr); >>>> ierr = MatSeqAIJSetPreallocation(tGmat,0,d_nnz);CHKERRQ(ierr); >>>> ierr = MatMPIAIJSetPreallocation(tGmat,0,d_nnz,0,o_nnz);CHKERRQ(ier >>>> r); >>>> >>>> >>>> ... >>>> ierr = MatSetValues(tGmat,1,&Ii,1,&id >>>> x[jj],&sv,ADD_VALUES);CHKERRQ(ierr); >>>> >>>> >>>> So if Gmat is neither SEQAIJ nor MPIAIJ, then no preallocation has >>>> happened (and MatSetUp is not called). >>>> >>>> Fixing the few instances here by just changing the type of these >>>> matrices to AIJ. One runs into to the problem that creating the coarse >>>> grid operators doesn't work, since MatMatMult and friends do not exist for >>>> BAIJ matrices. >>>> >>>> I guess GAMG could MatConvert from BAIJ to AIJ (but this uses extra >>>> memory). >>>> >>>> But it should probably barf with a comprehensible error message. >>>> >>>> Thoughts? >>>> >>>> Lawrence >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From leejearl at 126.com Mon Aug 15 21:28:11 2016 From: leejearl at 126.com (leejearl) Date: Tue, 16 Aug 2016 10:28:11 +0800 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: References: <523b459b-1af9-1a81-55b3-76d5a651fe76@126.com> Message-ID: <220b6363-1a7f-50f5-0a46-2404eab42208@126.com> Thank you for all your helps. I have tried and reinstalled PETSc lots of times. The error are still existed. I can not find out the reasons. Now, I give the some messages in this letter. 1> The source code is downloaded from the website of PETSc, and the version is 3.7.2. 2> Configure: >export PETSC_DIR=./ >export PETSC_ARCH=arch >./configure --prefix=$HOME/Install/petsc --with-mpi-dir=/home/leejearl/Install/mpich_3.1.4/gnu --download-exodusii=../externalpackages/exodus-5.24.tar.bz2 --download-netcdf=../externalpackages/netcdf-4.3.2.tar.gz --download-hdf5=../externalpackages/hdf5-1.8.12.tar.gz --download-metis=../externalpackages/git.metis.tar.gz --download-parmetis=yes 3> The process of installation has no error. 4> After the installation, I added the following statement into the file ~/.bashrc: export PETSC_ARCH="" export PETSC_DIR=$HOME/Install/pets/ I wish to get some helps as follows: 1> Is there any problems in my installation? 2> Can any one help me a simple code in which the value of overlap used in DMPlexDistribute function is greater than 1. 3> I attach the code, makefile, grid and the error messages again, I hope some one can help me to figure out the problems. 3.1> code: cavity.c 3.2> makefile: makefile 3.3> grid: cavity.exo 3.4> error messages: error.dat It is very strange that there is no error message when I run it using "mpirun -n 3 ./cavity", but when I run it using "mpirun -n 2 ./cavity", the errors happed. The error messages are shown in the file error.dat. Any helps are appreciated. On 2016?08?13? 09:04, leejearl wrote: > > Thank you for your reply. The source code I have used is from the > website of PETSc, not from the git repository. > > I will have a test with the code from git repository. > > > leejearl > > > On 2016?08?13? 08:49, Oxberry, Geoffrey Malcolm wrote: >> >>> On Aug 12, 2016, at 5:41 PM, leejearl >> > wrote: >>> >>> Hi, Matt: >>> >>> >>> >>> > Can you verify that you are running the master branch? >> >> cd ${PETSC_DIR} >> git branch >> >> The last command should return something like a list of branch names, >> and the branch name with an asterisk to the left of it will be the >> branch you are currently on. >> >> Geoff >> >>> I am not sure, how can I verify this? >>> And I configure PETSc with this command >>> "./configure --prefix=$HOME/Install/petsc-openmpi >>> --with-mpi=/home/leejearl/Install/openmpi/gnu/1.8.4 >>> --download-exodusii=yes --download-netcdf >>> --with-hdf5-dir=/home/leejearl/Install/hdf5-1.8.14 >>> --download-metis=yes". >>> Is there some problem? Can you show me your command for configuring >>> PETSc? >>> >>> >>> Thanks >>> >>> leejearl >>> >>> >>> >>> >>> >>> On 2016?08?13? 01:10, Matthew Knepley wrote: >>>> On Thu, Aug 11, 2016 at 8:00 PM, leejearl >>> > wrote: >>>> >>>> Thank you for your reply. I have attached the code, grid and >>>> the error message. >>>> >>>> cavity.c is the code file, cavity.exo is the grid, and >>>> error.dat is the error message. >>>> >>>> The command is "mpirun -n 2 ./cavity >>>> >>>> >>>> Can you verify that you are running the master branch? I just ran >>>> this and got >>>> >>>> DM Object: 2 MPI processes >>>> type: plex >>>> DM_0x84000004_0 in 2 dimensions: >>>> 0-cells: 5253 5252 >>>> 1-cells: 10352 10350 >>>> 2-cells: 5298 (198) 5297 (198) >>>> Labels: >>>> ghost: 2 strata of sizes (199, 400) >>>> vtk: 1 strata of sizes (4901) >>>> Cell Sets: 1 strata of sizes (5100) >>>> Face Sets: 3 strata of sizes (53, 99, 50) >>>> depth: 3 strata of sizes (5253, 10352, 5298) >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> On 2016?08?11? 23:29, Matthew Knepley wrote: >>>>> On Thu, Aug 11, 2016 at 3:14 AM, leejearl >>>> > wrote: >>>>> >>>>> Hi, >>>>> Thank you for your reply. It help me very much. >>>>> But, for >>>>> "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when I >>>>> set the overlap to 2 levels, the command is >>>>> "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 >>>>> -physics sw", it suffers a error. >>>>> It seems to me that setting overlap to 2 is very >>>>> common. Are there issues that I have not take into >>>>> consideration? >>>>> Any help are appreciated. >>>>> >>>>> I will check this out. I have not tested an overlap of 2 here >>>>> since I generally use nearest neighbor FV methods for >>>>> unstructured stuff. I have test examples that run fine for >>>>> overlap > 1. Can you send the entire error message? >>>>> >>>>> If the error is not in the distribution, but rather in the >>>>> analytics, that is understandable because this example is only >>>>> intended to be run using a nearest neighbor FV method, and >>>>> thus might be confused if we give it two layers of ghost >>>>> cells. >>>>> >>>>> Matt >>>>> >>>>> >>>>> leejearl >>>>> >>>>> >>>>> On 2016?08?11? 14:57, Julian Andrej wrote: >>>>>> Hi, >>>>>> >>>>>> take a look at slide 10 of [1], there is visually >>>>>> explained what the overlap between partitions is. >>>>>> >>>>>> [1] >>>>>> https://www.archer.ac.uk/training/virtual/files/2015/06-PETSc/slides.pdf >>>>>> >>>>>> >>>>>> On Thu, Aug 11, 2016 at 8:48 AM, leejearl >>>>>> > wrote: >>>>>> >>>>>> Hi, all: >>>>>> I want to use PETSc to build my FVM code. Now, I >>>>>> have a question about >>>>>> the function DMPlexDistribute(DM dm, PetscInt >>>>>> overlap, PetscSF *sf, DM *dmOverlap) . >>>>>> >>>>>> In the example >>>>>> "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". When >>>>>> I set the overlap >>>>>> as 0 or 1, it works well. But, if I set the overlap >>>>>> as 2, it suffers a problem. >>>>>> I am confused about the value of overlap. Can it >>>>>> be set as 2? What is the meaning of >>>>>> the parameter overlap? >>>>>> Any helps are appreciated! >>>>>> >>>>>> leejearl >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin >>>>> their experiments is infinitely more interesting than any >>>>> results to which their experiments lead. >>>>> -- Norbert Wiener >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to >>>> which their experiments lead. >>>> -- Norbert Wiener >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cavity.c Type: text/x-csrc Size: 1918 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: cavity.exo Type: application/octet-stream Size: 344931 bytes Desc: not available URL: -------------- next part -------------- ALL: cavity CFLAGS = FFLAGS = CPPFLAGS = FPPFLAGS = CLEANFILES = cavity include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules cavity: cavity.o chkopts ${CLINKER}$ -o cavity cavity.o ${PETSC_LIB}$ ${RM} cavity.o -------------- next part -------------- $ make $ mpirun -n 3 ./cavity overlap = 2 overlap = 2 overlap = 2 $ mpirun -n 2 ./cavity overlap = 2 overlap = 2 [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Argument out of range [1]PETSC ERROR: key <= 0 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [1]PETSC ERROR: ./cavity on a arch named leejearl by leejearl Tue Aug 16 10:19:57 2016 [1]PETSC ERROR: Configure options --prefix=/home/leejearl/Install/petsc --with-mpi-dir=/home/leejearl/Install/mpich_3.1.4/gnu --download-exodusii=../externalpackages/exodus-5.24.tar.bz2 --download-netcdf=../externalpackages/netcdf-4.3.2.tar.gz --download-hdf5=../externalpackages/hdf5-1.8.12.tar.gz --download-metis=../externalpackages/git.metis.tar.gz --download-parmetis=yes [1]PETSC ERROR: #1 PetscTableAdd() line 45 in ./include/petscctable.h [1]PETSC ERROR: #2 PetscSFSetGraph() line 347 in /home/leejearl/Software/petsc/petsc-3.7.2/src/vec/is/sf/interface/sf.c [1]PETSC ERROR: #3 DMLabelGather() line 1092 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/label/dmlabel.c [1]PETSC ERROR: #4 DMPlexPartitionLabelPropagate() line 1633 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexpartition.c [1]PETSC ERROR: #5 DMPlexCreateOverlap() line 615 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexdistribute.c [1]PETSC ERROR: #6 DMPlexDistributeOverlap() line 1729 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexdistribute.c [1]PETSC ERROR: #7 DMPlexDistribute() line 1635 in /home/leejearl/Software/petsc/petsc-3.7.2/src/dm/impls/plex/plexdistribute.c [1]PETSC ERROR: #8 main() line 60 in /home/leejearl/Desktop/PETSc/gks_cavity/cavity.c [1]PETSC ERROR: No PETSc Option Table entries [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- application called MPI_Abort(MPI_COMM_WORLD, 63) - process 1 =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 24134 RUNNING AT leejearl = EXIT CODE: 63 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES =================================================================================== From knepley at gmail.com Tue Aug 16 06:34:03 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 16 Aug 2016 06:34:03 -0500 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: <220b6363-1a7f-50f5-0a46-2404eab42208@126.com> References: <523b459b-1af9-1a81-55b3-76d5a651fe76@126.com> <220b6363-1a7f-50f5-0a46-2404eab42208@126.com> Message-ID: On Mon, Aug 15, 2016 at 9:28 PM, leejearl wrote: > Thank you for all your helps. I have tried and reinstalled PETSc lots of > times. The error are still existed. > > I can not find out the reasons. Now, I give the some messages in this > letter. > > 1> The source code is downloaded from the website of PETSc, and the > version is 3.7.2. > > You will need to run in the 'master' branch, not the release since we have fixed some bugs. It is best to use master for very new features like this. The instructions are here: http://www.mcs.anl.gov/petsc/developers/index.html Thanks, Matt > 2> Configure: > > >export PETSC_DIR=./ > > >export PETSC_ARCH=arch > > >./configure --prefix=$HOME/Install/petsc --with-mpi-dir=/home/leejearl/Install/mpich_3.1.4/gnu > --download-exodusii=../externalpackages/exodus-5.24.tar.bz2 > --download-netcdf=../externalpackages/netcdf-4.3.2.tar.gz > --download-hdf5=../externalpackages/hdf5-1.8.12.tar.gz > --download-metis=../externalpackages/git.metis.tar.gz > --download-parmetis=yes > > 3> The process of installation has no error. > > 4> After the installation, I added the following statement into the file > ~/.bashrc: > > export PETSC_ARCH="" > > export PETSC_DIR=$HOME/Install/pets/ > > I wish to get some helps as follows: > > 1> Is there any problems in my installation? > > 2> Can any one help me a simple code in which the value of overlap used in > DMPlexDistribute function is greater than 1. > > 3> I attach the code, makefile, grid and the error messages again, I hope > some one can help me to figure out the problems. > > 3.1> code: cavity.c > > 3.2> makefile: makefile > > 3.3> grid: cavity.exo > > 3.4> error messages: error.dat > > It is very strange that there is no error message when I run it using > "mpirun -n 3 ./cavity", but when I run it using "mpirun -n 2 ./cavity", the > errors happed. > > The error messages are shown in the file error.dat. > > > Any helps are appreciated. > > > > On 2016?08?13? 09:04, leejearl wrote: > > Thank you for your reply. The source code I have used is from the website > of PETSc, not from the git repository. > > I will have a test with the code from git repository. > > > leejearl > > On 2016?08?13? 08:49, Oxberry, Geoffrey Malcolm wrote: > > > On Aug 12, 2016, at 5:41 PM, leejearl wrote: > > Hi, Matt: > > > > Can you verify that you are running the master branch? > > > cd ${PETSC_DIR} > git branch > > The last command should return something like a list of branch names, and > the branch name with an asterisk to the left of it will be the branch you > are currently on. > > Geoff > > I am not sure, how can I verify this? > And I configure PETSc with this command > "./configure --prefix=$HOME/Install/petsc-openmpi > --with-mpi=/home/leejearl/Install/openmpi/gnu/1.8.4 > --download-exodusii=yes --download-netcdf --with-hdf5-dir=/home/leejearl/Install/hdf5-1.8.14 > --download-metis=yes". > Is there some problem? Can you show me your command for configuring PETSc? > > > Thanks > > leejearl > > > > > > On 2016?08?13? 01:10, Matthew Knepley wrote: > > On Thu, Aug 11, 2016 at 8:00 PM, leejearl wrote: > >> Thank you for your reply. I have attached the code, grid and the error >> message. >> >> cavity.c is the code file, cavity.exo is the grid, and error.dat is the >> error message. >> >> The command is "mpirun -n 2 ./cavity >> > > Can you verify that you are running the master branch? I just ran this and > got > > DM Object: 2 MPI processes > type: plex > DM_0x84000004_0 in 2 dimensions: > 0-cells: 5253 5252 > 1-cells: 10352 10350 > 2-cells: 5298 (198) 5297 (198) > Labels: > ghost: 2 strata of sizes (199, 400) > vtk: 1 strata of sizes (4901) > Cell Sets: 1 strata of sizes (5100) > Face Sets: 3 strata of sizes (53, 99, 50) > depth: 3 strata of sizes (5253, 10352, 5298) > > Thanks, > > Matt > > >> On 2016?08?11? 23:29, Matthew Knepley wrote: >> >> On Thu, Aug 11, 2016 at 3:14 AM, leejearl wrote: >> >>> Hi, >>> Thank you for your reply. It help me very much. >>> But, for "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when I >>> set the overlap to 2 levels, the command is >>> "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 -physics sw", >>> it suffers a error. >>> It seems to me that setting overlap to 2 is very common. Are there >>> issues that I have not take into consideration? >>> Any help are appreciated. >>> >> I will check this out. I have not tested an overlap of 2 here since I >> generally use nearest neighbor FV methods for >> unstructured stuff. I have test examples that run fine for overlap > 1. >> Can you send the entire error message? >> >> If the error is not in the distribution, but rather in the analytics, >> that is understandable because this example is only >> intended to be run using a nearest neighbor FV method, and thus might be >> confused if we give it two layers of ghost >> cells. >> >> Matt >> >> >>> >>> leejearl >>> >>> On 2016?08?11? 14:57, Julian Andrej wrote: >>> >>> Hi, >>> >>> take a look at slide 10 of [1], there is visually explained what the >>> overlap between partitions is. >>> >>> [1] https://www.archer.ac.uk/training/virtual/files/2015/06- >>> PETSc/slides.pdf >>> >>> On Thu, Aug 11, 2016 at 8:48 AM, leejearl wrote: >>> >>>> Hi, all: >>>> I want to use PETSc to build my FVM code. Now, I have a question >>>> about >>>> the function DMPlexDistribute(DM dm, PetscInt overlap, PetscSF *sf, DM >>>> *dmOverlap) . >>>> >>>> In the example "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". >>>> When I set the overlap >>>> as 0 or 1, it works well. But, if I set the overlap as 2, it suffers a >>>> problem. >>>> I am confused about the value of overlap. Can it be set as 2? What >>>> is the meaning of >>>> the parameter overlap? >>>> Any helps are appreciated! >>>> >>>> leejearl >>>> >>>> >>>> >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From aks084000 at utdallas.edu Tue Aug 16 17:21:29 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Tue, 16 Aug 2016 22:21:29 +0000 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: References: Message-ID: <7050CA61-07F5-4599-9CB1-219472487A20@utdallas.edu> Barry, Mark, Apologies for taking a while to respond. Are you saying it works for a while but fails when the problem is large, or that it never works with fieldsplit_1? And how many processors are you using? All of these are serial runs. The preconditioner always fails for large problems; but I also found an example where it fails for a fairly small matrix. fieldsplit_1 is the one that fails, although fieldpslit_0 always works. This should be fixed in the master branch I tried the developmental version (from today), but the same bug persists. I have some code that reproduces the example. The index sets that I use are non-overlapping; I also checked with matlab that each fieldsplit submatrix is non-singular. The issue is with fieldsplit_TA: if I do not use -fieldsplit_TA-pc_type_gamg, then the solver converges. Otherwise, it crashes with the same message. Artur -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: A Type: application/octet-stream Size: 781904 bytes Desc: A URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: A.info Type: application/octet-stream Size: 23 bytes Desc: A.info URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex.c Type: text/x-csrc Size: 7967 bytes Desc: ex.c URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.sh Type: application/x-shellscript Size: 559 bytes Desc: run.sh URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From leejearl at 126.com Tue Aug 16 20:48:19 2016 From: leejearl at 126.com (leejearl) Date: Wed, 17 Aug 2016 09:48:19 +0800 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: References: <523b459b-1af9-1a81-55b3-76d5a651fe76@126.com> <220b6363-1a7f-50f5-0a46-2404eab42208@126.com> Message-ID: <615f4b68-9a80-906a-0385-4b2c586c6184@126.com> Thank you for your help. The problem has been overcome following your advices. Now, I give some notes for this problem which might be useful for others. 1. Up to now, we must use the code in the master branch. The source code can be downloaded using git with: git clone https://bitbucket.org/petsc/petsc petsc. 2. Just the other day, I download the code using git with: git clone -b maint https://bitbucket.org/petsc/petsc petsc. But, this copy of code still has such a problem. I am not sure whether it is available now. 3. The code downloaded from website of PETSc(Version of 3.7.3) is not available for this problem, although it is in the branch master. Thanks for all helps again. Best wishes to you. leejearl On 2016?08?16? 19:34, Matthew Knepley wrote: > On Mon, Aug 15, 2016 at 9:28 PM, leejearl > wrote: > > Thank you for all your helps. I have tried and reinstalled PETSc > lots of times. The error are still existed. > > I can not find out the reasons. Now, I give the some messages in > this letter. > > 1> The source code is downloaded from the website of PETSc, and > the version is 3.7.2. > > You will need to run in the 'master' branch, not the release since we > have fixed some bugs. It is best to > use master for very new features like this. The instructions are here: > http://www.mcs.anl.gov/petsc/developers/index.html > > Thanks, > > Matt > > 2> Configure: > > >export PETSC_DIR=./ > > >export PETSC_ARCH=arch > > >./configure --prefix=$HOME/Install/petsc > --with-mpi-dir=/home/leejearl/Install/mpich_3.1.4/gnu > --download-exodusii=../externalpackages/exodus-5.24.tar.bz2 > --download-netcdf=../externalpackages/netcdf-4.3.2.tar.gz > --download-hdf5=../externalpackages/hdf5-1.8.12.tar.gz > --download-metis=../externalpackages/git.metis.tar.gz > --download-parmetis=yes > > 3> The process of installation has no error. > > 4> After the installation, I added the following statement into > the file ~/.bashrc: > > export PETSC_ARCH="" > > export PETSC_DIR=$HOME/Install/pets/ > > I wish to get some helps as follows: > > 1> Is there any problems in my installation? > > 2> Can any one help me a simple code in which the value of overlap > used in DMPlexDistribute function is greater than 1. > > 3> I attach the code, makefile, grid and the error messages again, > I hope some one can help me to figure out the problems. > > 3.1> code: cavity.c > > 3.2> makefile: makefile > > 3.3> grid: cavity.exo > > 3.4> error messages: error.dat > > It is very strange that there is no error message when I run > it using "mpirun -n 3 ./cavity", but when I run it using "mpirun > -n 2 ./cavity", the errors happed. > > The error messages are shown in the file error.dat. > > > Any helps are appreciated. > > > > On 2016?08?13? 09:04, leejearl wrote: >> >> Thank you for your reply. The source code I have used is from the >> website of PETSc, not from the git repository. >> >> I will have a test with the code from git repository. >> >> >> leejearl >> >> >> On 2016?08?13? 08:49, Oxberry, Geoffrey Malcolm wrote: >>> >>>> On Aug 12, 2016, at 5:41 PM, leejearl >>> > wrote: >>>> >>>> Hi, Matt: >>>> >>>> >>>> >>>> > Can you verify that you are running the master branch? >>> >>> cd ${PETSC_DIR} >>> git branch >>> >>> The last command should return something like a list of branch >>> names, and the branch name with an asterisk to the left of it >>> will be the branch you are currently on. >>> >>> Geoff >>> >>>> I am not sure, how can I verify this? >>>> And I configure PETSc with this command >>>> "./configure --prefix=$HOME/Install/petsc-openmpi >>>> --with-mpi=/home/leejearl/Install/openmpi/gnu/1.8.4 >>>> --download-exodusii=yes --download-netcdf >>>> --with-hdf5-dir=/home/leejearl/Install/hdf5-1.8.14 >>>> --download-metis=yes". >>>> Is there some problem? Can you show me your command for >>>> configuring PETSc? >>>> >>>> >>>> Thanks >>>> >>>> leejearl >>>> >>>> >>>> >>>> >>>> >>>> On 2016?08?13? 01:10, Matthew Knepley wrote: >>>>> On Thu, Aug 11, 2016 at 8:00 PM, leejearl >>>> > wrote: >>>>> >>>>> Thank you for your reply. I have attached the code, grid >>>>> and the error message. >>>>> >>>>> cavity.c is the code file, cavity.exo is the grid, and >>>>> error.dat is the error message. >>>>> >>>>> The command is "mpirun -n 2 ./cavity >>>>> >>>>> >>>>> Can you verify that you are running the master branch? I just >>>>> ran this and got >>>>> >>>>> DM Object: 2 MPI processes >>>>> type: plex >>>>> DM_0x84000004_0 in 2 dimensions: >>>>> 0-cells: 5253 5252 >>>>> 1-cells: 10352 10350 >>>>> 2-cells: 5298 (198) 5297 (198) >>>>> Labels: >>>>> ghost: 2 strata of sizes (199, 400) >>>>> vtk: 1 strata of sizes (4901) >>>>> Cell Sets: 1 strata of sizes (5100) >>>>> Face Sets: 3 strata of sizes (53, 99, 50) >>>>> depth: 3 strata of sizes (5253, 10352, 5298) >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> On 2016?08?11? 23:29, Matthew Knepley wrote: >>>>>> On Thu, Aug 11, 2016 at 3:14 AM, leejearl >>>>>> > wrote: >>>>>> >>>>>> Hi, >>>>>> Thank you for your reply. It help me very much. >>>>>> But, for >>>>>> "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when >>>>>> I set the overlap to 2 levels, the command is >>>>>> "mpirun -n 3 ./ex11 -f annulus-20.exo >>>>>> -ufv_mesh_overlap 2 -physics sw", it suffers a error. >>>>>> It seems to me that setting overlap to 2 is very >>>>>> common. Are there issues that I have not take into >>>>>> consideration? >>>>>> Any help are appreciated. >>>>>> >>>>>> I will check this out. I have not tested an overlap of 2 >>>>>> here since I generally use nearest neighbor FV methods for >>>>>> unstructured stuff. I have test examples that run fine >>>>>> for overlap > 1. Can you send the entire error message? >>>>>> >>>>>> If the error is not in the distribution, but rather in >>>>>> the analytics, that is understandable because this >>>>>> example is only >>>>>> intended to be run using a nearest neighbor FV method, >>>>>> and thus might be confused if we give it two layers of ghost >>>>>> cells. >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>> leejearl >>>>>> >>>>>> >>>>>> On 2016?08?11? 14:57, Julian Andrej wrote: >>>>>>> Hi, >>>>>>> >>>>>>> take a look at slide 10 of [1], there is visually >>>>>>> explained what the overlap between partitions is. >>>>>>> >>>>>>> [1] >>>>>>> https://www.archer.ac.uk/training/virtual/files/2015/06-PETSc/slides.pdf >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 11, 2016 at 8:48 AM, leejearl >>>>>>> > wrote: >>>>>>> >>>>>>> Hi, all: >>>>>>> I want to use PETSc to build my FVM code. >>>>>>> Now, I have a question about >>>>>>> the function DMPlexDistribute(DM dm, PetscInt >>>>>>> overlap, PetscSF *sf, DM *dmOverlap) . >>>>>>> >>>>>>> In the example >>>>>>> "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". >>>>>>> When I set the overlap >>>>>>> as 0 or 1, it works well. But, if I set the >>>>>>> overlap as 2, it suffers a problem. >>>>>>> I am confused about the value of overlap. >>>>>>> Can it be set as 2? What is the meaning of >>>>>>> the parameter overlap? >>>>>>> Any helps are appreciated! >>>>>>> >>>>>>> leejearl >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they >>>>>> begin their experiments is infinitely more interesting >>>>>> than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin >>>>> their experiments is infinitely more interesting than any >>>>> results to which their experiments lead. >>>>> -- Norbert Wiener >>>> >>> > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From juan at tf.uni-kiel.de Tue Aug 16 23:38:07 2016 From: juan at tf.uni-kiel.de (Julian Andrej) Date: Wed, 17 Aug 2016 06:38:07 +0200 Subject: [petsc-users] A question about DMPlexDistribute In-Reply-To: <615f4b68-9a80-906a-0385-4b2c586c6184@126.com> References: <523b459b-1af9-1a81-55b3-76d5a651fe76@126.com> <220b6363-1a7f-50f5-0a46-2404eab42208@126.com> <615f4b68-9a80-906a-0385-4b2c586c6184@126.com> Message-ID: The maint branch is a more or less "stable" branch. Features go into "master" at first then get merged into "maint". So git clone -b master https://bitbucket.org/petsc/petsc petsc will always be the most recent code you can possibly obtain. On Wed, Aug 17, 2016 at 3:48 AM, leejearl wrote: > Thank you for your help. The problem has been overcome following your > advices. > > Now, I give some notes for this problem which might be useful for others. > > 1. Up to now, we must use the code in the master branch. The source code can > be downloaded using git with: > > git clone https://bitbucket.org/petsc/petsc petsc. > > 2. Just the other day, I download the code using git with: git clone -b > maint https://bitbucket.org/petsc/petsc petsc. > > But, this copy of code still has such a problem. I am not sure whether > it is available now. > > 3. The code downloaded from website of PETSc(Version of 3.7.3) is not > available for this problem, although it is in the branch master. > > > Thanks for all helps again. Best wishes to you. > > leejearl > > > > > On 2016?08?16? 19:34, Matthew Knepley wrote: > > On Mon, Aug 15, 2016 at 9:28 PM, leejearl wrote: >> >> Thank you for all your helps. I have tried and reinstalled PETSc lots of >> times. The error are still existed. >> >> I can not find out the reasons. Now, I give the some messages in this >> letter. >> >> 1> The source code is downloaded from the website of PETSc, and the >> version is 3.7.2. > > You will need to run in the 'master' branch, not the release since we have > fixed some bugs. It is best to > use master for very new features like this. The instructions are here: > http://www.mcs.anl.gov/petsc/developers/index.html > > Thanks, > > Matt >> >> 2> Configure: >> >> >export PETSC_DIR=./ >> >> >export PETSC_ARCH=arch >> >> >./configure --prefix=$HOME/Install/petsc >> > --with-mpi-dir=/home/leejearl/Install/mpich_3.1.4/gnu >> > --download-exodusii=../externalpackages/exodus-5.24.tar.bz2 >> > --download-netcdf=../externalpackages/netcdf-4.3.2.tar.gz >> > --download-hdf5=../externalpackages/hdf5-1.8.12.tar.gz >> > --download-metis=../externalpackages/git.metis.tar.gz >> > --download-parmetis=yes >> >> 3> The process of installation has no error. >> >> 4> After the installation, I added the following statement into the file >> ~/.bashrc: >> >> export PETSC_ARCH="" >> >> export PETSC_DIR=$HOME/Install/pets/ >> >> I wish to get some helps as follows: >> >> 1> Is there any problems in my installation? >> >> 2> Can any one help me a simple code in which the value of overlap used in >> DMPlexDistribute function is greater than 1. >> >> 3> I attach the code, makefile, grid and the error messages again, I hope >> some one can help me to figure out the problems. >> >> 3.1> code: cavity.c >> >> 3.2> makefile: makefile >> >> 3.3> grid: cavity.exo >> >> 3.4> error messages: error.dat >> >> It is very strange that there is no error message when I run it using >> "mpirun -n 3 ./cavity", but when I run it using "mpirun -n 2 ./cavity", the >> errors happed. >> >> The error messages are shown in the file error.dat. >> >> >> Any helps are appreciated. >> >> >> >> On 2016?08?13? 09:04, leejearl wrote: >> >> Thank you for your reply. The source code I have used is from the website >> of PETSc, not from the git repository. >> >> I will have a test with the code from git repository. >> >> >> leejearl >> >> >> On 2016?08?13? 08:49, Oxberry, Geoffrey Malcolm wrote: >> >> >> On Aug 12, 2016, at 5:41 PM, leejearl wrote: >> >> Hi, Matt: >> >> >> >> > Can you verify that you are running the master branch? >> >> >> cd ${PETSC_DIR} >> git branch >> >> The last command should return something like a list of branch names, and >> the branch name with an asterisk to the left of it will be the branch you >> are currently on. >> >> Geoff >> >> I am not sure, how can I verify this? >> And I configure PETSc with this command >> "./configure --prefix=$HOME/Install/petsc-openmpi >> --with-mpi=/home/leejearl/Install/openmpi/gnu/1.8.4 --download-exodusii=yes >> --download-netcdf --with-hdf5-dir=/home/leejearl/Install/hdf5-1.8.14 >> --download-metis=yes". >> Is there some problem? Can you show me your command for configuring PETSc? >> >> >> Thanks >> >> leejearl >> >> >> >> >> >> On 2016?08?13? 01:10, Matthew Knepley wrote: >> >> On Thu, Aug 11, 2016 at 8:00 PM, leejearl wrote: >>> >>> Thank you for your reply. I have attached the code, grid and the error >>> message. >>> >>> cavity.c is the code file, cavity.exo is the grid, and error.dat is the >>> error message. >>> >>> The command is "mpirun -n 2 ./cavity >> >> >> Can you verify that you are running the master branch? I just ran this and >> got >> >> DM Object: 2 MPI processes >> type: plex >> DM_0x84000004_0 in 2 dimensions: >> 0-cells: 5253 5252 >> 1-cells: 10352 10350 >> 2-cells: 5298 (198) 5297 (198) >> Labels: >> ghost: 2 strata of sizes (199, 400) >> vtk: 1 strata of sizes (4901) >> Cell Sets: 1 strata of sizes (5100) >> Face Sets: 3 strata of sizes (53, 99, 50) >> depth: 3 strata of sizes (5253, 10352, 5298) >> >> Thanks, >> >> Matt >> >>> >>> On 2016?08?11? 23:29, Matthew Knepley wrote: >>> >>> On Thu, Aug 11, 2016 at 3:14 AM, leejearl wrote: >>>> >>>> Hi, >>>> Thank you for your reply. It help me very much. >>>> But, for "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c", when I set >>>> the overlap to 2 levels, the command is >>>> "mpirun -n 3 ./ex11 -f annulus-20.exo -ufv_mesh_overlap 2 -physics sw", >>>> it suffers a error. >>>> It seems to me that setting overlap to 2 is very common. Are there >>>> issues that I have not take into consideration? >>>> Any help are appreciated. >>> >>> I will check this out. I have not tested an overlap of 2 here since I >>> generally use nearest neighbor FV methods for >>> unstructured stuff. I have test examples that run fine for overlap > 1. >>> Can you send the entire error message? >>> >>> If the error is not in the distribution, but rather in the analytics, >>> that is understandable because this example is only >>> intended to be run using a nearest neighbor FV method, and thus might be >>> confused if we give it two layers of ghost >>> cells. >>> >>> Matt >>> >>>> >>>> >>>> leejearl >>>> >>>> >>>> On 2016?08?11? 14:57, Julian Andrej wrote: >>>> >>>> Hi, >>>> >>>> take a look at slide 10 of [1], there is visually explained what the >>>> overlap between partitions is. >>>> >>>> [1] >>>> https://www.archer.ac.uk/training/virtual/files/2015/06-PETSc/slides.pdf >>>> >>>> On Thu, Aug 11, 2016 at 8:48 AM, leejearl wrote: >>>>> >>>>> Hi, all: >>>>> I want to use PETSc to build my FVM code. Now, I have a question >>>>> about >>>>> the function DMPlexDistribute(DM dm, PetscInt overlap, PetscSF *sf, DM >>>>> *dmOverlap) . >>>>> >>>>> In the example "/petsc-3.7.2/src/ts/examples/tutorials/ex11.c". >>>>> When I set the overlap >>>>> as 0 or 1, it works well. But, if I set the overlap as 2, it suffers a >>>>> problem. >>>>> I am confused about the value of overlap. Can it be set as 2? What >>>>> is the meaning of >>>>> the parameter overlap? >>>>> Any helps are appreciated! >>>>> >>>>> leejearl >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> >> > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > From bsmith at mcs.anl.gov Wed Aug 17 00:44:25 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 17 Aug 2016 00:44:25 -0500 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: <7050CA61-07F5-4599-9CB1-219472487A20@utdallas.edu> References: <7050CA61-07F5-4599-9CB1-219472487A20@utdallas.edu> Message-ID: I first set the option -ksp_converged_reason and obtained the message $ ./ex1 Residual norms for fieldsplit_PA_ solve. 0 KSP unpreconditioned resid norm 2.288897733893e+03 true resid norm 2.288897733893e+03 ||r(i)||/||b|| 1.000000000000e+00 Linear solve did not converge due to DIVERGED_PCSETUP_FAILED iterations 0 PCSETUP_FAILED due to SUBPC_ERROR so it is failing in building the preconditioner since it didn't even work for the first fieldsplit I guessed that it failed in the gamg on the first fieldsplit so ran with -fieldsplit_PA_ksp_error_if_not_converged and got $ ./ex1 Residual norms for fieldsplit_PA_ solve. 0 KSP unpreconditioned resid norm 2.288897733893e+03 true resid norm 2.288897733893e+03 ||r(i)||/||b|| 1.000000000000e+00 Linear solve did not converge due to DIVERGED_PCSETUP_FAILED iterations 0 PCSETUP_FAILED due to SUBPC_ERROR ~/Src/petsc/test-dir (master=) arch-master-basic $ ./ex1 -fieldsplit_PA_ksp_error_if_not_converged [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: [0]PETSC ERROR: KSPSolve has not converged [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1182-g00a02d5 GIT Date: 2016-08-16 15:09:17 -0500 [0]PETSC ERROR: ./ex1 on a arch-master-basic named Barrys-MacBook-Pro.local by barrysmith Wed Aug 17 00:35:16 2016 [0]PETSC ERROR: Configure options --with-mpi-dir=/Users/barrysmith/libraries [0]PETSC ERROR: #1 KSPSolve() line 850 in /Users/barrysmith/Src/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #2 PCGAMGOptProlongator_AGG() line 1221 in /Users/barrysmith/Src/petsc/src/ksp/pc/impls/gamg/agg.c looking at the code and seeing the options prefix for this KSP solve I ran with -fieldsplit_PA_ksp_error_if_not_converged -fieldsplit_PA_gamg_est_ksp_monitor_true_residual and got $ ./ex1 -fieldsplit_PA_ksp_error_if_not_converged -fieldsplit_PA_gamg_est_ksp_monitor_true_residual Residual norms for fieldsplit_PA_gamg_est_ solve. 0 KSP none resid norm 7.030417576826e+07 true resid norm 1.006594247197e+01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP none resid norm 6.979279406029e+07 true resid norm 1.150138107009e+01 ||r(i)||/||b|| 1.142603497100e+00 2 KSP none resid norm 6.979246564783e+07 true resid norm 6.970771666727e+03 ||r(i)||/||b|| 6.925105807166e+02 3 KSP none resid norm 6.978033367036e+07 true resid norm 9.958555706490e+02 ||r(i)||/||b|| 9.893316730368e+01 4 KSP none resid norm 6.977995917588e+07 true resid norm 1.095475380870e+04 ||r(i)||/||b|| 1.088298869103e+03 5 KSP none resid norm 6.954940040289e+07 true resid norm 2.182804459638e+04 ||r(i)||/||b|| 2.168504802919e+03 6 KSP none resid norm 6.905975832912e+07 true resid norm 4.557801389945e+04 ||r(i)||/||b|| 4.527943014412e+03 7 KSP none resid norm 6.905788649989e+07 true resid norm 4.476060162996e+04 ||r(i)||/||b|| 4.446737278162e+03 8 KSP none resid norm 5.464732207984e+07 true resid norm 7.801211607942e+04 ||r(i)||/||b|| 7.750105496491e+03 9 KSP none resid norm 5.393328767072e+07 true resid norm 8.529925739695e+04 ||r(i)||/||b|| 8.474045787013e+03 10 KSP none resid norm 5.294387823310e+07 true resid norm 8.380999411358e+04 ||r(i)||/||b|| 8.326095082197e+03 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: [0]PETSC ERROR: KSPSolve has not converged the smoother is simply not working AT ALL on your first sub matrix so I ran with -fieldsplit_PA_gamg_est_ksp_pmat_view and loaded the resulting matrix from the file binaryoutput into matlab and checked its eigenvalues >> e = eig(full(a)) e = 5.5260 +85.4938i 5.5260 -85.4938i -11.6673 + 0.0000i -6.5409 + 0.0000i 6.5240 + 0.0000i -2.5377 + 2.0951i -2.5377 - 2.0951i 3.0712 + 0.0000i -0.5365 + 0.9521i -0.5365 - 0.9521i 1.0710 + 0.0000i -0.9334 + 0.0000i 0.7608 + 0.0000i 0.4337 + 0.0000i -0.4558 + 0.0000i -0.4011 + 0.0000i 0.1212 + 0.0000i 0.0327 + 0.0338i 0.0327 - 0.0338i -0.0426 + 0.0000i -0.0360 + 0.0334i -0.0360 - 0.0334i 0.0308 + 0.0000i -0.0003 + 0.0327i -0.0003 - 0.0327i -0.0323 + 0.0000i 0.0120 + 0.0185i 0.0120 - 0.0185i -0.0256 + 0.0000i -0.0225 + 0.0000i 0.0152 + 0.0000i -0.0083 + 0.0125i -0.0083 - 0.0125i -0.0177 + 0.0000i -0.0175 + 0.0000i -0.0177 + 0.0000i -0.0158 + 0.0000i -0.0176 + 0.0000i -0.0136 + 0.0038i -0.0136 - 0.0038i 0.0125 + 0.0000i -0.0080 + 0.0069i -0.0080 - 0.0069i 0.0066 + 0.0075i 0.0066 - 0.0075i 0.0097 + 0.0000i 0.0039 + 0.0085i 0.0039 - 0.0085i 0.0070 + 0.0000i -0.0095 + 0.0011i -0.0095 - 0.0011i -0.0064 + 0.0000i 0.0024 + 0.0036i 0.0024 - 0.0036i 0.0042 + 0.0000i 0.0042 - 0.0000i 0.0040 + 0.0000i -0.0035 + 0.0021i -0.0035 - 0.0021i -0.0004 + 0.0038i -0.0004 - 0.0038i -0.0040 + 0.0000i -0.0037 + 0.0000i -0.0036 + 0.0000i -0.0037 + 0.0000i -0.0033 + 0.0000i -0.0033 + 0.0000i -0.0033 + 0.0000i -0.0029 + 0.0000i 0.0015 + 0.0014i 0.0015 - 0.0014i 0.0004 + 0.0022i 0.0004 - 0.0022i 0.0003 + 0.0024i 0.0003 - 0.0024i -0.0012 + 0.0018i -0.0012 - 0.0018i 0.0003 + 0.0023i 0.0003 - 0.0023i -0.0022 + 0.0000i -0.0024 + 0.0000i -0.0024 - 0.0000i -0.0021 + 0.0000i 0.0010 + 0.0010i 0.0010 - 0.0010i 0.0009 + 0.0009i 0.0009 - 0.0009i -0.0002 + 0.0014i -0.0002 - 0.0014i -0.0013 + 0.0011i -0.0013 - 0.0011i -0.0011 + 0.0012i -0.0011 - 0.0012i -0.0014 + 0.0008i -0.0014 - 0.0008i -0.0012 + 0.0008i -0.0012 - 0.0008i -0.0014 + 0.0004i -0.0014 - 0.0004i -0.0005 + 0.0011i -0.0005 - 0.0011i 0.0007 + 0.0009i 0.0007 - 0.0009i -0.0010 + 0.0001i -0.0010 - 0.0001i 0.0008 + 0.0006i 0.0008 - 0.0006i 0.0011 + 0.0001i 0.0011 - 0.0001i 0.0010 + 0.0001i 0.0010 - 0.0001i 0.0010 + 0.0000i 0.0010 - 0.0000i 0.0003 + 0.0008i 0.0003 - 0.0008i -0.0001 + 0.0009i -0.0001 - 0.0009i -0.0008 + 0.0000i -0.0008 + 0.0005i -0.0008 - 0.0005i -0.0004 + 0.0007i -0.0004 - 0.0007i -0.0008 + 0.0002i -0.0008 - 0.0002i -0.0000 + 0.0008i -0.0000 - 0.0008i 0.0008 + 0.0000i 0.0008 - 0.0000i -0.0008 + 0.0000i -0.0007 + 0.0000i -0.0004 + 0.0005i -0.0004 - 0.0005i 0.0005 + 0.0003i 0.0005 - 0.0003i 0.0006 + 0.0001i 0.0006 - 0.0001i 0.0006 + 0.0000i -0.0005 + 0.0003i -0.0005 - 0.0003i 0.0003 + 0.0005i 0.0003 - 0.0005i -0.0001 + 0.0005i -0.0001 - 0.0005i 0.0000 + 0.0006i 0.0000 - 0.0006i -0.0004 + 0.0000i -0.0004 + 0.0000i 0.0003 + 0.0003i 0.0003 - 0.0003i 0.0003 + 0.0002i 0.0003 - 0.0002i -0.0004 + 0.0000i -0.0000 + 0.0004i -0.0000 - 0.0004i -0.0002 + 0.0003i -0.0002 - 0.0003i -0.0001 + 0.0003i -0.0001 - 0.0003i 0.0003 + 0.0001i 0.0003 - 0.0001i -0.0003 + 0.0000i -0.0003 - 0.0000i 0.0001 + 0.0002i 0.0001 - 0.0002i 0.0002 + 0.0000i 0.0002 + 0.0001i 0.0002 - 0.0001i 0.0001 + 0.0002i 0.0001 - 0.0002i 0.0002 + 0.0000i -0.0002 + 0.0000i -0.0002 - 0.0000i 0.0001 + 0.0000i 0.0000 + 0.0001i 0.0000 - 0.0001i 0.0001 + 0.0001i 0.0001 - 0.0001i -0.0001 + 0.0000i -0.0001 + 0.0000i -0.0001 - 0.0000i 0.0001 + 0.0001i 0.0001 - 0.0001i 0.0001 + 0.0000i 0.0001 - 0.0000i -0.0001 + 0.0001i -0.0001 - 0.0001i 0.0001 + 0.0001i 0.0001 - 0.0001i 0.0001 + 0.0000i 0.0000 + 0.0001i 0.0000 - 0.0001i 0.0000 + 0.0001i 0.0000 - 0.0001i -0.0000 + 0.0001i -0.0000 - 0.0001i -0.0001 + 0.0000i -0.0001 - 0.0000i -0.0001 + 0.0000i -0.0001 - 0.0000i -0.0001 + 0.0000i -0.0001 - 0.0000i 0.0001 + 0.0000i 0.0001 - 0.0000i -0.0001 + 0.0000i -0.0001 - 0.0000i -0.0001 + 0.0000i 0.0000 + 0.0001i 0.0000 - 0.0001i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i 0.0000 + 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 + 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i -0.0000 + 0.0000i -0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i -0.0000 + 0.0000i -0.0000 - 0.0000i -0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 - 0.0000i -0.0000 + 0.0000i 0.0000 + 0.0000i -0.0000 + 0.0000i 0.0000 + 0.0000i -0.0000 + 0.0000i -0.0000 + 0.0000i 0.0000 + 0.0000i -0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i 0.0000 + 0.0000i The first matrix you are handing off to GAMG is totally inappropriate for GAMG it has a huge range of eigenvalue scales both positive and negative and many zero eigenvectors. The smoother GAMG uses does not work on the matrix nor will GAMG work on the matrix. Hence the entire nested fieldsplit preconditioner cannot work on the entire matrix since the inner GAMG preconditioner cannot work on the inner matrix, in fact probably next to no preconditioner will work on the inner matrix. You need to go back and make sure that either 1) the matrices being generated are what they should be and 2) your assumption that GAMG would work on the first submatrix makes sense Barry > On Aug 16, 2016, at 5:21 PM, Safin, Artur wrote: > > Barry, Mark, > > Apologies for taking a while to respond. > >> Are you saying it works for a while but fails when the problem is large, or that it never works with fieldsplit_1? And how many processors are you using? > > All of these are serial runs. The preconditioner always fails for large problems; but I also found an example where it fails for a fairly small matrix. fieldsplit_1 is the one that fails, although fieldpslit_0 always works. > >> This should be fixed in the master branch > > I tried the developmental version (from today), but the same bug persists. > > > > I have some code that reproduces the example. The index sets that I use are non-overlapping; I also checked with matlab that each fieldsplit submatrix is non-singular. > The issue is with fieldsplit_TA: if I do not use -fieldsplit_TA-pc_type_gamg, then the solver converges. Otherwise, it crashes with the same message. > > Artur > > > From jychang48 at gmail.com Wed Aug 17 05:23:39 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 17 Aug 2016 05:23:39 -0500 Subject: [petsc-users] What exactly goes into DMPlexSetRefinementLimit Message-ID: Hi all, Playing around with SNES ex12.c and I am attempting to tinker around with 3D options. I am trying to understand what kind of values go into -refinement_limit for 3D simplices. Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 17 09:38:22 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 Aug 2016 09:38:22 -0500 Subject: [petsc-users] What exactly goes into DMPlexSetRefinementLimit In-Reply-To: References: Message-ID: On Wed, Aug 17, 2016 at 5:23 AM, Justin Chang wrote: > Hi all, > > Playing around with SNES ex12.c and I am attempting to tinker around with > 3D options. I am trying to understand what kind of values go into > -refinement_limit for 3D simplices. > The cell volume limits for any cells created out of the existing cells. This is how TetGen understands refinement. I think a better way is to use a metric tensor field, and we now have an interface to pragmatic for this (I think currently I only hooked it up to DMCoarsen() but it does both). Clearly the interface is immature, but this is the way we are headed. Matt > Thanks, > Justin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From aks084000 at utdallas.edu Wed Aug 17 11:02:16 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Wed, 17 Aug 2016 16:02:16 +0000 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: References: <7050CA61-07F5-4599-9CB1-219472487A20@utdallas.edu>, Message-ID: <5889439212f64929a139951a1b68b523@utdallas.edu> Hi Barry, Well, this is a bit strange: my run produces a completely different output. What I have is -------------------------------------------------------------------------------------------------------------------------------------------------------- artur at artur-ubuntu:~/Downloads/temp$ ./ex -options_file options.txt Residual norms for fieldsplit_PA_ solve. 0 KSP unpreconditioned resid norm 3.037294001981e-01 true resid norm 3.037294001981e-01 ||r(i)||/||b|| 1.000000000000e+00 1 KSP unpreconditioned resid norm 2.205402771720e-02 true resid norm 2.205402771720e-02 ||r(i)||/||b|| 7.261077690475e-02 2 KSP unpreconditioned resid norm 2.342871234277e-03 true resid norm 2.342871234279e-03 ||r(i)||/||b|| 7.713679455299e-03 3 KSP unpreconditioned resid norm 2.847545158015e-04 true resid norm 2.847545157994e-04 ||r(i)||/||b|| 9.375270079671e-04 4 KSP unpreconditioned resid norm 8.896004812140e-05 true resid norm 8.896004811976e-05 ||r(i)||/||b|| 2.928924498640e-04 5 KSP unpreconditioned resid norm 4.813451098741e-06 true resid norm 4.813451098816e-06 ||r(i)||/||b|| 1.584782736105e-05 6 KSP unpreconditioned resid norm 1.029168681786e-06 true resid norm 1.029168680690e-06 ||r(i)||/||b|| 3.388439446491e-06 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: !(matA_1 && !matA_1->compressedrow.use) [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1165-gfeaa1dd GIT Date: 2016-08-16 11:58:28 -0500 [0]PETSC ERROR: ./ex on a x86_64 named artur-ubuntu by artur Wed Aug 17 10:47:27 2016 [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack [0]PETSC ERROR: #1 smoothAggs() line 336 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/gamg/agg.c [0]PETSC ERROR: #2 PCGAMGCoarsen_AGG() line 1004 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/gamg/agg.c [0]PETSC ERROR: #3 PCSetUp_GAMG() line 526 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/gamg/gamg.c [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #5 KSPSetUp() line 393 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #6 KSPSolve() line 602 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #7 PCApply_FieldSplit() line 1040 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: #8 PCApply() line 482 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #9 KSP_PCApply() line 245 in /home/artur/Rorsrach/Packages/petsc/include/petsc/private/kspimpl.h [0]PETSC ERROR: #10 KSPInitialResidual() line 69 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itres.c [0]PETSC ERROR: #11 KSPSolve_GMRES() line 239 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/impls/gmres/gmres.c [0]PETSC ERROR: #12 KSPSolve() line 659 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #13 main() line 67 in /home/artur/Downloads/temp/ex.c [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -fieldsplit_PA_gamg_est_ksp_pmat_view [0]PETSC ERROR: -fieldsplit_PA_ksp_monitor_true_residual [0]PETSC ERROR: -fieldsplit_PA_ksp_type fgmres [0]PETSC ERROR: -fieldsplit_PA_pc_type gamg [0]PETSC ERROR: -fieldsplit_PB_ksp_monitor_true_residual [0]PETSC ERROR: -fieldsplit_PB_ksp_type fgmres [0]PETSC ERROR: -fieldsplit_PB_pc_type gamg [0]PETSC ERROR: -fieldsplit_TA_ksp_monitor_true_residual [0]PETSC ERROR: -fieldsplit_TA_ksp_type fgmres [0]PETSC ERROR: -fieldsplit_TA_pc_type gamg [0]PETSC ERROR: -fieldsplit_TB_ksp_monitor_true_residual [0]PETSC ERROR: -fieldsplit_TB_ksp_type fgmres [0]PETSC ERROR: -fieldsplit_TB_pc_type gamg [0]PETSC ERROR: -ksp_converged_reason [0]PETSC ERROR: -pc_fieldsplit_type multiplicative [0]PETSC ERROR: -pc_type fieldsplit [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- -------------------------------------------------------------------------------------------------------------------------------------------------------- Do you have any idea as to why our results are completely different? Artur From rlmackie862 at gmail.com Wed Aug 17 13:13:32 2016 From: rlmackie862 at gmail.com (Randall Mackie) Date: Wed, 17 Aug 2016 11:13:32 -0700 Subject: [petsc-users] Hypre - Euclid Message-ID: <688EB085-0EBB-49AA-AECC-341250F73FFB@gmail.com> It seems that Euclid is not available as a Hypre PC unless it is called as part of BoomerAMG. However, there are many older posts that mention -pc_hypre_type euclid, so I?m wondering why, or if there is some other way to access the parallel ILU(k) of Euclid? Thanks, Randy From jychang48 at gmail.com Wed Aug 17 14:04:06 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 17 Aug 2016 14:04:06 -0500 Subject: [petsc-users] What exactly goes into DMPlexSetRefinementLimit In-Reply-To: References: Message-ID: When I enter values like 1/16, 1/12, 1/24, and so on, I was expecting to get roughly the same dm object as if I simply did -dm_refine <0/1/2/3>. Instead it seems I get highly unstructured grids, and the smaller the number gets, the fewer additional cells I get. Is there a way to make the refinement limit uniform? On Wed, Aug 17, 2016 at 9:38 AM, Matthew Knepley wrote: > On Wed, Aug 17, 2016 at 5:23 AM, Justin Chang wrote: > >> Hi all, >> >> Playing around with SNES ex12.c and I am attempting to tinker around with >> 3D options. I am trying to understand what kind of values go into >> -refinement_limit for 3D simplices. >> > > The cell volume limits for any cells created out of the existing cells. > This is how TetGen understands refinement. I think a > better way is to use a metric tensor field, and we now have an interface > to pragmatic for this (I think currently I only hooked it up > to DMCoarsen() but it does both). Clearly the interface is immature, but > this is the way we are headed. > > Matt > > >> Thanks, >> Justin >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 17 14:13:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 17 Aug 2016 14:13:14 -0500 Subject: [petsc-users] Hypre - Euclid In-Reply-To: <688EB085-0EBB-49AA-AECC-341250F73FFB@gmail.com> References: <688EB085-0EBB-49AA-AECC-341250F73FFB@gmail.com> Message-ID: Yeah I remove the interface because of some bugs in Euclid that the Hypre people didn't seem to want to fix and it was a nuisance to deal with the bugs. Plus I was never particularly impressed with its performance. Barry > On Aug 17, 2016, at 1:13 PM, Randall Mackie wrote: > > It seems that Euclid is not available as a Hypre PC unless it is called as part of BoomerAMG. > > However, there are many older posts that mention -pc_hypre_type euclid, so I?m wondering why, or if there is some other way to access the parallel ILU(k) of Euclid? > > > Thanks, Randy From knepley at gmail.com Wed Aug 17 14:28:54 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 Aug 2016 14:28:54 -0500 Subject: [petsc-users] What exactly goes into DMPlexSetRefinementLimit In-Reply-To: References: Message-ID: On Wed, Aug 17, 2016 at 2:04 PM, Justin Chang wrote: > When I enter values like 1/16, 1/12, 1/24, and so on, I was expecting to > get roughly the same dm object as if I simply did -dm_refine <0/1/2/3>. > Instead it seems I get highly unstructured grids, and the smaller the > number gets, the fewer additional cells I get. Is there a way to make the > refinement limit uniform? > Why not just use -dm_refine ? Matt > On Wed, Aug 17, 2016 at 9:38 AM, Matthew Knepley > wrote: > >> On Wed, Aug 17, 2016 at 5:23 AM, Justin Chang >> wrote: >> >>> Hi all, >>> >>> Playing around with SNES ex12.c and I am attempting to tinker around >>> with 3D options. I am trying to understand what kind of values go into >>> -refinement_limit for 3D simplices. >>> >> >> The cell volume limits for any cells created out of the existing cells. >> This is how TetGen understands refinement. I think a >> better way is to use a metric tensor field, and we now have an interface >> to pragmatic for this (I think currently I only hooked it up >> to DMCoarsen() but it does both). Clearly the interface is immature, but >> this is the way we are headed. >> >> Matt >> >> >>> Thanks, >>> Justin >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Aug 17 14:35:45 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 17 Aug 2016 14:35:45 -0500 Subject: [petsc-users] What exactly goes into DMPlexSetRefinementLimit In-Reply-To: References: Message-ID: Because the base mesh starts off with 6 elements and 8 vertices. This is enough data for one cell per MPI process. Refinement is done after DMPlexDistribute(...). If I use anymore than 6 cores, some of the MPI ranks will have an empty DM Object. For example: $ mpirun -n 8 --bind-to-core --bysocket ./ex12 -dim 3 -run_type full -interpolate 1 -petscspace_order 1 -bc_type dirichlet -ksp_rtol 1.0e-7 -pc_type ml -refinement_limit 1 -dm_refine 1 -dm_view DM Object: Parallel Mesh 8 MPI processes type: plex Parallel Mesh in 3 dimensions: 0-cells: 10 0 10 10 10 0 10 10 1-cells: 25 0 25 25 25 0 25 25 2-cells: 24 0 24 24 24 0 24 24 3-cells: 8 0 8 8 8 0 8 8 Labels: marker: 1 strata of sizes (18) depth: 4 strata of sizes (10, 25, 24, 8) Number of SNES iterations = 1 L_2 Error: 0.118178 On Wed, Aug 17, 2016 at 2:28 PM, Matthew Knepley wrote: > On Wed, Aug 17, 2016 at 2:04 PM, Justin Chang wrote: > >> When I enter values like 1/16, 1/12, 1/24, and so on, I was expecting to >> get roughly the same dm object as if I simply did -dm_refine <0/1/2/3>. >> Instead it seems I get highly unstructured grids, and the smaller the >> number gets, the fewer additional cells I get. Is there a way to make the >> refinement limit uniform? >> > > Why not just use -dm_refine ? > > Matt > > >> On Wed, Aug 17, 2016 at 9:38 AM, Matthew Knepley >> wrote: >> >>> On Wed, Aug 17, 2016 at 5:23 AM, Justin Chang >>> wrote: >>> >>>> Hi all, >>>> >>>> Playing around with SNES ex12.c and I am attempting to tinker around >>>> with 3D options. I am trying to understand what kind of values go into >>>> -refinement_limit for 3D simplices. >>>> >>> >>> The cell volume limits for any cells created out of the existing cells. >>> This is how TetGen understands refinement. I think a >>> better way is to use a metric tensor field, and we now have an interface >>> to pragmatic for this (I think currently I only hooked it up >>> to DMCoarsen() but it does both). Clearly the interface is immature, but >>> this is the way we are headed. >>> >>> Matt >>> >>> >>>> Thanks, >>>> Justin >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 17 14:41:49 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 17 Aug 2016 14:41:49 -0500 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: <5889439212f64929a139951a1b68b523@utdallas.edu> References: <7050CA61-07F5-4599-9CB1-219472487A20@utdallas.edu> <5889439212f64929a139951a1b68b523@utdallas.edu> Message-ID: <1C911215-DDFB-4400-B830-B5E2D6889203@mcs.anl.gov> Strange, please email options.txt > On Aug 17, 2016, at 11:02 AM, Safin, Artur wrote: > > Hi Barry, > > Well, this is a bit strange: my run produces a completely different output. What I have is > > -------------------------------------------------------------------------------------------------------------------------------------------------------- > artur at artur-ubuntu:~/Downloads/temp$ ./ex -options_file options.txt > > Residual norms for fieldsplit_PA_ solve. > 0 KSP unpreconditioned resid norm 3.037294001981e-01 true resid norm 3.037294001981e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 2.205402771720e-02 true resid norm 2.205402771720e-02 ||r(i)||/||b|| 7.261077690475e-02 > 2 KSP unpreconditioned resid norm 2.342871234277e-03 true resid norm 2.342871234279e-03 ||r(i)||/||b|| 7.713679455299e-03 > 3 KSP unpreconditioned resid norm 2.847545158015e-04 true resid norm 2.847545157994e-04 ||r(i)||/||b|| 9.375270079671e-04 > 4 KSP unpreconditioned resid norm 8.896004812140e-05 true resid norm 8.896004811976e-05 ||r(i)||/||b|| 2.928924498640e-04 > 5 KSP unpreconditioned resid norm 4.813451098741e-06 true resid norm 4.813451098816e-06 ||r(i)||/||b|| 1.584782736105e-05 > 6 KSP unpreconditioned resid norm 1.029168681786e-06 true resid norm 1.029168680690e-06 ||r(i)||/||b|| 3.388439446491e-06 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: !(matA_1 && !matA_1->compressedrow.use) > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1165-gfeaa1dd GIT Date: 2016-08-16 11:58:28 -0500 > [0]PETSC ERROR: ./ex on a x86_64 named artur-ubuntu by artur Wed Aug 17 10:47:27 2016 > [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack > [0]PETSC ERROR: #1 smoothAggs() line 336 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #2 PCGAMGCoarsen_AGG() line 1004 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #3 PCSetUp_GAMG() line 526 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/gamg/gamg.c > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #5 KSPSetUp() line 393 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #6 KSPSolve() line 602 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #7 PCApply_FieldSplit() line 1040 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: #8 PCApply() line 482 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #9 KSP_PCApply() line 245 in /home/artur/Rorsrach/Packages/petsc/include/petsc/private/kspimpl.h > [0]PETSC ERROR: #10 KSPInitialResidual() line 69 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itres.c > [0]PETSC ERROR: #11 KSPSolve_GMRES() line 239 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: #12 KSPSolve() line 659 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #13 main() line 67 in /home/artur/Downloads/temp/ex.c > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -fieldsplit_PA_gamg_est_ksp_pmat_view > [0]PETSC ERROR: -fieldsplit_PA_ksp_monitor_true_residual > [0]PETSC ERROR: -fieldsplit_PA_ksp_type fgmres > [0]PETSC ERROR: -fieldsplit_PA_pc_type gamg > [0]PETSC ERROR: -fieldsplit_PB_ksp_monitor_true_residual > [0]PETSC ERROR: -fieldsplit_PB_ksp_type fgmres > [0]PETSC ERROR: -fieldsplit_PB_pc_type gamg > [0]PETSC ERROR: -fieldsplit_TA_ksp_monitor_true_residual > [0]PETSC ERROR: -fieldsplit_TA_ksp_type fgmres > [0]PETSC ERROR: -fieldsplit_TA_pc_type gamg > [0]PETSC ERROR: -fieldsplit_TB_ksp_monitor_true_residual > [0]PETSC ERROR: -fieldsplit_TB_ksp_type fgmres > [0]PETSC ERROR: -fieldsplit_TB_pc_type gamg > [0]PETSC ERROR: -ksp_converged_reason > [0]PETSC ERROR: -pc_fieldsplit_type multiplicative > [0]PETSC ERROR: -pc_type fieldsplit > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > -------------------------------------------------------------------------------------------------------------------------------------------------------- > > Do you have any idea as to why our results are completely different? > > Artur From aks084000 at utdallas.edu Wed Aug 17 14:45:06 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Wed, 17 Aug 2016 19:45:06 +0000 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: <1C911215-DDFB-4400-B830-B5E2D6889203@mcs.anl.gov> References: <7050CA61-07F5-4599-9CB1-219472487A20@utdallas.edu> <5889439212f64929a139951a1b68b523@utdallas.edu>, <1C911215-DDFB-4400-B830-B5E2D6889203@mcs.anl.gov> Message-ID: <2aa6377b2d5a49208fc67d7dbd4b8b8a@utdallas.edu> Barry, Options file is attached. Thank you for your help, Artur -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: options.txt URL: From knepley at gmail.com Wed Aug 17 14:46:01 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 17 Aug 2016 14:46:01 -0500 Subject: [petsc-users] What exactly goes into DMPlexSetRefinementLimit In-Reply-To: References: Message-ID: On Wed, Aug 17, 2016 at 2:35 PM, Justin Chang wrote: > Because the base mesh starts off with 6 elements and 8 vertices. This is > enough data for one cell per MPI process. Refinement is done after > DMPlexDistribute(...). If I use anymore than 6 cores, some of the MPI ranks > will have an empty DM Object. For example: > You could 1) Just stick an explicit call to uniform refinement in for the serial mesh 2) Use a cell volume that is 1/N, where N is the number of cells you want Matt > $ mpirun -n 8 --bind-to-core --bysocket ./ex12 -dim 3 -run_type full > -interpolate 1 -petscspace_order 1 -bc_type dirichlet -ksp_rtol 1.0e-7 > -pc_type ml -refinement_limit 1 -dm_refine 1 -dm_view > > DM Object: Parallel Mesh 8 MPI processes > > type: plex > > Parallel Mesh in 3 dimensions: > > 0-cells: 10 0 10 10 10 0 10 10 > > 1-cells: 25 0 25 25 25 0 25 25 > > 2-cells: 24 0 24 24 24 0 24 24 > > 3-cells: 8 0 8 8 8 0 8 8 > > Labels: > > marker: 1 strata of sizes (18) > > depth: 4 strata of sizes (10, 25, 24, 8) > > Number of SNES iterations = 1 > > L_2 Error: 0.118178 > > On Wed, Aug 17, 2016 at 2:28 PM, Matthew Knepley > wrote: > >> On Wed, Aug 17, 2016 at 2:04 PM, Justin Chang >> wrote: >> >>> When I enter values like 1/16, 1/12, 1/24, and so on, I was expecting >>> to get roughly the same dm object as if I simply did -dm_refine <0/1/2/3>. >>> Instead it seems I get highly unstructured grids, and the smaller the >>> number gets, the fewer additional cells I get. Is there a way to make the >>> refinement limit uniform? >>> >> >> Why not just use -dm_refine ? >> >> Matt >> >> >>> On Wed, Aug 17, 2016 at 9:38 AM, Matthew Knepley >>> wrote: >>> >>>> On Wed, Aug 17, 2016 at 5:23 AM, Justin Chang >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> Playing around with SNES ex12.c and I am attempting to tinker around >>>>> with 3D options. I am trying to understand what kind of values go into >>>>> -refinement_limit for 3D simplices. >>>>> >>>> >>>> The cell volume limits for any cells created out of the existing cells. >>>> This is how TetGen understands refinement. I think a >>>> better way is to use a metric tensor field, and we now have an >>>> interface to pragmatic for this (I think currently I only hooked it up >>>> to DMCoarsen() but it does both). Clearly the interface is immature, >>>> but this is the way we are headed. >>>> >>>> Matt >>>> >>>> >>>>> Thanks, >>>>> Justin >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Aug 17 15:04:12 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 17 Aug 2016 15:04:12 -0500 Subject: [petsc-users] What exactly goes into DMPlexSetRefinementLimit In-Reply-To: References: Message-ID: Ah I didn't realize that in DMPlexCreateBoxMesh(...) one of the fields was to define number of faces in each spatial direction. Right now it is set to 1 for 3D, so i guess I will just have to change this Thanks! On Wed, Aug 17, 2016 at 2:46 PM, Matthew Knepley wrote: > On Wed, Aug 17, 2016 at 2:35 PM, Justin Chang wrote: > >> Because the base mesh starts off with 6 elements and 8 vertices. This is >> enough data for one cell per MPI process. Refinement is done after >> DMPlexDistribute(...). If I use anymore than 6 cores, some of the MPI ranks >> will have an empty DM Object. For example: >> > > You could > > 1) Just stick an explicit call to uniform refinement in for the serial mesh > > 2) Use a cell volume that is 1/N, where N is the number of cells you want > > Matt > > >> $ mpirun -n 8 --bind-to-core --bysocket ./ex12 -dim 3 -run_type full >> -interpolate 1 -petscspace_order 1 -bc_type dirichlet -ksp_rtol 1.0e-7 >> -pc_type ml -refinement_limit 1 -dm_refine 1 -dm_view >> >> DM Object: Parallel Mesh 8 MPI processes >> >> type: plex >> >> Parallel Mesh in 3 dimensions: >> >> 0-cells: 10 0 10 10 10 0 10 10 >> >> 1-cells: 25 0 25 25 25 0 25 25 >> >> 2-cells: 24 0 24 24 24 0 24 24 >> >> 3-cells: 8 0 8 8 8 0 8 8 >> >> Labels: >> >> marker: 1 strata of sizes (18) >> >> depth: 4 strata of sizes (10, 25, 24, 8) >> >> Number of SNES iterations = 1 >> >> L_2 Error: 0.118178 >> >> On Wed, Aug 17, 2016 at 2:28 PM, Matthew Knepley >> wrote: >> >>> On Wed, Aug 17, 2016 at 2:04 PM, Justin Chang >>> wrote: >>> >>>> When I enter values like 1/16, 1/12, 1/24, and so on, I was expecting >>>> to get roughly the same dm object as if I simply did -dm_refine <0/1/2/3>. >>>> Instead it seems I get highly unstructured grids, and the smaller the >>>> number gets, the fewer additional cells I get. Is there a way to make the >>>> refinement limit uniform? >>>> >>> >>> Why not just use -dm_refine ? >>> >>> Matt >>> >>> >>>> On Wed, Aug 17, 2016 at 9:38 AM, Matthew Knepley >>>> wrote: >>>> >>>>> On Wed, Aug 17, 2016 at 5:23 AM, Justin Chang >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> Playing around with SNES ex12.c and I am attempting to tinker around >>>>>> with 3D options. I am trying to understand what kind of values go into >>>>>> -refinement_limit for 3D simplices. >>>>>> >>>>> >>>>> The cell volume limits for any cells created out of the existing >>>>> cells. This is how TetGen understands refinement. I think a >>>>> better way is to use a metric tensor field, and we now have an >>>>> interface to pragmatic for this (I think currently I only hooked it up >>>>> to DMCoarsen() but it does both). Clearly the interface is immature, >>>>> but this is the way we are headed. >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thanks, >>>>>> Justin >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 17 16:04:44 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 17 Aug 2016 16:04:44 -0500 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: <5889439212f64929a139951a1b68b523@utdallas.edu> References: <7050CA61-07F5-4599-9CB1-219472487A20@utdallas.edu> <5889439212f64929a139951a1b68b523@utdallas.edu> Message-ID: <6513A82A-D072-459D-8FB4-5D60750ADFDC@mcs.anl.gov> > On Aug 17, 2016, at 11:02 AM, Safin, Artur wrote: > > Hi Barry, > > Well, this is a bit strange: my run produces a completely different output. I am a complete idiot. I did not notice you were running with complex. I switched to complex and get the same error you got. I've attached a patch that will remove the problem; it was some unneeded checks that did not belong there. This is also fixed in the master branch. I can't fix it in maint because of code changes since the maint release. -------------- next part -------------- A non-text attachment was scrubbed... Name: gamg.patch Type: application/octet-stream Size: 918 bytes Desc: not available URL: -------------- next part -------------- Barry > What I have is > > -------------------------------------------------------------------------------------------------------------------------------------------------------- > artur at artur-ubuntu:~/Downloads/temp$ ./ex -options_file options.txt > > Residual norms for fieldsplit_PA_ solve. > 0 KSP unpreconditioned resid norm 3.037294001981e-01 true resid norm 3.037294001981e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP unpreconditioned resid norm 2.205402771720e-02 true resid norm 2.205402771720e-02 ||r(i)||/||b|| 7.261077690475e-02 > 2 KSP unpreconditioned resid norm 2.342871234277e-03 true resid norm 2.342871234279e-03 ||r(i)||/||b|| 7.713679455299e-03 > 3 KSP unpreconditioned resid norm 2.847545158015e-04 true resid norm 2.847545157994e-04 ||r(i)||/||b|| 9.375270079671e-04 > 4 KSP unpreconditioned resid norm 8.896004812140e-05 true resid norm 8.896004811976e-05 ||r(i)||/||b|| 2.928924498640e-04 > 5 KSP unpreconditioned resid norm 4.813451098741e-06 true resid norm 4.813451098816e-06 ||r(i)||/||b|| 1.584782736105e-05 > 6 KSP unpreconditioned resid norm 1.029168681786e-06 true resid norm 1.029168680690e-06 ||r(i)||/||b|| 3.388439446491e-06 > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Petsc has generated inconsistent data > [0]PETSC ERROR: !(matA_1 && !matA_1->compressedrow.use) > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1165-gfeaa1dd GIT Date: 2016-08-16 11:58:28 -0500 > [0]PETSC ERROR: ./ex on a x86_64 named artur-ubuntu by artur Wed Aug 17 10:47:27 2016 > [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack > [0]PETSC ERROR: #1 smoothAggs() line 336 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #2 PCGAMGCoarsen_AGG() line 1004 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/gamg/agg.c > [0]PETSC ERROR: #3 PCSetUp_GAMG() line 526 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/gamg/gamg.c > [0]PETSC ERROR: #4 PCSetUp() line 968 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #5 KSPSetUp() line 393 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #6 KSPSolve() line 602 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #7 PCApply_FieldSplit() line 1040 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: #8 PCApply() line 482 in /home/artur/Rorsrach/Packages/petsc/src/ksp/pc/interface/precon.c > [0]PETSC ERROR: #9 KSP_PCApply() line 245 in /home/artur/Rorsrach/Packages/petsc/include/petsc/private/kspimpl.h > [0]PETSC ERROR: #10 KSPInitialResidual() line 69 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itres.c > [0]PETSC ERROR: #11 KSPSolve_GMRES() line 239 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/impls/gmres/gmres.c > [0]PETSC ERROR: #12 KSPSolve() line 659 in /home/artur/Rorsrach/Packages/petsc/src/ksp/ksp/interface/itfunc.c > [0]PETSC ERROR: #13 main() line 67 in /home/artur/Downloads/temp/ex.c > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -fieldsplit_PA_gamg_est_ksp_pmat_view > [0]PETSC ERROR: -fieldsplit_PA_ksp_monitor_true_residual > [0]PETSC ERROR: -fieldsplit_PA_ksp_type fgmres > [0]PETSC ERROR: -fieldsplit_PA_pc_type gamg > [0]PETSC ERROR: -fieldsplit_PB_ksp_monitor_true_residual > [0]PETSC ERROR: -fieldsplit_PB_ksp_type fgmres > [0]PETSC ERROR: -fieldsplit_PB_pc_type gamg > [0]PETSC ERROR: -fieldsplit_TA_ksp_monitor_true_residual > [0]PETSC ERROR: -fieldsplit_TA_ksp_type fgmres > [0]PETSC ERROR: -fieldsplit_TA_pc_type gamg > [0]PETSC ERROR: -fieldsplit_TB_ksp_monitor_true_residual > [0]PETSC ERROR: -fieldsplit_TB_ksp_type fgmres > [0]PETSC ERROR: -fieldsplit_TB_pc_type gamg > [0]PETSC ERROR: -ksp_converged_reason > [0]PETSC ERROR: -pc_fieldsplit_type multiplicative > [0]PETSC ERROR: -pc_type fieldsplit > [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- > -------------------------------------------------------------------------------------------------------------------------------------------------------- > > Do you have any idea as to why our results are completely different? > > Artur From dalcinl at gmail.com Thu Aug 18 02:20:59 2016 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 18 Aug 2016 10:20:59 +0300 Subject: [petsc-users] [petsc4py] a problem with computeRHSFunctionLinear interface? In-Reply-To: <4270161.snsPm0L6UZ@wotan> References: <4270161.snsPm0L6UZ@wotan> Message-ID: Dear Francesco, sorry for the late answer, I missed your email. You have to do it this way: ts.setRHSFunction(PETSc.TS.computeRHSFunctionLinear) ts.setRHSJacobian(PETSc.TS.computeRHSJacobianConstant, J=A, P=A) I.e, you have to set unbound methods, not instance methods as you did in your original code. Additionally, do not pass "args" nor "kargs". On 11 August 2016 at 10:36, Francesco Caimmi wrote: > Dear all, > > I was trying to reproduce /ts/examples/tutorials/ex4.c in python to learn how > to use TS solvers; the example uses the function TSComputeRHSFunctionLinear. > However I get an error when running my code (attached in case you want to look > at it), when I call ts.solve. > > Here is the trace: > [fcaimmi at Wotan 2645] > ./ts_ex4.py > Solving a linear TS problem, number of processors = 1 > Timestep 0 : time = 0.0 2-norm error = 1.14956855594e-08 max norm error = 0 > Traceback (most recent call last): > File "./ts_ex4.py", line 473, in > main(m = m, debug = debug) > File "./ts_ex4.py", line 340, in main > ts.solve(u) > File "PETSc/TS.pyx", line 568, in petsc4py.PETSc.TS.solve > (src/petsc4py.PETSc.c:188927) > File "PETSc/petscts.pxi", line 221, in petsc4py.PETSc.TS_RHSFunction > (src/petsc4py.PETSc.c:35490) > File "PETSc/TS.pyx", line 189, in petsc4py.PETSc.TS.computeRHSFunctionLinear > (src/petsc4py.PETSc.c:181611) > TypeError: computeRHSFunctionLinear() takes exactly 3 positional arguments (5 > given) > > I cannot understand if there is a problem with my code or if the problem is in > computeRHSFunctionLinear interface. > > I checked https://bitbucket.org/petsc/petsc4py/ and the interface to > computeRHSFunctionLinear has three arguments, however I am not that much into > petsc4py to tell how it gets called. > > I am on Petsc Release Version 3.7.3 > > Thank you for your time. > > Best, > -- > Francesco Caimmi > > Laboratorio di Ingegneria dei Polimeri > http://www.chem.polimi.it/polyenglab/ > > Politecnico di Milano - Dipartimento di Chimica, > Materiali e Ingegneria Chimica ?Giulio Natta? > > P.zza Leonardo da Vinci, 32 > I-20133 Milano > Tel. +39.02.2399.4711 > Fax +39.02.7063.8173 > > francesco.caimmi at polimi.it > Skype: fmglcaimmi (please arrange meetings by e-mail) -- Lisandro Dalcin ============ Research Scientist Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Extreme Computing Research Center (ECRC) King Abdullah University of Science and Technology (KAUST) http://ecrc.kaust.edu.sa/ 4700 King Abdullah University of Science and Technology al-Khawarizmi Bldg (Bldg 1), Office # 0109 Thuwal 23955-6900, Kingdom of Saudi Arabia http://www.kaust.edu.sa Office Phone: +966 12 808-0459 From rongliang.chan at gmail.com Thu Aug 18 03:59:14 2016 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Thu, 18 Aug 2016 16:59:14 +0800 Subject: [petsc-users] dmplex with block size error Message-ID: <57B578E2.9020201@gmail.com> Dear All, I try to use the block matrix (BAIJ) for the dmplex data structure with the option "-dm_mat_type baij" (the block size is 7). The code works fine when np = 1 but the following error comes up when np>1. And the code also works fine for np>1 if I set the block size to be 1. Any suggestions are highly appreciated. ---------------------------------------------------- [1]PETSC ERROR: PetscMallocValidate: error detected at VecAXPY_Seq() line 89 in /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c [1]PETSC ERROR: Memory at address 0x1332571 is corrupted [1]PETSC ERROR: Probably write past beginning or end of array [1]PETSC ERROR: Last intact block allocated in PetscStrallocpy() line 188 in /home/rlchen/soft/petsc-3.6.3/src/sys/utils/str.c [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Memory corruption: http://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind [1]PETSC ERROR: [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu Aug 18 16:42:34 2016 [1]PETSC ERROR: Configure options --download-blacs --download-scalapack --download-metis --download-parmetis --download-exodusii --download-netcdf --download-hdf5 --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared --with-debugging=1 --download-fblaslapack --with-64-bit-indices [1]PETSC ERROR: #1 PetscMallocValidate() line 136 in /home/rlchen/soft/petsc-3.6.3/src/sys/memory/mtr.c [1]PETSC ERROR: #2 VecAXPY_Seq() line 89 in /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c [1]PETSC ERROR: #3 VecAXPY() line 640 in /home/rlchen/soft/petsc-3.6.3/src/vec/vec/interface/rvector.c --------------------------------------------------- Best regards, Rongliang From knepley at gmail.com Thu Aug 18 04:52:25 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 04:52:25 -0500 Subject: [petsc-users] dmplex with block size error In-Reply-To: <57B578E2.9020201@gmail.com> References: <57B578E2.9020201@gmail.com> Message-ID: Run with valgrind and send the log. Thanks, Matt On Thu, Aug 18, 2016 at 3:59 AM, Rongliang Chen wrote: > Dear All, > > I try to use the block matrix (BAIJ) for the dmplex data structure with > the option "-dm_mat_type baij" (the block size is 7). The code works fine > when np = 1 but the following error comes up when np>1. And the code also > works fine for np>1 if I set the block size to be 1. Any suggestions are > highly appreciated. > > ---------------------------------------------------- > [1]PETSC ERROR: PetscMallocValidate: error detected at VecAXPY_Seq() line > 89 in /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c > [1]PETSC ERROR: Memory at address 0x1332571 is corrupted > [1]PETSC ERROR: Probably write past beginning or end of array > [1]PETSC ERROR: Last intact block allocated in PetscStrallocpy() line 188 > in /home/rlchen/soft/petsc-3.6.3/src/sys/utils/str.c > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Memory corruption: http://www.mcs.anl.gov/petsc/d > ocumentation/installation.html#valgrind > [1]PETSC ERROR: > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 > [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu Aug 18 > 16:42:34 2016 > [1]PETSC ERROR: Configure options --download-blacs --download-scalapack > --download-metis --download-parmetis --download-exodusii --download-netcdf > --download-hdf5 --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared > --with-debugging=1 --download-fblaslapack --with-64-bit-indices > [1]PETSC ERROR: #1 PetscMallocValidate() line 136 in > /home/rlchen/soft/petsc-3.6.3/src/sys/memory/mtr.c > [1]PETSC ERROR: #2 VecAXPY_Seq() line 89 in /home/rlchen/soft/petsc-3.6.3/ > src/vec/vec/impls/seq/bvec1.c > [1]PETSC ERROR: #3 VecAXPY() line 640 in /home/rlchen/soft/petsc-3.6.3/ > src/vec/vec/interface/rvector.c > --------------------------------------------------- > > Best regards, > Rongliang > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Thu Aug 18 05:05:09 2016 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Thu, 18 Aug 2016 18:05:09 +0800 Subject: [petsc-users] dmplex with block size error In-Reply-To: References: <57B578E2.9020201@gmail.com> Message-ID: <57B58855.7060702@gmail.com> Hi Matt, The log of the valgrind is attached. When I run with valgrind, the following error message comes up. ------------------------- [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Arguments are incompatible [1]PETSC ERROR: Cannot change block size 3670016 to 7 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu Aug 18 17:59:43 2016 [1]PETSC ERROR: Configure options --download-blacs --download-scalapack --download-metis --download-parmetis --download-exodusii --download-netcdf --download-hdf5 --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared --with-debugging=1 --download-fblaslapack --with-64-bit-indices [1]PETSC ERROR: #1 PetscLayoutSetBlockSize() line 424 in /home/rlchen/soft/petsc-3.6.3/src/vec/is/utils/pmap.c [1]PETSC ERROR: #2 MatSetBlockSize() line 6920 in /home/rlchen/soft/petsc-3.6.3/src/mat/interface/matrix.c [1]PETSC ERROR: #3 MatXAIJSetPreallocation() line 282 in /home/rlchen/soft/3D_fluid/FSI/Spmcs-v1.5/Fluid-petsc-3.6/src/application/Fluid/gcreate.c ---------------------- Best regards, Rongliang On 08/18/2016 05:52 PM, Matthew Knepley wrote: > Run with valgrind and send the log. > > Thanks, > > Matt > > On Thu, Aug 18, 2016 at 3:59 AM, Rongliang Chen > > wrote: > > Dear All, > > I try to use the block matrix (BAIJ) for the dmplex data structure > with the option "-dm_mat_type baij" (the block size is 7). The > code works fine when np = 1 but the following error comes up when > np>1. And the code also works fine for np>1 if I set the block > size to be 1. Any suggestions are highly appreciated. > > ---------------------------------------------------- > [1]PETSC ERROR: PetscMallocValidate: error detected at > VecAXPY_Seq() line 89 in > /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c > [1]PETSC ERROR: Memory at address 0x1332571 is corrupted > [1]PETSC ERROR: Probably write past beginning or end of array > [1]PETSC ERROR: Last intact block allocated in PetscStrallocpy() > line 188 in /home/rlchen/soft/petsc-3.6.3/src/sys/utils/str.c > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Memory corruption: > http://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind > > [1]PETSC ERROR: > [1]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble > shooting. > [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 > [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu > Aug 18 16:42:34 2016 > [1]PETSC ERROR: Configure options --download-blacs > --download-scalapack --download-metis --download-parmetis > --download-exodusii --download-netcdf --download-hdf5 > --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared > --with-debugging=1 --download-fblaslapack --with-64-bit-indices > [1]PETSC ERROR: #1 PetscMallocValidate() line 136 in > /home/rlchen/soft/petsc-3.6.3/src/sys/memory/mtr.c > [1]PETSC ERROR: #2 VecAXPY_Seq() line 89 in > /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c > [1]PETSC ERROR: #3 VecAXPY() line 640 in > /home/rlchen/soft/petsc-3.6.3/src/vec/vec/interface/rvector.c > --------------------------------------------------- > > Best regards, > Rongliang > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- ==13325== Invalid read of size 8 ==13325== at 0x5A3650: SpmcsDMMeshPreallocateOperator (spmcsdmmeshfem.cpp:2557) ==13325== by 0x59DC4B: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2133) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== Address 0x9d6c970 is 16 bytes before a block of size 22 alloc'd ==13325== at 0x4A0688A: memalign (vg_replace_malloc.c:727) ==13325== by 0x4CDEE7F: PetscMallocAlign (mal.c:28) ==13325== by 0x4D213E2: PetscStrallocpy (str.c:188) ==13325== by 0x4D5417A: PetscFunctionListAdd_Private (reg.c:251) ==13325== by 0x4D69EFC: PetscObjectComposeFunction_Petsc (inherit.c:663) ==13325== by 0x4D6BA29: PetscObjectComposeFunction_Private (inherit.c:830) ==13325== by 0x52D75E9: MatCreate_MPIBAIJ (mpibaij.c:3180) ==13325== by 0x51415CF: MatSetType (matreg.c:94) ==13325== by 0x59D03E: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2045) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== ==13325== Invalid write of size 8 ==13325== at 0x5A3657: SpmcsDMMeshPreallocateOperator (spmcsdmmeshfem.cpp:2557) ==13325== by 0x59DC4B: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2133) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== Address 0x9d6c970 is 16 bytes before a block of size 22 alloc'd ==13325== at 0x4A0688A: memalign (vg_replace_malloc.c:727) ==13325== by 0x4CDEE7F: PetscMallocAlign (mal.c:28) ==13325== by 0x4D213E2: PetscStrallocpy (str.c:188) ==13325== by 0x4D5417A: PetscFunctionListAdd_Private (reg.c:251) ==13325== by 0x4D69EFC: PetscObjectComposeFunction_Petsc (inherit.c:663) ==13325== by 0x4D6BA29: PetscObjectComposeFunction_Private (inherit.c:830) ==13325== by 0x52D75E9: MatCreate_MPIBAIJ (mpibaij.c:3180) ==13325== by 0x51415CF: MatSetType (matreg.c:94) ==13325== by 0x59D03E: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2045) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== ==13325== Invalid read of size 8 ==13325== at 0x5A35FF: SpmcsDMMeshPreallocateOperator (spmcsdmmeshfem.cpp:2556) ==13325== by 0x59DC4B: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2133) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== Address 0x9d6c160 is 0 bytes after a block of size 2,480 alloc'd ==13325== at 0x4A0688A: memalign (vg_replace_malloc.c:727) ==13325== by 0x4CDEE7F: PetscMallocAlign (mal.c:28) ==13325== by 0x4111C1: MatCreate (gcreate.c:84) ==13325== by 0x59CF5D: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2043) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== ==13325== Invalid write of size 8 ==13325== at 0x5A3606: SpmcsDMMeshPreallocateOperator (spmcsdmmeshfem.cpp:2556) ==13325== by 0x59DC4B: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2133) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== Address 0x9d6c160 is 0 bytes after a block of size 2,480 alloc'd ==13325== at 0x4A0688A: memalign (vg_replace_malloc.c:727) ==13325== by 0x4CDEE7F: PetscMallocAlign (mal.c:28) ==13325== by 0x4111C1: MatCreate (gcreate.c:84) ==13325== by 0x59CF5D: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2043) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== ==13325== Invalid read of size 8 ==13325== at 0x4EB3B1A: PetscLayoutSetBlockSize (pmap.c:423) ==13325== by 0x51194D7: MatSetBlockSize (matrix.c:6920) ==13325== by 0x4133D4: MatXAIJSetPreallocation (gcreate.c:282) ==13325== by 0x5A3B3B: SpmcsDMMeshPreallocateOperator (spmcsdmmeshfem.cpp:2588) ==13325== by 0x59DC4B: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2133) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== Address 0x9d6c226 is 10 bytes before a block of size 80 alloc'd ==13325== at 0x4A0688A: memalign (vg_replace_malloc.c:727) ==13325== by 0x4CDEE7F: PetscMallocAlign (mal.c:28) ==13325== by 0x4EB0DDA: PetscLayoutCreate (pmap.c:53) ==13325== by 0x411352: MatCreate (gcreate.c:86) ==13325== by 0x59CF5D: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2043) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== ==13325== Invalid read of size 8 ==13325== at 0x4EB3B27: PetscLayoutSetBlockSize (pmap.c:423) ==13325== by 0x51194D7: MatSetBlockSize (matrix.c:6920) ==13325== by 0x4133D4: MatXAIJSetPreallocation (gcreate.c:282) ==13325== by 0x5A3B3B: SpmcsDMMeshPreallocateOperator (spmcsdmmeshfem.cpp:2588) ==13325== by 0x59DC4B: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2133) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== Address 0x9d6c226 is 10 bytes before a block of size 80 alloc'd ==13325== at 0x4A0688A: memalign (vg_replace_malloc.c:727) ==13325== by 0x4CDEE7F: PetscMallocAlign (mal.c:28) ==13325== by 0x4EB0DDA: PetscLayoutCreate (pmap.c:53) ==13325== by 0x411352: MatCreate (gcreate.c:86) ==13325== by 0x59CF5D: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2043) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== ==13325== Invalid read of size 4 ==13325== at 0x4EB3BC9: PetscLayoutSetBlockSize (pmap.c:424) ==13325== by 0x51194D7: MatSetBlockSize (matrix.c:6920) ==13325== by 0x4133D4: MatXAIJSetPreallocation (gcreate.c:282) ==13325== by 0x5A3B3B: SpmcsDMMeshPreallocateOperator (spmcsdmmeshfem.cpp:2588) ==13325== by 0x59DC4B: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2133) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== Address 0x9d6c21e is 18 bytes before a block of size 80 alloc'd ==13325== at 0x4A0688A: memalign (vg_replace_malloc.c:727) ==13325== by 0x4CDEE7F: PetscMallocAlign (mal.c:28) ==13325== by 0x4EB0DDA: PetscLayoutCreate (pmap.c:53) ==13325== by 0x411352: MatCreate (gcreate.c:86) ==13325== by 0x59CF5D: DMCreateMatrix_SpmcsDMMesh (spmcsdmmeshfem.cpp:2043) ==13325== by 0x5919B69: DMCreateMatrix (dm.c:956) ==13325== by 0x5D48905: SNESSetUpMatrices (snes.c:604) ==13325== by 0x5DE59B8: SNESSetUp_NEWTONLS (ls.c:307) ==13325== by 0x5D5F24C: SNESSetUp (snes.c:2659) ==13325== by 0x48257A: TimeStepperSolve(_p_DM*, void*) (timestepper.cpp:344) ==13325== by 0x40B47F: main (fluidstructureinteraction.cpp:97) ==13325== From knepley at gmail.com Thu Aug 18 05:08:00 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 05:08:00 -0500 Subject: [petsc-users] dmplex with block size error In-Reply-To: <57B58855.7060702@gmail.com> References: <57B578E2.9020201@gmail.com> <57B58855.7060702@gmail.com> Message-ID: On Thu, Aug 18, 2016 at 5:05 AM, Rongliang Chen wrote: > Hi Matt, > > The log of the valgrind is attached. When I run with valgrind, the > following error message comes up. > The valgrind log says your code is writing over memory. Fix that first. Matt > ------------------------- > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Arguments are incompatible > [1]PETSC ERROR: Cannot change block size 3670016 to 7 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 > [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu Aug 18 > 17:59:43 2016 > [1]PETSC ERROR: Configure options --download-blacs --download-scalapack > --download-metis --download-parmetis --download-exodusii --download-netcdf > --download-hdf5 --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared > --with-debugging=1 --download-fblaslapack --with-64-bit-indices > [1]PETSC ERROR: #1 PetscLayoutSetBlockSize() line 424 in > /home/rlchen/soft/petsc-3.6.3/src/vec/is/utils/pmap.c > [1]PETSC ERROR: #2 MatSetBlockSize() line 6920 in > /home/rlchen/soft/petsc-3.6.3/src/mat/interface/matrix.c > [1]PETSC ERROR: #3 MatXAIJSetPreallocation() line 282 in > /home/rlchen/soft/3D_fluid/FSI/Spmcs-v1.5/Fluid-petsc-3. > 6/src/application/Fluid/gcreate.c > ---------------------- > > Best regards, > Rongliang > > On 08/18/2016 05:52 PM, Matthew Knepley wrote: > > Run with valgrind and send the log. > > Thanks, > > Matt > > On Thu, Aug 18, 2016 at 3:59 AM, Rongliang Chen > wrote: > >> Dear All, >> >> I try to use the block matrix (BAIJ) for the dmplex data structure with >> the option "-dm_mat_type baij" (the block size is 7). The code works fine >> when np = 1 but the following error comes up when np>1. And the code also >> works fine for np>1 if I set the block size to be 1. Any suggestions are >> highly appreciated. >> >> ---------------------------------------------------- >> [1]PETSC ERROR: PetscMallocValidate: error detected at VecAXPY_Seq() line >> 89 in /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c >> [1]PETSC ERROR: Memory at address 0x1332571 is corrupted >> [1]PETSC ERROR: Probably write past beginning or end of array >> [1]PETSC ERROR: Last intact block allocated in PetscStrallocpy() line 188 >> in /home/rlchen/soft/petsc-3.6.3/src/sys/utils/str.c >> [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [1]PETSC ERROR: Memory corruption: http://www.mcs.anl.gov/petsc/d >> ocumentation/installation.html#valgrind >> [1]PETSC ERROR: >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >> [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu Aug 18 >> 16:42:34 2016 >> [1]PETSC ERROR: Configure options --download-blacs --download-scalapack >> --download-metis --download-parmetis --download-exodusii --download-netcdf >> --download-hdf5 --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared >> --with-debugging=1 --download-fblaslapack --with-64-bit-indices >> [1]PETSC ERROR: #1 PetscMallocValidate() line 136 in >> /home/rlchen/soft/petsc-3.6.3/src/sys/memory/mtr.c >> [1]PETSC ERROR: #2 VecAXPY_Seq() line 89 in /home/rlchen/soft/petsc-3.6.3/ >> src/vec/vec/impls/seq/bvec1.c >> [1]PETSC ERROR: #3 VecAXPY() line 640 in /home/rlchen/soft/petsc-3.6.3/ >> src/vec/vec/interface/rvector.c >> --------------------------------------------------- >> >> Best regards, >> Rongliang >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Thu Aug 18 05:49:11 2016 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Thu, 18 Aug 2016 18:49:11 +0800 Subject: [petsc-users] dmplex with block size error In-Reply-To: References: <57B578E2.9020201@gmail.com> <57B58855.7060702@gmail.com> Message-ID: <57B592A7.9000207@gmail.com> Hi Matt, Thanks for your help. I will fix this first. Best, Rongliang On 08/18/2016 06:08 PM, Matthew Knepley wrote: > On Thu, Aug 18, 2016 at 5:05 AM, Rongliang Chen > > wrote: > > Hi Matt, > > The log of the valgrind is attached. When I run with valgrind, the > following error message comes up. > > > The valgrind log says your code is writing over memory. Fix that first. > > Matt > > ------------------------- > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Arguments are incompatible > [1]PETSC ERROR: Cannot change block size 3670016 to 7 > [1]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble > shooting. > [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 > [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu > Aug 18 17:59:43 2016 > [1]PETSC ERROR: Configure options --download-blacs > --download-scalapack --download-metis --download-parmetis > --download-exodusii --download-netcdf --download-hdf5 > --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared > --with-debugging=1 --download-fblaslapack --with-64-bit-indices > [1]PETSC ERROR: #1 PetscLayoutSetBlockSize() line 424 in > /home/rlchen/soft/petsc-3.6.3/src/vec/is/utils/pmap.c > [1]PETSC ERROR: #2 MatSetBlockSize() line 6920 in > /home/rlchen/soft/petsc-3.6.3/src/mat/interface/matrix.c > [1]PETSC ERROR: #3 MatXAIJSetPreallocation() line 282 in > /home/rlchen/soft/3D_fluid/FSI/Spmcs-v1.5/Fluid-petsc-3.6/src/application/Fluid/gcreate.c > ---------------------- > > Best regards, > Rongliang > > On 08/18/2016 05:52 PM, Matthew Knepley wrote: >> Run with valgrind and send the log. >> >> Thanks, >> >> Matt >> >> On Thu, Aug 18, 2016 at 3:59 AM, Rongliang Chen >> > wrote: >> >> Dear All, >> >> I try to use the block matrix (BAIJ) for the dmplex data >> structure with the option "-dm_mat_type baij" (the block size >> is 7). The code works fine when np = 1 but the following >> error comes up when np>1. And the code also works fine for >> np>1 if I set the block size to be 1. Any suggestions are >> highly appreciated. >> >> ---------------------------------------------------- >> [1]PETSC ERROR: PetscMallocValidate: error detected at >> VecAXPY_Seq() line 89 in >> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c >> [1]PETSC ERROR: Memory at address 0x1332571 is corrupted >> [1]PETSC ERROR: Probably write past beginning or end of array >> [1]PETSC ERROR: Last intact block allocated in >> PetscStrallocpy() line 188 in >> /home/rlchen/soft/petsc-3.6.3/src/sys/utils/str.c >> [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [1]PETSC ERROR: Memory corruption: >> http://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind >> >> [1]PETSC ERROR: >> [1]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html >> for >> trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >> [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen >> Thu Aug 18 16:42:34 2016 >> [1]PETSC ERROR: Configure options --download-blacs >> --download-scalapack --download-metis --download-parmetis >> --download-exodusii --download-netcdf --download-hdf5 >> --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared >> --with-debugging=1 --download-fblaslapack --with-64-bit-indices >> [1]PETSC ERROR: #1 PetscMallocValidate() line 136 in >> /home/rlchen/soft/petsc-3.6.3/src/sys/memory/mtr.c >> [1]PETSC ERROR: #2 VecAXPY_Seq() line 89 in >> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c >> [1]PETSC ERROR: #3 VecAXPY() line 640 in >> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/interface/rvector.c >> --------------------------------------------------- >> >> Best regards, >> Rongliang >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Thu Aug 18 08:35:12 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Thu, 18 Aug 2016 22:35:12 +0900 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU Message-ID: Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of computing h value. I found above command (MatSNESMFWPSetComputeNormU) but my fortran compiler couldn't fine any reference of that command. I checked Petsc changes log, but there weren't any mentions about that command. Should I have to include another specific header file? Thank you always. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 18 09:18:12 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 09:18:12 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of > computing h value. > > I found above command (MatSNESMFWPSetComputeNormU) but my fortran compiler > couldn't fine any reference of that command. > > I checked Petsc changes log, but there weren't any mentions about that > command. > > Should I have to include another specific header file? > We have this function http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMFFDWPSetComputeNormU.html but I would recommend using the command line option *-mat_mffd_compute_normu* Thanks, Matt Thank you always. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Thu Aug 18 10:39:42 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Fri, 19 Aug 2016 00:39:42 +0900 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: 1) I wanna know the difference between applying option with command line and within source code. >From my experience, command line option helps set other default settings that I didn't applied, I guess. 2) I made a matrix-free matrix with MatCreateSNESMF function, and every time I check my snes context with SNESView, Mat Object: 1 MPI processes type: mffd rows=11616, cols=11616 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) The compute h routine has not yet been set at the end of line shows there's no routine for computing h value. I used MatMFFDWPSetComputeNormU function, but it didn't work I think. Is it ok if I leave the h value that way? Or should I have to set h computing routine? Kyungjun. 2016-08-18 23:18 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: > >> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of >> computing h value. >> >> I found above command (MatSNESMFWPSetComputeNormU) but my fortran >> compiler couldn't fine any reference of that command. >> >> I checked Petsc changes log, but there weren't any mentions about that >> command. >> >> Should I have to include another specific header file? >> > > We have this function > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/ > MatMFFDWPSetComputeNormU.html > > but I would recommend using the command line option > > *-mat_mffd_compute_normu* > > Thanks, > > Matt > > Thank you always. >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 18 10:54:53 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 10:54:53 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: On Thu, Aug 18, 2016 at 10:39 AM, ??? wrote: > 1) I wanna know the difference between applying option with command line > and within source code. > From my experience, command line option helps set other default settings > that I didn't applied, I guess. > The command line arguments are applied to an object when *SetFromOptions() is called, so in this case you want SNESSetFromOptions() on the solver. There should be no difference from using the API. > 2) I made a matrix-free matrix with MatCreateSNESMF function, and every > time I check my snes context with SNESView, > > Mat Object: 1 MPI processes > type: mffd > rows=11616, cols=11616 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > The compute h routine has not yet been set > > at the end of line shows there's no routine for computing h value. > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. > Is it ok if I leave the h value that way? Or should I have to set h > computing routine? > I am guessing you are calling the function on a different object from the one that is viewed here. However, there will always be a default function for computing h. Thanks, Matt > Kyungjun. > > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : > >> On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: >> >>> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of >>> computing h value. >>> >>> I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>> compiler couldn't fine any reference of that command. >>> >>> I checked Petsc changes log, but there weren't any mentions about that >>> command. >>> >>> Should I have to include another specific header file? >>> >> >> We have this function >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >> s/Mat/MatMFFDWPSetComputeNormU.html >> >> but I would recommend using the command line option >> >> *-mat_mffd_compute_normu* >> >> Thanks, >> >> Matt >> >> Thank you always. >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Thu Aug 18 11:42:02 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Fri, 19 Aug 2016 01:42:02 +0900 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: Thanks for your helpful answers. Here's another question... As I read some example PETSc codes, I noticed that there should be a preconditioning matrix (e.g. approx. jacobian matrix) when using MatCreateSNESMF(). I mean, after calling MatCreateSNESMF(snes, A, ier), there should be another matrix preA(preconditioning matrix) to use SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). 1) Is there any way that I can use matrix-free method without making preconditioning matrix? 2) I have a reference code, and the code adopts MatFDColoringCreate() and finally uses SNESComputeJacobianDefaultColor() at FormJacobian stage. But I can't see the inside of the fdcolor and I'm curious of this mechanism. Can you explain this very briefly or tell me an example code that I can refer to. ( I think none of PETSc example code is using fdcolor..) Best, Kyungjun. 2016-08-19 0:54 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 10:39 AM, ??? wrote: > >> 1) I wanna know the difference between applying option with command line >> and within source code. >> From my experience, command line option helps set other default settings >> that I didn't applied, I guess. >> > > The command line arguments are applied to an object when *SetFromOptions() > is called, so in this case > you want SNESSetFromOptions() on the solver. There should be no difference > from using the API. > > >> 2) I made a matrix-free matrix with MatCreateSNESMF function, and every >> time I check my snes context with SNESView, >> >> Mat Object: 1 MPI processes >> type: mffd >> rows=11616, cols=11616 >> Matrix-free approximation: >> err=1.49012e-08 (relative error in function evaluation) >> The compute h routine has not yet been set >> >> at the end of line shows there's no routine for computing h value. >> I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >> Is it ok if I leave the h value that way? Or should I have to set h >> computing routine? >> > > I am guessing you are calling the function on a different object from the > one that is viewed here. > However, there will always be a default function for computing h. > > Thanks, > > Matt > > >> Kyungjun. >> >> 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >> >>> On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: >>> >>>> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of >>>> computing h value. >>>> >>>> I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>>> compiler couldn't fine any reference of that command. >>>> >>>> I checked Petsc changes log, but there weren't any mentions about that >>>> command. >>>> >>>> Should I have to include another specific header file? >>>> >>> >>> We have this function >>> >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >>> s/Mat/MatMFFDWPSetComputeNormU.html >>> >>> but I would recommend using the command line option >>> >>> *-mat_mffd_compute_normu* >>> >>> Thanks, >>> >>> Matt >>> >>> Thank you always. >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 18 11:44:29 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 11:44:29 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: On Thu, Aug 18, 2016 at 11:42 AM, ??? wrote: > Thanks for your helpful answers. > > Here's another question... > > As I read some example PETSc codes, I noticed that there should be a > preconditioning matrix (e.g. approx. jacobian matrix) when using > MatCreateSNESMF(). > > I mean, > after calling MatCreateSNESMF(snes, A, ier), > there should be another matrix preA(preconditioning matrix) to use > SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). > > > 1) Is there any way that I can use matrix-free method without making > preconditioning matrix? > Don't use a preconditioner. As you might expect, this does not often work out well. > 2) I have a reference code, and the code adopts > > MatFDColoringCreate() > and finally uses > SNESComputeJacobianDefaultColor() at FormJacobian stage. > > But I can't see the inside of the fdcolor and I'm curious of this > mechanism. Can you explain this very briefly or tell me an example code > that I can refer to. ( I think none of PETSc example code is using > fdcolor..) > This is the default, so there is no need for all that code. We use naive graph 2-coloring. I think there might be a review article by Alex Pothen about that. Thanks, Matt > > Best, > > Kyungjun. > > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : > >> On Thu, Aug 18, 2016 at 10:39 AM, ??? wrote: >> >>> 1) I wanna know the difference between applying option with command line >>> and within source code. >>> From my experience, command line option helps set other default settings >>> that I didn't applied, I guess. >>> >> >> The command line arguments are applied to an object when >> *SetFromOptions() is called, so in this case >> you want SNESSetFromOptions() on the solver. There should be no >> difference from using the API. >> >> >>> 2) I made a matrix-free matrix with MatCreateSNESMF function, and every >>> time I check my snes context with SNESView, >>> >>> Mat Object: 1 MPI processes >>> type: mffd >>> rows=11616, cols=11616 >>> Matrix-free approximation: >>> err=1.49012e-08 (relative error in function evaluation) >>> The compute h routine has not yet been set >>> >>> at the end of line shows there's no routine for computing h value. >>> I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >>> Is it ok if I leave the h value that way? Or should I have to set h >>> computing routine? >>> >> >> I am guessing you are calling the function on a different object from the >> one that is viewed here. >> However, there will always be a default function for computing h. >> >> Thanks, >> >> Matt >> >> >>> Kyungjun. >>> >>> 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >>> >>>> On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: >>>> >>>>> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of >>>>> computing h value. >>>>> >>>>> I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>>>> compiler couldn't fine any reference of that command. >>>>> >>>>> I checked Petsc changes log, but there weren't any mentions about that >>>>> command. >>>>> >>>>> Should I have to include another specific header file? >>>>> >>>> >>>> We have this function >>>> >>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >>>> s/Mat/MatMFFDWPSetComputeNormU.html >>>> >>>> but I would recommend using the command line option >>>> >>>> *-mat_mffd_compute_normu* >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thank you always. >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Thu Aug 18 12:04:24 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Fri, 19 Aug 2016 02:04:24 +0900 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: Then in order not to use preconditioner, is it ok if I just put A matrix-free matrix (made from MatCreateSNESMF()) into the place where preA should be? The flow goes like this - call SNESCreate - call SNESSetFunction(snes, r, FormResidual, userctx, ier) - call MatCreateSNESMF(snes, A, ier) - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) - call SNESSetFromOptions() - call SNESGetKSP(snes, ksp, ier) - call KSPSetType(ksp, KSPGMRES, ier) - call KSPGetPC(ksp, pc, ier) - call PCSetType(pc, PCNONE, ier) - call KSPGMRESSetRestart(ksp, 30, ier) - call SNESSolve() . . and inside the FormJacobian routine - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ must be pointed with A and A. Thank you again, Kyungjun. 2016-08-19 1:44 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 11:42 AM, ??? wrote: > >> Thanks for your helpful answers. >> >> Here's another question... >> >> As I read some example PETSc codes, I noticed that there should be a >> preconditioning matrix (e.g. approx. jacobian matrix) when using >> MatCreateSNESMF(). >> >> I mean, >> after calling MatCreateSNESMF(snes, A, ier), >> there should be another matrix preA(preconditioning matrix) to use >> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >> >> >> 1) Is there any way that I can use matrix-free method without making >> preconditioning matrix? >> > > Don't use a preconditioner. As you might expect, this does not often work > out well. > > >> 2) I have a reference code, and the code adopts >> >> MatFDColoringCreate() >> and finally uses >> SNESComputeJacobianDefaultColor() at FormJacobian stage. >> >> But I can't see the inside of the fdcolor and I'm curious of this >> mechanism. Can you explain this very briefly or tell me an example code >> that I can refer to. ( I think none of PETSc example code is using >> fdcolor..) >> > > This is the default, so there is no need for all that code. We use naive > graph 2-coloring. I think there might be a review article by Alex Pothen > about that. > > Thanks, > > Matt > > >> >> Best, >> >> Kyungjun. >> >> 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >> >>> On Thu, Aug 18, 2016 at 10:39 AM, ??? wrote: >>> >>>> 1) I wanna know the difference between applying option with command >>>> line and within source code. >>>> From my experience, command line option helps set other default >>>> settings that I didn't applied, I guess. >>>> >>> >>> The command line arguments are applied to an object when >>> *SetFromOptions() is called, so in this case >>> you want SNESSetFromOptions() on the solver. There should be no >>> difference from using the API. >>> >>> >>>> 2) I made a matrix-free matrix with MatCreateSNESMF function, and every >>>> time I check my snes context with SNESView, >>>> >>>> Mat Object: 1 MPI processes >>>> type: mffd >>>> rows=11616, cols=11616 >>>> Matrix-free approximation: >>>> err=1.49012e-08 (relative error in function evaluation) >>>> The compute h routine has not yet been set >>>> >>>> at the end of line shows there's no routine for computing h value. >>>> I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >>>> Is it ok if I leave the h value that way? Or should I have to set h >>>> computing routine? >>>> >>> >>> I am guessing you are calling the function on a different object from >>> the one that is viewed here. >>> However, there will always be a default function for computing h. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Kyungjun. >>>> >>>> 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >>>> >>>>> On Thu, Aug 18, 2016 at 8:35 AM, ??? >>>>> wrote: >>>>> >>>>>> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way >>>>>> of computing h value. >>>>>> >>>>>> I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>>>>> compiler couldn't fine any reference of that command. >>>>>> >>>>>> I checked Petsc changes log, but there weren't any mentions about >>>>>> that command. >>>>>> >>>>>> Should I have to include another specific header file? >>>>>> >>>>> >>>>> We have this function >>>>> >>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >>>>> s/Mat/MatMFFDWPSetComputeNormU.html >>>>> >>>>> but I would recommend using the command line option >>>>> >>>>> *-mat_mffd_compute_normu* >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> Thank you always. >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 18 12:05:29 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 12:05:29 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: On Thu, Aug 18, 2016 at 12:04 PM, ??? wrote: > Then in order not to use preconditioner, > > is it ok if I just put A matrix-free matrix (made from MatCreateSNESMF()) > into the place where preA should be? > Yes, but again the solve will likely perform very poorly. Thanks, Matt > The flow goes like this > - call SNESCreate > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) > - call MatCreateSNESMF(snes, A, ier) > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) > - call SNESSetFromOptions() > > - call SNESGetKSP(snes, ksp, ier) > - call KSPSetType(ksp, KSPGMRES, ier) > - call KSPGetPC(ksp, pc, ier) > - call PCSetType(pc, PCNONE, ier) > - call KSPGMRESSetRestart(ksp, 30, ier) > > - call SNESSolve() > . > . > > > and inside the FormJacobian routine > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ > must be pointed with A and A. > > > > Thank you again, > > Kyungjun. > > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : > >> On Thu, Aug 18, 2016 at 11:42 AM, ??? wrote: >> >>> Thanks for your helpful answers. >>> >>> Here's another question... >>> >>> As I read some example PETSc codes, I noticed that there should be a >>> preconditioning matrix (e.g. approx. jacobian matrix) when using >>> MatCreateSNESMF(). >>> >>> I mean, >>> after calling MatCreateSNESMF(snes, A, ier), >>> there should be another matrix preA(preconditioning matrix) to use >>> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >>> >>> >>> 1) Is there any way that I can use matrix-free method without making >>> preconditioning matrix? >>> >> >> Don't use a preconditioner. As you might expect, this does not often work >> out well. >> >> >>> 2) I have a reference code, and the code adopts >>> >>> MatFDColoringCreate() >>> and finally uses >>> SNESComputeJacobianDefaultColor() at FormJacobian stage. >>> >>> But I can't see the inside of the fdcolor and I'm curious of this >>> mechanism. Can you explain this very briefly or tell me an example code >>> that I can refer to. ( I think none of PETSc example code is using >>> fdcolor..) >>> >> >> This is the default, so there is no need for all that code. We use naive >> graph 2-coloring. I think there might be a review article by Alex Pothen >> about that. >> >> Thanks, >> >> Matt >> >> >>> >>> Best, >>> >>> Kyungjun. >>> >>> 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >>> >>>> On Thu, Aug 18, 2016 at 10:39 AM, ??? >>>> wrote: >>>> >>>>> 1) I wanna know the difference between applying option with command >>>>> line and within source code. >>>>> From my experience, command line option helps set other default >>>>> settings that I didn't applied, I guess. >>>>> >>>> >>>> The command line arguments are applied to an object when >>>> *SetFromOptions() is called, so in this case >>>> you want SNESSetFromOptions() on the solver. There should be no >>>> difference from using the API. >>>> >>>> >>>>> 2) I made a matrix-free matrix with MatCreateSNESMF function, and >>>>> every time I check my snes context with SNESView, >>>>> >>>>> Mat Object: 1 MPI processes >>>>> type: mffd >>>>> rows=11616, cols=11616 >>>>> Matrix-free approximation: >>>>> err=1.49012e-08 (relative error in function evaluation) >>>>> The compute h routine has not yet been set >>>>> >>>>> at the end of line shows there's no routine for computing h value. >>>>> I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >>>>> Is it ok if I leave the h value that way? Or should I have to set h >>>>> computing routine? >>>>> >>>> >>>> I am guessing you are calling the function on a different object from >>>> the one that is viewed here. >>>> However, there will always be a default function for computing h. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Kyungjun. >>>>> >>>>> 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >>>>> >>>>>> On Thu, Aug 18, 2016 at 8:35 AM, ??? >>>>>> wrote: >>>>>> >>>>>>> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way >>>>>>> of computing h value. >>>>>>> >>>>>>> I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>>>>>> compiler couldn't fine any reference of that command. >>>>>>> >>>>>>> I checked Petsc changes log, but there weren't any mentions about >>>>>>> that command. >>>>>>> >>>>>>> Should I have to include another specific header file? >>>>>>> >>>>>> >>>>>> We have this function >>>>>> >>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >>>>>> s/Mat/MatMFFDWPSetComputeNormU.html >>>>>> >>>>>> but I would recommend using the command line option >>>>>> >>>>>> *-mat_mffd_compute_normu* >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> Thank you always. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Thu Aug 18 14:03:28 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Fri, 19 Aug 2016 04:03:28 +0900 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: I got stuck at FormJacobian stage. - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) --> J & pJ are same with A matrix-free matrix (input argument) with these kind of messages.. [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Mat type mffd Guess it's because I used A matrix-free matrix (which is mffd type) into pJ position. Is there any solution for this kind of situation? 2016-08-19 2:05 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 12:04 PM, ??? wrote: > >> Then in order not to use preconditioner, >> >> is it ok if I just put A matrix-free matrix (made from MatCreateSNESMF()) >> into the place where preA should be? >> > > Yes, but again the solve will likely perform very poorly. > > Thanks, > > Matt > > >> The flow goes like this >> - call SNESCreate >> - call SNESSetFunction(snes, r, FormResidual, userctx, ier) >> - call MatCreateSNESMF(snes, A, ier) >> - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) >> - call SNESSetFromOptions() >> >> - call SNESGetKSP(snes, ksp, ier) >> - call KSPSetType(ksp, KSPGMRES, ier) >> - call KSPGetPC(ksp, pc, ier) >> - call PCSetType(pc, PCNONE, ier) >> - call KSPGMRESSetRestart(ksp, 30, ier) >> >> - call SNESSolve() >> . >> . >> >> >> and inside the FormJacobian routine >> - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ >> must be pointed with A and A. >> >> >> >> Thank you again, >> >> Kyungjun. >> >> 2016-08-19 1:44 GMT+09:00 Matthew Knepley : >> >>> On Thu, Aug 18, 2016 at 11:42 AM, ??? wrote: >>> >>>> Thanks for your helpful answers. >>>> >>>> Here's another question... >>>> >>>> As I read some example PETSc codes, I noticed that there should be a >>>> preconditioning matrix (e.g. approx. jacobian matrix) when using >>>> MatCreateSNESMF(). >>>> >>>> I mean, >>>> after calling MatCreateSNESMF(snes, A, ier), >>>> there should be another matrix preA(preconditioning matrix) to use >>>> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >>>> >>>> >>>> 1) Is there any way that I can use matrix-free method without making >>>> preconditioning matrix? >>>> >>> >>> Don't use a preconditioner. As you might expect, this does not often >>> work out well. >>> >>> >>>> 2) I have a reference code, and the code adopts >>>> >>>> MatFDColoringCreate() >>>> and finally uses >>>> SNESComputeJacobianDefaultColor() at FormJacobian stage. >>>> >>>> But I can't see the inside of the fdcolor and I'm curious of this >>>> mechanism. Can you explain this very briefly or tell me an example code >>>> that I can refer to. ( I think none of PETSc example code is using >>>> fdcolor..) >>>> >>> >>> This is the default, so there is no need for all that code. We use naive >>> graph 2-coloring. I think there might be a review article by Alex Pothen >>> about that. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> >>>> Best, >>>> >>>> Kyungjun. >>>> >>>> 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >>>> >>>>> On Thu, Aug 18, 2016 at 10:39 AM, ??? >>>>> wrote: >>>>> >>>>>> 1) I wanna know the difference between applying option with command >>>>>> line and within source code. >>>>>> From my experience, command line option helps set other default >>>>>> settings that I didn't applied, I guess. >>>>>> >>>>> >>>>> The command line arguments are applied to an object when >>>>> *SetFromOptions() is called, so in this case >>>>> you want SNESSetFromOptions() on the solver. There should be no >>>>> difference from using the API. >>>>> >>>>> >>>>>> 2) I made a matrix-free matrix with MatCreateSNESMF function, and >>>>>> every time I check my snes context with SNESView, >>>>>> >>>>>> Mat Object: 1 MPI processes >>>>>> type: mffd >>>>>> rows=11616, cols=11616 >>>>>> Matrix-free approximation: >>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>> The compute h routine has not yet been set >>>>>> >>>>>> at the end of line shows there's no routine for computing h value. >>>>>> I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >>>>>> Is it ok if I leave the h value that way? Or should I have to set h >>>>>> computing routine? >>>>>> >>>>> >>>>> I am guessing you are calling the function on a different object from >>>>> the one that is viewed here. >>>>> However, there will always be a default function for computing h. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Kyungjun. >>>>>> >>>>>> 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >>>>>> >>>>>>> On Thu, Aug 18, 2016 at 8:35 AM, ??? >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way >>>>>>>> of computing h value. >>>>>>>> >>>>>>>> I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>>>>>>> compiler couldn't fine any reference of that command. >>>>>>>> >>>>>>>> I checked Petsc changes log, but there weren't any mentions about >>>>>>>> that command. >>>>>>>> >>>>>>>> Should I have to include another specific header file? >>>>>>>> >>>>>>> >>>>>>> We have this function >>>>>>> >>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >>>>>>> s/Mat/MatMFFDWPSetComputeNormU.html >>>>>>> >>>>>>> but I would recommend using the command line option >>>>>>> >>>>>>> *-mat_mffd_compute_normu* >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> Thank you always. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 18 14:05:19 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 14:05:19 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: On Thu, Aug 18, 2016 at 2:03 PM, ??? wrote: > I got stuck at FormJacobian stage. > > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) --> > J & pJ are same with A matrix-free matrix (input argument) > > > > with these kind of messages.. > > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Mat type mffd > 1) Always give the ENTIRE error message 2) As I said in the last two messages, you can only use this without a preconditioner, so you need -pc_type none Matt > > Guess it's because I used A matrix-free matrix (which is mffd type) into > pJ position. > > Is there any solution for this kind of situation? > > > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : > >> On Thu, Aug 18, 2016 at 12:04 PM, ??? wrote: >> >>> Then in order not to use preconditioner, >>> >>> is it ok if I just put A matrix-free matrix (made from >>> MatCreateSNESMF()) into the place where preA should be? >>> >> >> Yes, but again the solve will likely perform very poorly. >> >> Thanks, >> >> Matt >> >> >>> The flow goes like this >>> - call SNESCreate >>> - call SNESSetFunction(snes, r, FormResidual, userctx, ier) >>> - call MatCreateSNESMF(snes, A, ier) >>> - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) >>> - call SNESSetFromOptions() >>> >>> - call SNESGetKSP(snes, ksp, ier) >>> - call KSPSetType(ksp, KSPGMRES, ier) >>> - call KSPGetPC(ksp, pc, ier) >>> - call PCSetType(pc, PCNONE, ier) >>> - call KSPGMRESSetRestart(ksp, 30, ier) >>> >>> - call SNESSolve() >>> . >>> . >>> >>> >>> and inside the FormJacobian routine >>> - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ >>> must be pointed with A and A. >>> >>> >>> >>> Thank you again, >>> >>> Kyungjun. >>> >>> 2016-08-19 1:44 GMT+09:00 Matthew Knepley : >>> >>>> On Thu, Aug 18, 2016 at 11:42 AM, ??? >>>> wrote: >>>> >>>>> Thanks for your helpful answers. >>>>> >>>>> Here's another question... >>>>> >>>>> As I read some example PETSc codes, I noticed that there should be a >>>>> preconditioning matrix (e.g. approx. jacobian matrix) when using >>>>> MatCreateSNESMF(). >>>>> >>>>> I mean, >>>>> after calling MatCreateSNESMF(snes, A, ier), >>>>> there should be another matrix preA(preconditioning matrix) to use >>>>> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >>>>> >>>>> >>>>> 1) Is there any way that I can use matrix-free method without making >>>>> preconditioning matrix? >>>>> >>>> >>>> Don't use a preconditioner. As you might expect, this does not often >>>> work out well. >>>> >>>> >>>>> 2) I have a reference code, and the code adopts >>>>> >>>>> MatFDColoringCreate() >>>>> and finally uses >>>>> SNESComputeJacobianDefaultColor() at FormJacobian stage. >>>>> >>>>> But I can't see the inside of the fdcolor and I'm curious of this >>>>> mechanism. Can you explain this very briefly or tell me an example code >>>>> that I can refer to. ( I think none of PETSc example code is using >>>>> fdcolor..) >>>>> >>>> >>>> This is the default, so there is no need for all that code. We use >>>> naive graph 2-coloring. I think there might be a review article by Alex >>>> Pothen about that. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> >>>>> Best, >>>>> >>>>> Kyungjun. >>>>> >>>>> 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >>>>> >>>>>> On Thu, Aug 18, 2016 at 10:39 AM, ??? >>>>>> wrote: >>>>>> >>>>>>> 1) I wanna know the difference between applying option with command >>>>>>> line and within source code. >>>>>>> From my experience, command line option helps set other default >>>>>>> settings that I didn't applied, I guess. >>>>>>> >>>>>> >>>>>> The command line arguments are applied to an object when >>>>>> *SetFromOptions() is called, so in this case >>>>>> you want SNESSetFromOptions() on the solver. There should be no >>>>>> difference from using the API. >>>>>> >>>>>> >>>>>>> 2) I made a matrix-free matrix with MatCreateSNESMF function, and >>>>>>> every time I check my snes context with SNESView, >>>>>>> >>>>>>> Mat Object: 1 MPI processes >>>>>>> type: mffd >>>>>>> rows=11616, cols=11616 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> The compute h routine has not yet been set >>>>>>> >>>>>>> at the end of line shows there's no routine for computing h value. >>>>>>> I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >>>>>>> Is it ok if I leave the h value that way? Or should I have to set h >>>>>>> computing routine? >>>>>>> >>>>>> >>>>>> I am guessing you are calling the function on a different object from >>>>>> the one that is viewed here. >>>>>> However, there will always be a default function for computing h. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Kyungjun. >>>>>>> >>>>>>> 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >>>>>>> >>>>>>>> On Thu, Aug 18, 2016 at 8:35 AM, ??? >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice >>>>>>>>> way of computing h value. >>>>>>>>> >>>>>>>>> I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>>>>>>>> compiler couldn't fine any reference of that command. >>>>>>>>> >>>>>>>>> I checked Petsc changes log, but there weren't any mentions about >>>>>>>>> that command. >>>>>>>>> >>>>>>>>> Should I have to include another specific header file? >>>>>>>>> >>>>>>>> >>>>>>>> We have this function >>>>>>>> >>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >>>>>>>> s/Mat/MatMFFDWPSetComputeNormU.html >>>>>>>> >>>>>>>> but I would recommend using the command line option >>>>>>>> >>>>>>>> *-mat_mffd_compute_normu* >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> Thank you always. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Thu Aug 18 14:14:18 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Fri, 19 Aug 2016 04:14:18 +0900 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: I called SNESView() at right above call SNESSolve() to check options that I applied. SNES Object: 1 MPI processes type: newtonls SNES has not been set up so information may be incomplete maximum iterations=1, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-32, solution=1e-08 total number of linear solver iterations=0 total number of function evaluations=0 norm schedule ALWAYS SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=0.001, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using DEFAULT norm type for convergence test PC Object: 1 MPI processes type: none PC has not been set up so information may be incomplete linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mffd rows=64, cols=64 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) The compute h routine has not yet been set ------------------------------------------------------------------------------------- And this is the whole error message [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Mat type mffd [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./flus ? ??? on a arch-linux2-c-debug named ckj-System by ckj Fri Aug 19 04:09:56 2016 [0]PETSC ERROR: Configure options --with-cc=icc --with-cxx=icpc --with-fc=ifort --with-mpi-include=/opt/intel/impi/5.0.1.035/intel64/include --with-mpi-lib="-L/opt/intel//impi/5.0.1.035/intel64/release -L/opt/intel//impi/5.0.1.035/intel64/lib -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread" --with-blas-lapack-dir=/opt/intel/mkl --prefix=/opt/petsc/3.7.3 [0]PETSC ERROR: #1 MatZeroEntries() line 5511 in /home/ckj/Repository/petsc-3.7.3/src/mat/interface/matrix.c [0]PETSC ERROR: #2 SNESComputeJacobianDefault() line 65 in /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snesj.c [0]PETSC ERROR: #3 oursnesjacobian() line 105 in /home/ckj/Repository/petsc-3.7.3/src/snes/interface/ftn-custom/zsnesf.c [0]PETSC ERROR: #4 SNESComputeJacobian() line 2312 in /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c [0]PETSC ERROR: #5 SNESSolve_NEWTONLS() line 228 in /home/ckj/Repository/petsc-3.7.3/src/snes/impls/ls/ls.c [0]PETSC ERROR: #6 SNESSolve() line 4005 in /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c --------------------------------------------------------------------------------------------- I checked that the program got out of calling SNESSolve() after this error message. But I can't figure out the reason for this error. Best, Kyungjun 2016-08-19 4:05 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 2:03 PM, ??? wrote: > >> I got stuck at FormJacobian stage. >> >> - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) >> --> J & pJ are same with A matrix-free matrix (input argument) >> >> >> >> with these kind of messages.. >> >> [0]PETSC ERROR: No support for this operation for this object type >> [0]PETSC ERROR: Mat type mffd >> > > 1) Always give the ENTIRE error message > > 2) As I said in the last two messages, you can only use this without a > preconditioner, so you need > > -pc_type none > > Matt > > >> >> Guess it's because I used A matrix-free matrix (which is mffd type) into >> pJ position. >> >> Is there any solution for this kind of situation? >> >> >> 2016-08-19 2:05 GMT+09:00 Matthew Knepley : >> >>> On Thu, Aug 18, 2016 at 12:04 PM, ??? wrote: >>> >>>> Then in order not to use preconditioner, >>>> >>>> is it ok if I just put A matrix-free matrix (made from >>>> MatCreateSNESMF()) into the place where preA should be? >>>> >>> >>> Yes, but again the solve will likely perform very poorly. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> The flow goes like this >>>> - call SNESCreate >>>> - call SNESSetFunction(snes, r, FormResidual, userctx, ier) >>>> - call MatCreateSNESMF(snes, A, ier) >>>> - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) >>>> - call SNESSetFromOptions() >>>> >>>> - call SNESGetKSP(snes, ksp, ier) >>>> - call KSPSetType(ksp, KSPGMRES, ier) >>>> - call KSPGetPC(ksp, pc, ier) >>>> - call PCSetType(pc, PCNONE, ier) >>>> - call KSPGMRESSetRestart(ksp, 30, ier) >>>> >>>> - call SNESSolve() >>>> . >>>> . >>>> >>>> >>>> and inside the FormJacobian routine >>>> - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ >>>> must be pointed with A and A. >>>> >>>> >>>> >>>> Thank you again, >>>> >>>> Kyungjun. >>>> >>>> 2016-08-19 1:44 GMT+09:00 Matthew Knepley : >>>> >>>>> On Thu, Aug 18, 2016 at 11:42 AM, ??? >>>>> wrote: >>>>> >>>>>> Thanks for your helpful answers. >>>>>> >>>>>> Here's another question... >>>>>> >>>>>> As I read some example PETSc codes, I noticed that there should be a >>>>>> preconditioning matrix (e.g. approx. jacobian matrix) when using >>>>>> MatCreateSNESMF(). >>>>>> >>>>>> I mean, >>>>>> after calling MatCreateSNESMF(snes, A, ier), >>>>>> there should be another matrix preA(preconditioning matrix) to use >>>>>> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >>>>>> >>>>>> >>>>>> 1) Is there any way that I can use matrix-free method without making >>>>>> preconditioning matrix? >>>>>> >>>>> >>>>> Don't use a preconditioner. As you might expect, this does not often >>>>> work out well. >>>>> >>>>> >>>>>> 2) I have a reference code, and the code adopts >>>>>> >>>>>> MatFDColoringCreate() >>>>>> and finally uses >>>>>> SNESComputeJacobianDefaultColor() at FormJacobian stage. >>>>>> >>>>>> But I can't see the inside of the fdcolor and I'm curious of this >>>>>> mechanism. Can you explain this very briefly or tell me an example code >>>>>> that I can refer to. ( I think none of PETSc example code is using >>>>>> fdcolor..) >>>>>> >>>>> >>>>> This is the default, so there is no need for all that code. We use >>>>> naive graph 2-coloring. I think there might be a review article by Alex >>>>> Pothen about that. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> >>>>>> Best, >>>>>> >>>>>> Kyungjun. >>>>>> >>>>>> 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >>>>>> >>>>>>> On Thu, Aug 18, 2016 at 10:39 AM, ??? >>>>>>> wrote: >>>>>>> >>>>>>>> 1) I wanna know the difference between applying option with command >>>>>>>> line and within source code. >>>>>>>> From my experience, command line option helps set other default >>>>>>>> settings that I didn't applied, I guess. >>>>>>>> >>>>>>> >>>>>>> The command line arguments are applied to an object when >>>>>>> *SetFromOptions() is called, so in this case >>>>>>> you want SNESSetFromOptions() on the solver. There should be no >>>>>>> difference from using the API. >>>>>>> >>>>>>> >>>>>>>> 2) I made a matrix-free matrix with MatCreateSNESMF function, and >>>>>>>> every time I check my snes context with SNESView, >>>>>>>> >>>>>>>> Mat Object: 1 MPI processes >>>>>>>> type: mffd >>>>>>>> rows=11616, cols=11616 >>>>>>>> Matrix-free approximation: >>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>> The compute h routine has not yet been set >>>>>>>> >>>>>>>> at the end of line shows there's no routine for computing h value. >>>>>>>> I used MatMFFDWPSetComputeNormU function, but it didn't work I >>>>>>>> think. >>>>>>>> Is it ok if I leave the h value that way? Or should I have to set h >>>>>>>> computing routine? >>>>>>>> >>>>>>> >>>>>>> I am guessing you are calling the function on a different object >>>>>>> from the one that is viewed here. >>>>>>> However, there will always be a default function for computing h. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> >>>>>>>> Kyungjun. >>>>>>>> >>>>>>>> 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >>>>>>>> >>>>>>>>> On Thu, Aug 18, 2016 at 8:35 AM, ??? >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice >>>>>>>>>> way of computing h value. >>>>>>>>>> >>>>>>>>>> I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>>>>>>>>> compiler couldn't fine any reference of that command. >>>>>>>>>> >>>>>>>>>> I checked Petsc changes log, but there weren't any mentions about >>>>>>>>>> that command. >>>>>>>>>> >>>>>>>>>> Should I have to include another specific header file? >>>>>>>>>> >>>>>>>>> >>>>>>>>> We have this function >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >>>>>>>>> s/Mat/MatMFFDWPSetComputeNormU.html >>>>>>>>> >>>>>>>>> but I would recommend using the command line option >>>>>>>>> >>>>>>>>> *-mat_mffd_compute_normu* >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Matt >>>>>>>>> >>>>>>>>> Thank you always. >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>> experiments lead. >>>>>>>>> -- Norbert Wiener >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their >>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>> experiments lead. >>>>>>> -- Norbert Wiener >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 18 14:23:52 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 14:23:52 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: On Thu, Aug 18, 2016 at 2:14 PM, ??? wrote: > I called SNESView() at right above call SNESSolve() to check options > that I applied. > Okay, I am not understanding what you want to do. Below, you have setup SNES to calculate the entire Jacobian using a finite-difference (FD) approximation. This is the same as using -snes_fd I thought you wanted to use a matrix-free action for the Jacobian -snes_mf and no preconditioner -pc_type none although I have no idea what kind of a problem this would work for. Matt > SNES Object: 1 MPI processes > type: newtonls > SNES has not been set up so information may be incomplete > maximum iterations=1, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-32, solution=1e-08 > total number of linear solver iterations=0 > total number of function evaluations=0 > norm schedule ALWAYS > SNESLineSearch Object: 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=0.001, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using DEFAULT norm type for convergence test > PC Object: 1 MPI processes > type: none > PC has not been set up so information may be incomplete > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mffd > rows=64, cols=64 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > The compute h routine has not yet been set > > ------------------------------------------------------------ > ------------------------- > > And this is the whole error message > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Mat type mffd > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./flus > > > ? ??? on a arch-linux2-c-debug > named ckj-System by ckj Fri Aug 19 04:09:56 2016 > [0]PETSC ERROR: Configure options --with-cc=icc --with-cxx=icpc > --with-fc=ifort --with-mpi-include=/opt/intel/impi/ > 5.0.1.035/intel64/include --with-mpi-lib="-L/opt/intel//impi/ > 5.0.1.035/intel64/release -L/opt/intel//impi/5.0.1.035/intel64/lib > -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread" --with-blas-lapack-dir=/opt/intel/mkl > --prefix=/opt/petsc/3.7.3 > [0]PETSC ERROR: #1 MatZeroEntries() line 5511 in > /home/ckj/Repository/petsc-3.7.3/src/mat/interface/matrix.c > [0]PETSC ERROR: #2 SNESComputeJacobianDefault() line 65 in > /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snesj.c > [0]PETSC ERROR: #3 oursnesjacobian() line 105 in > /home/ckj/Repository/petsc-3.7.3/src/snes/interface/ftn-custom/zsnesf.c > [0]PETSC ERROR: #4 SNESComputeJacobian() line 2312 in > /home/ckj/Repository/petsc-3.7.3/src/snes/interface/snes.c > [0]PETSC ERROR: #5 SNESSolve_NEWTONLS() line 228 in > /home/ckj/Repository/petsc-3.7.3/src/snes/impls/ls/ls.c > [0]PETSC ERROR: #6 SNESSolve() line 4005 in /home/ckj/Repository/petsc-3. > 7.3/src/snes/interface/snes.c > > ------------------------------------------------------------ > --------------------------------- > > I checked that the program got out of calling SNESSolve() after this error > message. > > But I can't figure out the reason for this error. > > > Best, > > Kyungjun > > > 2016-08-19 4:05 GMT+09:00 Matthew Knepley : > >> On Thu, Aug 18, 2016 at 2:03 PM, ??? wrote: >> >>> I got stuck at FormJacobian stage. >>> >>> - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) >>> --> J & pJ are same with A matrix-free matrix (input argument) >>> >>> >>> >>> with these kind of messages.. >>> >>> [0]PETSC ERROR: No support for this operation for this object type >>> [0]PETSC ERROR: Mat type mffd >>> >> >> 1) Always give the ENTIRE error message >> >> 2) As I said in the last two messages, you can only use this without a >> preconditioner, so you need >> >> -pc_type none >> >> Matt >> >> >>> >>> Guess it's because I used A matrix-free matrix (which is mffd type) into >>> pJ position. >>> >>> Is there any solution for this kind of situation? >>> >>> >>> 2016-08-19 2:05 GMT+09:00 Matthew Knepley : >>> >>>> On Thu, Aug 18, 2016 at 12:04 PM, ??? >>>> wrote: >>>> >>>>> Then in order not to use preconditioner, >>>>> >>>>> is it ok if I just put A matrix-free matrix (made from >>>>> MatCreateSNESMF()) into the place where preA should be? >>>>> >>>> >>>> Yes, but again the solve will likely perform very poorly. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> The flow goes like this >>>>> - call SNESCreate >>>>> - call SNESSetFunction(snes, r, FormResidual, userctx, ier) >>>>> - call MatCreateSNESMF(snes, A, ier) >>>>> - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) >>>>> - call SNESSetFromOptions() >>>>> >>>>> - call SNESGetKSP(snes, ksp, ier) >>>>> - call KSPSetType(ksp, KSPGMRES, ier) >>>>> - call KSPGetPC(ksp, pc, ier) >>>>> - call PCSetType(pc, PCNONE, ier) >>>>> - call KSPGMRESSetRestart(ksp, 30, ier) >>>>> >>>>> - call SNESSolve() >>>>> . >>>>> . >>>>> >>>>> >>>>> and inside the FormJacobian routine >>>>> - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ >>>>> must be pointed with A and A. >>>>> >>>>> >>>>> >>>>> Thank you again, >>>>> >>>>> Kyungjun. >>>>> >>>>> 2016-08-19 1:44 GMT+09:00 Matthew Knepley : >>>>> >>>>>> On Thu, Aug 18, 2016 at 11:42 AM, ??? >>>>>> wrote: >>>>>> >>>>>>> Thanks for your helpful answers. >>>>>>> >>>>>>> Here's another question... >>>>>>> >>>>>>> As I read some example PETSc codes, I noticed that there should be a >>>>>>> preconditioning matrix (e.g. approx. jacobian matrix) when using >>>>>>> MatCreateSNESMF(). >>>>>>> >>>>>>> I mean, >>>>>>> after calling MatCreateSNESMF(snes, A, ier), >>>>>>> there should be another matrix preA(preconditioning matrix) to use >>>>>>> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >>>>>>> >>>>>>> >>>>>>> 1) Is there any way that I can use matrix-free method without making >>>>>>> preconditioning matrix? >>>>>>> >>>>>> >>>>>> Don't use a preconditioner. As you might expect, this does not often >>>>>> work out well. >>>>>> >>>>>> >>>>>>> 2) I have a reference code, and the code adopts >>>>>>> >>>>>>> MatFDColoringCreate() >>>>>>> and finally uses >>>>>>> SNESComputeJacobianDefaultColor() at FormJacobian stage. >>>>>>> >>>>>>> But I can't see the inside of the fdcolor and I'm curious of this >>>>>>> mechanism. Can you explain this very briefly or tell me an example code >>>>>>> that I can refer to. ( I think none of PETSc example code is using >>>>>>> fdcolor..) >>>>>>> >>>>>> >>>>>> This is the default, so there is no need for all that code. We use >>>>>> naive graph 2-coloring. I think there might be a review article by Alex >>>>>> Pothen about that. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> >>>>>>> Best, >>>>>>> >>>>>>> Kyungjun. >>>>>>> >>>>>>> 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >>>>>>> >>>>>>>> On Thu, Aug 18, 2016 at 10:39 AM, ??? >>>>>>>> wrote: >>>>>>>> >>>>>>>>> 1) I wanna know the difference between applying option with >>>>>>>>> command line and within source code. >>>>>>>>> From my experience, command line option helps set other default >>>>>>>>> settings that I didn't applied, I guess. >>>>>>>>> >>>>>>>> >>>>>>>> The command line arguments are applied to an object when >>>>>>>> *SetFromOptions() is called, so in this case >>>>>>>> you want SNESSetFromOptions() on the solver. There should be no >>>>>>>> difference from using the API. >>>>>>>> >>>>>>>> >>>>>>>>> 2) I made a matrix-free matrix with MatCreateSNESMF function, and >>>>>>>>> every time I check my snes context with SNESView, >>>>>>>>> >>>>>>>>> Mat Object: 1 MPI processes >>>>>>>>> type: mffd >>>>>>>>> rows=11616, cols=11616 >>>>>>>>> Matrix-free approximation: >>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>> The compute h routine has not yet been set >>>>>>>>> >>>>>>>>> at the end of line shows there's no routine for computing h value. >>>>>>>>> I used MatMFFDWPSetComputeNormU function, but it didn't work I >>>>>>>>> think. >>>>>>>>> Is it ok if I leave the h value that way? Or should I have to set >>>>>>>>> h computing routine? >>>>>>>>> >>>>>>>> >>>>>>>> I am guessing you are calling the function on a different object >>>>>>>> from the one that is viewed here. >>>>>>>> However, there will always be a default function for computing h. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Kyungjun. >>>>>>>>> >>>>>>>>> 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >>>>>>>>> >>>>>>>>>> On Thu, Aug 18, 2016 at 8:35 AM, ??? >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, I'm trying to set my SNES matrix-free with Walker & Pernice >>>>>>>>>>> way of computing h value. >>>>>>>>>>> >>>>>>>>>>> I found above command (MatSNESMFWPSetComputeNormU) but my >>>>>>>>>>> fortran compiler couldn't fine any reference of that command. >>>>>>>>>>> >>>>>>>>>>> I checked Petsc changes log, but there weren't any mentions >>>>>>>>>>> about that command. >>>>>>>>>>> >>>>>>>>>>> Should I have to include another specific header file? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> We have this function >>>>>>>>>> >>>>>>>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpage >>>>>>>>>> s/Mat/MatMFFDWPSetComputeNormU.html >>>>>>>>>> >>>>>>>>>> but I would recommend using the command line option >>>>>>>>>> >>>>>>>>>> *-mat_mffd_compute_normu* >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> Thank you always. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 18 14:27:38 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 18 Aug 2016 14:27:38 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: Message-ID: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> You can't use that Jacobian function SNESComputeJacobianDefault with matrix free, it tries to compute the matrix entries and stick them into the matrix. You can use MatMFFDComputeJacobian > On Aug 18, 2016, at 2:03 PM, ??? wrote: > > I got stuck at FormJacobian stage. > > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) --> J & pJ are same with A matrix-free matrix (input argument) > > > > with these kind of messages.. > > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Mat type mffd > > > > Guess it's because I used A matrix-free matrix (which is mffd type) into pJ position. > > Is there any solution for this kind of situation? > > > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 12:04 PM, ??? wrote: > Then in order not to use preconditioner, > > is it ok if I just put A matrix-free matrix (made from MatCreateSNESMF()) into the place where preA should be? > > Yes, but again the solve will likely perform very poorly. > > Thanks, > > Matt > > The flow goes like this > - call SNESCreate > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) > - call MatCreateSNESMF(snes, A, ier) > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) > - call SNESSetFromOptions() > > - call SNESGetKSP(snes, ksp, ier) > - call KSPSetType(ksp, KSPGMRES, ier) > - call KSPGetPC(ksp, pc, ier) > - call PCSetType(pc, PCNONE, ier) > - call KSPGMRESSetRestart(ksp, 30, ier) > > - call SNESSolve() > . > . > > > and inside the FormJacobian routine > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ must be pointed with A and A. > > > > Thank you again, > > Kyungjun. > > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 11:42 AM, ??? wrote: > Thanks for your helpful answers. > > Here's another question... > > As I read some example PETSc codes, I noticed that there should be a preconditioning matrix (e.g. approx. jacobian matrix) when using MatCreateSNESMF(). > > I mean, > after calling MatCreateSNESMF(snes, A, ier), > there should be another matrix preA(preconditioning matrix) to use SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). > > > 1) Is there any way that I can use matrix-free method without making preconditioning matrix? > > Don't use a preconditioner. As you might expect, this does not often work out well. > > 2) I have a reference code, and the code adopts > > MatFDColoringCreate() > and finally uses > SNESComputeJacobianDefaultColor() at FormJacobian stage. > > But I can't see the inside of the fdcolor and I'm curious of this mechanism. Can you explain this very briefly or tell me an example code that I can refer to. ( I think none of PETSc example code is using fdcolor..) > > This is the default, so there is no need for all that code. We use naive graph 2-coloring. I think there might be a review article by Alex Pothen about that. > > Thanks, > > Matt > > > Best, > > Kyungjun. > > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 10:39 AM, ??? wrote: > 1) I wanna know the difference between applying option with command line and within source code. > From my experience, command line option helps set other default settings that I didn't applied, I guess. > > The command line arguments are applied to an object when *SetFromOptions() is called, so in this case > you want SNESSetFromOptions() on the solver. There should be no difference from using the API. > > 2) I made a matrix-free matrix with MatCreateSNESMF function, and every time I check my snes context with SNESView, > > Mat Object: 1 MPI processes > type: mffd > rows=11616, cols=11616 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > The compute h routine has not yet been set > > at the end of line shows there's no routine for computing h value. > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. > Is it ok if I leave the h value that way? Or should I have to set h computing routine? > > I am guessing you are calling the function on a different object from the one that is viewed here. > However, there will always be a default function for computing h. > > Thanks, > > Matt > > Kyungjun. > > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of computing h value. > > I found above command (MatSNESMFWPSetComputeNormU) but my fortran compiler couldn't fine any reference of that command. > > I checked Petsc changes log, but there weren't any mentions about that command. > > Should I have to include another specific header file? > > We have this function > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMFFDWPSetComputeNormU.html > > but I would recommend using the command line option > > -mat_mffd_compute_normu > > Thanks, > > Matt > > Thank you always. > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From kyungjun.choi92 at gmail.com Thu Aug 18 14:44:49 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Fri, 19 Aug 2016 04:44:49 +0900 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> References: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> Message-ID: Is there a part that you considered this as finite-difference approximation? I thought I used matrix-free method with MatCreateSNESMF() function Also I used - call PCSetType(pc, PCNONE, ier) --> so the pc type shows 'none' at the log I didn't use any of command line options. Kyungjun 2016-08-19 4:27 GMT+09:00 Barry Smith : > > You can't use that Jacobian function SNESComputeJacobianDefault with > matrix free, it tries to compute the matrix entries and stick them into the > matrix. You can use MatMFFDComputeJacobian > > > On Aug 18, 2016, at 2:03 PM, ??? wrote: > > > > I got stuck at FormJacobian stage. > > > > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) > --> J & pJ are same with A matrix-free matrix (input argument) > > > > > > > > with these kind of messages.. > > > > [0]PETSC ERROR: No support for this operation for this object type > > [0]PETSC ERROR: Mat type mffd > > > > > > > > Guess it's because I used A matrix-free matrix (which is mffd type) into > pJ position. > > > > Is there any solution for this kind of situation? > > > > > > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 12:04 PM, ??? wrote: > > Then in order not to use preconditioner, > > > > is it ok if I just put A matrix-free matrix (made from > MatCreateSNESMF()) into the place where preA should be? > > > > Yes, but again the solve will likely perform very poorly. > > > > Thanks, > > > > Matt > > > > The flow goes like this > > - call SNESCreate > > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) > > - call MatCreateSNESMF(snes, A, ier) > > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) > > - call SNESSetFromOptions() > > > > - call SNESGetKSP(snes, ksp, ier) > > - call KSPSetType(ksp, KSPGMRES, ier) > > - call KSPGetPC(ksp, pc, ier) > > - call PCSetType(pc, PCNONE, ier) > > - call KSPGMRESSetRestart(ksp, 30, ier) > > > > - call SNESSolve() > > . > > . > > > > > > and inside the FormJacobian routine > > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ > must be pointed with A and A. > > > > > > > > Thank you again, > > > > Kyungjun. > > > > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 11:42 AM, ??? wrote: > > Thanks for your helpful answers. > > > > Here's another question... > > > > As I read some example PETSc codes, I noticed that there should be a > preconditioning matrix (e.g. approx. jacobian matrix) when using > MatCreateSNESMF(). > > > > I mean, > > after calling MatCreateSNESMF(snes, A, ier), > > there should be another matrix preA(preconditioning matrix) to use > SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). > > > > > > 1) Is there any way that I can use matrix-free method without making > preconditioning matrix? > > > > Don't use a preconditioner. As you might expect, this does not often > work out well. > > > > 2) I have a reference code, and the code adopts > > > > MatFDColoringCreate() > > and finally uses > > SNESComputeJacobianDefaultColor() at FormJacobian stage. > > > > But I can't see the inside of the fdcolor and I'm curious of this > mechanism. Can you explain this very briefly or tell me an example code > that I can refer to. ( I think none of PETSc example code is using > fdcolor..) > > > > This is the default, so there is no need for all that code. We use naive > graph 2-coloring. I think there might be a review article by Alex Pothen > about that. > > > > Thanks, > > > > Matt > > > > > > Best, > > > > Kyungjun. > > > > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 10:39 AM, ??? wrote: > > 1) I wanna know the difference between applying option with command line > and within source code. > > From my experience, command line option helps set other default settings > that I didn't applied, I guess. > > > > The command line arguments are applied to an object when > *SetFromOptions() is called, so in this case > > you want SNESSetFromOptions() on the solver. There should be no > difference from using the API. > > > > 2) I made a matrix-free matrix with MatCreateSNESMF function, and every > time I check my snes context with SNESView, > > > > Mat Object: 1 MPI processes > > type: mffd > > rows=11616, cols=11616 > > Matrix-free approximation: > > err=1.49012e-08 (relative error in function evaluation) > > The compute h routine has not yet been set > > > > at the end of line shows there's no routine for computing h value. > > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. > > Is it ok if I leave the h value that way? Or should I have to set h > computing routine? > > > > I am guessing you are calling the function on a different object from > the one that is viewed here. > > However, there will always be a default function for computing h. > > > > Thanks, > > > > Matt > > > > Kyungjun. > > > > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: > > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of > computing h value. > > > > I found above command (MatSNESMFWPSetComputeNormU) but my fortran > compiler couldn't fine any reference of that command. > > > > I checked Petsc changes log, but there weren't any mentions about that > command. > > > > Should I have to include another specific header file? > > > > We have this function > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/ > MatMFFDWPSetComputeNormU.html > > > > but I would recommend using the command line option > > > > -mat_mffd_compute_normu > > > > Thanks, > > > > Matt > > > > Thank you always. > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 18 14:47:05 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 14:47:05 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> Message-ID: On Thu, Aug 18, 2016 at 2:44 PM, ??? wrote: > Is there a part that you considered this as finite-difference > approximation? > I thought I used matrix-free method with MatCreateSNESMF() function > You did not tell the SNES to use a MF Jacobian, you just made a Mat object. This is why we encourage people to use the command line. Everything is setup correctly and in order. Why would you choose not to. This creates long rounds of email. Matt > Also I used > - call PCSetType(pc, PCNONE, ier) --> so the pc type shows 'none' at the > log > > > I didn't use any of command line options. > > > Kyungjun > > 2016-08-19 4:27 GMT+09:00 Barry Smith : > >> >> You can't use that Jacobian function SNESComputeJacobianDefault with >> matrix free, it tries to compute the matrix entries and stick them into the >> matrix. You can use MatMFFDComputeJacobian >> >> > On Aug 18, 2016, at 2:03 PM, ??? wrote: >> > >> > I got stuck at FormJacobian stage. >> > >> > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) >> --> J & pJ are same with A matrix-free matrix (input argument) >> > >> > >> > >> > with these kind of messages.. >> > >> > [0]PETSC ERROR: No support for this operation for this object type >> > [0]PETSC ERROR: Mat type mffd >> > >> > >> > >> > Guess it's because I used A matrix-free matrix (which is mffd type) >> into pJ position. >> > >> > Is there any solution for this kind of situation? >> > >> > >> > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : >> > On Thu, Aug 18, 2016 at 12:04 PM, ??? >> wrote: >> > Then in order not to use preconditioner, >> > >> > is it ok if I just put A matrix-free matrix (made from >> MatCreateSNESMF()) into the place where preA should be? >> > >> > Yes, but again the solve will likely perform very poorly. >> > >> > Thanks, >> > >> > Matt >> > >> > The flow goes like this >> > - call SNESCreate >> > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) >> > - call MatCreateSNESMF(snes, A, ier) >> > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) >> > - call SNESSetFromOptions() >> > >> > - call SNESGetKSP(snes, ksp, ier) >> > - call KSPSetType(ksp, KSPGMRES, ier) >> > - call KSPGetPC(ksp, pc, ier) >> > - call PCSetType(pc, PCNONE, ier) >> > - call KSPGMRESSetRestart(ksp, 30, ier) >> > >> > - call SNESSolve() >> > . >> > . >> > >> > >> > and inside the FormJacobian routine >> > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ >> must be pointed with A and A. >> > >> > >> > >> > Thank you again, >> > >> > Kyungjun. >> > >> > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : >> > On Thu, Aug 18, 2016 at 11:42 AM, ??? >> wrote: >> > Thanks for your helpful answers. >> > >> > Here's another question... >> > >> > As I read some example PETSc codes, I noticed that there should be a >> preconditioning matrix (e.g. approx. jacobian matrix) when using >> MatCreateSNESMF(). >> > >> > I mean, >> > after calling MatCreateSNESMF(snes, A, ier), >> > there should be another matrix preA(preconditioning matrix) to use >> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >> > >> > >> > 1) Is there any way that I can use matrix-free method without making >> preconditioning matrix? >> > >> > Don't use a preconditioner. As you might expect, this does not often >> work out well. >> > >> > 2) I have a reference code, and the code adopts >> > >> > MatFDColoringCreate() >> > and finally uses >> > SNESComputeJacobianDefaultColor() at FormJacobian stage. >> > >> > But I can't see the inside of the fdcolor and I'm curious of this >> mechanism. Can you explain this very briefly or tell me an example code >> that I can refer to. ( I think none of PETSc example code is using >> fdcolor..) >> > >> > This is the default, so there is no need for all that code. We use >> naive graph 2-coloring. I think there might be a review article by Alex >> Pothen about that. >> > >> > Thanks, >> > >> > Matt >> > >> > >> > Best, >> > >> > Kyungjun. >> > >> > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >> > On Thu, Aug 18, 2016 at 10:39 AM, ??? >> wrote: >> > 1) I wanna know the difference between applying option with command >> line and within source code. >> > From my experience, command line option helps set other default >> settings that I didn't applied, I guess. >> > >> > The command line arguments are applied to an object when >> *SetFromOptions() is called, so in this case >> > you want SNESSetFromOptions() on the solver. There should be no >> difference from using the API. >> > >> > 2) I made a matrix-free matrix with MatCreateSNESMF function, and every >> time I check my snes context with SNESView, >> > >> > Mat Object: 1 MPI processes >> > type: mffd >> > rows=11616, cols=11616 >> > Matrix-free approximation: >> > err=1.49012e-08 (relative error in function evaluation) >> > The compute h routine has not yet been set >> > >> > at the end of line shows there's no routine for computing h value. >> > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >> > Is it ok if I leave the h value that way? Or should I have to set h >> computing routine? >> > >> > I am guessing you are calling the function on a different object from >> the one that is viewed here. >> > However, there will always be a default function for computing h. >> > >> > Thanks, >> > >> > Matt >> > >> > Kyungjun. >> > >> > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >> > On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: >> > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of >> computing h value. >> > >> > I found above command (MatSNESMFWPSetComputeNormU) but my fortran >> compiler couldn't fine any reference of that command. >> > >> > I checked Petsc changes log, but there weren't any mentions about that >> command. >> > >> > Should I have to include another specific header file? >> > >> > We have this function >> > >> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages >> /Mat/MatMFFDWPSetComputeNormU.html >> > >> > but I would recommend using the command line option >> > >> > -mat_mffd_compute_normu >> > >> > Thanks, >> > >> > Matt >> > >> > Thank you always. >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkg2140 at gmail.com Thu Aug 18 17:15:16 2016 From: dkg2140 at gmail.com (Krzysztof Gawarecki) Date: Fri, 19 Aug 2016 00:15:16 +0200 Subject: [petsc-users] MatCreateFFT in Fortran Message-ID: Dear All, I'm trying to implement Fast Fourier Transform (via FFTW interface in PETSC) to my program. Unfortunately, I have not found any example/tutorial for fortran. When I use MatCreateFFT it leads to a crash. My test code is: #include "petsc/finclude/petscsysdef.h" #include "petsc/finclude/petscmatdef.h" program test use petscsys use petscmat implicit none Mat :: A integer :: ndim,vdim(3) integer :: ierr ndim = 3 vdim(1) = 30 vdim(2) = 30 vdim(3) = 30 call PetscInitialize(PETSC_NULL_CHARACTER,ierr) call MatCreateFFT(PETSC_COMM_WORLD,ndim,vdim,MATFFTW,A,ierr) call PetscFinalize(ierr) end program When MatCreateFFT is executed program crushes and gives: "[Minas-Thirith:17525] *** An error occurred in MPI_Comm_size [Minas-Thirith:17525] *** reported by process [3784507393,2] [Minas-Thirith:17525] *** on communicator MPI_COMM_WORLD [Minas-Thirith:17525] *** MPI_ERR_COMM: invalid communicator [Minas-Thirith:17525] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [Minas-Thirith:17525] *** and potentially your MPI job) [Minas-Thirith:17520] 5 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal [Minas-Thirith:17520] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages" What am I doing wrong? Best regards, K.G. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aks084000 at utdallas.edu Thu Aug 18 17:41:39 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Thu, 18 Aug 2016 22:41:39 +0000 Subject: [petsc-users] Meaning of error message (gamg & fieldsplit related) In-Reply-To: <6513A82A-D072-459D-8FB4-5D60750ADFDC@mcs.anl.gov> References: <7050CA61-07F5-4599-9CB1-219472487A20@utdallas.edu> <5889439212f64929a139951a1b68b523@utdallas.edu>, <6513A82A-D072-459D-8FB4-5D60750ADFDC@mcs.anl.gov> Message-ID: <68ac26cfe0484640a224a86e0bbdc6d5@utdallas.edu> Barry, Yep, it works now. Thank you, Artur From jshen25 at jhu.edu Thu Aug 18 21:22:54 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Thu, 18 Aug 2016 22:22:54 -0400 Subject: [petsc-users] Store and reuse the factor of matrix Message-ID: ?Hi, I'm trying to implement modified newton method to solve the nonlinear finite element using petsc. As well known, the advantage of modified newton is the Jacobian matrix is always same during the iteration, which means once the J is factorized at the first iteration, we can store the factors and avoid the factorization for next iteration if we use direct solver, e.g. super_lu. Therefore, the option FACTORED in SUPER_LU is quite useful. However, it looks like the option FACTORED is not available in SUPER_LU_DIST in petsc. I tried, and it shows 'unknown option'. Is there alternative way to use the same idea of FACTORED in petsc for super_lu? Also, I'm wondering whether iterative solver in PETSC is also able to apply the same strategy. In other words, in the problem where Jacobian is constant, only residue and solution vectors need to be updated, is there any way to take advantage of such same Jacobian pattern to expedite the computation using iterative solver? Thank you BTW, though using modified newton will increase the iteration number, however, in the case which is much more expensive to factorize the jacobian, more iterations will probably be worthwhile. Bests, Jinlei -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 18 21:28:22 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 21:28:22 -0500 Subject: [petsc-users] Store and reuse the factor of matrix In-Reply-To: References: Message-ID: On Thu, Aug 18, 2016 at 9:22 PM, Jinlei Shen wrote: > ?Hi, > > I'm trying to implement modified newton method to solve the nonlinear > finite element using petsc. > > As well known, the advantage of modified newton is the Jacobian matrix is > always same during the iteration, which means once the J is factorized at > the first iteration, we can store the factors and avoid the factorization > for next iteration if we use direct solver, e.g. super_lu. Therefore, the > option FACTORED in SUPER_LU is quite useful. > > However, it looks like the option FACTORED is not available in > SUPER_LU_DIST in petsc. I tried, and it shows 'unknown option'. > > Is there alternative way to use the same idea of FACTORED in petsc for > super_lu? > > Also, I'm wondering whether iterative solver in PETSC is also able to > apply the same strategy. > > In other words, in the problem where Jacobian is constant, only residue > and solution vectors need to be updated, is there any way to take advantage > of such same Jacobian pattern to expedite the computation using iterative > solver? > > Thank you > > BTW, though using modified newton will increase the iteration number, > however, in the case which is much more expensive to factorize the > jacobian, more iterations will probably be worthwhile. > You can use http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagJacobian.html http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESSetLagPreconditioner.html#SNESSetLagPreconditioner to get fine-grained control over this without writing any code. Matt Bests, > Jinlei > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Thu Aug 18 22:28:44 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Fri, 19 Aug 2016 12:28:44 +0900 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> Message-ID: Dear Matt. I didn't use the command line options because it looked not working. I called SNESSetFromOptions(snes, ier) in my source code, but options like -snes_mf or -snes_monitor doesn't look working. Is there anything that I should consider more? 2016-08-19 4:47 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 2:44 PM, ??? wrote: > >> Is there a part that you considered this as finite-difference >> approximation? >> I thought I used matrix-free method with MatCreateSNESMF() function >> > > You did not tell the SNES to use a MF Jacobian, you just made a Mat > object. This is why > we encourage people to use the command line. Everything is setup > correctly and in order. > Why would you choose not to. This creates long rounds of email. > > Matt > > >> Also I used >> - call PCSetType(pc, PCNONE, ier) --> so the pc type shows 'none' at the >> log >> >> >> I didn't use any of command line options. >> >> >> Kyungjun >> >> 2016-08-19 4:27 GMT+09:00 Barry Smith : >> >>> >>> You can't use that Jacobian function SNESComputeJacobianDefault with >>> matrix free, it tries to compute the matrix entries and stick them into the >>> matrix. You can use MatMFFDComputeJacobian >>> >>> > On Aug 18, 2016, at 2:03 PM, ??? wrote: >>> > >>> > I got stuck at FormJacobian stage. >>> > >>> > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) >>> --> J & pJ are same with A matrix-free matrix (input argument) >>> > >>> > >>> > >>> > with these kind of messages.. >>> > >>> > [0]PETSC ERROR: No support for this operation for this object type >>> > [0]PETSC ERROR: Mat type mffd >>> > >>> > >>> > >>> > Guess it's because I used A matrix-free matrix (which is mffd type) >>> into pJ position. >>> > >>> > Is there any solution for this kind of situation? >>> > >>> > >>> > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : >>> > On Thu, Aug 18, 2016 at 12:04 PM, ??? >>> wrote: >>> > Then in order not to use preconditioner, >>> > >>> > is it ok if I just put A matrix-free matrix (made from >>> MatCreateSNESMF()) into the place where preA should be? >>> > >>> > Yes, but again the solve will likely perform very poorly. >>> > >>> > Thanks, >>> > >>> > Matt >>> > >>> > The flow goes like this >>> > - call SNESCreate >>> > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) >>> > - call MatCreateSNESMF(snes, A, ier) >>> > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) >>> > - call SNESSetFromOptions() >>> > >>> > - call SNESGetKSP(snes, ksp, ier) >>> > - call KSPSetType(ksp, KSPGMRES, ier) >>> > - call KSPGetPC(ksp, pc, ier) >>> > - call PCSetType(pc, PCNONE, ier) >>> > - call KSPGMRESSetRestart(ksp, 30, ier) >>> > >>> > - call SNESSolve() >>> > . >>> > . >>> > >>> > >>> > and inside the FormJacobian routine >>> > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ >>> must be pointed with A and A. >>> > >>> > >>> > >>> > Thank you again, >>> > >>> > Kyungjun. >>> > >>> > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : >>> > On Thu, Aug 18, 2016 at 11:42 AM, ??? >>> wrote: >>> > Thanks for your helpful answers. >>> > >>> > Here's another question... >>> > >>> > As I read some example PETSc codes, I noticed that there should be a >>> preconditioning matrix (e.g. approx. jacobian matrix) when using >>> MatCreateSNESMF(). >>> > >>> > I mean, >>> > after calling MatCreateSNESMF(snes, A, ier), >>> > there should be another matrix preA(preconditioning matrix) to use >>> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >>> > >>> > >>> > 1) Is there any way that I can use matrix-free method without making >>> preconditioning matrix? >>> > >>> > Don't use a preconditioner. As you might expect, this does not often >>> work out well. >>> > >>> > 2) I have a reference code, and the code adopts >>> > >>> > MatFDColoringCreate() >>> > and finally uses >>> > SNESComputeJacobianDefaultColor() at FormJacobian stage. >>> > >>> > But I can't see the inside of the fdcolor and I'm curious of this >>> mechanism. Can you explain this very briefly or tell me an example code >>> that I can refer to. ( I think none of PETSc example code is using >>> fdcolor..) >>> > >>> > This is the default, so there is no need for all that code. We use >>> naive graph 2-coloring. I think there might be a review article by Alex >>> Pothen about that. >>> > >>> > Thanks, >>> > >>> > Matt >>> > >>> > >>> > Best, >>> > >>> > Kyungjun. >>> > >>> > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >>> > On Thu, Aug 18, 2016 at 10:39 AM, ??? >>> wrote: >>> > 1) I wanna know the difference between applying option with command >>> line and within source code. >>> > From my experience, command line option helps set other default >>> settings that I didn't applied, I guess. >>> > >>> > The command line arguments are applied to an object when >>> *SetFromOptions() is called, so in this case >>> > you want SNESSetFromOptions() on the solver. There should be no >>> difference from using the API. >>> > >>> > 2) I made a matrix-free matrix with MatCreateSNESMF function, and >>> every time I check my snes context with SNESView, >>> > >>> > Mat Object: 1 MPI processes >>> > type: mffd >>> > rows=11616, cols=11616 >>> > Matrix-free approximation: >>> > err=1.49012e-08 (relative error in function evaluation) >>> > The compute h routine has not yet been set >>> > >>> > at the end of line shows there's no routine for computing h value. >>> > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >>> > Is it ok if I leave the h value that way? Or should I have to set h >>> computing routine? >>> > >>> > I am guessing you are calling the function on a different object from >>> the one that is viewed here. >>> > However, there will always be a default function for computing h. >>> > >>> > Thanks, >>> > >>> > Matt >>> > >>> > Kyungjun. >>> > >>> > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >>> > On Thu, Aug 18, 2016 at 8:35 AM, ??? >>> wrote: >>> > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of >>> computing h value. >>> > >>> > I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>> compiler couldn't fine any reference of that command. >>> > >>> > I checked Petsc changes log, but there weren't any mentions about that >>> command. >>> > >>> > Should I have to include another specific header file? >>> > >>> > We have this function >>> > >>> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages >>> /Mat/MatMFFDWPSetComputeNormU.html >>> > >>> > but I would recommend using the command line option >>> > >>> > -mat_mffd_compute_normu >>> > >>> > Thanks, >>> > >>> > Matt >>> > >>> > Thank you always. >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > -- Norbert Wiener >>> > >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > -- Norbert Wiener >>> > >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > -- Norbert Wiener >>> > >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > -- Norbert Wiener >>> > >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 18 22:31:47 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 18 Aug 2016 22:31:47 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> Message-ID: On Thu, Aug 18, 2016 at 10:28 PM, ??? wrote: > Dear Matt. > > I didn't use the command line options because it looked not working. > > I called SNESSetFromOptions(snes, ier) in my source code, > > but options like -snes_mf or -snes_monitor doesn't look working. > I would recommend starting with an example, where these options clearly work (I use SNES ex5 with these options in my class). Then slowly change that example until you have what you want. Matt > Is there anything that I should consider more? > > > 2016-08-19 4:47 GMT+09:00 Matthew Knepley : > >> On Thu, Aug 18, 2016 at 2:44 PM, ??? wrote: >> >>> Is there a part that you considered this as finite-difference >>> approximation? >>> I thought I used matrix-free method with MatCreateSNESMF() function >>> >> >> You did not tell the SNES to use a MF Jacobian, you just made a Mat >> object. This is why >> we encourage people to use the command line. Everything is setup >> correctly and in order. >> Why would you choose not to. This creates long rounds of email. >> >> Matt >> >> >>> Also I used >>> - call PCSetType(pc, PCNONE, ier) --> so the pc type shows 'none' at >>> the log >>> >>> >>> I didn't use any of command line options. >>> >>> >>> Kyungjun >>> >>> 2016-08-19 4:27 GMT+09:00 Barry Smith : >>> >>>> >>>> You can't use that Jacobian function SNESComputeJacobianDefault >>>> with matrix free, it tries to compute the matrix entries and stick them >>>> into the matrix. You can use MatMFFDComputeJacobian >>>> >>>> > On Aug 18, 2016, at 2:03 PM, ??? wrote: >>>> > >>>> > I got stuck at FormJacobian stage. >>>> > >>>> > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, >>>> ier) --> J & pJ are same with A matrix-free matrix (input argument) >>>> > >>>> > >>>> > >>>> > with these kind of messages.. >>>> > >>>> > [0]PETSC ERROR: No support for this operation for this object type >>>> > [0]PETSC ERROR: Mat type mffd >>>> > >>>> > >>>> > >>>> > Guess it's because I used A matrix-free matrix (which is mffd type) >>>> into pJ position. >>>> > >>>> > Is there any solution for this kind of situation? >>>> > >>>> > >>>> > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : >>>> > On Thu, Aug 18, 2016 at 12:04 PM, ??? >>>> wrote: >>>> > Then in order not to use preconditioner, >>>> > >>>> > is it ok if I just put A matrix-free matrix (made from >>>> MatCreateSNESMF()) into the place where preA should be? >>>> > >>>> > Yes, but again the solve will likely perform very poorly. >>>> > >>>> > Thanks, >>>> > >>>> > Matt >>>> > >>>> > The flow goes like this >>>> > - call SNESCreate >>>> > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) >>>> > - call MatCreateSNESMF(snes, A, ier) >>>> > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) >>>> > - call SNESSetFromOptions() >>>> > >>>> > - call SNESGetKSP(snes, ksp, ier) >>>> > - call KSPSetType(ksp, KSPGMRES, ier) >>>> > - call KSPGetPC(ksp, pc, ier) >>>> > - call PCSetType(pc, PCNONE, ier) >>>> > - call KSPGMRESSetRestart(ksp, 30, ier) >>>> > >>>> > - call SNESSolve() >>>> > . >>>> > . >>>> > >>>> > >>>> > and inside the FormJacobian routine >>>> > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and >>>> pJ must be pointed with A and A. >>>> > >>>> > >>>> > >>>> > Thank you again, >>>> > >>>> > Kyungjun. >>>> > >>>> > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : >>>> > On Thu, Aug 18, 2016 at 11:42 AM, ??? >>>> wrote: >>>> > Thanks for your helpful answers. >>>> > >>>> > Here's another question... >>>> > >>>> > As I read some example PETSc codes, I noticed that there should be a >>>> preconditioning matrix (e.g. approx. jacobian matrix) when using >>>> MatCreateSNESMF(). >>>> > >>>> > I mean, >>>> > after calling MatCreateSNESMF(snes, A, ier), >>>> > there should be another matrix preA(preconditioning matrix) to use >>>> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >>>> > >>>> > >>>> > 1) Is there any way that I can use matrix-free method without making >>>> preconditioning matrix? >>>> > >>>> > Don't use a preconditioner. As you might expect, this does not often >>>> work out well. >>>> > >>>> > 2) I have a reference code, and the code adopts >>>> > >>>> > MatFDColoringCreate() >>>> > and finally uses >>>> > SNESComputeJacobianDefaultColor() at FormJacobian stage. >>>> > >>>> > But I can't see the inside of the fdcolor and I'm curious of this >>>> mechanism. Can you explain this very briefly or tell me an example code >>>> that I can refer to. ( I think none of PETSc example code is using >>>> fdcolor..) >>>> > >>>> > This is the default, so there is no need for all that code. We use >>>> naive graph 2-coloring. I think there might be a review article by Alex >>>> Pothen about that. >>>> > >>>> > Thanks, >>>> > >>>> > Matt >>>> > >>>> > >>>> > Best, >>>> > >>>> > Kyungjun. >>>> > >>>> > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >>>> > On Thu, Aug 18, 2016 at 10:39 AM, ??? >>>> wrote: >>>> > 1) I wanna know the difference between applying option with command >>>> line and within source code. >>>> > From my experience, command line option helps set other default >>>> settings that I didn't applied, I guess. >>>> > >>>> > The command line arguments are applied to an object when >>>> *SetFromOptions() is called, so in this case >>>> > you want SNESSetFromOptions() on the solver. There should be no >>>> difference from using the API. >>>> > >>>> > 2) I made a matrix-free matrix with MatCreateSNESMF function, and >>>> every time I check my snes context with SNESView, >>>> > >>>> > Mat Object: 1 MPI processes >>>> > type: mffd >>>> > rows=11616, cols=11616 >>>> > Matrix-free approximation: >>>> > err=1.49012e-08 (relative error in function evaluation) >>>> > The compute h routine has not yet been set >>>> > >>>> > at the end of line shows there's no routine for computing h value. >>>> > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >>>> > Is it ok if I leave the h value that way? Or should I have to set h >>>> computing routine? >>>> > >>>> > I am guessing you are calling the function on a different object from >>>> the one that is viewed here. >>>> > However, there will always be a default function for computing h. >>>> > >>>> > Thanks, >>>> > >>>> > Matt >>>> > >>>> > Kyungjun. >>>> > >>>> > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >>>> > On Thu, Aug 18, 2016 at 8:35 AM, ??? >>>> wrote: >>>> > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way >>>> of computing h value. >>>> > >>>> > I found above command (MatSNESMFWPSetComputeNormU) but my fortran >>>> compiler couldn't fine any reference of that command. >>>> > >>>> > I checked Petsc changes log, but there weren't any mentions about >>>> that command. >>>> > >>>> > Should I have to include another specific header file? >>>> > >>>> > We have this function >>>> > >>>> > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages >>>> /Mat/MatMFFDWPSetComputeNormU.html >>>> > >>>> > but I would recommend using the command line option >>>> > >>>> > -mat_mffd_compute_normu >>>> > >>>> > Thanks, >>>> > >>>> > Matt >>>> > >>>> > Thank you always. >>>> > >>>> > >>>> > >>>> > -- >>>> > What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> > -- Norbert Wiener >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> > -- Norbert Wiener >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> > -- Norbert Wiener >>>> > >>>> > >>>> > >>>> > >>>> > -- >>>> > What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> > -- Norbert Wiener >>>> > >>>> >>>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jshen25 at jhu.edu Thu Aug 18 22:57:09 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Thu, 18 Aug 2016 23:57:09 -0400 Subject: [petsc-users] Store and reuse the factor of matrix In-Reply-To: References: Message-ID: Hi Matt, Thanks for speedy reply. It seems effective in SNES. I'm curious about how it works in iterative solver. Let's say I'm using CG with BJACOBI for modified newton, if I Set lag as 5, does that mean the ilu decomposition for pc is stored and reused for the next 4 iterations? Will this setting help to reduce the iteration number of ksp solver? Also, I'm wondering how to set the same option for just linear KSP solver since I have coded the modified newton framework manually. Thanks Jinlei On Thu, Aug 18, 2016 at 10:28 PM, Matthew Knepley wrote: > On Thu, Aug 18, 2016 at 9:22 PM, Jinlei Shen wrote: > >> ?Hi, >> >> I'm trying to implement modified newton method to solve the nonlinear >> finite element using petsc. >> >> As well known, the advantage of modified newton is the Jacobian matrix >> is always same during the iteration, which means once the J is factorized >> at the first iteration, we can store the factors and avoid the >> factorization for next iteration if we use direct solver, e.g. super_lu. >> Therefore, the option FACTORED in SUPER_LU is quite useful. >> >> However, it looks like the option FACTORED is not available in >> SUPER_LU_DIST in petsc. I tried, and it shows 'unknown option'. >> >> Is there alternative way to use the same idea of FACTORED in petsc for >> super_lu? >> >> Also, I'm wondering whether iterative solver in PETSC is also able to >> apply the same strategy. >> >> In other words, in the problem where Jacobian is constant, only residue >> and solution vectors need to be updated, is there any way to take advantage >> of such same Jacobian pattern to expedite the computation using iterative >> solver? >> >> Thank you >> >> BTW, though using modified newton will increase the iteration number, >> however, in the case which is much more expensive to factorize the >> jacobian, more iterations will probably be worthwhile. >> > > You can use > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/ > SNESSetLagJacobian.html > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/ > SNESSetLagPreconditioner.html#SNESSetLagPreconditioner > > to get fine-grained control over this without writing any code. > > Matt > > Bests, >> Jinlei >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 18 23:00:07 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 18 Aug 2016 23:00:07 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> Message-ID: <88CE2F93-938B-46D8-BEC5-7287E2430353@mcs.anl.gov> > On Aug 18, 2016, at 10:28 PM, ??? wrote: > > Dear Matt. > > I didn't use the command line options because it looked not working. > > I called SNESSetFromOptions(snes, ier) in my source code, > > but options like -snes_mf or -snes_monitor doesn't look working. "doesn't work" is not useful to help us figure out what has gone wrong. You need to show us EXACTLY what you did by sending the code you compiled and the command line options you ran and all the output include full error messages. Without the information we simply do not have enough information to even begin to guess why it "doesn't work". Barry > > > Is there anything that I should consider more? > > > 2016-08-19 4:47 GMT+09:00 Matthew Knepley : > On Thu, Aug 18, 2016 at 2:44 PM, ??? wrote: > Is there a part that you considered this as finite-difference approximation? > I thought I used matrix-free method with MatCreateSNESMF() function > > You did not tell the SNES to use a MF Jacobian, you just made a Mat object. This is why > we encourage people to use the command line. Everything is setup correctly and in order. > Why would you choose not to. This creates long rounds of email. > > Matt > > Also I used > - call PCSetType(pc, PCNONE, ier) --> so the pc type shows 'none' at the log > > > I didn't use any of command line options. > > > Kyungjun > > 2016-08-19 4:27 GMT+09:00 Barry Smith : > > You can't use that Jacobian function SNESComputeJacobianDefault with matrix free, it tries to compute the matrix entries and stick them into the matrix. You can use MatMFFDComputeJacobian > > > On Aug 18, 2016, at 2:03 PM, ??? wrote: > > > > I got stuck at FormJacobian stage. > > > > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) --> J & pJ are same with A matrix-free matrix (input argument) > > > > > > > > with these kind of messages.. > > > > [0]PETSC ERROR: No support for this operation for this object type > > [0]PETSC ERROR: Mat type mffd > > > > > > > > Guess it's because I used A matrix-free matrix (which is mffd type) into pJ position. > > > > Is there any solution for this kind of situation? > > > > > > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 12:04 PM, ??? wrote: > > Then in order not to use preconditioner, > > > > is it ok if I just put A matrix-free matrix (made from MatCreateSNESMF()) into the place where preA should be? > > > > Yes, but again the solve will likely perform very poorly. > > > > Thanks, > > > > Matt > > > > The flow goes like this > > - call SNESCreate > > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) > > - call MatCreateSNESMF(snes, A, ier) > > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) > > - call SNESSetFromOptions() > > > > - call SNESGetKSP(snes, ksp, ier) > > - call KSPSetType(ksp, KSPGMRES, ier) > > - call KSPGetPC(ksp, pc, ier) > > - call PCSetType(pc, PCNONE, ier) > > - call KSPGMRESSetRestart(ksp, 30, ier) > > > > - call SNESSolve() > > . > > . > > > > > > and inside the FormJacobian routine > > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ must be pointed with A and A. > > > > > > > > Thank you again, > > > > Kyungjun. > > > > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 11:42 AM, ??? wrote: > > Thanks for your helpful answers. > > > > Here's another question... > > > > As I read some example PETSc codes, I noticed that there should be a preconditioning matrix (e.g. approx. jacobian matrix) when using MatCreateSNESMF(). > > > > I mean, > > after calling MatCreateSNESMF(snes, A, ier), > > there should be another matrix preA(preconditioning matrix) to use SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). > > > > > > 1) Is there any way that I can use matrix-free method without making preconditioning matrix? > > > > Don't use a preconditioner. As you might expect, this does not often work out well. > > > > 2) I have a reference code, and the code adopts > > > > MatFDColoringCreate() > > and finally uses > > SNESComputeJacobianDefaultColor() at FormJacobian stage. > > > > But I can't see the inside of the fdcolor and I'm curious of this mechanism. Can you explain this very briefly or tell me an example code that I can refer to. ( I think none of PETSc example code is using fdcolor..) > > > > This is the default, so there is no need for all that code. We use naive graph 2-coloring. I think there might be a review article by Alex Pothen about that. > > > > Thanks, > > > > Matt > > > > > > Best, > > > > Kyungjun. > > > > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 10:39 AM, ??? wrote: > > 1) I wanna know the difference between applying option with command line and within source code. > > From my experience, command line option helps set other default settings that I didn't applied, I guess. > > > > The command line arguments are applied to an object when *SetFromOptions() is called, so in this case > > you want SNESSetFromOptions() on the solver. There should be no difference from using the API. > > > > 2) I made a matrix-free matrix with MatCreateSNESMF function, and every time I check my snes context with SNESView, > > > > Mat Object: 1 MPI processes > > type: mffd > > rows=11616, cols=11616 > > Matrix-free approximation: > > err=1.49012e-08 (relative error in function evaluation) > > The compute h routine has not yet been set > > > > at the end of line shows there's no routine for computing h value. > > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. > > Is it ok if I leave the h value that way? Or should I have to set h computing routine? > > > > I am guessing you are calling the function on a different object from the one that is viewed here. > > However, there will always be a default function for computing h. > > > > Thanks, > > > > Matt > > > > Kyungjun. > > > > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: > > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of computing h value. > > > > I found above command (MatSNESMFWPSetComputeNormU) but my fortran compiler couldn't fine any reference of that command. > > > > I checked Petsc changes log, but there weren't any mentions about that command. > > > > Should I have to include another specific header file? > > > > We have this function > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMFFDWPSetComputeNormU.html > > > > but I would recommend using the command line option > > > > -mat_mffd_compute_normu > > > > Thanks, > > > > Matt > > > > Thank you always. > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From jed at jedbrown.org Thu Aug 18 23:27:36 2016 From: jed at jedbrown.org (Jed Brown) Date: Thu, 18 Aug 2016 22:27:36 -0600 Subject: [petsc-users] Store and reuse the factor of matrix In-Reply-To: References: Message-ID: <87wpjd758n.fsf@jedbrown.org> Jinlei Shen writes: > Hi Matt, > > Thanks for speedy reply. > > It seems effective in SNES. > > I'm curious about how it works in iterative solver. > Let's say I'm using CG with BJACOBI for modified newton, if I Set lag as 5, > does that mean the ilu decomposition for pc is stored and reused for the > next 4 iterations? Will this setting help to reduce the iteration number of > ksp solver? Reusing the preconditioner with a new operator will generally converge more slowly (or sometimes not at all). Solving the stale linear system may cause modified Newton to stagnate/fail, e.g., when it chooses a search direction that is not a descent direction. > Also, I'm wondering how to set the same option for just linear KSP solver > since I have coded the modified newton framework manually. You can call KSPSolve() repeatedly without KSPSetOperators. You can also use KSPSetReusePreconditioner to reuse the preconditioner that was set up in a previous solve. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From kyungjun.choi92 at gmail.com Fri Aug 19 00:04:44 2016 From: kyungjun.choi92 at gmail.com (=?UTF-8?B?7LWc6rK97KSA?=) Date: Fri, 19 Aug 2016 14:04:44 +0900 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: <88CE2F93-938B-46D8-BEC5-7287E2430353@mcs.anl.gov> References: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> <88CE2F93-938B-46D8-BEC5-7287E2430353@mcs.anl.gov> Message-ID: Dear Barry and Matt. Thank you very much for helping me up all night. (in my time) And sorry for not asking with sufficient source code condition or my circumstances. (also with poor English.) I just want to make sure that the options of my code is well applied. I'm trying to use GMRES with matrix-free method. I'd like to solve 2-D euler equation without preconditioning matrix, for now. 1) I'm still curious whether my snes context is using MF jacobian. ( just like -snes_mf command line option) 2) And mind if I ask you that whether I applied petsc functions properly? I'll check out ex5 for applying command line options. I'll attach my petsc flow code and option log by SNESView() below. -------------------------------------------------------------------------------------------------------------------- *- petsc flow code* -------------------------------------------------------------------------------------------------------------------- ndof = Mixt%nCVar * Grid%nCell *call VecCreateMPIWIthArray(PETSC_COMM_WORLD, Mixt%nCVar, ndof, PETSC_DECIDE, Mixt%cv, Mixt%x, ier)* *call VecDuplicate(Mixt%x, Mixt%r, ier)* *call VecSet(Mixt%r, zero, ier)* *call SNESCreate(PETSC_COMM_WORLD, Mixt%snes, ier)* *call SNESSetFunction(Mixt%snes, Mixt%r, FormPetscResidual, Collect, ier)* *call MatCreateSNESMF(Mixt%snes, Mixt%A, ier)* *call SNESSetJacobian(Mixt%snes, Mixt%A, Mixt%A, MatMFFDComputeJacobian, Collect, ier)* *call SNESSetFromOptions(Mixt%snes, ier)* *call SNESGetKSP(Mixt%snes, ksp, ier)* *call KSPSetType(ksp, KSPGMRES, ier)* *call KSPGetPC(ksp, pc, ier)* *call PCSetType(pc, PCNONE, ier)* *call KSPSetInitialGuessNonzero(ksp, PETSC_TRUE, ier)* *call KSPGMRESSetRestart(ksp, 30, ier)* *call KSPGMRESSetPreAllocation(ksp, ier)* *call SNESSetFunction(Mixt%snes, Mixt%r, FormPetscResidual, Collect, ier)* *call SNESSetJacobian(Mixt%snes, Mixt%A, Mixt%A, MatMFFDComputeJacobian, Collect, ier)* *call SNESSolve(Mixt%snes, PETSC_NULL_OBJECT, Mixt%x, ier)* *stop* ( for temporary ) -------------------------------------------------------------------------------------------------------------------- *subroutine FormPetscResidual(snes, x, f, Collect, ier)* type(t_Collect), intent(inout) :: Collect SNES :: snes Vec :: x, f integer :: ier, counter, iCell, iVar, temp integer :: ndof real(8), allocatable :: CVar(:,:) real(8), allocatable :: PVar(:,:) PetscScalar, pointer :: xx_v(:) PetscScalar, pointer :: ff_v(:) ! Set degree of freedom of this system. ndof = Collect%pMixt%nCVar * Collect%pGrid%nCell ! Backup the original values for cv to local array CVar allocate( CVar(0:Collect%pMixt%nCVar-1, Collect%pGrid%nCell) ) allocate( PVar(0:Collect%pMixt%nPVar-1, Collect%pGrid%nCell) ) allocate( xx_v(1:ndof) ) allocate( ff_v(1:ndof) ) xx_v(:) = 0d0 ff_v(:) = 0d0 ! Backup the original values for cv and pv do iCell = 1, Collect%pGrid%nCell do iVar = 0, Collect%pMixt%nCVar-1 CVar(iVar,iCell) = Collect%pMixt%cv(iVar,iCell) PVar(iVar,iCell) = Collect%pMixt%pv(iVar,iCell) end do end do ! Copy the input argument vector x to array value xx_v call VecGetArrayReadF90(x, xx_v, ier) call VecGetArrayF90(f, ff_v, ier) ! Compute copy the given vector into Mixt%cv and check for validity counter = 0 do iCell = 1, Collect%pGrid%nCell do iVar = 0, Collect%pMixt%nCVar-1 counter = counter + 1 Collect%pMixt%cv(iVar,iCell) = xx_v(counter) end do end do ! Update primitive variables with input x vector to compute residual call PostProcessing(Collect%pMixt,Collect%pGrid,Collect%pConf) ! Compute the residual call ComputeResidual(Collect%pMixt,Collect%pGrid,Collect%pConf) --> where update residual of cell ! Copy the residual array into the PETSc vector counter = 0 do iCell = 1, Collect%pGrid%nCell do iVar = 0, Collect%pMixt%nCVar-1 counter = counter + 1 ff_v(counter) = Collect%pMixt%Residual(iVar,iCell) + Collect%pGrid%vol(iCell)/Collect%pMixt%TimeStep(iCell)*( Collect%pMixt%cv(iVar,iCell) - CVar(iVar,iCell) ) end do end do ! Restore conservative variables do iCell = 1, Collect%pGrid%nCell do iVar = 0, Collect%pMixt%nCVar-1 Collect%pMixt%cv(iVar,iCell) = CVar(iVar,iCell) Collect%pMixt%pv(iVar,iCell) = PVar(iVar,iCell) end do end do call VecRestoreArrayReadF90(x, xx_v, ier) call VecRestoreArrayF90(f, ff_v, ier) deallocate(CVar) deallocate(PVar) -------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------- *- option log* -------------------------------------------------------------------------------------------------------------------- SNES Object: 1 MPI processes type: newtonls SNES has not been set up so information may be incomplete maximum iterations=1, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-32, solution=1e-08 total number of linear solver iterations=0 total number of function evaluations=0 norm schedule ALWAYS SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=0.001, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using DEFAULT norm type for convergence test PC Object: 1 MPI processes type: none PC has not been set up so information may be incomplete linear system matrix = precond matrix: Mat Object: 1 MPI processes type: mffd rows=11616, cols=11616 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) The compute h routine has not yet been set Sincerely, Kyungjun 2016-08-19 13:00 GMT+09:00 Barry Smith : > > > On Aug 18, 2016, at 10:28 PM, ??? wrote: > > > > Dear Matt. > > > > I didn't use the command line options because it looked not working. > > > > I called SNESSetFromOptions(snes, ier) in my source code, > > > > but options like -snes_mf or -snes_monitor doesn't look working. > > "doesn't work" is not useful to help us figure out what has gone wrong. > You need to show us EXACTLY what you did by sending the code you compiled > and the command line options you ran and all the output include full error > messages. Without the information we simply do not have enough information > to even begin to guess why it "doesn't work". > > Barry > > > > > > > > Is there anything that I should consider more? > > > > > > 2016-08-19 4:47 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 2:44 PM, ??? wrote: > > Is there a part that you considered this as finite-difference > approximation? > > I thought I used matrix-free method with MatCreateSNESMF() function > > > > You did not tell the SNES to use a MF Jacobian, you just made a Mat > object. This is why > > we encourage people to use the command line. Everything is setup > correctly and in order. > > Why would you choose not to. This creates long rounds of email. > > > > Matt > > > > Also I used > > - call PCSetType(pc, PCNONE, ier) --> so the pc type shows 'none' at > the log > > > > > > I didn't use any of command line options. > > > > > > Kyungjun > > > > 2016-08-19 4:27 GMT+09:00 Barry Smith : > > > > You can't use that Jacobian function SNESComputeJacobianDefault with > matrix free, it tries to compute the matrix entries and stick them into the > matrix. You can use MatMFFDComputeJacobian > > > > > On Aug 18, 2016, at 2:03 PM, ??? wrote: > > > > > > I got stuck at FormJacobian stage. > > > > > > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) > --> J & pJ are same with A matrix-free matrix (input argument) > > > > > > > > > > > > with these kind of messages.. > > > > > > [0]PETSC ERROR: No support for this operation for this object type > > > [0]PETSC ERROR: Mat type mffd > > > > > > > > > > > > Guess it's because I used A matrix-free matrix (which is mffd type) > into pJ position. > > > > > > Is there any solution for this kind of situation? > > > > > > > > > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : > > > On Thu, Aug 18, 2016 at 12:04 PM, ??? > wrote: > > > Then in order not to use preconditioner, > > > > > > is it ok if I just put A matrix-free matrix (made from > MatCreateSNESMF()) into the place where preA should be? > > > > > > Yes, but again the solve will likely perform very poorly. > > > > > > Thanks, > > > > > > Matt > > > > > > The flow goes like this > > > - call SNESCreate > > > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) > > > - call MatCreateSNESMF(snes, A, ier) > > > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) > > > - call SNESSetFromOptions() > > > > > > - call SNESGetKSP(snes, ksp, ier) > > > - call KSPSetType(ksp, KSPGMRES, ier) > > > - call KSPGetPC(ksp, pc, ier) > > > - call PCSetType(pc, PCNONE, ier) > > > - call KSPGMRESSetRestart(ksp, 30, ier) > > > > > > - call SNESSolve() > > > . > > > . > > > > > > > > > and inside the FormJacobian routine > > > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ > must be pointed with A and A. > > > > > > > > > > > > Thank you again, > > > > > > Kyungjun. > > > > > > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : > > > On Thu, Aug 18, 2016 at 11:42 AM, ??? > wrote: > > > Thanks for your helpful answers. > > > > > > Here's another question... > > > > > > As I read some example PETSc codes, I noticed that there should be a > preconditioning matrix (e.g. approx. jacobian matrix) when using > MatCreateSNESMF(). > > > > > > I mean, > > > after calling MatCreateSNESMF(snes, A, ier), > > > there should be another matrix preA(preconditioning matrix) to use > SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). > > > > > > > > > 1) Is there any way that I can use matrix-free method without making > preconditioning matrix? > > > > > > Don't use a preconditioner. As you might expect, this does not often > work out well. > > > > > > 2) I have a reference code, and the code adopts > > > > > > MatFDColoringCreate() > > > and finally uses > > > SNESComputeJacobianDefaultColor() at FormJacobian stage. > > > > > > But I can't see the inside of the fdcolor and I'm curious of this > mechanism. Can you explain this very briefly or tell me an example code > that I can refer to. ( I think none of PETSc example code is using > fdcolor..) > > > > > > This is the default, so there is no need for all that code. We use > naive graph 2-coloring. I think there might be a review article by Alex > Pothen about that. > > > > > > Thanks, > > > > > > Matt > > > > > > > > > Best, > > > > > > Kyungjun. > > > > > > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : > > > On Thu, Aug 18, 2016 at 10:39 AM, ??? > wrote: > > > 1) I wanna know the difference between applying option with command > line and within source code. > > > From my experience, command line option helps set other default > settings that I didn't applied, I guess. > > > > > > The command line arguments are applied to an object when > *SetFromOptions() is called, so in this case > > > you want SNESSetFromOptions() on the solver. There should be no > difference from using the API. > > > > > > 2) I made a matrix-free matrix with MatCreateSNESMF function, and > every time I check my snes context with SNESView, > > > > > > Mat Object: 1 MPI processes > > > type: mffd > > > rows=11616, cols=11616 > > > Matrix-free approximation: > > > err=1.49012e-08 (relative error in function evaluation) > > > The compute h routine has not yet been set > > > > > > at the end of line shows there's no routine for computing h value. > > > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. > > > Is it ok if I leave the h value that way? Or should I have to set h > computing routine? > > > > > > I am guessing you are calling the function on a different object from > the one that is viewed here. > > > However, there will always be a default function for computing h. > > > > > > Thanks, > > > > > > Matt > > > > > > Kyungjun. > > > > > > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : > > > On Thu, Aug 18, 2016 at 8:35 AM, ??? > wrote: > > > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of > computing h value. > > > > > > I found above command (MatSNESMFWPSetComputeNormU) but my fortran > compiler couldn't fine any reference of that command. > > > > > > I checked Petsc changes log, but there weren't any mentions about that > command. > > > > > > Should I have to include another specific header file? > > > > > > We have this function > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/ > MatMFFDWPSetComputeNormU.html > > > > > > but I would recommend using the command line option > > > > > > -mat_mffd_compute_normu > > > > > > Thanks, > > > > > > Matt > > > > > > Thank you always. > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 19 06:47:56 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 19 Aug 2016 06:47:56 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> <88CE2F93-938B-46D8-BEC5-7287E2430353@mcs.anl.gov> Message-ID: On Fri, Aug 19, 2016 at 12:04 AM, ??? wrote: > Dear Barry and Matt. > > Thank you very much for helping me up all night. (in my time) > > And sorry for not asking with sufficient source code condition or my > circumstances. (also with poor English.) > > > I just want to make sure that the options of my code is well applied. > > I'm trying to use GMRES with matrix-free method. I'd like to solve 2-D > euler equation without preconditioning matrix, for now. > > > 1) I'm still curious whether my snes context is using MF jacobian. ( just > like -snes_mf command line option) > > 2) And mind if I ask you that whether I applied petsc functions properly? > > I'll check out ex5 for applying command line options. > > > I'll attach my petsc flow code and option log by SNESView() below. > ------------------------------------------------------------ > -------------------------------------------------------- > *- petsc flow code* > ------------------------------------------------------------ > -------------------------------------------------------- > > ndof = Mixt%nCVar * Grid%nCell > > *call VecCreateMPIWIthArray(PETSC_COMM_WORLD, Mixt%nCVar, ndof, > PETSC_DECIDE, Mixt%cv, Mixt%x, ier)* > *call VecDuplicate(Mixt%x, Mixt%r, ier)* > *call VecSet(Mixt%r, zero, ier)* > > *call SNESCreate(PETSC_COMM_WORLD, Mixt%snes, ier)* > *call SNESSetFunction(Mixt%snes, Mixt%r, FormPetscResidual, Collect, ier)* > Remove these two lines > *call MatCreateSNESMF(Mixt%snes, Mixt%A, ier)* > > *call SNESSetJacobian(Mixt%snes, Mixt%A, Mixt%A, MatMFFDComputeJacobian, > Collect, ier)* > > *call SNESSetFromOptions(Mixt%snes, ier)* > And also all this stuff > *call SNESGetKSP(Mixt%snes, ksp, ier)* > *call KSPSetType(ksp, KSPGMRES, ier)* > *call KSPGetPC(ksp, pc, ier)* > *call PCSetType(pc, PCNONE, ier)* > *call KSPSetInitialGuessNonzero(ksp, PETSC_TRUE, ier)* > *call KSPGMRESSetRestart(ksp, 30, ier)* > *call KSPGMRESSetPreAllocation(ksp, ier)* > > > *call SNESSetFunction(Mixt%snes, Mixt%r, FormPetscResidual, Collect, ier)* > *call SNESSetJacobian(Mixt%snes, Mixt%A, Mixt%A, MatMFFDComputeJacobian, > Collect, ier)* > until here. Then give -snes_mf -pc_type none -snes_view -snes_monitor -ksp_monitor_true_residual -ksp_converged_reason -snes_convergred_reason and you have what you want. Matt > *call SNESSolve(Mixt%snes, PETSC_NULL_OBJECT, Mixt%x, ier)* > > *stop* ( for temporary ) > > > ------------------------------------------------------------ > -------------------------------------------------------- > *subroutine FormPetscResidual(snes, x, f, Collect, ier)* > type(t_Collect), intent(inout) :: Collect > > SNES :: snes > Vec :: x, f > integer :: ier, counter, iCell, iVar, temp > integer :: ndof > real(8), allocatable :: CVar(:,:) > real(8), allocatable :: PVar(:,:) > PetscScalar, pointer :: xx_v(:) > PetscScalar, pointer :: ff_v(:) > > ! Set degree of freedom of this system. > ndof = Collect%pMixt%nCVar * Collect%pGrid%nCell > > ! Backup the original values for cv to local array CVar > allocate( CVar(0:Collect%pMixt%nCVar-1, Collect%pGrid%nCell) ) > allocate( PVar(0:Collect%pMixt%nPVar-1, Collect%pGrid%nCell) ) > allocate( xx_v(1:ndof) ) > allocate( ff_v(1:ndof) ) > xx_v(:) = 0d0 > ff_v(:) = 0d0 > > ! Backup the original values for cv and pv > do iCell = 1, Collect%pGrid%nCell > do iVar = 0, Collect%pMixt%nCVar-1 > CVar(iVar,iCell) = Collect%pMixt%cv(iVar,iCell) > PVar(iVar,iCell) = Collect%pMixt%pv(iVar,iCell) > end do > end do > > ! Copy the input argument vector x to array value xx_v > call VecGetArrayReadF90(x, xx_v, ier) > call VecGetArrayF90(f, ff_v, ier) > > ! Compute copy the given vector into Mixt%cv and check for validity > counter = 0 > do iCell = 1, Collect%pGrid%nCell > do iVar = 0, Collect%pMixt%nCVar-1 > counter = counter + 1 > Collect%pMixt%cv(iVar,iCell) = xx_v(counter) > end do > end do > > ! Update primitive variables with input x vector to compute residual > call PostProcessing(Collect%pMixt,Collect%pGrid,Collect%pConf) > > > ! Compute the residual > call ComputeResidual(Collect%pMixt,Collect%pGrid,Collect%pConf) --> > where update residual of cell > > ! Copy the residual array into the PETSc vector > counter = 0 > do iCell = 1, Collect%pGrid%nCell > do iVar = 0, Collect%pMixt%nCVar-1 > counter = counter + 1 > > ff_v(counter) = Collect%pMixt%Residual(iVar,iCell) + > Collect%pGrid%vol(iCell)/Collect%pMixt%TimeStep(iCell)*( > Collect%pMixt%cv(iVar,iCell) - CVar(iVar,iCell) ) > end do > end do > > ! Restore conservative variables > do iCell = 1, Collect%pGrid%nCell > do iVar = 0, Collect%pMixt%nCVar-1 > Collect%pMixt%cv(iVar,iCell) = CVar(iVar,iCell) > Collect%pMixt%pv(iVar,iCell) = PVar(iVar,iCell) > end do > end do > > call VecRestoreArrayReadF90(x, xx_v, ier) > call VecRestoreArrayF90(f, ff_v, ier) > > deallocate(CVar) > deallocate(PVar) > ------------------------------------------------------------ > -------------------------------------------------------- > > > ------------------------------------------------------------ > -------------------------------------------------------- > *- option log* > ------------------------------------------------------------ > -------------------------------------------------------- > SNES Object: 1 MPI processes > type: newtonls > SNES has not been set up so information may be incomplete > maximum iterations=1, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-32, solution=1e-08 > total number of linear solver iterations=0 > total number of function evaluations=0 > norm schedule ALWAYS > SNESLineSearch Object: 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=0.001, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using DEFAULT norm type for convergence test > PC Object: 1 MPI processes > type: none > PC has not been set up so information may be incomplete > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mffd > rows=11616, cols=11616 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > The compute h routine has not yet been set > > > Sincerely, > > Kyungjun > > > 2016-08-19 13:00 GMT+09:00 Barry Smith : > >> >> > On Aug 18, 2016, at 10:28 PM, ??? wrote: >> > >> > Dear Matt. >> > >> > I didn't use the command line options because it looked not working. >> > >> > I called SNESSetFromOptions(snes, ier) in my source code, >> > >> > but options like -snes_mf or -snes_monitor doesn't look working. >> >> "doesn't work" is not useful to help us figure out what has gone wrong. >> You need to show us EXACTLY what you did by sending the code you compiled >> and the command line options you ran and all the output include full error >> messages. Without the information we simply do not have enough information >> to even begin to guess why it "doesn't work". >> >> Barry >> >> >> > >> > >> > Is there anything that I should consider more? >> > >> > >> > 2016-08-19 4:47 GMT+09:00 Matthew Knepley : >> > On Thu, Aug 18, 2016 at 2:44 PM, ??? wrote: >> > Is there a part that you considered this as finite-difference >> approximation? >> > I thought I used matrix-free method with MatCreateSNESMF() function >> > >> > You did not tell the SNES to use a MF Jacobian, you just made a Mat >> object. This is why >> > we encourage people to use the command line. Everything is setup >> correctly and in order. >> > Why would you choose not to. This creates long rounds of email. >> > >> > Matt >> > >> > Also I used >> > - call PCSetType(pc, PCNONE, ier) --> so the pc type shows 'none' at >> the log >> > >> > >> > I didn't use any of command line options. >> > >> > >> > Kyungjun >> > >> > 2016-08-19 4:27 GMT+09:00 Barry Smith : >> > >> > You can't use that Jacobian function SNESComputeJacobianDefault >> with matrix free, it tries to compute the matrix entries and stick them >> into the matrix. You can use MatMFFDComputeJacobian >> > >> > > On Aug 18, 2016, at 2:03 PM, ??? wrote: >> > > >> > > I got stuck at FormJacobian stage. >> > > >> > > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, >> ier) --> J & pJ are same with A matrix-free matrix (input argument) >> > > >> > > >> > > >> > > with these kind of messages.. >> > > >> > > [0]PETSC ERROR: No support for this operation for this object type >> > > [0]PETSC ERROR: Mat type mffd >> > > >> > > >> > > >> > > Guess it's because I used A matrix-free matrix (which is mffd type) >> into pJ position. >> > > >> > > Is there any solution for this kind of situation? >> > > >> > > >> > > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : >> > > On Thu, Aug 18, 2016 at 12:04 PM, ??? >> wrote: >> > > Then in order not to use preconditioner, >> > > >> > > is it ok if I just put A matrix-free matrix (made from >> MatCreateSNESMF()) into the place where preA should be? >> > > >> > > Yes, but again the solve will likely perform very poorly. >> > > >> > > Thanks, >> > > >> > > Matt >> > > >> > > The flow goes like this >> > > - call SNESCreate >> > > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) >> > > - call MatCreateSNESMF(snes, A, ier) >> > > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) >> > > - call SNESSetFromOptions() >> > > >> > > - call SNESGetKSP(snes, ksp, ier) >> > > - call KSPSetType(ksp, KSPGMRES, ier) >> > > - call KSPGetPC(ksp, pc, ier) >> > > - call PCSetType(pc, PCNONE, ier) >> > > - call KSPGMRESSetRestart(ksp, 30, ier) >> > > >> > > - call SNESSolve() >> > > . >> > > . >> > > >> > > >> > > and inside the FormJacobian routine >> > > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and >> pJ must be pointed with A and A. >> > > >> > > >> > > >> > > Thank you again, >> > > >> > > Kyungjun. >> > > >> > > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : >> > > On Thu, Aug 18, 2016 at 11:42 AM, ??? >> wrote: >> > > Thanks for your helpful answers. >> > > >> > > Here's another question... >> > > >> > > As I read some example PETSc codes, I noticed that there should be a >> preconditioning matrix (e.g. approx. jacobian matrix) when using >> MatCreateSNESMF(). >> > > >> > > I mean, >> > > after calling MatCreateSNESMF(snes, A, ier), >> > > there should be another matrix preA(preconditioning matrix) to use >> SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). >> > > >> > > >> > > 1) Is there any way that I can use matrix-free method without making >> preconditioning matrix? >> > > >> > > Don't use a preconditioner. As you might expect, this does not often >> work out well. >> > > >> > > 2) I have a reference code, and the code adopts >> > > >> > > MatFDColoringCreate() >> > > and finally uses >> > > SNESComputeJacobianDefaultColor() at FormJacobian stage. >> > > >> > > But I can't see the inside of the fdcolor and I'm curious of this >> mechanism. Can you explain this very briefly or tell me an example code >> that I can refer to. ( I think none of PETSc example code is using >> fdcolor..) >> > > >> > > This is the default, so there is no need for all that code. We use >> naive graph 2-coloring. I think there might be a review article by Alex >> Pothen about that. >> > > >> > > Thanks, >> > > >> > > Matt >> > > >> > > >> > > Best, >> > > >> > > Kyungjun. >> > > >> > > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : >> > > On Thu, Aug 18, 2016 at 10:39 AM, ??? >> wrote: >> > > 1) I wanna know the difference between applying option with command >> line and within source code. >> > > From my experience, command line option helps set other default >> settings that I didn't applied, I guess. >> > > >> > > The command line arguments are applied to an object when >> *SetFromOptions() is called, so in this case >> > > you want SNESSetFromOptions() on the solver. There should be no >> difference from using the API. >> > > >> > > 2) I made a matrix-free matrix with MatCreateSNESMF function, and >> every time I check my snes context with SNESView, >> > > >> > > Mat Object: 1 MPI processes >> > > type: mffd >> > > rows=11616, cols=11616 >> > > Matrix-free approximation: >> > > err=1.49012e-08 (relative error in function evaluation) >> > > The compute h routine has not yet been set >> > > >> > > at the end of line shows there's no routine for computing h value. >> > > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. >> > > Is it ok if I leave the h value that way? Or should I have to set h >> computing routine? >> > > >> > > I am guessing you are calling the function on a different object from >> the one that is viewed here. >> > > However, there will always be a default function for computing h. >> > > >> > > Thanks, >> > > >> > > Matt >> > > >> > > Kyungjun. >> > > >> > > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : >> > > On Thu, Aug 18, 2016 at 8:35 AM, ??? >> wrote: >> > > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way >> of computing h value. >> > > >> > > I found above command (MatSNESMFWPSetComputeNormU) but my fortran >> compiler couldn't fine any reference of that command. >> > > >> > > I checked Petsc changes log, but there weren't any mentions about >> that command. >> > > >> > > Should I have to include another specific header file? >> > > >> > > We have this function >> > > >> > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages >> /Mat/MatMFFDWPSetComputeNormU.html >> > > >> > > but I would recommend using the command line option >> > > >> > > -mat_mffd_compute_normu >> > > >> > > Thanks, >> > > >> > > Matt >> > > >> > > Thank you always. >> > > >> > > >> > > >> > > -- >> > > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > > -- Norbert Wiener >> > > >> > > >> > > >> > > >> > > -- >> > > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > > -- Norbert Wiener >> > > >> > > >> > > >> > > >> > > -- >> > > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > > -- Norbert Wiener >> > > >> > > >> > > >> > > >> > > -- >> > > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > > -- Norbert Wiener >> > > >> > >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 19 07:14:37 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 19 Aug 2016 07:14:37 -0500 Subject: [petsc-users] Question about using MatSNESMFWPSetComputeNormU In-Reply-To: References: <04A37A4A-66DA-461A-983A-B031EFD0183D@mcs.anl.gov> <88CE2F93-938B-46D8-BEC5-7287E2430353@mcs.anl.gov> Message-ID: <64FDE88F-897A-48C0-89E4-214DB4619DAF@mcs.anl.gov> It looks like the SNESView() you have below was called before you ever did a solve, hence it prints the message "information may be incomplete". Note also zero function evaluations have been done in the SNESSolve, if the solve had been called it should be great than 0. SNES Object: 1 MPI processes type: newtonls SNES has not been set up so information may be incomplete This is also why it prints The compute h routine has not yet been set The information about the h routine won't be printed until after an actual solve is done and the "compute h" function is set. Barry Note you can call MatMFFDSetType() to control the "compute h" function that is used. > On Aug 19, 2016, at 12:04 AM, ??? wrote: > > Dear Barry and Matt. > > Thank you very much for helping me up all night. (in my time) > > And sorry for not asking with sufficient source code condition or my circumstances. (also with poor English.) > > > I just want to make sure that the options of my code is well applied. > > I'm trying to use GMRES with matrix-free method. I'd like to solve 2-D euler equation without preconditioning matrix, for now. > > > 1) I'm still curious whether my snes context is using MF jacobian. ( just like -snes_mf command line option) > > 2) And mind if I ask you that whether I applied petsc functions properly? > > I'll check out ex5 for applying command line options. > > > I'll attach my petsc flow code and option log by SNESView() below. > -------------------------------------------------------------------------------------------------------------------- > - petsc flow code > -------------------------------------------------------------------------------------------------------------------- > > ndof = Mixt%nCVar * Grid%nCell > > call VecCreateMPIWIthArray(PETSC_COMM_WORLD, Mixt%nCVar, ndof, PETSC_DECIDE, Mixt%cv, Mixt%x, ier) > call VecDuplicate(Mixt%x, Mixt%r, ier) > call VecSet(Mixt%r, zero, ier) > > call SNESCreate(PETSC_COMM_WORLD, Mixt%snes, ier) > call SNESSetFunction(Mixt%snes, Mixt%r, FormPetscResidual, Collect, ier) > call MatCreateSNESMF(Mixt%snes, Mixt%A, ier) > > call SNESSetJacobian(Mixt%snes, Mixt%A, Mixt%A, MatMFFDComputeJacobian, Collect, ier) > call SNESSetFromOptions(Mixt%snes, ier) > > call SNESGetKSP(Mixt%snes, ksp, ier) > call KSPSetType(ksp, KSPGMRES, ier) > call KSPGetPC(ksp, pc, ier) > call PCSetType(pc, PCNONE, ier) > call KSPSetInitialGuessNonzero(ksp, PETSC_TRUE, ier) > call KSPGMRESSetRestart(ksp, 30, ier) > call KSPGMRESSetPreAllocation(ksp, ier) > > > call SNESSetFunction(Mixt%snes, Mixt%r, FormPetscResidual, Collect, ier) > call SNESSetJacobian(Mixt%snes, Mixt%A, Mixt%A, MatMFFDComputeJacobian, Collect, ier) > > call SNESSolve(Mixt%snes, PETSC_NULL_OBJECT, Mixt%x, ier) > > stop ( for temporary ) > > > -------------------------------------------------------------------------------------------------------------------- > subroutine FormPetscResidual(snes, x, f, Collect, ier) > type(t_Collect), intent(inout) :: Collect > > SNES :: snes > Vec :: x, f > integer :: ier, counter, iCell, iVar, temp > integer :: ndof > real(8), allocatable :: CVar(:,:) > real(8), allocatable :: PVar(:,:) > PetscScalar, pointer :: xx_v(:) > PetscScalar, pointer :: ff_v(:) > > ! Set degree of freedom of this system. > ndof = Collect%pMixt%nCVar * Collect%pGrid%nCell > > ! Backup the original values for cv to local array CVar > allocate( CVar(0:Collect%pMixt%nCVar-1, Collect%pGrid%nCell) ) > allocate( PVar(0:Collect%pMixt%nPVar-1, Collect%pGrid%nCell) ) > allocate( xx_v(1:ndof) ) > allocate( ff_v(1:ndof) ) > xx_v(:) = 0d0 > ff_v(:) = 0d0 > > ! Backup the original values for cv and pv > do iCell = 1, Collect%pGrid%nCell > do iVar = 0, Collect%pMixt%nCVar-1 > CVar(iVar,iCell) = Collect%pMixt%cv(iVar,iCell) > PVar(iVar,iCell) = Collect%pMixt%pv(iVar,iCell) > end do > end do > > ! Copy the input argument vector x to array value xx_v > call VecGetArrayReadF90(x, xx_v, ier) > call VecGetArrayF90(f, ff_v, ier) > > ! Compute copy the given vector into Mixt%cv and check for validity > counter = 0 > do iCell = 1, Collect%pGrid%nCell > do iVar = 0, Collect%pMixt%nCVar-1 > counter = counter + 1 > Collect%pMixt%cv(iVar,iCell) = xx_v(counter) > end do > end do > > ! Update primitive variables with input x vector to compute residual > call PostProcessing(Collect%pMixt,Collect%pGrid,Collect%pConf) > > > ! Compute the residual > call ComputeResidual(Collect%pMixt,Collect%pGrid,Collect%pConf) --> where update residual of cell > > ! Copy the residual array into the PETSc vector > counter = 0 > do iCell = 1, Collect%pGrid%nCell > do iVar = 0, Collect%pMixt%nCVar-1 > counter = counter + 1 > > ff_v(counter) = Collect%pMixt%Residual(iVar,iCell) + Collect%pGrid%vol(iCell)/Collect%pMixt%TimeStep(iCell)*( Collect%pMixt%cv(iVar,iCell) - CVar(iVar,iCell) ) > end do > end do > > ! Restore conservative variables > do iCell = 1, Collect%pGrid%nCell > do iVar = 0, Collect%pMixt%nCVar-1 > Collect%pMixt%cv(iVar,iCell) = CVar(iVar,iCell) > Collect%pMixt%pv(iVar,iCell) = PVar(iVar,iCell) > end do > end do > > call VecRestoreArrayReadF90(x, xx_v, ier) > call VecRestoreArrayF90(f, ff_v, ier) > > deallocate(CVar) > deallocate(PVar) > -------------------------------------------------------------------------------------------------------------------- > > > -------------------------------------------------------------------------------------------------------------------- > - option log > -------------------------------------------------------------------------------------------------------------------- > SNES Object: 1 MPI processes > type: newtonls > SNES has not been set up so information may be incomplete > maximum iterations=1, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-32, solution=1e-08 > total number of linear solver iterations=0 > total number of function evaluations=0 > norm schedule ALWAYS > SNESLineSearch Object: 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=0.001, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using DEFAULT norm type for convergence test > PC Object: 1 MPI processes > type: none > PC has not been set up so information may be incomplete > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: mffd > rows=11616, cols=11616 > Matrix-free approximation: > err=1.49012e-08 (relative error in function evaluation) > The compute h routine has not yet been set > > > Sincerely, > > Kyungjun > > > 2016-08-19 13:00 GMT+09:00 Barry Smith : > > > On Aug 18, 2016, at 10:28 PM, ??? wrote: > > > > Dear Matt. > > > > I didn't use the command line options because it looked not working. > > > > I called SNESSetFromOptions(snes, ier) in my source code, > > > > but options like -snes_mf or -snes_monitor doesn't look working. > > "doesn't work" is not useful to help us figure out what has gone wrong. You need to show us EXACTLY what you did by sending the code you compiled and the command line options you ran and all the output include full error messages. Without the information we simply do not have enough information to even begin to guess why it "doesn't work". > > Barry > > > > > > > > Is there anything that I should consider more? > > > > > > 2016-08-19 4:47 GMT+09:00 Matthew Knepley : > > On Thu, Aug 18, 2016 at 2:44 PM, ??? wrote: > > Is there a part that you considered this as finite-difference approximation? > > I thought I used matrix-free method with MatCreateSNESMF() function > > > > You did not tell the SNES to use a MF Jacobian, you just made a Mat object. This is why > > we encourage people to use the command line. Everything is setup correctly and in order. > > Why would you choose not to. This creates long rounds of email. > > > > Matt > > > > Also I used > > - call PCSetType(pc, PCNONE, ier) --> so the pc type shows 'none' at the log > > > > > > I didn't use any of command line options. > > > > > > Kyungjun > > > > 2016-08-19 4:27 GMT+09:00 Barry Smith : > > > > You can't use that Jacobian function SNESComputeJacobianDefault with matrix free, it tries to compute the matrix entries and stick them into the matrix. You can use MatMFFDComputeJacobian > > > > > On Aug 18, 2016, at 2:03 PM, ??? wrote: > > > > > > I got stuck at FormJacobian stage. > > > > > > - call SNESComputeJacobianDefault(snes, v, J, pJ, FormResidual, ier) --> J & pJ are same with A matrix-free matrix (input argument) > > > > > > > > > > > > with these kind of messages.. > > > > > > [0]PETSC ERROR: No support for this operation for this object type > > > [0]PETSC ERROR: Mat type mffd > > > > > > > > > > > > Guess it's because I used A matrix-free matrix (which is mffd type) into pJ position. > > > > > > Is there any solution for this kind of situation? > > > > > > > > > 2016-08-19 2:05 GMT+09:00 Matthew Knepley : > > > On Thu, Aug 18, 2016 at 12:04 PM, ??? wrote: > > > Then in order not to use preconditioner, > > > > > > is it ok if I just put A matrix-free matrix (made from MatCreateSNESMF()) into the place where preA should be? > > > > > > Yes, but again the solve will likely perform very poorly. > > > > > > Thanks, > > > > > > Matt > > > > > > The flow goes like this > > > - call SNESCreate > > > - call SNESSetFunction(snes, r, FormResidual, userctx, ier) > > > - call MatCreateSNESMF(snes, A, ier) > > > - call SNESSetJacobian(snes, A, A, FormJacobian, userctx, ier) > > > - call SNESSetFromOptions() > > > > > > - call SNESGetKSP(snes, ksp, ier) > > > - call KSPSetType(ksp, KSPGMRES, ier) > > > - call KSPGetPC(ksp, pc, ier) > > > - call PCSetType(pc, PCNONE, ier) > > > - call KSPGMRESSetRestart(ksp, 30, ier) > > > > > > - call SNESSolve() > > > . > > > . > > > > > > > > > and inside the FormJacobian routine > > > - call SNESComputeJacobian(snes, v, J, pJ, userctx, ier) --> J and pJ must be pointed with A and A. > > > > > > > > > > > > Thank you again, > > > > > > Kyungjun. > > > > > > 2016-08-19 1:44 GMT+09:00 Matthew Knepley : > > > On Thu, Aug 18, 2016 at 11:42 AM, ??? wrote: > > > Thanks for your helpful answers. > > > > > > Here's another question... > > > > > > As I read some example PETSc codes, I noticed that there should be a preconditioning matrix (e.g. approx. jacobian matrix) when using MatCreateSNESMF(). > > > > > > I mean, > > > after calling MatCreateSNESMF(snes, A, ier), > > > there should be another matrix preA(preconditioning matrix) to use SNESSetJacobian(snes, A, preA, FormJacobian, ctx, ier). > > > > > > > > > 1) Is there any way that I can use matrix-free method without making preconditioning matrix? > > > > > > Don't use a preconditioner. As you might expect, this does not often work out well. > > > > > > 2) I have a reference code, and the code adopts > > > > > > MatFDColoringCreate() > > > and finally uses > > > SNESComputeJacobianDefaultColor() at FormJacobian stage. > > > > > > But I can't see the inside of the fdcolor and I'm curious of this mechanism. Can you explain this very briefly or tell me an example code that I can refer to. ( I think none of PETSc example code is using fdcolor..) > > > > > > This is the default, so there is no need for all that code. We use naive graph 2-coloring. I think there might be a review article by Alex Pothen about that. > > > > > > Thanks, > > > > > > Matt > > > > > > > > > Best, > > > > > > Kyungjun. > > > > > > 2016-08-19 0:54 GMT+09:00 Matthew Knepley : > > > On Thu, Aug 18, 2016 at 10:39 AM, ??? wrote: > > > 1) I wanna know the difference between applying option with command line and within source code. > > > From my experience, command line option helps set other default settings that I didn't applied, I guess. > > > > > > The command line arguments are applied to an object when *SetFromOptions() is called, so in this case > > > you want SNESSetFromOptions() on the solver. There should be no difference from using the API. > > > > > > 2) I made a matrix-free matrix with MatCreateSNESMF function, and every time I check my snes context with SNESView, > > > > > > Mat Object: 1 MPI processes > > > type: mffd > > > rows=11616, cols=11616 > > > Matrix-free approximation: > > > err=1.49012e-08 (relative error in function evaluation) > > > The compute h routine has not yet been set > > > > > > at the end of line shows there's no routine for computing h value. > > > I used MatMFFDWPSetComputeNormU function, but it didn't work I think. > > > Is it ok if I leave the h value that way? Or should I have to set h computing routine? > > > > > > I am guessing you are calling the function on a different object from the one that is viewed here. > > > However, there will always be a default function for computing h. > > > > > > Thanks, > > > > > > Matt > > > > > > Kyungjun. > > > > > > 2016-08-18 23:18 GMT+09:00 Matthew Knepley : > > > On Thu, Aug 18, 2016 at 8:35 AM, ??? wrote: > > > Hi, I'm trying to set my SNES matrix-free with Walker & Pernice way of computing h value. > > > > > > I found above command (MatSNESMFWPSetComputeNormU) but my fortran compiler couldn't fine any reference of that command. > > > > > > I checked Petsc changes log, but there weren't any mentions about that command. > > > > > > Should I have to include another specific header file? > > > > > > We have this function > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMFFDWPSetComputeNormU.html > > > > > > but I would recommend using the command line option > > > > > > -mat_mffd_compute_normu > > > > > > Thanks, > > > > > > Matt > > > > > > Thank you always. > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > From bsmith at mcs.anl.gov Fri Aug 19 14:40:42 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 19 Aug 2016 14:40:42 -0500 Subject: [petsc-users] Segmentation faults: Derived types In-Reply-To: References: Message-ID: You code has a bug right at the top: CALL PetscInitialize(PETSC_COMM_WORLD,ierr) should be CALL PetscInitialize(PETSC_NULL_CHARACTER,ierr) you were just lucky previously that the stack frame was different enough that it did not previously crash. Once I corrected the code the new version ran without crashing. I found the bug very easily by simply running the new version directly in the debugger lldb ./ANSIFLOW and seeing it crashed with a crazy stack in petscinitialize_ Barry > On Aug 5, 2016, at 11:34 PM, Santiago Ospina De Los Rios wrote: > > Dear Barry, > > I tried to build a simple code with the same things I mentioned to you on last e-mail but it worked, which is more strange to me. So I built two branches on my git code to show you the problem: > > git code: https://github.com/SoilRos/ANISOFLOWPACK > > Branch: PETSc_debug_boolean_0 > The first one is a simple code which is working for what was designed. Forget sample problems, just compile and run the ANISOFLOW executable on src folder, if there are some verbose messages then the program is working. > > Branch: PETSc_debug_boolean_1 > The second one is just a modification of the first one adding the two booleans mentioned above in 01_Types.F90 file. I tried it in mac El Capitan and Ubuntu (with Valgrind) and PETSc 3.7.3 and 3.7.2 respectively, both with the same segmentation fault. > > PD: Although I already fixed it compressing the three booleans into one integer, I think is better if we try to figure out why there is a segmentation fault because I had similar problems before. > PD2: Please obviate the variable description because are pretty out of date. I'm trying to change it, so it can be confusing. > > Best wishes, > Santiago Ospina > > 2016-08-05 15:54 GMT-05:00 Barry Smith : > > > On Aug 1, 2016, at 4:41 PM, Santiago Ospina De Los Rios wrote: > > > > Hello there, > > > > I'm having problems defining some variables into derived types in Fortran. Before, I had a similar problems with an allocatable array "PetsInt" but I solved it just doing a non-collective Petsc Vec. Today I'm having troubles with "PetscBool" or "Logical": > > > > In a module which define the variables, I have the following: > > > > MODULE ANISOFLOW_Types > > > > IMPLICIT NONE > > > > #include > > #include > > > > ... > > > > TYPE ConductivityField > > PetscBool :: DefinedByCvtZones=.FALSE. ! It produces the segmentation fault. > > PetscBool :: DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > > PetscBool :: DefinedByCell=.FALSE. > > ! Conductivity defined by zones (Local): > > Vec :: ZoneID > > TYPE(Tensor),ALLOCATABLE :: Zone(:) > > ! Conductivity defined on every cell (Local): > > Vec :: Cell > > END TYPE ConductivityField > > > > > > TYPE SpecificStorageField > > PetscBool :: DefinedByStoZones=.FALSE. ! It produces the segmentation fault. > > PetscBool :: DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > > PetscBool :: DefinedByCell=.FALSE. > > ! Specific Storage defined by zones (Local): > > Vec :: ZoneID > > Vec :: Zone > > ! Specific Storage defined on every cell (Global).: > > Vec :: Cell > > END TYPE SpecificStorageField > > > > TYPE PropertiesField > > TYPE(ConductivityField) :: Cvt > > TYPE(SpecificStorageField) :: Sto > > ! Property defined by zones (Local): > > PetscBool :: DefinedByPptZones=.FALSE. > > Vec :: ZoneID > > END TYPE PropertiesField > > > > ... > > > > CONTAINS > > > > ... > > > > END MODULE ANISOFLOW_Types > > > > > > Later I use it in the main program, with something like this > > > > PROGRAM ANISOFLOW > > > > USE ANISOFLOW_Types, ONLY : ... ,PropertiesField, ... > > ... > > > > IMPLICIT NONE > > > > #include > > > > ... > > TYPE(PropertiesField) :: PptFld > > ... > > > > CALL PetscInitialize(PETSC_COMM_WORLD,ierr) > > ... > > CALL PetscFinalize(ierr) > > > > END PROGRAM > > > > > > When I run the program appears a Segmentation Fault, which disappears when I comment the booleans marked in the code. Because I need them, I used Valgrind to figure out what is happening but it is yet a mistery to me. > > > > Valgrind message: > > ==5160== > > ==5160== Invalid read of size 1 > > It is curious that it says "of size 1" when we declare PetscBool to be a logical*4 I don't see anything obviously wrong. > > Please send a simple code we can compile and run that reproduces the problem. > > Barry > > ==5160== at 0x4FB2156: petscinitialize_ (zstart.c:433) > > ==5160== by 0x4030EA: MAIN__ (ANISOFLOW.F90:29) # line of petsc inizalitation > > ==5160== by 0x404380: main (ANISOFLOW.F90:3) # line of "USE ANISOFLOW_Types, ONLY : ... ,PropertiesField, ..." > > ==5160== Address 0xc54fff is not stack'd, malloc'd or (recently) free'd > > ==5160== > > > > Program received signal SIGSEGV: Segmentation fault - invalid memory reference. > > > > Backtrace for this error: > > #0 0x699E777 > > #1 0x699ED7E > > #2 0x6F0BCAF > > #3 0x4FB2156 > > #4 0x4030EA in anisoflow at ANISOFLOW.F90:29 > > > > I think it is maybe related with petsc because the error popped out just in its initialization, so if you know what's going on, I would appreciate to tell me. > > > > Santiago Ospina > > -- > > > > -- > > Att: > > > > Santiago Ospina De Los R?os > > National University of Colombia > > > > > -- > > -- > Att: > > Santiago Ospina De Los R?os > National University of Colombia From dkg2140 at gmail.com Sat Aug 20 17:04:04 2016 From: dkg2140 at gmail.com (Krzysztof Gawarecki) Date: Sun, 21 Aug 2016 00:04:04 +0200 Subject: [petsc-users] MatCreateFFT in Fortran Message-ID: The problem was related to fortran wrapper: src/mat/impls/fft/ftn-custom/zfftf.c After changing: *ierr = MatCreateFFT(*comm,*ndim,dim,mattype,A); to *ierr = MatCreateFFT(MPI_Comm_f2c(*(MPI_Fint*)&*comm),*ndim,dim,mattype,A); FFTW works fine in fortran. Best regards, K.G. -------------- next part -------------- An HTML attachment was scrubbed... URL: From francesco.caimmi at polimi.it Sun Aug 21 05:01:29 2016 From: francesco.caimmi at polimi.it (Francesco Caimmi) Date: Sun, 21 Aug 2016 12:01:29 +0200 Subject: [petsc-users] [petsc4py] a problem with computeRHSFunctionLinear interface? In-Reply-To: References: <4270161.snsPm0L6UZ@wotan> Message-ID: <5537563.Y5rDWPrv14@wotan> Dear Lisandro, many thanks for your answer On Thursday 18 August 2016 10:20:59 Lisandro Dalcin wrote: > Dear Francesco, sorry for the late answer, I missed your email. > > You have to do it this way: > > ts.setRHSFunction(PETSc.TS.computeRHSFunctionLinear) > ts.setRHSJacobian(PETSc.TS.computeRHSJacobianConstant, J=A, P=A) Ok, now that works! > > I.e, you have to set unbound methods, not instance methods as you did > in your original code. Additionally, do not pass "args" nor "kargs". This makes sense to me, although they are passed in the original source code I was translating. I will also take the chance to ask a related question: in the original C code, lines 213 -214 read ierr = TSGetSNES(ts,&snes);CHKERRQ(ierr); ierr=SNESSetJacobian(snes,NULL,NULL,SNESComputeJacobianDefault,NULL);CHKERRQ(ierr); How do I translate the last line in Python? I was not able to find the equivalent of SNESComputeJacobianDefault. Thanks, Francesco > > On 11 August 2016 at 10:36, Francesco Caimmi wrote: > > Dear all, > > > > I was trying to reproduce /ts/examples/tutorials/ex4.c in python to learn > > how to use TS solvers; the example uses the function > > TSComputeRHSFunctionLinear. However I get an error when running my code > > (attached in case you want to look at it), when I call ts.solve. > > > > Here is the trace: > > [fcaimmi at Wotan 2645] > ./ts_ex4.py > > Solving a linear TS problem, number of processors = 1 > > Timestep 0 : time = 0.0 2-norm error = 1.14956855594e-08 max norm error > > = 0> > > Traceback (most recent call last): > > File "./ts_ex4.py", line 473, in > > > > main(m = m, debug = debug) > > > > File "./ts_ex4.py", line 340, in main > > > > ts.solve(u) > > > > File "PETSc/TS.pyx", line 568, in petsc4py.PETSc.TS.solve > > > > (src/petsc4py.PETSc.c:188927) > > > > File "PETSc/petscts.pxi", line 221, in petsc4py.PETSc.TS_RHSFunction > > > > (src/petsc4py.PETSc.c:35490) > > > > File "PETSc/TS.pyx", line 189, in > > petsc4py.PETSc.TS.computeRHSFunctionLinear> > > (src/petsc4py.PETSc.c:181611) > > TypeError: computeRHSFunctionLinear() takes exactly 3 positional arguments > > (5 given) > > > > I cannot understand if there is a problem with my code or if the problem > > is in computeRHSFunctionLinear interface. > > > > I checked https://bitbucket.org/petsc/petsc4py/ and the interface to > > > > computeRHSFunctionLinear has three arguments, however I am not that much > > into petsc4py to tell how it gets called. > > > > I am on Petsc Release Version 3.7.3 > > > > Thank you for your time. > > > > Best, > > -- > > Francesco Caimmi > > > > Laboratorio di Ingegneria dei Polimeri > > http://www.chem.polimi.it/polyenglab/ > > > > Politecnico di Milano - Dipartimento di Chimica, > > Materiali e Ingegneria Chimica ?Giulio Natta? > > > > P.zza Leonardo da Vinci, 32 > > I-20133 Milano > > Tel. +39.02.2399.4711 > > Fax +39.02.7063.8173 > > > > francesco.caimmi at polimi.it > > Skype: fmglcaimmi (please arrange meetings by e-mail) From sospinar at unal.edu.co Sun Aug 21 05:09:46 2016 From: sospinar at unal.edu.co (Santiago Ospina De Los Rios) Date: Sun, 21 Aug 2016 05:09:46 -0500 Subject: [petsc-users] Segmentation faults: Derived types In-Reply-To: References: Message-ID: wow, what a dumb mistake. thank you very much! Since you've noticed me such error, I am taking the time to learn all of your advice about management and debugging code . Then I'm learning emacs and so... the matter is that you recommend to use PETSc tags which it's supposed to be in ${PETSC_DIR}/TAGS, but it seems to be changed or something because I wasn't able to find it. So, how can I use the tags you are referring to? 2016-08-19 14:40 GMT-05:00 Barry Smith : > > You code has a bug right at the top: > > CALL PetscInitialize(PETSC_COMM_WORLD,ierr) > > should be > > CALL PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > you were just lucky previously that the stack frame was different enough > that it did not previously crash. Once I corrected the code the new version > ran without crashing. I found the bug very easily by simply running the new > version directly in the debugger > > lldb ./ANSIFLOW > > and seeing it crashed with a crazy stack in petscinitialize_ > > Barry > > > > On Aug 5, 2016, at 11:34 PM, Santiago Ospina De Los Rios < > sospinar at unal.edu.co> wrote: > > > > Dear Barry, > > > > I tried to build a simple code with the same things I mentioned to you > on last e-mail but it worked, which is more strange to me. So I built two > branches on my git code to show you the problem: > > > > git code: https://github.com/SoilRos/ANISOFLOWPACK > > > > Branch: PETSc_debug_boolean_0 > > The first one is a simple code which is working for what was designed. > Forget sample problems, just compile and run the ANISOFLOW executable on > src folder, if there are some verbose messages then the program is working. > > > > Branch: PETSc_debug_boolean_1 > > The second one is just a modification of the first one adding the two > booleans mentioned above in 01_Types.F90 file. I tried it in mac El Capitan > and Ubuntu (with Valgrind) and PETSc 3.7.3 and 3.7.2 respectively, both > with the same segmentation fault. > > > > PD: Although I already fixed it compressing the three booleans into one > integer, I think is better if we try to figure out why there is a > segmentation fault because I had similar problems before. > > PD2: Please obviate the variable description because are pretty out of > date. I'm trying to change it, so it can be confusing. > > > > Best wishes, > > Santiago Ospina > > > > 2016-08-05 15:54 GMT-05:00 Barry Smith : > > > > > On Aug 1, 2016, at 4:41 PM, Santiago Ospina De Los Rios < > sospinar at unal.edu.co> wrote: > > > > > > Hello there, > > > > > > I'm having problems defining some variables into derived types in > Fortran. Before, I had a similar problems with an allocatable array > "PetsInt" but I solved it just doing a non-collective Petsc Vec. Today I'm > having troubles with "PetscBool" or "Logical": > > > > > > In a module which define the variables, I have the following: > > > > > > MODULE ANISOFLOW_Types > > > > > > IMPLICIT NONE > > > > > > #include > > > #include > > > > > > ... > > > > > > TYPE ConductivityField > > > PetscBool :: > DefinedByCvtZones=.FALSE. ! It produces the segmentation fault. > > > PetscBool :: > DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > > > PetscBool :: > DefinedByCell=.FALSE. > > > ! Conductivity defined by zones (Local): > > > Vec :: ZoneID > > > TYPE(Tensor),ALLOCATABLE :: Zone(:) > > > ! Conductivity defined on every cell (Local): > > > Vec :: Cell > > > END TYPE ConductivityField > > > > > > > > > TYPE SpecificStorageField > > > PetscBool :: > DefinedByStoZones=.FALSE. ! It produces the segmentation fault. > > > PetscBool :: > DefinedByPptZones=.FALSE. ! It produces the segmentation fault. > > > PetscBool :: > DefinedByCell=.FALSE. > > > ! Specific Storage defined by zones (Local): > > > Vec :: ZoneID > > > Vec :: Zone > > > ! Specific Storage defined on every cell (Global).: > > > Vec :: Cell > > > END TYPE SpecificStorageField > > > > > > TYPE PropertiesField > > > TYPE(ConductivityField) :: Cvt > > > TYPE(SpecificStorageField) :: Sto > > > ! Property defined by zones (Local): > > > PetscBool :: DefinedByPptZones=.FALSE. > > > Vec :: ZoneID > > > END TYPE PropertiesField > > > > > > ... > > > > > > CONTAINS > > > > > > ... > > > > > > END MODULE ANISOFLOW_Types > > > > > > > > > Later I use it in the main program, with something like this > > > > > > PROGRAM ANISOFLOW > > > > > > USE ANISOFLOW_Types, ONLY : ... ,PropertiesField, > ... > > > ... > > > > > > IMPLICIT NONE > > > > > > #include > > > > > > ... > > > TYPE(PropertiesField) :: PptFld > > > ... > > > > > > CALL PetscInitialize(PETSC_COMM_WORLD,ierr) > > > ... > > > CALL PetscFinalize(ierr) > > > > > > END PROGRAM > > > > > > > > > When I run the program appears a Segmentation Fault, which disappears > when I comment the booleans marked in the code. Because I need them, I used > Valgrind to figure out what is happening but it is yet a mistery to me. > > > > > > Valgrind message: > > > ==5160== > > > ==5160== Invalid read of size 1 > > > > It is curious that it says "of size 1" when we declare PetscBool to > be a logical*4 I don't see anything obviously wrong. > > > > Please send a simple code we can compile and run that reproduces the > problem. > > > > Barry > > > ==5160== at 0x4FB2156: petscinitialize_ (zstart.c:433) > > > ==5160== by 0x4030EA: MAIN__ (ANISOFLOW.F90:29) # line of petsc > inizalitation > > > ==5160== by 0x404380: main (ANISOFLOW.F90:3) # line of "USE > ANISOFLOW_Types, ONLY : ... ,PropertiesField, ..." > > > ==5160== Address 0xc54fff is not stack'd, malloc'd or (recently) > free'd > > > ==5160== > > > > > > Program received signal SIGSEGV: Segmentation fault - invalid memory > reference. > > > > > > Backtrace for this error: > > > #0 0x699E777 > > > #1 0x699ED7E > > > #2 0x6F0BCAF > > > #3 0x4FB2156 > > > #4 0x4030EA in anisoflow at ANISOFLOW.F90:29 > > > > > > I think it is maybe related with petsc because the error popped out > just in its initialization, so if you know what's going on, I would > appreciate to tell me. > > > > > > Santiago Ospina > > > -- > > > > > > -- > > > Att: > > > > > > Santiago Ospina De Los R?os > > > National University of Colombia > > > > > > > > > > -- > > > > -- > > Att: > > > > Santiago Ospina De Los R?os > > National University of Colombia > > -- -- Att: Santiago Ospina De Los R?os National University of Colombia -------------- next part -------------- An HTML attachment was scrubbed... URL: From juan at tf.uni-kiel.de Sun Aug 21 09:00:33 2016 From: juan at tf.uni-kiel.de (Julian Andrej) Date: Sun, 21 Aug 2016 16:00:33 +0200 Subject: [petsc-users] Segmentation faults: Derived types In-Reply-To: References: Message-ID: You need to generate the TAGS file first using make alletags in ${PETSC_DIR}. Then you need to visit the tags table using emacs. On Sun, Aug 21, 2016 at 12:09 PM, Santiago Ospina De Los Rios wrote: > wow, what a dumb mistake. thank you very much! > > Since you've noticed me such error, I am taking the time to learn all of > your advice about management and debugging code. Then I'm learning emacs and > so... the matter is that you recommend to use PETSc tags which it's supposed > to be in ${PETSC_DIR}/TAGS, but it seems to be changed or something because > I wasn't able to find it. > > So, how can I use the tags you are referring to? > > > > 2016-08-19 14:40 GMT-05:00 Barry Smith : >> >> >> You code has a bug right at the top: >> >> CALL PetscInitialize(PETSC_COMM_WORLD,ierr) >> >> should be >> >> CALL PetscInitialize(PETSC_NULL_CHARACTER,ierr) >> >> you were just lucky previously that the stack frame was different enough >> that it did not previously crash. Once I corrected the code the new version >> ran without crashing. I found the bug very easily by simply running the new >> version directly in the debugger >> >> lldb ./ANSIFLOW >> >> and seeing it crashed with a crazy stack in petscinitialize_ >> >> Barry >> >> >> > On Aug 5, 2016, at 11:34 PM, Santiago Ospina De Los Rios >> > wrote: >> > >> > Dear Barry, >> > >> > I tried to build a simple code with the same things I mentioned to you >> > on last e-mail but it worked, which is more strange to me. So I built two >> > branches on my git code to show you the problem: >> > >> > git code: https://github.com/SoilRos/ANISOFLOWPACK >> > >> > Branch: PETSc_debug_boolean_0 >> > The first one is a simple code which is working for what was designed. >> > Forget sample problems, just compile and run the ANISOFLOW executable on src >> > folder, if there are some verbose messages then the program is working. >> > >> > Branch: PETSc_debug_boolean_1 >> > The second one is just a modification of the first one adding the two >> > booleans mentioned above in 01_Types.F90 file. I tried it in mac El Capitan >> > and Ubuntu (with Valgrind) and PETSc 3.7.3 and 3.7.2 respectively, both with >> > the same segmentation fault. >> > >> > PD: Although I already fixed it compressing the three booleans into one >> > integer, I think is better if we try to figure out why there is a >> > segmentation fault because I had similar problems before. >> > PD2: Please obviate the variable description because are pretty out of >> > date. I'm trying to change it, so it can be confusing. >> > >> > Best wishes, >> > Santiago Ospina >> > >> > 2016-08-05 15:54 GMT-05:00 Barry Smith : >> > >> > > On Aug 1, 2016, at 4:41 PM, Santiago Ospina De Los Rios >> > > wrote: >> > > >> > > Hello there, >> > > >> > > I'm having problems defining some variables into derived types in >> > > Fortran. Before, I had a similar problems with an allocatable array >> > > "PetsInt" but I solved it just doing a non-collective Petsc Vec. Today I'm >> > > having troubles with "PetscBool" or "Logical": >> > > >> > > In a module which define the variables, I have the following: >> > > >> > > MODULE ANISOFLOW_Types >> > > >> > > IMPLICIT NONE >> > > >> > > #include >> > > #include >> > > >> > > ... >> > > >> > > TYPE ConductivityField >> > > PetscBool :: >> > > DefinedByCvtZones=.FALSE. ! It produces the segmentation fault. >> > > PetscBool :: >> > > DefinedByPptZones=.FALSE. ! It produces the segmentation fault. >> > > PetscBool :: >> > > DefinedByCell=.FALSE. >> > > ! Conductivity defined by zones (Local): >> > > Vec :: ZoneID >> > > TYPE(Tensor),ALLOCATABLE :: Zone(:) >> > > ! Conductivity defined on every cell (Local): >> > > Vec :: Cell >> > > END TYPE ConductivityField >> > > >> > > >> > > TYPE SpecificStorageField >> > > PetscBool :: >> > > DefinedByStoZones=.FALSE. ! It produces the segmentation fault. >> > > PetscBool :: >> > > DefinedByPptZones=.FALSE. ! It produces the segmentation fault. >> > > PetscBool :: >> > > DefinedByCell=.FALSE. >> > > ! Specific Storage defined by zones (Local): >> > > Vec :: ZoneID >> > > Vec :: Zone >> > > ! Specific Storage defined on every cell (Global).: >> > > Vec :: Cell >> > > END TYPE SpecificStorageField >> > > >> > > TYPE PropertiesField >> > > TYPE(ConductivityField) :: Cvt >> > > TYPE(SpecificStorageField) :: Sto >> > > ! Property defined by zones (Local): >> > > PetscBool :: DefinedByPptZones=.FALSE. >> > > Vec :: ZoneID >> > > END TYPE PropertiesField >> > > >> > > ... >> > > >> > > CONTAINS >> > > >> > > ... >> > > >> > > END MODULE ANISOFLOW_Types >> > > >> > > >> > > Later I use it in the main program, with something like this >> > > >> > > PROGRAM ANISOFLOW >> > > >> > > USE ANISOFLOW_Types, ONLY : ... ,PropertiesField, >> > > ... >> > > ... >> > > >> > > IMPLICIT NONE >> > > >> > > #include >> > > >> > > ... >> > > TYPE(PropertiesField) :: PptFld >> > > ... >> > > >> > > CALL PetscInitialize(PETSC_COMM_WORLD,ierr) >> > > ... >> > > CALL PetscFinalize(ierr) >> > > >> > > END PROGRAM >> > > >> > > >> > > When I run the program appears a Segmentation Fault, which disappears >> > > when I comment the booleans marked in the code. Because I need them, I used >> > > Valgrind to figure out what is happening but it is yet a mistery to me. >> > > >> > > Valgrind message: >> > > ==5160== >> > > ==5160== Invalid read of size 1 >> > >> > It is curious that it says "of size 1" when we declare PetscBool to >> > be a logical*4 I don't see anything obviously wrong. >> > >> > Please send a simple code we can compile and run that reproduces the >> > problem. >> > >> > Barry >> > > ==5160== at 0x4FB2156: petscinitialize_ (zstart.c:433) >> > > ==5160== by 0x4030EA: MAIN__ (ANISOFLOW.F90:29) # line of petsc >> > > inizalitation >> > > ==5160== by 0x404380: main (ANISOFLOW.F90:3) # line of "USE >> > > ANISOFLOW_Types, ONLY : ... ,PropertiesField, ..." >> > > ==5160== Address 0xc54fff is not stack'd, malloc'd or (recently) >> > > free'd >> > > ==5160== >> > > >> > > Program received signal SIGSEGV: Segmentation fault - invalid memory >> > > reference. >> > > >> > > Backtrace for this error: >> > > #0 0x699E777 >> > > #1 0x699ED7E >> > > #2 0x6F0BCAF >> > > #3 0x4FB2156 >> > > #4 0x4030EA in anisoflow at ANISOFLOW.F90:29 >> > > >> > > I think it is maybe related with petsc because the error popped out >> > > just in its initialization, so if you know what's going on, I would >> > > appreciate to tell me. >> > > >> > > Santiago Ospina >> > > -- >> > > >> > > -- >> > > Att: >> > > >> > > Santiago Ospina De Los R?os >> > > National University of Colombia >> > >> > >> > >> > >> > -- >> > >> > -- >> > Att: >> > >> > Santiago Ospina De Los R?os >> > National University of Colombia >> > > > > -- > > -- > Att: > > Santiago Ospina De Los R?os > National University of Colombia From jrekier at gmail.com Sun Aug 21 10:04:19 2016 From: jrekier at gmail.com (=?utf-8?Q?J=C3=A9r=C3=A9my_REKIER?=) Date: Sun, 21 Aug 2016 17:04:19 +0200 Subject: [petsc-users] petsc installs okay, cannot install petsc4py Message-ID: Dear support team, I am trying to install petsc4py on top of my miniconda3 python environment. I recently filed an issue on bitbucket about the trouble that I was having with installing the development version of petsc4py. I have changed the status of this issue as invalid as this no longer corresponds to the trouble that I am having. Since I think that the problem I?m probably having comes from my?possibly broken?environment, I reckoned the best was to ask for support via e-mail. I had previously succeeded in installing the development version of both petsc and petsc4py on my office computer running on MacOSX. But, when I try to do so on my laptop (using the same OS) using exactly the same command inputs, I keep on failing. Now, I seem to compile PETSc without trouble. Configuration step produces the following output: jrek at MacJerem:petsc$ python2.7 ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-scalar-type=complex --download-scalapack --download-mumps =============================================================================== Configuring PETSc to compile on your system =============================================================================== Compilers: C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fvisibility=hidden -g Fortran Compiler: mpif90 -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g Linkers: Shared linker: mpicc -dynamiclib -single_module -undefined dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 Dynamic linker: mpicc -dynamiclib -single_module -undefined dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 make: BLAS/LAPACK: -llapack -lblas MPI: Includes: -I/usr/local/Cellar/open-mpi/1.10.3/include cmake: Arch: hwloc: Includes: -I/usr/local/include Library: -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lhwloc scalapack: Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lscalapack MUMPS: Includes: -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord X: Includes: -I/opt/X11/include Library: -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 pthread: sowing: ssl: Includes: -I/usr/local/opt/openssl/include Library: -Wl,-rpath,/usr/local/opt/openssl/lib -L/usr/local/opt/openssl/lib -lssl -lcrypto PETSc: PETSC_ARCH: arch-darwin-c-opt PETSC_DIR: /Users/jrek/softs/petsc Scalar type: complex Precision: double Clanguage: C Integer size: 32 shared libraries: enabled Memory alignment: 16 xxx=========================================================================xxx Configure stage complete. Now build PETSc libraries with (gnumake build): make PETSC_DIR=/Users/jrek/softs/petsc PETSC_ARCH=arch-darwin-c-opt all xxx=========================================================================xxx Then, everything works smoothly with building and testing until I get to the point of actually installing petsc4py. Then, the error I am having is this one: jrek at MacJerem:petsc4py$ python setup.py install running install running build running build_src running build_py running build_ext PETSC_DIR: /Users/jrek/softs/petsc PETSC_ARCH: arch-darwin-c-opt version: 3.7.3 development integer-size: 32-bit scalar-type: complex precision: double language: CONLY compiler: mpicc linker: mpicc building 'PETSc' extension mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -g3 -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/Users/jrek/miniconda3/include -arch x86_64 -DPETSC_DIR=/Users/jrek/softs/petsc -I/usr/local/Cellar/open-mpi/1.10.3/include -I/usr/local/opt/openssl/include -I/opt/X11/include -I/usr/local/include -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include -I/Users/jrek/softs/petsc/include -Isrc/include -I/Users/jrek/miniconda3/lib/python3.5/site-packages/numpy/core/include -I/Users/jrek/miniconda3/include/python3.5m -c src/PETSc.c -o build/temp.macosx-10.6-x86_64-3.5/arch-darwin-c-opt/src/PETSc.o In file included from src/PETSc.c:3: In file included from src/petsc4py.PETSc.c:271: In file included from /usr/local/include/petsc.h:5: In file included from /usr/local/include/petscbag.h:4: /usr/local/include/petscsys.h:152:6: error: "PETSc was configured with one OpenMPI mpi.h version but now appears to be compiling using a different OpenMPI mpi.h version" # error "PETSc was configured with one OpenMPI mpi.h version but now appears to be compiling using a different OpenMPI mpi.h version" Plus other errors which I think have good chances of being due to this one. I guess I must probably have conflicting OpenMPI installs but I do not understand why PETSc is cannot compile just as fine as it did a moment ago. How can be sure of the ?mpi.h? that I am using and would specify one solve my problem ? Any help would be greatly appreciated as this is slowly driving me insane :) Thanks very much in advance :D Cheers, Jerem -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Aug 21 10:35:02 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 21 Aug 2016 10:35:02 -0500 Subject: [petsc-users] petsc installs okay, cannot install petsc4py In-Reply-To: References: Message-ID: On Sun, Aug 21, 2016 at 10:04 AM, J?r?my REKIER wrote: > > Dear support team, > > I am trying to install petsc4py on top of my miniconda3 python > environment. > > I recently filed an issue > on > bitbucket about the trouble that I was having with installing the > development version of petsc4py. I have changed the status of this issue as > invalid as this no longer corresponds to the trouble that I am having. > > Since I think that the problem I?m probably having comes from my?possibly > broken?environment, I reckoned the best was to ask for support via e-mail. > > I had previously succeeded in installing the development version of both > petsc and petsc4py on my office computer running on MacOSX. But, when I try > to do so on my laptop (using the same OS) using exactly the same command > inputs, I keep on failing. > > Now, I seem to compile PETSc without trouble. Configuration step produces > the following output: > > > jrek at MacJerem:petsc$ python2.7 ./configure --with-cc=mpicc > --with-cxx=mpicxx --with-fc=mpif90 --with-scalar-type=complex > --download-scalapack --download-mumps > ============================================================ > =================== > Configuring PETSc to compile on your system > > ============================================================ > =================== > Compilers: > > > C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 > C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -fvisibility=hidden -g > Fortran Compiler: mpif90 -Wall -ffree-line-length-0 > -Wno-unused-dummy-argument -g > Linkers: > Shared linker: mpicc -dynamiclib -single_module -undefined > dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings > -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments > -fvisibility=hidden -g3 > Dynamic linker: mpicc -dynamiclib -single_module -undefined > dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings > -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments > -fvisibility=hidden -g3 > make: > BLAS/LAPACK: -llapack -lblas > MPI: > Includes: -I/usr/local/Cellar/open-mpi/1.10.3/include > cmake: > Arch: > hwloc: > Includes: -I/usr/local/include > Library: -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lhwloc > scalapack: > Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib > -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lscalapack > MUMPS: > Includes: -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include > Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib > -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lcmumps -ldmumps > -lsmumps -lzmumps -lmumps_common -lpord > X: > Includes: -I/opt/X11/include > Library: -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 > pthread: > sowing: > ssl: > Includes: -I/usr/local/opt/openssl/include > Library: -Wl,-rpath,/usr/local/opt/openssl/lib > -L/usr/local/opt/openssl/lib -lssl -lcrypto > PETSc: > PETSC_ARCH: arch-darwin-c-opt > PETSC_DIR: /Users/jrek/softs/petsc > Scalar type: complex > Precision: double > Clanguage: C > Integer size: 32 > shared libraries: enabled > Memory alignment: 16 > xxx========================================================= > ================xxx > Configure stage complete. Now build PETSc libraries with (gnumake build): > make PETSC_DIR=/Users/jrek/softs/petsc PETSC_ARCH=arch-darwin-c-opt all > xxx========================================================= > ================xxx > > Then, everything works smoothly with building and testing until I get to > the point of actually installing petsc4py. Then, the error I am having is > this one: > > jrek at MacJerem:petsc4py$ python setup.py install > running install > running build > running build_src > running build_py > running build_ext > PETSC_DIR: /Users/jrek/softs/petsc > PETSC_ARCH: arch-darwin-c-opt > version: 3.7.3 development > integer-size: 32-bit > scalar-type: complex > precision: double > language: CONLY > compiler: mpicc > linker: mpicc > building 'PETSc' extension > mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > -Qunused-arguments -g3 -Wno-unused-result -Wsign-compare -Wunreachable-code > -DNDEBUG -g -fwrapv -O3 -Wall -I/Users/jrek/miniconda3/include -arch > x86_64 -DPETSC_DIR=/Users/jrek/softs/petsc -I/usr/local/Cellar/open-mpi/1.10.3/include > -I/usr/local/opt/openssl/include -I/opt/X11/include -I/usr/local/include > -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include > -I/Users/jrek/softs/petsc/include -Isrc/include > -I/Users/jrek/miniconda3/lib/python3.5/site-packages/numpy/core/include > -I/Users/jrek/miniconda3/include/python3.5m -c src/PETSc.c -o > build/temp.macosx-10.6-x86_64-3.5/arch-darwin-c-opt/src/PETSc.o > In file included from src/PETSc.c:3: > In file included from src/petsc4py.PETSc.c:271: > In file included from /usr/local/include/petsc.h:5: > In file included from /usr/local/include/petscbag.h:4: > /usr/local/include/petscsys.h:152:6: error: "PETSc was configured with > one OpenMPI mpi.h version but now appears to be compiling using a different > OpenMPI mpi.h version" > # error "PETSc was configured with one OpenMPI mpi.h version but now > appears to be compiling using a different OpenMPI mpi.h version" > > > Plus other errors which I think have good chances of being due to this > one. > > I guess I must probably have conflicting OpenMPI installs but I do not > understand why PETSc is cannot compile just as fine as it did a moment ago. > How can be sure of the ?mpi.h? that I am using and would specify one solve > my problem ? > I believe this can happen because Python is not always as careful about letting default include directories sneak in since they believe you should always be using them. So the MPI that PETSc used was mpicc in your path Includes: -I/usr/local/Cellar/open-mpi/1.10.3/include but I bet that you have /use/include/mpi.h as well. I guess its also possible that you have a slightly different path when you installed petsc4py than when you installed PETSc, and this causes two different 'mpicc' to be picked up. I would say that having multiple installs of MPI is always the road to disaster. Matt > Any help would be greatly appreciated as this is slowly driving me insane > :) > Thanks very much in advance :D > > Cheers, > Jerem > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jrekier at gmail.com Sun Aug 21 10:56:53 2016 From: jrekier at gmail.com (=?utf-8?Q?J=C3=A9r=C3=A9my_REKIER?=) Date: Sun, 21 Aug 2016 17:56:53 +0200 Subject: [petsc-users] petsc installs okay, cannot install petsc4py In-Reply-To: References: Message-ID: <3C556EB1-A17C-4D1C-888D-8B0BD38F3294@gmail.com> Hi Matthew, Thanks for your reply. I went down hunting for other possible mpi installs and might have identified the culprit as being the mpi4py installed via anaconda package management. But now, after recompiling PETSc, I have another problem which I can?t really identify: jrek at MacJerem:petsc4py$ python setup.py build running build running build_src running build_py running build_ext PETSC_DIR: /Users/jrek/softs/petsc PETSC_ARCH: arch-darwin-c-opt version: 3.7.3 development integer-size: 32-bit scalar-type: complex precision: double language: CONLY compiler: mpicc linker: mpicc building 'PETSc' extension mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -g3 -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/Users/jrek/miniconda3/include -arch x86_64 -DPETSC_DIR=/Users/jrek/softs/petsc -I/usr/local/Cellar/open-mpi/1.10.3/include -I/usr/local/opt/openssl/include -I/opt/X11/include -I/usr/local/include -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include -I/Users/jrek/softs/petsc/include -Isrc/include -I/Users/jrek/miniconda3/lib/python3.5/site-packages/numpy/core/include -I/Users/jrek/miniconda3/include/python3.5m -c src/PETSc.c -o build/temp.macosx-10.6-x86_64-3.5/arch-darwin-c-opt/src/PETSc.o In file included from src/PETSc.c:3: In file included from src/petsc4py.PETSc.c:273: In file included from src/include/custom.h:8: In file included from /Users/jrek/softs/petsc/include/petsc/private/matimpl.h:6: /Users/jrek/softs/petsc/include/petscmatcoarsen.h:33:16: error: redefinition of '_PetscCDIntNd' typedef struct _PetscCDIntNd{ ^ /usr/local/include/petscmat.h:1322:16: note: previous definition is here typedef struct _PetscCDIntNd{ ^ In file included from src/PETSc.c:3: In file included from src/petsc4py.PETSc.c:273: In file included from src/include/custom.h:8: In file included from /Users/jrek/softs/petsc/include/petsc/private/matimpl.h:6: /Users/jrek/softs/petsc/include/petscmatcoarsen.h:36:2: error: typedef redefinition with different types ('struct (anonymous struct at /Users/jrek/softs/petsc/include/petscmatcoarsen.h:33:16)' vs 'struct _PetscCDIntNd') }PetscCDIntNd; And many subsequent errors. I have never had that one before and I have no clue of what to do to solve it. Any thought. Thanks very much. Cheers, > On 21 Aug 2016, at 5:35 PM, Matthew Knepley wrote: > > On Sun, Aug 21, 2016 at 10:04 AM, J?r?my REKIER > wrote: > > Dear support team, > > I am trying to install petsc4py on top of my miniconda3 python environment. > > I recently filed an issue on bitbucket about the trouble that I was having with installing the development version of petsc4py. I have changed the status of this issue as invalid as this no longer corresponds to the trouble that I am having. > > Since I think that the problem I?m probably having comes from my?possibly broken?environment, I reckoned the best was to ask for support via e-mail. > > I had previously succeeded in installing the development version of both petsc and petsc4py on my office computer running on MacOSX. But, when I try to do so on my laptop (using the same OS) using exactly the same command inputs, I keep on failing. > > Now, I seem to compile PETSc without trouble. Configuration step produces the following output: > > > jrek at MacJerem:petsc$ python2.7 ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-scalar-type=complex --download-scalapack --download-mumps > =============================================================================== > Configuring PETSc to compile on your system > =============================================================================== > Compilers: > C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 > C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fvisibility=hidden -g > Fortran Compiler: mpif90 -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g > Linkers: > Shared linker: mpicc -dynamiclib -single_module -undefined dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 > Dynamic linker: mpicc -dynamiclib -single_module -undefined dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 > make: > BLAS/LAPACK: -llapack -lblas > MPI: > Includes: -I/usr/local/Cellar/open-mpi/1.10.3/include > cmake: > Arch: > hwloc: > Includes: -I/usr/local/include > Library: -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lhwloc > scalapack: > Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lscalapack > MUMPS: > Includes: -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include > Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord > X: > Includes: -I/opt/X11/include > Library: -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 > pthread: > sowing: > ssl: > Includes: -I/usr/local/opt/openssl/include > Library: -Wl,-rpath,/usr/local/opt/openssl/lib -L/usr/local/opt/openssl/lib -lssl -lcrypto > PETSc: > PETSC_ARCH: arch-darwin-c-opt > PETSC_DIR: /Users/jrek/softs/petsc > Scalar type: complex > Precision: double > Clanguage: C > Integer size: 32 > shared libraries: enabled > Memory alignment: 16 > xxx=========================================================================xxx > Configure stage complete. Now build PETSc libraries with (gnumake build): > make PETSC_DIR=/Users/jrek/softs/petsc PETSC_ARCH=arch-darwin-c-opt all > xxx=========================================================================xxx > > Then, everything works smoothly with building and testing until I get to the point of actually installing petsc4py. Then, the error I am having is this one: > > jrek at MacJerem:petsc4py$ python setup.py install > running install > running build > running build_src > running build_py > running build_ext > PETSC_DIR: /Users/jrek/softs/petsc > PETSC_ARCH: arch-darwin-c-opt > version: 3.7.3 development > integer-size: 32-bit > scalar-type: complex > precision: double > language: CONLY > compiler: mpicc > linker: mpicc > building 'PETSc' extension > mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -g3 -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/Users/jrek/miniconda3/include -arch x86_64 -DPETSC_DIR=/Users/jrek/softs/petsc -I/usr/local/Cellar/open-mpi/1.10.3/include -I/usr/local/opt/openssl/include -I/opt/X11/include -I/usr/local/include -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include -I/Users/jrek/softs/petsc/include -Isrc/include -I/Users/jrek/miniconda3/lib/python3.5/site-packages/numpy/core/include -I/Users/jrek/miniconda3/include/python3.5m -c src/PETSc.c -o build/temp.macosx-10.6-x86_64-3.5/arch-darwin-c-opt/src/PETSc.o > In file included from src/PETSc.c:3: > In file included from src/petsc4py.PETSc.c:271: > In file included from /usr/local/include/petsc.h:5: > In file included from /usr/local/include/petscbag.h:4: > /usr/local/include/petscsys.h:152:6: error: "PETSc was configured with one OpenMPI mpi.h version but now appears to be compiling using a different OpenMPI mpi.h version" > # error "PETSc was configured with one OpenMPI mpi.h version but now appears to be compiling using a different OpenMPI mpi.h version" > > Plus other errors which I think have good chances of being due to this one. > > I guess I must probably have conflicting OpenMPI installs but I do not understand why PETSc is cannot compile just as fine as it did a moment ago. > How can be sure of the ?mpi.h? that I am using and would specify one solve my problem ? > > I believe this can happen because Python is not always as careful about letting default include directories sneak in since they believe > you should always be using them. > > So the MPI that PETSc used was > > mpicc in your path > Includes: -I/usr/local/Cellar/open-mpi/1.10.3/include > > but I bet that you have > > /use/include/mpi.h > > as well. I guess its also possible that you have a slightly different path when you installed petsc4py > than when you installed PETSc, and this causes two different 'mpicc' to be picked up. > > I would say that having multiple installs of MPI is always the road to disaster. > > Matt > > Any help would be greatly appreciated as this is slowly driving me insane :) > Thanks very much in advance :D > > Cheers, > Jerem > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sun Aug 21 11:09:40 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 21 Aug 2016 11:09:40 -0500 Subject: [petsc-users] petsc installs okay, cannot install petsc4py In-Reply-To: <3C556EB1-A17C-4D1C-888D-8B0BD38F3294@gmail.com> References: <3C556EB1-A17C-4D1C-888D-8B0BD38F3294@gmail.com> Message-ID: On Sun, Aug 21, 2016 at 10:56 AM, J?r?my REKIER wrote: > Hi Matthew, > > Thanks for your reply. > I went down hunting for other possible mpi installs and might have > identified the culprit as being the mpi4py installed via anaconda package > management. > But now, after recompiling PETSc, I have another problem which I can?t > really identify: > > jrek at MacJerem:petsc4py$ python setup.py build > running build > running build_src > running build_py > running build_ext > PETSC_DIR: /Users/jrek/softs/petsc > PETSC_ARCH: arch-darwin-c-opt > version: 3.7.3 development > integer-size: 32-bit > scalar-type: complex > precision: double > language: CONLY > compiler: mpicc > linker: mpicc > building 'PETSc' extension > mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > -Qunused-arguments -g3 -Wno-unused-result -Wsign-compare -Wunreachable-code > -DNDEBUG -g -fwrapv -O3 -Wall -I/Users/jrek/miniconda3/include -arch > x86_64 -DPETSC_DIR=/Users/jrek/softs/petsc -I/usr/local/Cellar/open-mpi/1.10.3/include > -I/usr/local/opt/openssl/include -I/opt/X11/include -I/usr/local/include > -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include > -I/Users/jrek/softs/petsc/include -Isrc/include > -I/Users/jrek/miniconda3/lib/python3.5/site-packages/numpy/core/include > -I/Users/jrek/miniconda3/include/python3.5m -c src/PETSc.c -o > build/temp.macosx-10.6-x86_64-3.5/arch-darwin-c-opt/src/PETSc.o > In file included from src/PETSc.c:3: > In file included from src/petsc4py.PETSc.c:273: > In file included from src/include/custom.h:8: > In file included from /Users/jrek/softs/petsc/ > include/petsc/private/matimpl.h:6: > /Users/jrek/softs/petsc/include/petscmatcoarsen.h:33:16: error: redefinition > of '_PetscCDIntNd' > typedef struct _PetscCDIntNd{ > ^ > /usr/local/include/petscmat.h:1322:16: note: previous definition is here > typedef struct _PetscCDIntNd{ > ^ > In file included from src/PETSc.c:3: > In file included from src/petsc4py.PETSc.c:273: > In file included from src/include/custom.h:8: > In file included from /Users/jrek/softs/petsc/ > include/petsc/private/matimpl.h:6: > /Users/jrek/softs/petsc/include/petscmatcoarsen.h:36:2: error: typedef > redefinition with different types ('struct (anonymous struct at > /Users/jrek/softs/petsc/include/petscmatcoarsen.h:33:16)' vs > 'struct _PetscCDIntNd') > }PetscCDIntNd; > > And many subsequent errors. I have never had that one before and I have no > clue of what to do to solve it. > Any thought. > This is very similar. You have two inconsistent PETSc installs. One in /Users/jrek/softs/petsc/ and one in /usr/local/ Get rid of the use/local one. Thanks, Matt Thanks very much. > Cheers, > > On 21 Aug 2016, at 5:35 PM, Matthew Knepley wrote: > > On Sun, Aug 21, 2016 at 10:04 AM, J?r?my REKIER wrote: > >> >> Dear support team, >> >> I am trying to install petsc4py on top of my miniconda3 python >> environment. >> >> I recently filed an issue >> on >> bitbucket about the trouble that I was having with installing the >> development version of petsc4py. I have changed the status of this issue as >> invalid as this no longer corresponds to the trouble that I am having. >> >> Since I think that the problem I?m probably having comes from my?possibly >> broken?environment, I reckoned the best was to ask for support via e-mail. >> >> I had previously succeeded in installing the development version of both >> petsc and petsc4py on my office computer running on MacOSX. But, when I try >> to do so on my laptop (using the same OS) using exactly the same command >> inputs, I keep on failing. >> >> Now, I seem to compile PETSc without trouble. Configuration step produces >> the following output: >> >> >> jrek at MacJerem:petsc$ python2.7 ./configure --with-cc=mpicc >> --with-cxx=mpicxx --with-fc=mpif90 --with-scalar-type=complex >> --download-scalapack --download-mumps >> ============================================================ >> =================== >> Configuring PETSc to compile on your system >> >> ============================================================ >> =================== >> Compilers: >> >> >> C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing >> -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 >> C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing >> -Wno-unknown-pragmas -fvisibility=hidden -g >> Fortran Compiler: mpif90 -Wall -ffree-line-length-0 >> -Wno-unused-dummy-argument -g >> Linkers: >> Shared linker: mpicc -dynamiclib -single_module -undefined >> dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings >> -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments >> -fvisibility=hidden -g3 >> Dynamic linker: mpicc -dynamiclib -single_module -undefined >> dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings >> -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments >> -fvisibility=hidden -g3 >> make: >> BLAS/LAPACK: -llapack -lblas >> MPI: >> Includes: -I/usr/local/Cellar/open-mpi/1.10.3/include >> cmake: >> Arch: >> hwloc: >> Includes: -I/usr/local/include >> Library: -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lhwloc >> scalapack: >> Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib >> -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lscalapack >> MUMPS: >> Includes: -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include >> Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib >> -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lcmumps -ldmumps >> -lsmumps -lzmumps -lmumps_common -lpord >> X: >> Includes: -I/opt/X11/include >> Library: -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 >> pthread: >> sowing: >> ssl: >> Includes: -I/usr/local/opt/openssl/include >> Library: -Wl,-rpath,/usr/local/opt/openssl/lib >> -L/usr/local/opt/openssl/lib -lssl -lcrypto >> PETSc: >> PETSC_ARCH: arch-darwin-c-opt >> PETSC_DIR: /Users/jrek/softs/petsc >> Scalar type: complex >> Precision: double >> Clanguage: C >> Integer size: 32 >> shared libraries: enabled >> Memory alignment: 16 >> xxx========================================================= >> ================xxx >> Configure stage complete. Now build PETSc libraries with (gnumake build): >> make PETSC_DIR=/Users/jrek/softs/petsc PETSC_ARCH=arch-darwin-c-opt >> all >> xxx========================================================= >> ================xxx >> >> Then, everything works smoothly with building and testing until I get to >> the point of actually installing petsc4py. Then, the error I am having is >> this one: >> >> jrek at MacJerem:petsc4py$ python setup.py install >> running install >> running build >> running build_src >> running build_py >> running build_ext >> PETSC_DIR: /Users/jrek/softs/petsc >> PETSC_ARCH: arch-darwin-c-opt >> version: 3.7.3 development >> integer-size: 32-bit >> scalar-type: complex >> precision: double >> language: CONLY >> compiler: mpicc >> linker: mpicc >> building 'PETSc' extension >> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas >> -Qunused-arguments -g3 -Wno-unused-result -Wsign-compare -Wunreachable-code >> -DNDEBUG -g -fwrapv -O3 -Wall -I/Users/jrek/miniconda3/include -arch >> x86_64 -DPETSC_DIR=/Users/jrek/softs/petsc -I/usr/local/Cellar/open-mpi/1.10.3/include >> -I/usr/local/opt/openssl/include -I/opt/X11/include -I/usr/local/include >> -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include >> -I/Users/jrek/softs/petsc/include -Isrc/include >> -I/Users/jrek/miniconda3/lib/python3.5/site-packages/numpy/core/include >> -I/Users/jrek/miniconda3/include/python3.5m -c src/PETSc.c -o >> build/temp.macosx-10.6-x86_64-3.5/arch-darwin-c-opt/src/PETSc.o >> In file included from src/PETSc.c:3: >> In file included from src/petsc4py.PETSc.c:271: >> In file included from /usr/local/include/petsc.h:5: >> In file included from /usr/local/include/petscbag.h:4: >> /usr/local/include/petscsys.h:152:6: error: "PETSc was configured with >> one OpenMPI mpi.h version but now appears to be compiling using a different >> OpenMPI mpi.h version" >> # error "PETSc was configured with one OpenMPI mpi.h version but now >> appears to be compiling using a different OpenMPI mpi.h version" >> >> >> Plus other errors which I think have good chances of being due to this >> one. >> >> I guess I must probably have conflicting OpenMPI installs but I do not >> understand why PETSc is cannot compile just as fine as it did a moment ago. >> How can be sure of the ?mpi.h? that I am using and would specify one >> solve my problem ? >> > > I believe this can happen because Python is not always as careful about > letting default include directories sneak in since they believe > you should always be using them. > > So the MPI that PETSc used was > > mpicc in your path > Includes: -I/usr/local/Cellar/open-mpi/1.10.3/include > > but I bet that you have > > /use/include/mpi.h > > as well. I guess its also possible that you have a slightly different path > when you installed petsc4py > than when you installed PETSc, and this causes two different 'mpicc' to be > picked up. > > I would say that having multiple installs of MPI is always the road to > disaster. > > Matt > > >> Any help would be greatly appreciated as this is slowly driving me insane >> :) >> Thanks very much in advance :D >> >> Cheers, >> Jerem >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jrekier at gmail.com Sun Aug 21 11:34:23 2016 From: jrekier at gmail.com (=?utf-8?Q?J=C3=A9r=C3=A9my_REKIER?=) Date: Sun, 21 Aug 2016 18:34:23 +0200 Subject: [petsc-users] petsc installs okay, cannot install petsc4py In-Reply-To: References: <3C556EB1-A17C-4D1C-888D-8B0BD38F3294@gmail.com> Message-ID: Tanks very much !! It now works like a charm ! Last time I use one of these silly ?homebrew? or 'conda' package managers for anything other than trivialities. Thanks again. Cheers, Jerem > On 21 Aug 2016, at 6:09 PM, Matthew Knepley wrote: > > On Sun, Aug 21, 2016 at 10:56 AM, J?r?my REKIER > wrote: > Hi Matthew, > > Thanks for your reply. > I went down hunting for other possible mpi installs and might have identified the culprit as being the mpi4py installed via anaconda package management. > But now, after recompiling PETSc, I have another problem which I can?t really identify: > > jrek at MacJerem:petsc4py$ python setup.py build > running build > running build_src > running build_py > running build_ext > PETSC_DIR: /Users/jrek/softs/petsc > PETSC_ARCH: arch-darwin-c-opt > version: 3.7.3 development > integer-size: 32-bit > scalar-type: complex > precision: double > language: CONLY > compiler: mpicc > linker: mpicc > building 'PETSc' extension > mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -g3 -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/Users/jrek/miniconda3/include -arch x86_64 -DPETSC_DIR=/Users/jrek/softs/petsc -I/usr/local/Cellar/open-mpi/1.10.3/include -I/usr/local/opt/openssl/include -I/opt/X11/include -I/usr/local/include -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include -I/Users/jrek/softs/petsc/include -Isrc/include -I/Users/jrek/miniconda3/lib/python3.5/site-packages/numpy/core/include -I/Users/jrek/miniconda3/include/python3.5m -c src/PETSc.c -o build/temp.macosx-10.6-x86_64-3.5/arch-darwin-c-opt/src/PETSc.o > In file included from src/PETSc.c:3: > In file included from src/petsc4py.PETSc.c:273: > In file included from src/include/custom.h:8: > In file included from /Users/jrek/softs/petsc/include/petsc/private/matimpl.h:6: > /Users/jrek/softs/petsc/include/petscmatcoarsen.h:33:16: error: redefinition of '_PetscCDIntNd' > typedef struct _PetscCDIntNd{ > ^ > /usr/local/include/petscmat.h:1322:16: note: previous definition is here > typedef struct _PetscCDIntNd{ > ^ > In file included from src/PETSc.c:3: > In file included from src/petsc4py.PETSc.c:273: > In file included from src/include/custom.h:8: > In file included from /Users/jrek/softs/petsc/include/petsc/private/matimpl.h:6: > /Users/jrek/softs/petsc/include/petscmatcoarsen.h:36:2: error: typedef redefinition with different types ('struct (anonymous struct at > /Users/jrek/softs/petsc/include/petscmatcoarsen.h:33:16)' vs 'struct _PetscCDIntNd') > }PetscCDIntNd; > > And many subsequent errors. I have never had that one before and I have no clue of what to do to solve it. > Any thought. > > This is very similar. You have two inconsistent PETSc installs. One in > > /Users/jrek/softs/petsc/ > > and one in > > /usr/local/ > > Get rid of the use/local one. > > Thanks, > > Matt > > Thanks very much. > Cheers, > >> On 21 Aug 2016, at 5:35 PM, Matthew Knepley > wrote: >> >> On Sun, Aug 21, 2016 at 10:04 AM, J?r?my REKIER > wrote: >> >> Dear support team, >> >> I am trying to install petsc4py on top of my miniconda3 python environment. >> >> I recently filed an issue on bitbucket about the trouble that I was having with installing the development version of petsc4py. I have changed the status of this issue as invalid as this no longer corresponds to the trouble that I am having. >> >> Since I think that the problem I?m probably having comes from my?possibly broken?environment, I reckoned the best was to ask for support via e-mail. >> >> I had previously succeeded in installing the development version of both petsc and petsc4py on my office computer running on MacOSX. But, when I try to do so on my laptop (using the same OS) using exactly the same command inputs, I keep on failing. >> >> Now, I seem to compile PETSc without trouble. Configuration step produces the following output: >> >> >> jrek at MacJerem:petsc$ python2.7 ./configure --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-scalar-type=complex --download-scalapack --download-mumps >> =============================================================================== >> Configuring PETSc to compile on your system >> =============================================================================== >> Compilers: >> C Compiler: mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 >> C++ Compiler: mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fvisibility=hidden -g >> Fortran Compiler: mpif90 -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g >> Linkers: >> Shared linker: mpicc -dynamiclib -single_module -undefined dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 >> Dynamic linker: mpicc -dynamiclib -single_module -undefined dynamic_lookup -multiply_defined suppress -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -fvisibility=hidden -g3 >> make: >> BLAS/LAPACK: -llapack -lblas >> MPI: >> Includes: -I/usr/local/Cellar/open-mpi/1.10.3/include >> cmake: >> Arch: >> hwloc: >> Includes: -I/usr/local/include >> Library: -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lhwloc >> scalapack: >> Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lscalapack >> MUMPS: >> Includes: -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include >> Library: -Wl,-rpath,/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -L/Users/jrek/softs/petsc/arch-darwin-c-opt/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord >> X: >> Includes: -I/opt/X11/include >> Library: -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 >> pthread: >> sowing: >> ssl: >> Includes: -I/usr/local/opt/openssl/include >> Library: -Wl,-rpath,/usr/local/opt/openssl/lib -L/usr/local/opt/openssl/lib -lssl -lcrypto >> PETSc: >> PETSC_ARCH: arch-darwin-c-opt >> PETSC_DIR: /Users/jrek/softs/petsc >> Scalar type: complex >> Precision: double >> Clanguage: C >> Integer size: 32 >> shared libraries: enabled >> Memory alignment: 16 >> xxx=========================================================================xxx >> Configure stage complete. Now build PETSc libraries with (gnumake build): >> make PETSC_DIR=/Users/jrek/softs/petsc PETSC_ARCH=arch-darwin-c-opt all >> xxx=========================================================================xxx >> >> Then, everything works smoothly with building and testing until I get to the point of actually installing petsc4py. Then, the error I am having is this one: >> >> jrek at MacJerem:petsc4py$ python setup.py install >> running install >> running build >> running build_src >> running build_py >> running build_ext >> PETSC_DIR: /Users/jrek/softs/petsc >> PETSC_ARCH: arch-darwin-c-opt >> version: 3.7.3 development >> integer-size: 32-bit >> scalar-type: complex >> precision: double >> language: CONLY >> compiler: mpicc >> linker: mpicc >> building 'PETSc' extension >> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -g3 -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -I/Users/jrek/miniconda3/include -arch x86_64 -DPETSC_DIR=/Users/jrek/softs/petsc -I/usr/local/Cellar/open-mpi/1.10.3/include -I/usr/local/opt/openssl/include -I/opt/X11/include -I/usr/local/include -I/Users/jrek/softs/petsc/arch-darwin-c-opt/include -I/Users/jrek/softs/petsc/include -Isrc/include -I/Users/jrek/miniconda3/lib/python3.5/site-packages/numpy/core/include -I/Users/jrek/miniconda3/include/python3.5m -c src/PETSc.c -o build/temp.macosx-10.6-x86_64-3.5/arch-darwin-c-opt/src/PETSc.o >> In file included from src/PETSc.c:3: >> In file included from src/petsc4py.PETSc.c:271: >> In file included from /usr/local/include/petsc.h:5: >> In file included from /usr/local/include/petscbag.h:4: >> /usr/local/include/petscsys.h:152:6: error: "PETSc was configured with one OpenMPI mpi.h version but now appears to be compiling using a different OpenMPI mpi.h version" >> # error "PETSc was configured with one OpenMPI mpi.h version but now appears to be compiling using a different OpenMPI mpi.h version" >> >> Plus other errors which I think have good chances of being due to this one. >> >> I guess I must probably have conflicting OpenMPI installs but I do not understand why PETSc is cannot compile just as fine as it did a moment ago. >> How can be sure of the ?mpi.h? that I am using and would specify one solve my problem ? >> >> I believe this can happen because Python is not always as careful about letting default include directories sneak in since they believe >> you should always be using them. >> >> So the MPI that PETSc used was >> >> mpicc in your path >> Includes: -I/usr/local/Cellar/open-mpi/1.10.3/include >> >> but I bet that you have >> >> /use/include/mpi.h >> >> as well. I guess its also possible that you have a slightly different path when you installed petsc4py >> than when you installed PETSc, and this causes two different 'mpicc' to be picked up. >> >> I would say that having multiple installs of MPI is always the road to disaster. >> >> Matt >> >> Any help would be greatly appreciated as this is slowly driving me insane :) >> Thanks very much in advance :D >> >> Cheers, >> Jerem >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Sun Aug 21 12:11:00 2016 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 21 Aug 2016 20:11:00 +0300 Subject: [petsc-users] [petsc4py] a problem with computeRHSFunctionLinear interface? In-Reply-To: <5537563.Y5rDWPrv14@wotan> References: <4270161.snsPm0L6UZ@wotan> <5537563.Y5rDWPrv14@wotan> Message-ID: On 21 August 2016 at 13:01, Francesco Caimmi wrote: > I will also take the chance to ask a related question: in the original C > code, > lines 213 -214 read > ierr = TSGetSNES(ts,&snes);CHKERRQ(ierr); > ierr=SNESSetJacobian(snes,NULL,NULL,SNESComputeJacobianDefault, > NULL);CHKERRQ(ierr); > > How do I translate the last line in Python? I was not able to find the > equivalent of SNESComputeJacobianDefault. > That one is not directly available, however you can pass -snes_fd in the command line, or alternatively opts = PETSc.Options() opts['snes_fd'] = 1 snes.setFromOptions() -- Lisandro Dalcin ============ Research Scientist Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Extreme Computing Research Center (ECRC) King Abdullah University of Science and Technology (KAUST) http://ecrc.kaust.edu.sa/ 4700 King Abdullah University of Science and Technology al-Khawarizmi Bldg (Bldg 1), Office # 0109 Thuwal 23955-6900, Kingdom of Saudi Arabia http://www.kaust.edu.sa Office Phone: +966 12 808-0459 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jshen25 at jhu.edu Sun Aug 21 21:12:19 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Sun, 21 Aug 2016 22:12:19 -0400 Subject: [petsc-users] Store and reuse the factor of matrix In-Reply-To: <87wpjd758n.fsf@jedbrown.org> References: <87wpjd758n.fsf@jedbrown.org> Message-ID: Hi, Jed, I agree with you that it's not wise to reuse pc with new operator. Basically, I just keep pc and matrix updating concurrently. Thanks for telling me that option in ksp. It works. Bests, Jinlei On Fri, Aug 19, 2016 at 12:27 AM, Jed Brown wrote: > Jinlei Shen writes: > > > Hi Matt, > > > > Thanks for speedy reply. > > > > It seems effective in SNES. > > > > I'm curious about how it works in iterative solver. > > Let's say I'm using CG with BJACOBI for modified newton, if I Set lag as > 5, > > does that mean the ilu decomposition for pc is stored and reused for the > > next 4 iterations? Will this setting help to reduce the iteration number > of > > ksp solver? > > Reusing the preconditioner with a new operator will generally converge > more slowly (or sometimes not at all). Solving the stale linear system > may cause modified Newton to stagnate/fail, e.g., when it chooses a > search direction that is not a descent direction. > > > Also, I'm wondering how to set the same option for just linear KSP solver > > since I have coded the modified newton framework manually. > > You can call KSPSolve() repeatedly without KSPSetOperators. You can > also use KSPSetReusePreconditioner to reuse the preconditioner that was > set up in a previous solve. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Sun Aug 21 21:38:37 2016 From: jychang48 at gmail.com (Justin Chang) Date: Sun, 21 Aug 2016 21:38:37 -0500 Subject: [petsc-users] strong-scaling vs weak-scaling Message-ID: Hi all, This may or may not be a PETSc specific question but... I have seen some people claim that strong-scaling is harder to achieve than weak scaling (e.g., https://www.sharcnet.ca/help/index.php/Measuring_Parallel_Scaling_Performance) and generally speaking it makes sense - communication overhead increases with concurrency. However, we know that most PETSc solvers/applications are not only memory-bandwidth bound, but may not scale as well w.r.t. problem size as other solvers (e.g., ILU(0) may beat out GAMG for small elliptic problems but GAMG will eventually beat out ILU(0) for larger problems), so wouldn't weak-scaling not only be the more interesting but more difficult performance metric to achieve? Strong-scaling issues arise mostly from communication overhead but weak-scaling issues may come from that and also solver/algorithmic scalability w.r.t problem size (e.g., problem size N takes 10*T seconds to compute but problem size 2*N takes 50*T seconds to compute). In other words, if one were to propose or design a new algorithm/solver capable of handling large-scale problems, would it be equally if not more important to show the weak-scaling potential? Because if you really think about it, a "truly efficient" algorithm will be less likely to scale in the strong sense but computation time will be close to linearly proportional to problem size (hence better scaling in the weak-sense). It seems if I am trying to convince someone that a proposed computational framework is "high performing" without getting too deep into performance modeling, a poor parallel efficiency (arising due to good sequential efficiency) in the strong sense may not look promising. Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpraveen at gmail.com Mon Aug 22 04:05:42 2016 From: cpraveen at gmail.com (Praveen C) Date: Mon, 22 Aug 2016 14:35:42 +0530 Subject: [petsc-users] Example for unstructured grid, metis, petsc Message-ID: Dear all We are developing a 3d unstructured grid finite volume code for compressible turbulent flows. Our approach is to use metis/parmetis to partition the mesh. Then read these partitioned mesh files in the MPI-based cfd code for computation. Are there any examples available in PETSc which are similar to this. Thanks praveen -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 22 05:42:28 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Aug 2016 05:42:28 -0500 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: Message-ID: On Sun, Aug 21, 2016 at 9:38 PM, Justin Chang wrote: > Hi all, > > This may or may not be a PETSc specific question but... > > I have seen some people claim that strong-scaling is harder to achieve > than weak scaling (e.g., https://www.sharcnet.ca/help/index.php/Measuring_ > Parallel_Scaling_Performance) and generally speaking it makes sense - > communication overhead increases with concurrency. > > However, we know that most PETSc solvers/applications are not only > memory-bandwidth bound, but may not scale as well w.r.t. problem size as > other solvers (e.g., ILU(0) may beat out GAMG for small elliptic problems > but GAMG will eventually beat out ILU(0) for larger problems), so wouldn't > weak-scaling not only be the more interesting but more difficult > performance metric to achieve? Strong-scaling issues arise mostly from > communication overhead but weak-scaling issues may come from that and also > solver/algorithmic scalability w.r.t problem size (e.g., problem size N > takes 10*T seconds to compute but problem size 2*N takes 50*T seconds to > compute). > > In other words, if one were to propose or design a new algorithm/solver > capable of handling large-scale problems, would it be equally if not more > important to show the weak-scaling potential? Because if you really think > about it, a "truly efficient" algorithm will be less likely to scale in the > strong sense but computation time will be close to linearly proportional to > problem size (hence better scaling in the weak-sense). It seems if I am > trying to convince someone that a proposed computational framework is "high > performing" without getting too deep into performance modeling, a poor > parallel efficiency (arising due to good sequential efficiency) in the > strong sense may not look promising. > It definitely depends on your point of view. However, I believe that the point being made by people is twofold: 1) Weak scaling is relatively easy to game by making a serially inefficient code. 2) Strong scaling involves reducing the serial fraction of your code to get around the Amdahl limit. For most codes, I don't think communication overhead even makes it on the table. Engineering for this level of concurrency is not easy. There is a related point: 3) Scientists often want to solve a particular problem rather than the biggest problem a machine can fit in memory For sophisticated algorithms that can handle real world problems, I don't think weak scaling is easy either. However, for toy problems it is, and since most CS types only deal with the Laplace equation on a square, it does not look hard. Thanks, Matt Thanks, > Justin > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 22 05:45:32 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Aug 2016 05:45:32 -0500 Subject: [petsc-users] Example for unstructured grid, metis, petsc In-Reply-To: References: Message-ID: On Mon, Aug 22, 2016 at 4:05 AM, Praveen C wrote: > Dear all > > We are developing a 3d unstructured grid finite volume code for > compressible turbulent flows. > > Our approach is to use metis/parmetis to partition the mesh. > Then read these partitioned mesh files in the MPI-based cfd code for > computation. > > Are there any examples available in PETSc which are similar to this. > That is a pretty difficult problem. The closest example we have is probably TS ex11, which can solve things like the shallow water equation and Euler. Currently it reads in a mesh, partitions in memory, and distributes the mesh using MPI. After that it can regularly refine. In the next release, we will introduce the ability to - read a mesh in parallel - adaptively refine in parallel using the Pragmatic package - load balance after adaptive refinement - use adaptive quadtree meshes from p4est Thanks, Matt > Thanks > praveen > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhofer at itt.uni-stuttgart.de Mon Aug 22 07:36:58 2016 From: mhofer at itt.uni-stuttgart.de (Jonas Mairhofer) Date: Mon, 22 Aug 2016 14:36:58 +0200 Subject: [petsc-users] Implementation of Anderson mixing Message-ID: <57BAF1EA.3030804@itt.uni-stuttgart.de> Dear PETSc-Team, I am having a bit of trouble understanding the way the Anderson mixing method is implemented inPETSc. On the corresponding website (http://www.mcs.anl.gov/petsc/petsc-3.5/docs/manualpages/SNES/SNESAnderson.html) it says in the "Option database" section X_{update} = X + \beta F which looks to me as if the new solution is calculated from the old solution plus something which depends on the old residual F. However, in the "Notes" section it says that the new solution is found by combining "m previous solutions" which I interpret as the last m values of X. In most other sources on Anderson mixing (e.g. the original one referd to on the website mentioned above) the update procedure looks something like X^{k+1} = (1-\beta) \sum_{i=0}^{m} \alpha_i^k X^{k-m+1} + \beta \sum_{i=0}^m \alpha_i^k G(X^{k-m+i}) where G = F + X. Therefore, the new value of X is calculated from the last m values of X and F. Unfortunately, I was not able to understand and follow the actual source code in src/snes/impls/ngmres/anderson.c Could you please tell me whether the last values of X, F or both are used in the update scheme of PETSc's Anderson mixing implementation? And is \beta used in a "regular line search way" where X^{new} = X^{old} + \beta* "undamped update" or as in the equation above X^{new} = (1-\beta) \sum "function of last m X" + \beta \sum "function of last m G" which would explain to me why beta is not set via -snes_linesearch_damping but using the extra option -snes_anderson_beta. Thank you very much for your help! Jonas -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Mon Aug 22 08:02:37 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Mon, 22 Aug 2016 15:02:37 +0200 Subject: [petsc-users] Example for unstructured grid, metis, petsc In-Reply-To: References: Message-ID: Dear Matt The last option is in fact very interesting. Do you know if we can already use it a priori from the master branch? Giang On Mon, Aug 22, 2016 at 12:45 PM, Matthew Knepley wrote: > On Mon, Aug 22, 2016 at 4:05 AM, Praveen C wrote: > >> Dear all >> >> We are developing a 3d unstructured grid finite volume code for >> compressible turbulent flows. >> >> Our approach is to use metis/parmetis to partition the mesh. >> Then read these partitioned mesh files in the MPI-based cfd code for >> computation. >> >> Are there any examples available in PETSc which are similar to this. >> > > That is a pretty difficult problem. The closest example we have is > probably TS ex11, which can solve > things like the shallow water equation and Euler. Currently it reads in a > mesh, partitions in memory, > and distributes the mesh using MPI. After that it can regularly refine. In > the next release, we will > introduce the ability to > > - read a mesh in parallel > > - adaptively refine in parallel using the Pragmatic package > > - load balance after adaptive refinement > > - use adaptive quadtree meshes from p4est > > Thanks, > > Matt > > >> Thanks >> praveen >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 22 08:10:59 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Aug 2016 08:10:59 -0500 Subject: [petsc-users] Example for unstructured grid, metis, petsc In-Reply-To: References: Message-ID: On Mon, Aug 22, 2016 at 8:02 AM, Hoang Giang Bui wrote: > Dear Matt > > The last option is in fact very interesting. Do you know if we can already > use it a priori from the master branch? > Yes. It is working on SNES ex12, ex62, and TS ex11. We have some more work to do testing adaptivity, but simple gradient indicators are working right I think. Thanks, Matt > Giang > > On Mon, Aug 22, 2016 at 12:45 PM, Matthew Knepley > wrote: > >> On Mon, Aug 22, 2016 at 4:05 AM, Praveen C wrote: >> >>> Dear all >>> >>> We are developing a 3d unstructured grid finite volume code for >>> compressible turbulent flows. >>> >>> Our approach is to use metis/parmetis to partition the mesh. >>> Then read these partitioned mesh files in the MPI-based cfd code for >>> computation. >>> >>> Are there any examples available in PETSc which are similar to this. >>> >> >> That is a pretty difficult problem. The closest example we have is >> probably TS ex11, which can solve >> things like the shallow water equation and Euler. Currently it reads in a >> mesh, partitions in memory, >> and distributes the mesh using MPI. After that it can regularly refine. >> In the next release, we will >> introduce the ability to >> >> - read a mesh in parallel >> >> - adaptively refine in parallel using the Pragmatic package >> >> - load balance after adaptive refinement >> >> - use adaptive quadtree meshes from p4est >> >> Thanks, >> >> Matt >> >> >>> Thanks >>> praveen >>> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mirzadeh at gmail.com Mon Aug 22 09:19:05 2016 From: mirzadeh at gmail.com (Mohammad Mirzadeh) Date: Mon, 22 Aug 2016 10:19:05 -0400 Subject: [petsc-users] Example for unstructured grid, metis, petsc In-Reply-To: References: Message-ID: Matt, > Yes. It is working on SNES ex12, ex62, and TS ex11. We have some more work > to do testing adaptivity, but simple > gradient indicators are working right I think. > > Thanks, > > Matt > Is there a document/todo-list sort of thing that outlines the roadmap with p4est integration? What things are done/remains to do sort of draft? Best, Mohammad > > >> Giang >> >> On Mon, Aug 22, 2016 at 12:45 PM, Matthew Knepley >> wrote: >> >>> On Mon, Aug 22, 2016 at 4:05 AM, Praveen C wrote: >>> >>>> Dear all >>>> >>>> We are developing a 3d unstructured grid finite volume code for >>>> compressible turbulent flows. >>>> >>>> Our approach is to use metis/parmetis to partition the mesh. >>>> Then read these partitioned mesh files in the MPI-based cfd code for >>>> computation. >>>> >>>> Are there any examples available in PETSc which are similar to this. >>>> >>> >>> That is a pretty difficult problem. The closest example we have is >>> probably TS ex11, which can solve >>> things like the shallow water equation and Euler. Currently it reads in >>> a mesh, partitions in memory, >>> and distributes the mesh using MPI. After that it can regularly refine. >>> In the next release, we will >>> introduce the ability to >>> >>> - read a mesh in parallel >>> >>> - adaptively refine in parallel using the Pragmatic package >>> >>> - load balance after adaptive refinement >>> >>> - use adaptive quadtree meshes from p4est >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks >>>> praveen >>>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 22 09:45:06 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 22 Aug 2016 09:45:06 -0500 Subject: [petsc-users] Example for unstructured grid, metis, petsc In-Reply-To: References: Message-ID: On Mon, Aug 22, 2016 at 9:19 AM, Mohammad Mirzadeh wrote: > Matt, > > >> Yes. It is working on SNES ex12, ex62, and TS ex11. We have some more >> work to do testing adaptivity, but simple >> gradient indicators are working right I think. >> >> Thanks, >> >> Matt >> > > Is there a document/todo-list sort of thing that outlines the roadmap with > p4est integration? What things are done/remains to do sort of draft? > Toby may have a more extensive one. Shortly, if all you want is for PETSc to manage the grid, the p4est integration is done. It works in parallel and can do most things Plex does. For anything else, it just falls back to Plex. For very large problems, we might want to write more of the Plex functionality into p4est, but that would wait for users getting that large. Most of our work now is concerned with - Interfacing with the rudimentary discretization support in PETSc - Designing better adaptivity measures - Fixing inconsistencies in the interface - Supporting mixed grid-particle methods Thanks, Matt > Best, > Mohammad > > >> >> >>> Giang >>> >>> On Mon, Aug 22, 2016 at 12:45 PM, Matthew Knepley >>> wrote: >>> >>>> On Mon, Aug 22, 2016 at 4:05 AM, Praveen C wrote: >>>> >>>>> Dear all >>>>> >>>>> We are developing a 3d unstructured grid finite volume code for >>>>> compressible turbulent flows. >>>>> >>>>> Our approach is to use metis/parmetis to partition the mesh. >>>>> Then read these partitioned mesh files in the MPI-based cfd code for >>>>> computation. >>>>> >>>>> Are there any examples available in PETSc which are similar to this. >>>>> >>>> >>>> That is a pretty difficult problem. The closest example we have is >>>> probably TS ex11, which can solve >>>> things like the shallow water equation and Euler. Currently it reads in >>>> a mesh, partitions in memory, >>>> and distributes the mesh using MPI. After that it can regularly refine. >>>> In the next release, we will >>>> introduce the ability to >>>> >>>> - read a mesh in parallel >>>> >>>> - adaptively refine in parallel using the Pragmatic package >>>> >>>> - load balance after adaptive refinement >>>> >>>> - use adaptive quadtree meshes from p4est >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thanks >>>>> praveen >>>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From overholt at capesim.com Mon Aug 22 10:49:17 2016 From: overholt at capesim.com (Matthew Overholt) Date: Mon, 22 Aug 2016 11:49:17 -0400 Subject: [petsc-users] MatMkl_CPardisoSetCntl Message-ID: <004601d1fc8c$bdea1270$39be3750$@capesim.com> Hi All, I am using the Intel MKL CPardiso library as a PC direct solver, and I am trying to figure out how to properly set options (the Pardiso and CPardiso "iparm" parameter values in the Intel docs). Q1: To set the Pivoting perturbation, for example, is the correct call: ierr = MatMkl_CPardisoSetCntl( K, 10, 13 ); where the KSP matrix to be inverted is "K" and the value that I want is "13"? This is the "-mat_mkl_pardiso_10" option listed on the MATSOLVERMKL_PARDISO page, and the "iparm[9]" option in the Intel docs. Q2: Where do I make this call, before "MatSetUp( K )", at some point during the creation and setup of KSP, or after calling "PCFactorSetMatSolverPackage( pc, MATSOLVERMKL_CPARDISO )"? I've tried many different combinations and none of them seem to work. Since the effect of changing the pivoting perturbation may not be obvious, I also tried setting the number of OpenMP threads to use within Pardiso (icntl = 65, without the environmental variable MKL_NUM_THREADS present) and my setting was ignored (Pardiso defaulted to using the maximum number of cores present). Thanks in advance, Matt Overholt --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Aug 22 11:07:57 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 Aug 2016 11:07:57 -0500 Subject: [petsc-users] Implementation of Anderson mixing In-Reply-To: <57BAF1EA.3030804@itt.uni-stuttgart.de> References: <57BAF1EA.3030804@itt.uni-stuttgart.de> Message-ID: <093D35E8-E48B-4552-AE7E-343B6C864EE0@mcs.anl.gov> > On Aug 22, 2016, at 7:36 AM, Jonas Mairhofer wrote: > > Dear PETSc-Team, > > I am having a bit of trouble understanding the way the Anderson mixing method is implemented inPETSc. > > On the corresponding website (http://www.mcs.anl.gov/petsc/petsc-3.5/docs/manualpages/SNES/SNESAnderson.html) it says in the > "Option database" section > > X_{update} = X + \beta F This is wrong, it is the "mixing" parameter. I will fix it. > > which looks to me as if the new solution is calculated from the old solution plus something which depends on the old residual F. > > However, in the "Notes" section it says that the new solution is found by combining "m previous solutions" which I interpret as the last m values of X. > > In most other sources on Anderson mixing (e.g. the original one referd to on the website mentioned above) the update procedure looks something like > > X^{k+1} = (1-\beta) \sum_{i=0}^{m} \alpha_i^k X^{k-m+1} + \beta \sum_{i=0}^m \alpha_i^k G(X^{k-m+i}) where G = F + X. > > Therefore, the new value of X is calculated from the last m values of X and F. > > Unfortunately, I was not able to understand and follow the actual source code in src/snes/impls/ngmres/anderson.c The computation of the \sum_{i=0}^{m} \alpha_i^k X^{k-m+1} and \sum_{i=0}^m \alpha_i^k G(X^{k-m+i} take place in 146: SNESNGMRESFormCombinedSolution_Private(snes,ivec,l,XM,FM,fMnorm,X,XA,FA); in particular the ierr = VecMAXPY(XA,l,beta,Xdot);CHKERRQ(ierr); (note this is NOT the same \beta it is more like the \alpha above). ierr = VecMAXPY(FA,l,beta,Fdot);CHKERRQ(ierr); > > Could you please tell me whether the last values of X, F or both are used in the update scheme of PETSc's Anderson mixing implementation? Both are, see above. > > And is \beta used in a "regular line search way" where X^{new} = X^{old} + \beta* "undamped update" No > > or as in the equation above > > X^{new} = (1-\beta) \sum "function of last m X" + \beta \sum "function of last m G" Yes. 142: VecAXPY(XM,-ngmres->andersonBeta,FM); > > which would explain to me why beta is not set via -snes_linesearch_damping but using the extra option -snes_anderson_beta. > > > Thank you very much for your help! > Jonas From bsmith at mcs.anl.gov Mon Aug 22 11:32:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 Aug 2016 11:32:14 -0500 Subject: [petsc-users] MatMkl_CPardisoSetCntl In-Reply-To: <004601d1fc8c$bdea1270$39be3750$@capesim.com> References: <004601d1fc8c$bdea1270$39be3750$@capesim.com> Message-ID: > On Aug 22, 2016, at 10:49 AM, Matthew Overholt wrote: > > Hi All, > > I am using the Intel MKL CPardiso library as a PC direct solver, and I am trying to figure out how to properly set options (the Pardiso and CPardiso ?iparm? parameter values in the Intel docs). > > Q1: To set the Pivoting perturbation, for example, is the correct call: > ierr = MatMkl_CPardisoSetCntl( K, 10, 13 ); > where the KSP matrix to be inverted is ?K? and the value that I want is ?13?? This is the ?-mat_mkl_pardiso_10? option listed on the MATSOLVERMKL_PARDISO page, and the ?iparm[9]? option in the Intel docs. From the documentation: Input Parameters: + F - the factored matrix obtained by calling MatGetFactor() So it is not the matrix you pass to KSP, it is the factored matrix created inside the KSP; the problem is that there is no way to access this factored matrix when using KSP before the factorization takes place to change these values. > > Q2: Where do I make this call, before ?MatSetUp( K )?, at some point during the creation and setup of KSP, or after calling ?PCFactorSetMatSolverPackage( pc, MATSOLVERMKL_CPARDISO )?? I?ve tried many different combinations and none of them seem to work. > > Since the effect of changing the pivoting perturbation may not be obvious, I also tried setting the number of OpenMP threads to use within Pardiso (icntl = 65, without the environmental variable MKL_NUM_THREADS present) and my setting was ignored (Pardiso defaulted to using the maximum number of cores present). The easy way to set these is via the options database so from the command line you can use -mat_mkl_cpardiso_ for example -mat_mkl_cpardiso_10 13 or in the code you can write PetscOptionsSetValue(NULL,"-mat_mkl_cpardiso_10","13"); make sure that you call the PetscOptionsSetValue() BEFORE you call KSPSetFromOptions(). I am adding a note to manual page for this routine to make this clear. Barry The MatMkl_CPardisoSetCntl() routines are only usable if you don't use KSP but use MatGetFactor() and MatLUFactorSymbolic() directly. I don't recommend this. > > Thanks in advance, > Matt Overholt > > Virus-free. www.avast.com From mfadams at lbl.gov Mon Aug 22 12:44:17 2016 From: mfadams at lbl.gov (Mark Adams) Date: Mon, 22 Aug 2016 13:44:17 -0400 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: Message-ID: > > >> I have seen some people claim that strong-scaling is harder to achieve >> than weak scaling (e.g., https://www.sharcnet.ca >> /help/index.php/Measuring_Parallel_Scaling_Performance) and generally >> speaking it makes sense - communication overhead increases with concurrency. >> > I would back up and avoid this sort of competitive thing. Weak and strong scaling are different, but valid metrics. They each tell you different things. Are oranges better than apples? Depends but they are both useful and of the same basic class. Just to add, I've started to like Jed's metric of "dynamic range" (for lack of a better word). This is like strong scaling except instead of fixing the problem and increasing the parallelism you fix the parallelism and decrease the problem size. Then plot this with Time vs. N/Time (rate). This has the nice property, like weak scaling, a horizontal line is perfect. The rollover point is the lower bound on turnaround time (I call these turnaround time plots sometimes), and both axis are interesting (size, P or N, is not interesting, but rate and time are). But, this is just different. Because it keeps parallelism the same, your latencies are a constant (ignoring algorithmic effects of problem size), so in a sense it is between strong and weak scaling. -------------- next part -------------- An HTML attachment was scrubbed... URL: From overholt at capesim.com Mon Aug 22 13:10:00 2016 From: overholt at capesim.com (Matthew Overholt) Date: Mon, 22 Aug 2016 14:10:00 -0400 Subject: [petsc-users] MatMkl_CPardisoSetCntl In-Reply-To: References: <004601d1fc8c$bdea1270$39be3750$@capesim.com> Message-ID: <006001d1fca0$66dc5e20$34951a60$@capesim.com> >> On Aug 22, 2016, at 10:49 AM, Matthew Overholt wrote: >> >> I am using the Intel MKL CPardiso library as a PC direct solver, and I am trying to >> figure out how to properly set options (the Pardiso and CPardiso ?iparm? parameter >> values in the Intel docs). > On Aug 22, 2016, at 12:32 PM, Barry Smith wrote: > The easy way to set these is via the options database so from the command line you > can use -mat_mkl_cpardiso_ for example -mat_mkl_cpardiso_10 13 or in > the code you can write PetscOptionsSetValue(NULL,"-mat_mkl_cpardiso_10","13"); make > sure that you call the PetscOptionsSetValue() BEFORE you call KSPSetFromOptions(). Thanks, Barry, that is very helpful. I can successfully set some of the CPardiso parameters now from the command line (or otherwise, using the options database), but according to "-help" the only MKL_CPARDISO Options available are -mat_mkl_cpardiso_1 and _65 through _69. If I specify option 10 as you suggested I get a warning that the option was not used. I would like to get access to several other parameters to try and fix my zero pivot problem. How should I do that? Does the MKL interface make use of some -pc_factor* options (such as -pc_factor_zeropivot) when it calls CPardiso? Thanks again, Matt Overholt --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus From bsmith at mcs.anl.gov Mon Aug 22 13:39:51 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 22 Aug 2016 13:39:51 -0500 Subject: [petsc-users] MatMkl_CPardisoSetCntl In-Reply-To: <006001d1fca0$66dc5e20$34951a60$@capesim.com> References: <004601d1fc8c$bdea1270$39be3750$@capesim.com> <006001d1fca0$66dc5e20$34951a60$@capesim.com> Message-ID: <27A52EA3-BFA3-4AEB-B983-5FA454B229B6@mcs.anl.gov> > On Aug 22, 2016, at 1:10 PM, Matthew Overholt wrote: > >>> On Aug 22, 2016, at 10:49 AM, Matthew Overholt wrote: >>> >>> I am using the Intel MKL CPardiso library as a PC direct solver, and I am trying to >>> figure out how to properly set options (the Pardiso and CPardiso ?iparm? parameter >>> values in the Intel docs). > >> On Aug 22, 2016, at 12:32 PM, Barry Smith wrote: >> The easy way to set these is via the options database so from the command line you >> can use -mat_mkl_cpardiso_ for example -mat_mkl_cpardiso_10 13 or in >> the code you can write PetscOptionsSetValue(NULL,"-mat_mkl_cpardiso_10","13"); make >> sure that you call the PetscOptionsSetValue() BEFORE you call KSPSetFromOptions(). > > Thanks, Barry, that is very helpful. I can successfully set some of the CPardiso parameters now from the command line (or otherwise, using the options database), but according to "-help" the only MKL_CPARDISO Options available are -mat_mkl_cpardiso_1 and _65 through _69. If I specify option 10 as you suggested I get a warning that the option was not used. Hmm, looks like you need to run with -mat_mkl_cpardiso_1 1 in order to get the other options like 10 to work. So for example -mat_mkl_cpardiso_1 1 -mat_mkl_cpardiso_10 13 > I would like to get access to several other parameters to try and fix my zero pivot problem. How should I do that? Does the MKL interface make use of some -pc_factor* options (such as -pc_factor_zeropivot) when it calls CPardiso? No it does not use these options. > > Thanks again, > Matt Overholt > > > --- > This email has been checked for viruses by Avast antivirus software. > https://www.avast.com/antivirus > From rupp at iue.tuwien.ac.at Mon Aug 22 16:14:45 2016 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Mon, 22 Aug 2016 23:14:45 +0200 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: Message-ID: Hi Justin, > I have seen some people claim that strong-scaling is harder to achieve > than weak scaling > (e.g., https://www.sharcnet.ca/help/index.php/Measuring_Parallel_Scaling_Performance) > and generally speaking it makes sense - communication overhead increases > with concurrency. > > However, we know that most PETSc solvers/applications are not only > memory-bandwidth bound, but may not scale as well w.r.t. problem size as > other solvers (e.g., ILU(0) may beat out GAMG for small elliptic > problems but GAMG will eventually beat out ILU(0) for larger problems), > so wouldn't weak-scaling not only be the more interesting but more > difficult performance metric to achieve? Strong-scaling issues arise > mostly from communication overhead but weak-scaling issues may come from > that and also solver/algorithmic scalability w.r.t problem size (e.g., > problem size N takes 10*T seconds to compute but problem size 2*N takes > 50*T seconds to compute). > > In other words, if one were to propose or design a new algorithm/solver > capable of handling large-scale problems, would it be equally if not > more important to show the weak-scaling potential? Because if you really > think about it, a "truly efficient" algorithm will be less likely to > scale in the strong sense but computation time will be close to linearly > proportional to problem size (hence better scaling in the weak-sense). > It seems if I am trying to convince someone that a proposed > computational framework is "high performing" without getting too deep > into performance modeling, a poor parallel efficiency (arising due to > good sequential efficiency) in the strong sense may not look promising. These are all valid thoughts. Let me add another perspective: If you are only interested in the machine aspects of scaling, you could run for a fixed number of solver iterations. That allows you to focus on the actual computational work done and your results will exclusively reflect the machine's performance. Thus, even though fixing solver iterations and thus not running solvers to convergence is a bad shortcut from the solver point of view, it can be a handy way of eliminating algorithmic fluctuations. (Clearly, this simplistic approach has not only been used, but also abused...) Best regards, Karli From jychang48 at gmail.com Mon Aug 22 21:03:35 2016 From: jychang48 at gmail.com (Justin Chang) Date: Mon, 22 Aug 2016 21:03:35 -0500 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: Message-ID: Thanks all. So this issue was one of our ATPESC2015 exam questions, and turned some friends into foes. Most eventually fell into the strong-scale is harder camp, but some of these "friends" also believed PETSc is *not* capable of handling dense matrices and is not portable. Just wanted to hear some expert opinions on this :) Anyway, in one of my applications, I am comparing the performance of some VI solvers (i.e., with variable bounds) with that of just standard linear solves (i.e., no variable bounds) for 3D advection-diffusion equations in highly heterogeneous and anisotropic porous media. The parallel efficiency in the strong-sense is roughly the same but the parallel efficiency in the weak-sense is significantly worse for VI solvers. I suppose one inference that can be made is that VI solvers take longer to solver as the problem size grows. And yes solver iteration counts do grow so that has some to do with it. As for these "dynamic range" plots, I tried something like this across 1 and 8 MPI processes with the following problem sizes for a 3D anisotropic diffusion problem with CG/BoomerAMG: 1,331 9,261 29,791 68,921 132,651 226,981 357,911 531,441 753,571 1,030,301 Using a single Intel Xeon E5-2670 compute node for this. Attached is the plot, but instead of flat or incline lines, i get concave down curves. If my problem size gets too big, the N/time rate decreases, whereas for very small problems it increases. I am guessing bandwidth limitation have something to do with the decrease in performance. In that HPGMG presentation you attached the other day, it seems the rate should decrease as problem size decreases. Perhaps this study should be done with more MPI processes? On Mon, Aug 22, 2016 at 4:14 PM, Karl Rupp wrote: > Hi Justin, > > > I have seen some people claim that strong-scaling is harder to achieve >> than weak scaling >> (e.g., https://www.sharcnet.ca/help/index.php/Measuring_Parallel_Sc >> aling_Performance) >> and generally speaking it makes sense - communication overhead increases >> with concurrency. >> >> However, we know that most PETSc solvers/applications are not only >> memory-bandwidth bound, but may not scale as well w.r.t. problem size as >> other solvers (e.g., ILU(0) may beat out GAMG for small elliptic >> problems but GAMG will eventually beat out ILU(0) for larger problems), >> so wouldn't weak-scaling not only be the more interesting but more >> difficult performance metric to achieve? Strong-scaling issues arise >> mostly from communication overhead but weak-scaling issues may come from >> that and also solver/algorithmic scalability w.r.t problem size (e.g., >> problem size N takes 10*T seconds to compute but problem size 2*N takes >> 50*T seconds to compute). >> >> In other words, if one were to propose or design a new algorithm/solver >> capable of handling large-scale problems, would it be equally if not >> more important to show the weak-scaling potential? Because if you really >> think about it, a "truly efficient" algorithm will be less likely to >> scale in the strong sense but computation time will be close to linearly >> proportional to problem size (hence better scaling in the weak-sense). >> It seems if I am trying to convince someone that a proposed >> computational framework is "high performing" without getting too deep >> into performance modeling, a poor parallel efficiency (arising due to >> good sequential efficiency) in the strong sense may not look promising. >> > > These are all valid thoughts. Let me add another perspective: If you are > only interested in the machine aspects of scaling, you could run for a > fixed number of solver iterations. That allows you to focus on the actual > computational work done and your results will exclusively reflect the > machine's performance. Thus, even though fixing solver iterations and thus > not running solvers to convergence is a bad shortcut from the solver point > of view, it can be a handy way of eliminating algorithmic fluctuations. > (Clearly, this simplistic approach has not only been used, but also > abused...) > > Best regards, > Karli > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: example_dynamic_range.eps Type: application/postscript Size: 87684 bytes Desc: not available URL: From jychang48 at gmail.com Tue Aug 23 03:33:10 2016 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 23 Aug 2016 03:33:10 -0500 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: Message-ID: Redid some of those experiments for 8 and 20 cores and scaled it up to even larger problems. Attached is the plot. Looking at this "dynamic plot" (if you ask me, I honestly think there could be a better word for this out there), the lines curve up for the smaller problems, have a "flat line" in the middle, then slowly tail down as the problem gets bigger. I am guessing these downward curves have to do with either memory bandwidth effects or simply the solver requiring more effort to handle larger problems (or a combination of both). I currently only have access to a small 80 node (20 cores per node) HPC cluster so obviously I am unable to experiment with 10k cores or more. If our goal is to see how close flat the lines get, we can easily game the system by scaling the problem until we find the "sweet spot(s)". In the weak-scaling and strong-scaling studies there are perfect lines we can compare to, but there does not seem to be such lines for this type of study even in the seemingly flat regions. Seems these plots are useful if we simply compare different solvers/preconditioners/etc or different HPC platforms. Also, the solver count iteration increases with problem size - it went from 9 KSP iterations for 1,331 dofs to 48 KSP iterations for 3,442,951 dofs. Algorithmic time-to-solution is not linearly proportional to problem size so the RHS of the graph is obviously going to have lower N/time rates at some point - similar to what we observe from weak-scaling. Also, the N/time rate seems very similar to the floating-point rate, although I can see why it's more informative. Any thoughts on anything I said or did thus far? Just wanting to make sure I understand these correctly :) On Mon, Aug 22, 2016 at 9:03 PM, Justin Chang wrote: > Thanks all. So this issue was one of our ATPESC2015 exam questions, and > turned some friends into foes. Most eventually fell into the strong-scale > is harder camp, but some of these "friends" also believed PETSc is *not* > capable of handling dense matrices and is not portable. Just wanted to hear > some expert opinions on this :) > > Anyway, in one of my applications, I am comparing the performance of some > VI solvers (i.e., with variable bounds) with that of just standard linear > solves (i.e., no variable bounds) for 3D advection-diffusion equations in > highly heterogeneous and anisotropic porous media. The parallel efficiency > in the strong-sense is roughly the same but the parallel efficiency in the > weak-sense is significantly worse for VI solvers. I suppose one inference > that can be made is that VI solvers take longer to solver as the problem > size grows. And yes solver iteration counts do grow so that has some to do > with it. > > As for these "dynamic range" plots, I tried something like this across 1 > and 8 MPI processes with the following problem sizes for a 3D anisotropic > diffusion problem with CG/BoomerAMG: > > 1,331 > 9,261 > 29,791 > 68,921 > 132,651 > 226,981 > 357,911 > 531,441 > 753,571 > 1,030,301 > > Using a single Intel Xeon E5-2670 compute node for this. Attached is the > plot, but instead of flat or incline lines, i get concave down curves. If > my problem size gets too big, the N/time rate decreases, whereas for very > small problems it increases. I am guessing bandwidth limitation have > something to do with the decrease in performance. In that HPGMG > presentation you attached the other day, it seems the rate should decrease > as problem size decreases. Perhaps this study should be done with more MPI > processes? > > > On Mon, Aug 22, 2016 at 4:14 PM, Karl Rupp wrote: > >> Hi Justin, >> >> >> I have seen some people claim that strong-scaling is harder to achieve >>> than weak scaling >>> (e.g., https://www.sharcnet.ca/help/index.php/Measuring_Parallel_Sc >>> aling_Performance) >>> and generally speaking it makes sense - communication overhead increases >>> with concurrency. >>> >>> However, we know that most PETSc solvers/applications are not only >>> memory-bandwidth bound, but may not scale as well w.r.t. problem size as >>> other solvers (e.g., ILU(0) may beat out GAMG for small elliptic >>> problems but GAMG will eventually beat out ILU(0) for larger problems), >>> so wouldn't weak-scaling not only be the more interesting but more >>> difficult performance metric to achieve? Strong-scaling issues arise >>> mostly from communication overhead but weak-scaling issues may come from >>> that and also solver/algorithmic scalability w.r.t problem size (e.g., >>> problem size N takes 10*T seconds to compute but problem size 2*N takes >>> 50*T seconds to compute). >>> >>> In other words, if one were to propose or design a new algorithm/solver >>> capable of handling large-scale problems, would it be equally if not >>> more important to show the weak-scaling potential? Because if you really >>> think about it, a "truly efficient" algorithm will be less likely to >>> scale in the strong sense but computation time will be close to linearly >>> proportional to problem size (hence better scaling in the weak-sense). >>> It seems if I am trying to convince someone that a proposed >>> computational framework is "high performing" without getting too deep >>> into performance modeling, a poor parallel efficiency (arising due to >>> good sequential efficiency) in the strong sense may not look promising. >>> >> >> These are all valid thoughts. Let me add another perspective: If you are >> only interested in the machine aspects of scaling, you could run for a >> fixed number of solver iterations. That allows you to focus on the actual >> computational work done and your results will exclusively reflect the >> machine's performance. Thus, even though fixing solver iterations and thus >> not running solvers to convergence is a bad shortcut from the solver point >> of view, it can be a handy way of eliminating algorithmic fluctuations. >> (Clearly, this simplistic approach has not only been used, but also >> abused...) >> >> Best regards, >> Karli >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: example_dynamic_range.eps Type: application/postscript Size: 87076 bytes Desc: not available URL: From cyrill.von.planta at usi.ch Tue Aug 23 03:51:04 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Tue, 23 Aug 2016 08:51:04 +0000 Subject: [petsc-users] MatSOR and GaussSeidel Message-ID: <67876D44-A778-487C-9761-3DAA10356F6A@usi.ch> Dear PETSc-Users, I am debugging a smoother of ours and i was wondering what settings of MatSOR exactly form one ordinary Gauss Seidel smoothing step. Currently I use: ierr = MatSOR(A,b,1.0,(MatSORType)(SOR_ZERO_INITIAL_GUESS | SOR_FORWARD_SWEEP),0,1,1,x); CHKERRV(ierr); I expect this to be the same as this na?ve Gauss-Seidel step: for (int i=0;i References: <57BAF1EA.3030804@itt.uni-stuttgart.de> <093D35E8-E48B-4552-AE7E-343B6C864EE0@mcs.anl.gov> Message-ID: <57BC141A.8010900@itt.uni-stuttgart.de> Thank you for this fast response! Am 22.08.2016 18:07, schrieb Barry Smith: >> On Aug 22, 2016, at 7:36 AM, Jonas Mairhofer wrote: >> >> Dear PETSc-Team, >> >> I am having a bit of trouble understanding the way the Anderson mixing method is implemented inPETSc. >> >> On the corresponding website (http://www.mcs.anl.gov/petsc/petsc-3.5/docs/manualpages/SNES/SNESAnderson.html) it says in the >> "Option database" section >> >> X_{update} = X + \beta F > This is wrong, it is the "mixing" parameter. I will fix it. >> which looks to me as if the new solution is calculated from the old solution plus something which depends on the old residual F. >> >> However, in the "Notes" section it says that the new solution is found by combining "m previous solutions" which I interpret as the last m values of X. >> >> In most other sources on Anderson mixing (e.g. the original one referd to on the website mentioned above) the update procedure looks something like >> >> X^{k+1} = (1-\beta) \sum_{i=0}^{m} \alpha_i^k X^{k-m+1} + \beta \sum_{i=0}^m \alpha_i^k G(X^{k-m+i}) where G = F + X. >> >> Therefore, the new value of X is calculated from the last m values of X and F. >> >> Unfortunately, I was not able to understand and follow the actual source code in src/snes/impls/ngmres/anderson.c > The computation of the \sum_{i=0}^{m} \alpha_i^k X^{k-m+1} and \sum_{i=0}^m \alpha_i^k G(X^{k-m+i} take place in > > 146: SNESNGMRESFormCombinedSolution_Private(snes,ivec,l,XM,FM,fMnorm,X,XA,FA); > > in particular the > > ierr = VecMAXPY(XA,l,beta,Xdot);CHKERRQ(ierr); > > (note this is NOT the same \beta it is more like the \alpha above). > > ierr = VecMAXPY(FA,l,beta,Fdot);CHKERRQ(ierr); > >> Could you please tell me whether the last values of X, F or both are used in the update scheme of PETSc's Anderson mixing implementation? > Both are, see above. >> And is \beta used in a "regular line search way" where X^{new} = X^{old} + \beta* "undamped update" > No >> or as in the equation above >> >> X^{new} = (1-\beta) \sum "function of last m X" + \beta \sum "function of last m G" > Yes. > > 142: VecAXPY(XM,-ngmres->andersonBeta,FM); > >> which would explain to me why beta is not set via -snes_linesearch_damping but using the extra option -snes_anderson_beta. >> >> >> Thank you very much for your help! >> Jonas From jed at jedbrown.org Tue Aug 23 09:54:20 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 23 Aug 2016 08:54:20 -0600 Subject: [petsc-users] MatSOR and GaussSeidel In-Reply-To: <67876D44-A778-487C-9761-3DAA10356F6A@usi.ch> References: <67876D44-A778-487C-9761-3DAA10356F6A@usi.ch> Message-ID: <87k2f7tu1v.fsf@jedbrown.org> Cyrill Vonplanta writes: > Dear PETSc-Users, > > I am debugging a smoother of ours and i was wondering what settings of MatSOR exactly form one ordinary Gauss Seidel smoothing step. Currently I use: > > ierr = MatSOR(A,b,1.0,(MatSORType)(SOR_ZERO_INITIAL_GUESS | SOR_FORWARD_SWEEP),0,1,1,x); CHKERRV(ierr); Yes, this is a standard forward sweep of GS. Note that your code below computes half zeros (because the vector starts as 0), but that it handles the diagonal incorrectly if you were to use a nonzero initial guess. > I expect this to be the same as this na?ve Gauss-Seidel step: > > > for (int i=0;i > sum_i = 0; > > sum_i += ps_b_values[i]; > > for (int j=0;j > sum_i -= ps_A_values[i+j*m]*ps_x_values[j]; > > } > > ps_x_values[i] += sum_i/ps_A_values[i*m +i]; > > } > > The ps_* refer to the data parts of PETSc types (everything is serial and dense in my toy example. Initial x is zero.m is dimension of A). However the convergence history looks different. Am I missing something here? > > Best Cyrill -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From niko.karin at gmail.com Tue Aug 23 10:54:59 2016 From: niko.karin at gmail.com (Karin&NiKo) Date: Tue, 23 Aug 2016 17:54:59 +0200 Subject: [petsc-users] Command lines to reproduce the tests of "Composing scalable nonlinear algebraic solvers" Message-ID: Dear PETSc team, I have read with high interest the paper of Peter Brune et al. entitled "Composing scalable nonlinear algebraic solvers". Nevertheless I would like to be able to reproduce the tests that are presented within (mainly the elasticity problem, ex16). Could you please provide us with the command lines of these tests? Best regards, Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 23 13:25:17 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 23 Aug 2016 13:25:17 -0500 Subject: [petsc-users] Command lines to reproduce the tests of "Composing scalable nonlinear algebraic solvers" In-Reply-To: References: Message-ID: On Tue, Aug 23, 2016 at 10:54 AM, Karin&NiKo wrote: > Dear PETSc team, > > I have read with high interest the paper of Peter Brune et al. entitled > "Composing scalable nonlinear algebraic solvers". > Nevertheless I would like to be able to reproduce the tests that are > presented within (mainly the elasticity problem, ex16). > > Could you please provide us with the command lines of these tests? > I believe Peter used the attached script. Thanks, Matt > Best regards, > Nicolas > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: experiments.py Type: text/x-python-script Size: 32598 bytes Desc: not available URL: From andrewh0 at uw.edu Wed Aug 24 00:57:05 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Tue, 23 Aug 2016 22:57:05 -0700 Subject: [petsc-users] DMPlex higher order elements Message-ID: I created an unstructured tri6 mesh in Cubit and am trying to read it into PETSc using DMPlexCreateFromFile. However, when I do PETSc gives me an error that it doesn't support this type of element. I know my PETSc install has Exodus II support because by giving a different Exodus mesh file with Tri3 elements works just fine. I've attached the Exodus mesh generated by Cubit. I can open up the mesh file in Paraview just fine and it does appear to be valid (it has 2 elements approximating a quarter circle wedge). Here's the error message: [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Cone size 6 not supported for dimension 2 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1257-ge04de0c GIT > Date: 2016-08-24 00:14:37 -0500 > > [0]PETSC ERROR: bin/mesh_testing on a arch-linux2-c-opt named ahome by > andrew Tue Aug 23 22:48:57 2016 > > [0]PETSC ERROR: Configure options --with-debugging=0 --COPTFLAGS="-O3 > -march=native" --CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 > -march=native" --download-exodusii=yes --download-hdf5=yes > --download-netcdf=yes > [0]PETSC ERROR: #8 DMPlexGetRawFaces_Internal() line 93 in > petsc/src/dm/impls/plex/plexinterpolate.c > > [0]PETSC ERROR: #9 DMPlexGetFaces_Internal() line 20 in > petsc/src/dm/impls/plex/plexinterpolate.c > > [0]PETSC ERROR: #10 DMPlexInterpolateFaces_Internal() line 172 in > petsc/src/dm/impls/plex/plexinterpolate.c > > [0]PETSC ERROR: #11 DMPlexInterpolate() line 532 in > petsc/src/dm/impls/plex/plexinterpolate.c > > [0]PETSC ERROR: #12 DMPlexCreateExodus() line 168 in > petsc/src/dm/impls/plex/plexexodusii.c > > [0]PETSC ERROR: #13 DMPlexCreateExodusFromFile() line 46 in > petsc/src/dm/impls/plex/plexexodusii.c > > [0]PETSC ERROR: #14 DMPlexCreateFromFile() line 1967 in > petsc/src/dm/impls/plex/plexcreate.c > > [0]PETSC ERROR: #15 main() line 11 in mesh_testing.cpp > > [0]PETSC ERROR: No PETSc Option Table entries > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tri6_wedge.exo Type: application/octet-stream Size: 1360 bytes Desc: not available URL: From knepley at gmail.com Wed Aug 24 10:02:56 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Aug 2016 10:02:56 -0500 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: Message-ID: On Wed, Aug 24, 2016 at 12:57 AM, Andrew Ho wrote: > I created an unstructured tri6 mesh in Cubit and am trying to read it into > PETSc using DMPlexCreateFromFile. However, when I do PETSc gives me an > error that it doesn't support this type of element. > Yes, I do not support that since I think its a crazy way to talk about things. All the topological information is in the Tri3 mesh, and Cubit has no business telling me about the function space. Matt > I know my PETSc install has Exodus II support because by giving a > different Exodus mesh file with Tri3 elements works just fine. I've > attached the Exodus mesh generated by Cubit. I can open up the mesh file in > Paraview just fine and it does appear to be valid (it has 2 elements > approximating a quarter circle wedge). > > Here's the error message: > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: Cone size 6 not supported for dimension 2 >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1257-ge04de0c GIT >> Date: 2016-08-24 00:14:37 -0500 >> >> [0]PETSC ERROR: bin/mesh_testing on a arch-linux2-c-opt named ahome by >> andrew Tue Aug 23 22:48:57 2016 >> >> [0]PETSC ERROR: Configure options --with-debugging=0 --COPTFLAGS="-O3 >> -march=native" --CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 >> -march=native" --download-exodusii=yes --download-hdf5=yes >> --download-netcdf=yes >> [0]PETSC ERROR: #8 DMPlexGetRawFaces_Internal() line 93 in >> petsc/src/dm/impls/plex/plexinterpolate.c >> >> [0]PETSC ERROR: #9 DMPlexGetFaces_Internal() line 20 in >> petsc/src/dm/impls/plex/plexinterpolate.c >> >> [0]PETSC ERROR: #10 DMPlexInterpolateFaces_Internal() line 172 in >> petsc/src/dm/impls/plex/plexinterpolate.c >> >> [0]PETSC ERROR: #11 DMPlexInterpolate() line 532 in >> petsc/src/dm/impls/plex/plexinterpolate.c >> >> [0]PETSC ERROR: #12 DMPlexCreateExodus() line 168 in >> petsc/src/dm/impls/plex/plexexodusii.c >> >> [0]PETSC ERROR: #13 DMPlexCreateExodusFromFile() line 46 in >> petsc/src/dm/impls/plex/plexexodusii.c >> >> [0]PETSC ERROR: #14 DMPlexCreateFromFile() line 1967 in >> petsc/src/dm/impls/plex/plexcreate.c >> >> [0]PETSC ERROR: #15 main() line 11 in mesh_testing.cpp >> >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire >> error message to petsc-maint at mcs.anl.gov---------- >> > > > -- > Andrew Ho > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From griffith at cims.nyu.edu Wed Aug 24 10:45:03 2016 From: griffith at cims.nyu.edu (Boyce Griffith) Date: Wed, 24 Aug 2016 11:45:03 -0400 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: Message-ID: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> > On Aug 24, 2016, at 11:02 AM, Matthew Knepley wrote: > >> On Wed, Aug 24, 2016 at 12:57 AM, Andrew Ho wrote: >> I created an unstructured tri6 mesh in Cubit and am trying to read it into PETSc using DMPlexCreateFromFile. However, when I do PETSc gives me an error that it doesn't support this type of element. > > Yes, I do not support that since I think its a crazy way to talk about things. All the topological information is in the Tri3 mesh, and > Cubit has no business telling me about the function space. Do you support / plan to support curved elements? > Matt > >> I know my PETSc install has Exodus II support because by giving a different Exodus mesh file with Tri3 elements works just fine. I've attached the Exodus mesh generated by Cubit. I can open up the mesh file in Paraview just fine and it does appear to be valid (it has 2 elements approximating a quarter circle wedge). >> >> Here's the error message: >> >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [0]PETSC ERROR: Argument out of range >>> [0]PETSC ERROR: Cone size 6 not supported for dimension 2 >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1257-ge04de0c GIT Date: 2016-08-24 00:14:37 -0500 >>> [0]PETSC ERROR: bin/mesh_testing on a arch-linux2-c-opt named ahome by andrew Tue Aug 23 22:48:57 2016 >>> [0]PETSC ERROR: Configure options --with-debugging=0 --COPTFLAGS="-O3 -march=native" --CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 -march=native" --download-exodusii=yes --download-hdf5=yes --download-netcdf=yes >>> [0]PETSC ERROR: #8 DMPlexGetRawFaces_Internal() line 93 in petsc/src/dm/impls/plex/plexinterpolate.c >>> [0]PETSC ERROR: #9 DMPlexGetFaces_Internal() line 20 in petsc/src/dm/impls/plex/plexinterpolate.c >>> [0]PETSC ERROR: #10 DMPlexInterpolateFaces_Internal() line 172 in petsc/src/dm/impls/plex/plexinterpolate.c >>> [0]PETSC ERROR: #11 DMPlexInterpolate() line 532 in petsc/src/dm/impls/plex/plexinterpolate.c >>> [0]PETSC ERROR: #12 DMPlexCreateExodus() line 168 in petsc/src/dm/impls/plex/plexexodusii.c >>> [0]PETSC ERROR: #13 DMPlexCreateExodusFromFile() line 46 in petsc/src/dm/impls/plex/plexexodusii.c >>> [0]PETSC ERROR: #14 DMPlexCreateFromFile() line 1967 in petsc/src/dm/impls/plex/plexcreate.c >>> [0]PETSC ERROR: #15 main() line 11 in mesh_testing.cpp >>> [0]PETSC ERROR: No PETSc Option Table entries >>> [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- >> >> >> -- >> Andrew Ho > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 24 12:03:58 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Aug 2016 12:03:58 -0500 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> Message-ID: On Wed, Aug 24, 2016 at 10:45 AM, Boyce Griffith wrote: > > > On Aug 24, 2016, at 11:02 AM, Matthew Knepley wrote: > > On Wed, Aug 24, 2016 at 12:57 AM, Andrew Ho wrote: > >> I created an unstructured tri6 mesh in Cubit and am trying to read it >> into PETSc using DMPlexCreateFromFile. However, when I do PETSc gives me an >> error that it doesn't support this type of element. >> > > Yes, I do not support that since I think its a crazy way to talk about > things. All the topological information is in the Tri3 mesh, and > Cubit has no business telling me about the function space. > > > Do you support / plan to support curved elements? > I had "support" in there, but there were bugs. Toby and Mark discovered these, and Toby has fixed them. I think all of the fixes are in master now. We currently support isoparametric elements, but it will not be hard (I think) to support super/sub-parametrics by attaching an independent FEM space to the coordinate DM. Right now it just defaults to the one in the master DM. I don't think there should be a problem with that, but of course we need to try. Matt > Matt > > >> I know my PETSc install has Exodus II support because by giving a >> different Exodus mesh file with Tri3 elements works just fine. I've >> attached the Exodus mesh generated by Cubit. I can open up the mesh file in >> Paraview just fine and it does appear to be valid (it has 2 elements >> approximating a quarter circle wedge). >> >> Here's the error message: >> >> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >>> [0]PETSC ERROR: Argument out of range >>> [0]PETSC ERROR: Cone size 6 not supported for dimension 2 >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> for trouble shooting. >>> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1257-ge04de0c >>> GIT Date: 2016-08-24 00:14:37 -0500 >>> >>> [0]PETSC ERROR: bin/mesh_testing on a arch-linux2-c-opt named ahome by >>> andrew Tue Aug 23 22:48:57 2016 >>> >>> [0]PETSC ERROR: Configure options --with-debugging=0 --COPTFLAGS="-O3 >>> -march=native" --CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 >>> -march=native" --download-exodusii=yes --download-hdf5=yes >>> --download-netcdf=yes >>> [0]PETSC ERROR: #8 DMPlexGetRawFaces_Internal() line 93 in >>> petsc/src/dm/impls/plex/plexinterpolate.c >>> >>> [0]PETSC ERROR: #9 DMPlexGetFaces_Internal() line 20 in >>> petsc/src/dm/impls/plex/plexinterpolate.c >>> >>> [0]PETSC ERROR: #10 DMPlexInterpolateFaces_Internal() line 172 in >>> petsc/src/dm/impls/plex/plexinterpolate.c >>> >>> [0]PETSC ERROR: #11 DMPlexInterpolate() line 532 in >>> petsc/src/dm/impls/plex/plexinterpolate.c >>> >>> [0]PETSC ERROR: #12 DMPlexCreateExodus() line 168 in >>> petsc/src/dm/impls/plex/plexexodusii.c >>> >>> [0]PETSC ERROR: #13 DMPlexCreateExodusFromFile() line 46 in >>> petsc/src/dm/impls/plex/plexexodusii.c >>> >>> [0]PETSC ERROR: #14 DMPlexCreateFromFile() line 1967 in >>> petsc/src/dm/impls/plex/plexcreate.c >>> >>> [0]PETSC ERROR: #15 main() line 11 in mesh_testing.cpp >>> >>> [0]PETSC ERROR: No PETSc Option Table entries >>> [0]PETSC ERROR: ----------------End of Error Message -------send entire >>> error message to petsc-maint at mcs.anl.gov---------- >>> >> >> >> -- >> Andrew Ho >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Aug 24 12:12:13 2016 From: jed at jedbrown.org (Jed Brown) Date: Wed, 24 Aug 2016 11:12:13 -0600 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> Message-ID: <87a8g2qefm.fsf@jedbrown.org> Matthew Knepley writes: >> Yes, I do not support that since I think its a crazy way to talk about >> things. All the topological information is in the Tri3 mesh, and >> Cubit has no business telling me about the function space. >> >> >> Do you support / plan to support curved elements? >> > > I had "support" in there, but there were bugs. Toby and Mark discovered > these, and Toby has fixed them. I think > all of the fixes are in master now. The context is clearly that the mesh generator needs to express the curved elements. DMPlex doesn't have a geometric model available, so it doesn't know how to make the Tri3 elements curve to conform more accurately to the boundary. The mesh generator has no business telling you what function space to use for your solution, but it'd be a shame to prevent it from expressing element geometry. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From knepley at gmail.com Wed Aug 24 12:17:29 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Aug 2016 12:17:29 -0500 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: <87a8g2qefm.fsf@jedbrown.org> References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> <87a8g2qefm.fsf@jedbrown.org> Message-ID: On Wed, Aug 24, 2016 at 12:12 PM, Jed Brown wrote: > Matthew Knepley writes: > >> Yes, I do not support that since I think its a crazy way to talk about > >> things. All the topological information is in the Tri3 mesh, and > >> Cubit has no business telling me about the function space. > >> > >> > >> Do you support / plan to support curved elements? > >> > > > > I had "support" in there, but there were bugs. Toby and Mark discovered > > these, and Toby has fixed them. I think > > all of the fixes are in master now. > > The context is clearly that the mesh generator needs to express the > curved elements. DMPlex doesn't have a geometric model available, so it > doesn't know how to make the Tri3 elements curve to conform more > accurately to the boundary. The mesh generator has no business telling > you what function space to use for your solution, but it'd be a shame to > prevent it from expressing element geometry. > Oh, so you want me to accept this Tri6 format for specifying quadratic triangle geometry. I guess I could stomach that, but I really hate that you would mix the topological information with the function space information, so that I have to peel them apart. They should just have a separate thing that indicates a function space and values for the geometry instead of screwing up a perfectly good topological definition. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 24 12:22:38 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Aug 2016 12:22:38 -0500 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: Message-ID: On Wed, Aug 24, 2016 at 12:57 AM, Andrew Ho wrote: > I created an unstructured tri6 mesh in Cubit and am trying to read it into > PETSc using DMPlexCreateFromFile. However, when I do PETSc gives me an > error that it doesn't support this type of element. > > I know my PETSc install has Exodus II support because by giving a > different Exodus mesh file with Tri3 elements works just fine. I've > attached the Exodus mesh generated by Cubit. I can open up the mesh file in > Paraview just fine and it does appear to be valid (it has 2 elements > approximating a quarter circle wedge). > Okay, I will look at getting this to read in alright. Notice that Paraview does exactly the wrong thing here in that it has straight lines connecting the midpoints and corners of the triangles. They should be smooth quadratic curves, and this misunderstanding comes from the horrible format choice which pretends that there are "extra" vertices. Matt > Here's the error message: > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: Cone size 6 not supported for dimension 2 >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1257-ge04de0c GIT >> Date: 2016-08-24 00:14:37 -0500 >> >> [0]PETSC ERROR: bin/mesh_testing on a arch-linux2-c-opt named ahome by >> andrew Tue Aug 23 22:48:57 2016 >> >> [0]PETSC ERROR: Configure options --with-debugging=0 --COPTFLAGS="-O3 >> -march=native" --CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 >> -march=native" --download-exodusii=yes --download-hdf5=yes >> --download-netcdf=yes >> [0]PETSC ERROR: #8 DMPlexGetRawFaces_Internal() line 93 in >> petsc/src/dm/impls/plex/plexinterpolate.c >> >> [0]PETSC ERROR: #9 DMPlexGetFaces_Internal() line 20 in >> petsc/src/dm/impls/plex/plexinterpolate.c >> >> [0]PETSC ERROR: #10 DMPlexInterpolateFaces_Internal() line 172 in >> petsc/src/dm/impls/plex/plexinterpolate.c >> >> [0]PETSC ERROR: #11 DMPlexInterpolate() line 532 in >> petsc/src/dm/impls/plex/plexinterpolate.c >> >> [0]PETSC ERROR: #12 DMPlexCreateExodus() line 168 in >> petsc/src/dm/impls/plex/plexexodusii.c >> >> [0]PETSC ERROR: #13 DMPlexCreateExodusFromFile() line 46 in >> petsc/src/dm/impls/plex/plexexodusii.c >> >> [0]PETSC ERROR: #14 DMPlexCreateFromFile() line 1967 in >> petsc/src/dm/impls/plex/plexcreate.c >> >> [0]PETSC ERROR: #15 main() line 11 in mesh_testing.cpp >> >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire >> error message to petsc-maint at mcs.anl.gov---------- >> > > > -- > Andrew Ho > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From ptbauman at gmail.com Wed Aug 24 12:24:16 2016 From: ptbauman at gmail.com (Paul T. Bauman) Date: Wed, 24 Aug 2016 13:24:16 -0400 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: Message-ID: On Wed, Aug 24, 2016 at 1:22 PM, Matthew Knepley wrote: > Notice that Paraview does exactly the wrong thing here in that it has > straight lines connecting the midpoints and corners of the triangles. > Everytime I meet someone from Kitware, I complain about this and their representation of quadratic functions, especially with AMR grids. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Wed Aug 24 12:27:32 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Wed, 24 Aug 2016 10:27:32 -0700 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: <87a8g2qefm.fsf@jedbrown.org> References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> <87a8g2qefm.fsf@jedbrown.org> Message-ID: Good geometric accuracy is very import for achieving appropriate convergence rates in complex geometry, not just using higher order polynomials on flat elements. If you look at Hesthaven's book Nodal Discontinuous Galerkin Methods, Table 9.1 shows that without support for curved elements, higher order DG element on flat elements converges at sub optimal rates due to inaccuracies produced by the boundary conditions. There's no way to re-construct this curved information correctly after the fact; it must be generated by the meshing software. On Wed, Aug 24, 2016 at 10:12 AM, Jed Brown wrote: > Matthew Knepley writes: > >> Yes, I do not support that since I think its a crazy way to talk about > >> things. All the topological information is in the Tri3 mesh, and > >> Cubit has no business telling me about the function space. > >> > >> > >> Do you support / plan to support curved elements? > >> > > > > I had "support" in there, but there were bugs. Toby and Mark discovered > > these, and Toby has fixed them. I think > > all of the fixes are in master now. > > The context is clearly that the mesh generator needs to express the > curved elements. DMPlex doesn't have a geometric model available, so it > doesn't know how to make the Tri3 elements curve to conform more > accurately to the boundary. The mesh generator has no business telling > you what function space to use for your solution, but it'd be a shame to > prevent it from expressing element geometry. > -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Wed Aug 24 12:33:32 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Wed, 24 Aug 2016 10:33:32 -0700 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: Message-ID: True, Paraview does draw them in as "flat" edges, but this is less of a concern for me; the important part is if I want to do a simulation on a curved geometric model, I need to be able to achieve higher order numerical accuracy because it reduces my simulation times by orders of magnitude. On Wed, Aug 24, 2016 at 10:22 AM, Matthew Knepley wrote: > On Wed, Aug 24, 2016 at 12:57 AM, Andrew Ho wrote: > >> I created an unstructured tri6 mesh in Cubit and am trying to read it >> into PETSc using DMPlexCreateFromFile. However, when I do PETSc gives me an >> error that it doesn't support this type of element. >> >> I know my PETSc install has Exodus II support because by giving a >> different Exodus mesh file with Tri3 elements works just fine. I've >> attached the Exodus mesh generated by Cubit. I can open up the mesh file in >> Paraview just fine and it does appear to be valid (it has 2 elements >> approximating a quarter circle wedge). >> > > Okay, I will look at getting this to read in alright. Notice that Paraview > does exactly the wrong thing here in that it has > straight lines connecting the midpoints and corners of the triangles. They > should be smooth quadratic curves, and this > misunderstanding comes from the horrible format choice which pretends that > there are "extra" vertices. > > Matt > > >> Here's the error message: >> >> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >>> [0]PETSC ERROR: Argument out of range >>> [0]PETSC ERROR: Cone size 6 not supported for dimension 2 >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> for trouble shooting. >>> [0]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1257-ge04de0c >>> GIT Date: 2016-08-24 00:14:37 -0500 >>> >>> [0]PETSC ERROR: bin/mesh_testing on a arch-linux2-c-opt named ahome by >>> andrew Tue Aug 23 22:48:57 2016 >>> >>> [0]PETSC ERROR: Configure options --with-debugging=0 --COPTFLAGS="-O3 >>> -march=native" --CXXOPTFLAGS="-O3 -march=native" --FOPTFLAGS="-O3 >>> -march=native" --download-exodusii=yes --download-hdf5=yes >>> --download-netcdf=yes >>> [0]PETSC ERROR: #8 DMPlexGetRawFaces_Internal() line 93 in >>> petsc/src/dm/impls/plex/plexinterpolate.c >>> >>> [0]PETSC ERROR: #9 DMPlexGetFaces_Internal() line 20 in >>> petsc/src/dm/impls/plex/plexinterpolate.c >>> >>> [0]PETSC ERROR: #10 DMPlexInterpolateFaces_Internal() line 172 in >>> petsc/src/dm/impls/plex/plexinterpolate.c >>> >>> [0]PETSC ERROR: #11 DMPlexInterpolate() line 532 in >>> petsc/src/dm/impls/plex/plexinterpolate.c >>> >>> [0]PETSC ERROR: #12 DMPlexCreateExodus() line 168 in >>> petsc/src/dm/impls/plex/plexexodusii.c >>> >>> [0]PETSC ERROR: #13 DMPlexCreateExodusFromFile() line 46 in >>> petsc/src/dm/impls/plex/plexexodusii.c >>> >>> [0]PETSC ERROR: #14 DMPlexCreateFromFile() line 1967 in >>> petsc/src/dm/impls/plex/plexcreate.c >>> >>> [0]PETSC ERROR: #15 main() line 11 in mesh_testing.cpp >>> >>> [0]PETSC ERROR: No PETSc Option Table entries >>> [0]PETSC ERROR: ----------------End of Error Message -------send entire >>> error message to petsc-maint at mcs.anl.gov---------- >>> >> >> >> -- >> Andrew Ho >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at princeton.edu Wed Aug 24 12:40:08 2016 From: mlohry at princeton.edu (Mark Lohry) Date: Wed, 24 Aug 2016 11:40:08 -0600 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> <87a8g2qefm.fsf@jedbrown.org> Message-ID: <57BDDBF8.2070104@princeton.edu> On 08/24/2016 11:27 AM, Andrew Ho wrote: > Good geometric accuracy is very import for achieving appropriate > convergence rates in complex geometry, not just using higher order > polynomials on flat elements. > > If you look at Hesthaven's book Nodal Discontinuous Galerkin Methods, > Table 9.1 shows that without support for curved elements, higher order > DG element on flat elements converges at sub optimal rates due to > inaccuracies produced by the boundary conditions. > > There's no way to re-construct this curved information correctly after > the fact; it must be generated by the meshing software. > I don't *entirely* agree with the suggestion the mesh generator has to provide that information. Some people reconstruct splines through the nodes to create higher order meshes after the fact (which also requires some detection or specification of sharp corners to break the spline). I personally take the given mesh from the generator in addition to an IGES/NURBS type CAD file for complex geometries, and then do a kind of projection for sub-cell curved boundary faces. > On Wed, Aug 24, 2016 at 10:12 AM, Jed Brown > wrote: > > Matthew Knepley > writes: > >> Yes, I do not support that since I think its a crazy way to > talk about > >> things. All the topological information is in the Tri3 mesh, and > >> Cubit has no business telling me about the function space. > >> > >> > >> Do you support / plan to support curved elements? > >> > > > > I had "support" in there, but there were bugs. Toby and Mark > discovered > > these, and Toby has fixed them. I think > > all of the fixes are in master now. > > The context is clearly that the mesh generator needs to express the > curved elements. DMPlex doesn't have a geometric model available, > so it > doesn't know how to make the Tri3 elements curve to conform more > accurately to the boundary. The mesh generator has no business > telling > you what function space to use for your solution, but it'd be a > shame to > prevent it from expressing element geometry. > > > > > -- > Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Wed Aug 24 12:48:07 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Wed, 24 Aug 2016 10:48:07 -0700 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: <57BDDBF8.2070104@princeton.edu> References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> <87a8g2qefm.fsf@jedbrown.org> <57BDDBF8.2070104@princeton.edu> Message-ID: That sounds like taking part of the job of the mesh generator and putting it into my code. I was simply stating that without access to the original geometric representation, I don't know if my simulation domain really has flat walls, or even how curved are the walls if they are curved, so there's no general way for me to accurately determine the shape of the element. On Wed, Aug 24, 2016 at 10:40 AM, Mark Lohry wrote: > On 08/24/2016 11:27 AM, Andrew Ho wrote: > > Good geometric accuracy is very import for achieving appropriate > convergence rates in complex geometry, not just using higher order > polynomials on flat elements. > > If you look at Hesthaven's book Nodal Discontinuous Galerkin Methods, > Table 9.1 shows that without support for curved elements, higher order DG > element on flat elements converges at sub optimal rates due to inaccuracies > produced by the boundary conditions. > > There's no way to re-construct this curved information correctly after the > fact; it must be generated by the meshing software. > > > I don't *entirely* agree with the suggestion the mesh generator has to > provide that information. Some people reconstruct splines through the nodes > to create higher order meshes after the fact (which also requires some > detection or specification of sharp corners to break the spline). I > personally take the given mesh from the generator in addition to an > IGES/NURBS type CAD file for complex geometries, and then do a kind of > projection for sub-cell curved boundary faces. > > > On Wed, Aug 24, 2016 at 10:12 AM, Jed Brown wrote: > >> Matthew Knepley writes: >> >> Yes, I do not support that since I think its a crazy way to talk about >> >> things. All the topological information is in the Tri3 mesh, and >> >> Cubit has no business telling me about the function space. >> >> >> >> >> >> Do you support / plan to support curved elements? >> >> >> > >> > I had "support" in there, but there were bugs. Toby and Mark discovered >> > these, and Toby has fixed them. I think >> > all of the fixes are in master now. >> >> The context is clearly that the mesh generator needs to express the >> curved elements. DMPlex doesn't have a geometric model available, so it >> doesn't know how to make the Tri3 elements curve to conform more >> accurately to the boundary. The mesh generator has no business telling >> you what function space to use for your solution, but it'd be a shame to >> prevent it from expressing element geometry. >> > > > > -- > Andrew Ho > > > > -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From mlohry at princeton.edu Wed Aug 24 12:56:47 2016 From: mlohry at princeton.edu (Mark Lohry) Date: Wed, 24 Aug 2016 11:56:47 -0600 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> <87a8g2qefm.fsf@jedbrown.org> <57BDDBF8.2070104@princeton.edu> Message-ID: <57BDDFDF.10900@princeton.edu> Philosophically it's certainly part of the job of the mesh generator, but realistically I don't think there are any really viable high-order mesh generators or even file formats right now (correct me if I'm wrong). And as Matthew put, the mesh generator has no business telling us what our function basis should be, and by extension has no basis telling us where we should place our face nodes. > I was simply stating that without access to the original geometric representation, I don't know if my simulation domain really has flat walls, > or even how curved are the walls if they are curved, so there's no general way for me to accurately determine the shape of the element. This comes up a lot in high order mesh discussions, but I always wonder when do you *not* have access to the original geometric representation? If you've generated the mesh yourself you must have the geometry. On 08/24/2016 11:48 AM, Andrew Ho wrote: > That sounds like taking part of the job of the mesh generator and > putting it into my code. I was simply stating that without access to > the original geometric representation, I don't know if my simulation > domain really has flat walls, or even how curved are the walls if they > are curved, so there's no general way for me to accurately determine > the shape of the element. > > On Wed, Aug 24, 2016 at 10:40 AM, Mark Lohry > wrote: > > On 08/24/2016 11:27 AM, Andrew Ho wrote: >> Good geometric accuracy is very import for achieving appropriate >> convergence rates in complex geometry, not just using higher >> order polynomials on flat elements. >> >> If you look at Hesthaven's book Nodal Discontinuous Galerkin >> Methods, Table 9.1 shows that without support for curved >> elements, higher order DG element on flat elements converges at >> sub optimal rates due to inaccuracies produced by the boundary >> conditions. >> >> There's no way to re-construct this curved information correctly >> after the fact; it must be generated by the meshing software. >> > > I don't *entirely* agree with the suggestion the mesh generator > has to provide that information. Some people reconstruct splines > through the nodes to create higher order meshes after the fact > (which also requires some detection or specification of sharp > corners to break the spline). I personally take the given mesh > from the generator in addition to an IGES/NURBS type CAD file for > complex geometries, and then do a kind of projection for sub-cell > curved boundary faces. > > >> On Wed, Aug 24, 2016 at 10:12 AM, Jed Brown > > wrote: >> >> Matthew Knepley > > writes: >> >> Yes, I do not support that since I think its a crazy way >> to talk about >> >> things. All the topological information is in the Tri3 >> mesh, and >> >> Cubit has no business telling me about the function space. >> >> >> >> >> >> Do you support / plan to support curved elements? >> >> >> > >> > I had "support" in there, but there were bugs. Toby and >> Mark discovered >> > these, and Toby has fixed them. I think >> > all of the fixes are in master now. >> >> The context is clearly that the mesh generator needs to >> express the >> curved elements. DMPlex doesn't have a geometric model >> available, so it >> doesn't know how to make the Tri3 elements curve to conform more >> accurately to the boundary. The mesh generator has no >> business telling >> you what function space to use for your solution, but it'd be >> a shame to >> prevent it from expressing element geometry. >> >> >> >> >> -- >> Andrew Ho > > > > > > -- > Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Wed Aug 24 13:18:38 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Wed, 24 Aug 2016 11:18:38 -0700 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: <57BDDFDF.10900@princeton.edu> References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> <87a8g2qefm.fsf@jedbrown.org> <57BDDBF8.2070104@princeton.edu> <57BDDFDF.10900@princeton.edu> Message-ID: > > Philosophically it's certainly part of the job of the mesh generator, but > realistically I don't think there are any really viable high-order mesh > generators or even file formats right now (correct me if I'm wrong). And as > Matthew put, the mesh generator has no business telling us what our > function basis should be, and by extension has no basis telling us where we > should place our face nodes. > The mesh surface nodes have no correspondence to the basis set I use for expanding my solution on; i.e. I could use sub/super parametric mappings. I think this is why Exodus II doesn't support any elements higher than 6-node triangles; I believe these are sufficient for representing conics sections (not entirely sure about this). This comes up a lot in high order mesh discussions, but I always wonder > when do you *not* have access to the original geometric representation? If > you've generated the mesh yourself you must have the geometry. Yes, I have the original CAD files, but there are a few issues: 1. I don't have any code for reading in the CAD format (not entirely sure how difficult this is) 2. I don't know of any easy way to correspond a given element in the mesh with a surface in the CAD file, as in what part of the original surface is my element on. I could do some "geometric tolerance" based approach, but I don't know how robust this is especially near where two surfaces join together. 3. Why do I need to add this complexity to all of my simulation codes? The mesh generator already knows how to understand the geometry, and output meshes which conform to the curved surface. I simply want to use this existing feature and have my simulation code deal entirely with performing the simulation, not deal with having to handle NURBS surfaces, conics section, surface mapping, etc. -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 24 13:47:33 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Aug 2016 13:47:33 -0500 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> <87a8g2qefm.fsf@jedbrown.org> <57BDDBF8.2070104@princeton.edu> <57BDDFDF.10900@princeton.edu> Message-ID: On Wed, Aug 24, 2016 at 1:18 PM, Andrew Ho wrote: > Philosophically it's certainly part of the job of the mesh generator, but >> realistically I don't think there are any really viable high-order mesh >> generators or even file formats right now (correct me if I'm wrong). And as >> Matthew put, the mesh generator has no business telling us what our >> function basis should be, and by extension has no basis telling us where we >> should place our face nodes. >> > > The mesh surface nodes have no correspondence to the basis set I use for > expanding my solution on; i.e. I could use sub/super parametric mappings. I > think this is why Exodus II doesn't support any elements higher than 6-node > triangles; I believe these are sufficient for representing conics sections > (not entirely sure about this). > > This comes up a lot in high order mesh discussions, but I always wonder >> when do you *not* have access to the original geometric representation? If >> you've generated the mesh yourself you must have the geometry. > > > Yes, I have the original CAD files, but there are a few issues: > > 1. I don't have any code for reading in the CAD format (not entirely sure > how difficult this is) > 2. I don't know of any easy way to correspond a given element in the mesh > with a surface in the CAD file, as in what part of the original surface is > my element on. I could do some "geometric tolerance" based approach, but I > don't know how robust this is especially near where two surfaces join > together. > 3. Why do I need to add this complexity to all of my simulation codes? The > mesh generator already knows how to understand the geometry, and output > meshes which conform to the curved surface. I simply want to use this > existing feature and have my simulation code deal entirely with performing > the simulation, not deal with having to handle NURBS surfaces, conics > section, surface mapping, etc. > Correspondence should not be hard since you can mark the mesh however you like. If you are happy with quadratic surface approximations, then you are in a great spot here. However, I think its easy to push to a place where they are insufficient (needing exact normals for conservation or balance, needing accurate volume conservation, ...) and you must interact with the CAD model. However, we have tried this before. Jed had a hard time with the CAD file reader, and when Jed has a hard time I usually give up immediately. Matt > -- > Andrew Ho > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlmackie862 at gmail.com Wed Aug 24 16:27:10 2016 From: rlmackie862 at gmail.com (Randall Mackie) Date: Wed, 24 Aug 2016 14:27:10 -0700 Subject: [petsc-users] MatSetValues dropping non-local entries Message-ID: I?ve run into a situation where MatSetValues seems to be dropping non-local entries. Most of the entries that are set are local, but a few are possibly non-local, and are only maximum a few grid points off the local part of the grid. Specifically, I get the local to global mapping, and the indices like so: call DMGetLocalToGlobalMapping(da,ltogm,ierr) call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) then set the row using ltog(idltog + row +1) etc When run on 1 process, everything worked fine, but for > 1 I was not getting the right result (I know what the right answer should be for a simple case). I found that when I increased the stencil width on the DA (in the call to DACreate3d) that was used in the DMGetLocalToGlobalMapping to be large enough that the non-local points would be in the ghost region, everything was fine even for > 1 process. So, in conclusion, it seems like if I use a local to global mapping from DMGetLocalToGlobalMapping, then MatSetValues will drop any non-local entries that are not included in the ghost region. Is this the correct behavior and if so, is there another way to set these non-local values so they don?t get dropped? Thanks, Randy From bsmith at mcs.anl.gov Wed Aug 24 16:39:58 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 24 Aug 2016 16:39:58 -0500 Subject: [petsc-users] MatSetValues dropping non-local entries In-Reply-To: References: Message-ID: <97F444A2-CA2F-4B62-B640-B700836D16FE@mcs.anl.gov> > On Aug 24, 2016, at 4:27 PM, Randall Mackie wrote: > > I?ve run into a situation where MatSetValues seems to be dropping non-local entries. Most of the entries that are set are local, but a few are possibly non-local, and are only maximum a few grid points off the local part of the grid. > > Specifically, I get the local to global mapping, and the indices like so: > > call DMGetLocalToGlobalMapping(da,ltogm,ierr) > call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) > > then set the row using ltog(idltog + row +1) etc > > When run on 1 process, everything worked fine, but for > 1 I was not getting the right result (I know what the right answer should be for a simple case). > > I found that when I increased the stencil width on the DA (in the call to DACreate3d) that was used in the DMGetLocalToGlobalMapping to be large enough that the non-local points would be in the ghost region, everything was fine even for > 1 process. > > > So, in conclusion, it seems like if I use a local to global mapping from DMGetLocalToGlobalMapping, then MatSetValues will drop any non-local entries that are not included in the ghost region. > > > Is this the correct behavior and if so, is there another way to set these non-local values so they don?t get dropped? Yes, this is the expected behavior. Note also that DMDA only allocates space in the matrix for these locations and if it did stick in your "extra" locations it would be very very slow because it would have to reallocate the matrix data structures. Why would you want to put in matrix entries that are not represented in the ghosting? The whole point of the ghosting is to indicate what values need to be communicated so you putting additional values in that do not fit the ghosting does not match the paradigm. Barry > > > Thanks, > > Randy From rlmackie862 at gmail.com Wed Aug 24 16:45:35 2016 From: rlmackie862 at gmail.com (Randall Mackie) Date: Wed, 24 Aug 2016 14:45:35 -0700 Subject: [petsc-users] MatSetValues dropping non-local entries In-Reply-To: <97F444A2-CA2F-4B62-B640-B700836D16FE@mcs.anl.gov> References: <97F444A2-CA2F-4B62-B640-B700836D16FE@mcs.anl.gov> Message-ID: <20A94C07-E35D-4A9B-AB5F-45D24B921EA3@gmail.com> Well, I only need this particular matrix to multiply a vector (ordering based on the DMDA grid), so I don?t need to do any ghost communication (like residual calculations). I just need to be able to set a few non-local entries. Is there no way to do that without increasing the stencil width of the DMDA? Randy > On Aug 24, 2016, at 2:39 PM, Barry Smith wrote: > > >> On Aug 24, 2016, at 4:27 PM, Randall Mackie wrote: >> >> I?ve run into a situation where MatSetValues seems to be dropping non-local entries. Most of the entries that are set are local, but a few are possibly non-local, and are only maximum a few grid points off the local part of the grid. >> >> Specifically, I get the local to global mapping, and the indices like so: >> >> call DMGetLocalToGlobalMapping(da,ltogm,ierr) >> call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) >> >> then set the row using ltog(idltog + row +1) etc >> >> When run on 1 process, everything worked fine, but for > 1 I was not getting the right result (I know what the right answer should be for a simple case). >> >> I found that when I increased the stencil width on the DA (in the call to DACreate3d) that was used in the DMGetLocalToGlobalMapping to be large enough that the non-local points would be in the ghost region, everything was fine even for > 1 process. >> >> >> So, in conclusion, it seems like if I use a local to global mapping from DMGetLocalToGlobalMapping, then MatSetValues will drop any non-local entries that are not included in the ghost region. >> >> >> Is this the correct behavior and if so, is there another way to set these non-local values so they don?t get dropped? > > Yes, this is the expected behavior. Note also that DMDA only allocates space in the matrix for these locations and if it did stick in your "extra" locations it would be very very slow because it would have to reallocate the matrix data structures. > > Why would you want to put in matrix entries that are not represented in the ghosting? The whole point of the ghosting is to indicate what values need to be communicated so you putting additional values in that do not fit the ghosting does not match the paradigm. > > Barry > >> >> >> Thanks, >> >> Randy > From knepley at gmail.com Wed Aug 24 16:52:01 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Aug 2016 16:52:01 -0500 Subject: [petsc-users] MatSetValues dropping non-local entries In-Reply-To: <20A94C07-E35D-4A9B-AB5F-45D24B921EA3@gmail.com> References: <97F444A2-CA2F-4B62-B640-B700836D16FE@mcs.anl.gov> <20A94C07-E35D-4A9B-AB5F-45D24B921EA3@gmail.com> Message-ID: On Wed, Aug 24, 2016 at 4:45 PM, Randall Mackie wrote: > Well, I only need this particular matrix to multiply a vector (ordering > based on the DMDA grid), so I don?t need to do any ghost communication > (like residual calculations). I just need to be able to set a few non-local > entries. Is there no way to do that without increasing the stencil width of > the DMDA? > You could fall back to MatSetValuesStencil() instead, which does not use the mapping. Matt > Randy > > > > On Aug 24, 2016, at 2:39 PM, Barry Smith wrote: > > > > > >> On Aug 24, 2016, at 4:27 PM, Randall Mackie > wrote: > >> > >> I?ve run into a situation where MatSetValues seems to be dropping > non-local entries. Most of the entries that are set are local, but a few > are possibly non-local, and are only maximum a few grid points off the > local part of the grid. > >> > >> Specifically, I get the local to global mapping, and the indices like > so: > >> > >> call DMGetLocalToGlobalMapping(da,ltogm,ierr) > >> call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) > >> > >> then set the row using ltog(idltog + row +1) etc > >> > >> When run on 1 process, everything worked fine, but for > 1 I was not > getting the right result (I know what the right answer should be for a > simple case). > >> > >> I found that when I increased the stencil width on the DA (in the call > to DACreate3d) that was used in the DMGetLocalToGlobalMapping to be large > enough that the non-local points would be in the ghost region, everything > was fine even for > 1 process. > >> > >> > >> So, in conclusion, it seems like if I use a local to global mapping > from DMGetLocalToGlobalMapping, then MatSetValues will drop any non-local > entries that are not included in the ghost region. > >> > >> > >> Is this the correct behavior and if so, is there another way to set > these non-local values so they don?t get dropped? > > > > Yes, this is the expected behavior. Note also that DMDA only allocates > space in the matrix for these locations and if it did stick in your "extra" > locations it would be very very slow because it would have to reallocate > the matrix data structures. > > > > Why would you want to put in matrix entries that are not represented in > the ghosting? The whole point of the ghosting is to indicate what values > need to be communicated so you putting additional values in that do not fit > the ghosting does not match the paradigm. > > > > Barry > > > >> > >> > >> Thanks, > >> > >> Randy > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 24 16:52:38 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 24 Aug 2016 16:52:38 -0500 Subject: [petsc-users] MatSetValues dropping non-local entries In-Reply-To: <20A94C07-E35D-4A9B-AB5F-45D24B921EA3@gmail.com> References: <97F444A2-CA2F-4B62-B640-B700836D16FE@mcs.anl.gov> <20A94C07-E35D-4A9B-AB5F-45D24B921EA3@gmail.com> Message-ID: > On Aug 24, 2016, at 4:45 PM, Randall Mackie wrote: > > Well, I only need this particular matrix to multiply a vector (ordering based on the DMDA grid), so I don?t need to do any ghost communication (like residual calculations). I just need to be able to set a few non-local entries. Is there no way to do that without increasing the stencil width of the DMDA? Create your own matrix of the appropriate size and layout to match the DMDA vector and then put your values in it with MatSetValues(); don't use the matrix from DMCreateMatrix() Barry > > Randy > > >> On Aug 24, 2016, at 2:39 PM, Barry Smith wrote: >> >> >>> On Aug 24, 2016, at 4:27 PM, Randall Mackie wrote: >>> >>> I?ve run into a situation where MatSetValues seems to be dropping non-local entries. Most of the entries that are set are local, but a few are possibly non-local, and are only maximum a few grid points off the local part of the grid. >>> >>> Specifically, I get the local to global mapping, and the indices like so: >>> >>> call DMGetLocalToGlobalMapping(da,ltogm,ierr) >>> call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) >>> >>> then set the row using ltog(idltog + row +1) etc >>> >>> When run on 1 process, everything worked fine, but for > 1 I was not getting the right result (I know what the right answer should be for a simple case). >>> >>> I found that when I increased the stencil width on the DA (in the call to DACreate3d) that was used in the DMGetLocalToGlobalMapping to be large enough that the non-local points would be in the ghost region, everything was fine even for > 1 process. >>> >>> >>> So, in conclusion, it seems like if I use a local to global mapping from DMGetLocalToGlobalMapping, then MatSetValues will drop any non-local entries that are not included in the ghost region. >>> >>> >>> Is this the correct behavior and if so, is there another way to set these non-local values so they don?t get dropped? >> >> Yes, this is the expected behavior. Note also that DMDA only allocates space in the matrix for these locations and if it did stick in your "extra" locations it would be very very slow because it would have to reallocate the matrix data structures. >> >> Why would you want to put in matrix entries that are not represented in the ghosting? The whole point of the ghosting is to indicate what values need to be communicated so you putting additional values in that do not fit the ghosting does not match the paradigm. >> >> Barry >> >>> >>> >>> Thanks, >>> >>> Randy >> > From rlmackie862 at gmail.com Wed Aug 24 17:01:19 2016 From: rlmackie862 at gmail.com (Randall Mackie) Date: Wed, 24 Aug 2016 15:01:19 -0700 Subject: [petsc-users] MatSetValues dropping non-local entries In-Reply-To: References: <97F444A2-CA2F-4B62-B640-B700836D16FE@mcs.anl.gov> <20A94C07-E35D-4A9B-AB5F-45D24B921EA3@gmail.com> Message-ID: I already create my own matrix with the appropriate size and layout. The problem seems to be the local to global mapping from DMGetLocalToGlobalMapping, which I suspect does not allow for these non-local entries outside the stencil width. How is one suppose to determine the local to global mapping without a call to this? @Matthew: I had tried MatSetValuesStencil with the same result, and in fact the web page says this: The columns and rows in the stencil passed in MUST be contained within the ghost region of the given process as set with DMDACreateXXX() or MatSetStencil (). Randy > On Aug 24, 2016, at 2:52 PM, Barry Smith wrote: > > >> On Aug 24, 2016, at 4:45 PM, Randall Mackie wrote: >> >> Well, I only need this particular matrix to multiply a vector (ordering based on the DMDA grid), so I don?t need to do any ghost communication (like residual calculations). I just need to be able to set a few non-local entries. Is there no way to do that without increasing the stencil width of the DMDA? > > Create your own matrix of the appropriate size and layout to match the DMDA vector and then put your values in it with MatSetValues(); don't use the matrix from DMCreateMatrix() > > Barry > >> >> Randy >> >> >>> On Aug 24, 2016, at 2:39 PM, Barry Smith wrote: >>> >>> >>>> On Aug 24, 2016, at 4:27 PM, Randall Mackie wrote: >>>> >>>> I?ve run into a situation where MatSetValues seems to be dropping non-local entries. Most of the entries that are set are local, but a few are possibly non-local, and are only maximum a few grid points off the local part of the grid. >>>> >>>> Specifically, I get the local to global mapping, and the indices like so: >>>> >>>> call DMGetLocalToGlobalMapping(da,ltogm,ierr) >>>> call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) >>>> >>>> then set the row using ltog(idltog + row +1) etc >>>> >>>> When run on 1 process, everything worked fine, but for > 1 I was not getting the right result (I know what the right answer should be for a simple case). >>>> >>>> I found that when I increased the stencil width on the DA (in the call to DACreate3d) that was used in the DMGetLocalToGlobalMapping to be large enough that the non-local points would be in the ghost region, everything was fine even for > 1 process. >>>> >>>> >>>> So, in conclusion, it seems like if I use a local to global mapping from DMGetLocalToGlobalMapping, then MatSetValues will drop any non-local entries that are not included in the ghost region. >>>> >>>> >>>> Is this the correct behavior and if so, is there another way to set these non-local values so they don?t get dropped? >>> >>> Yes, this is the expected behavior. Note also that DMDA only allocates space in the matrix for these locations and if it did stick in your "extra" locations it would be very very slow because it would have to reallocate the matrix data structures. >>> >>> Why would you want to put in matrix entries that are not represented in the ghosting? The whole point of the ghosting is to indicate what values need to be communicated so you putting additional values in that do not fit the ghosting does not match the paradigm. >>> >>> Barry >>> >>>> >>>> >>>> Thanks, >>>> >>>> Randy >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Wed Aug 24 17:22:40 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Wed, 24 Aug 2016 15:22:40 -0700 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> <87a8g2qefm.fsf@jedbrown.org> <57BDDBF8.2070104@princeton.edu> <57BDDFDF.10900@princeton.edu> Message-ID: > > Correspondence should not be hard since you can mark the mesh however you > like. True, I guess I need to do this anyways in order to apply boundary conditions. If you are happy with quadratic surface approximations, then you are in a > great spot here. However, I think its easy to push to a place where they > are insufficient (needing exact normals for conservation or balance, > needing accurate volume conservation, ...) and you must interact with the > CAD model. However, we have tried this before. Jed had a hard time with the > CAD file reader, and when Jed has a hard time I usually give up immediately. I suppose that's true. The more I look at what Cubit/other meshing software generate at the surface, I don't think it can exactly represent even simple conics sections such as circular arcs and ellipses exactly, which is a shame. Still, I'd take 3rd order accuracy with quadratic surfaces over 2nd order accuracy, and I'm assuming that whatever is needed to support Tri6 elements in DMPlex generalizes easily to Tri10/Tri15/etc. whenever there exists a meshing tool capable of generating these meshes. -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 24 17:23:17 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 24 Aug 2016 17:23:17 -0500 Subject: [petsc-users] MatSetValues dropping non-local entries In-Reply-To: References: <97F444A2-CA2F-4B62-B640-B700836D16FE@mcs.anl.gov> <20A94C07-E35D-4A9B-AB5F-45D24B921EA3@gmail.com> Message-ID: On Wed, Aug 24, 2016 at 5:01 PM, Randall Mackie wrote: > I already create my own matrix with the appropriate size and layout. The > problem seems to be the local to global mapping from > DMGetLocalToGlobalMapping, which I suspect does not allow for these > non-local entries outside the stencil width. > > How is one suppose to determine the local to global mapping without a call > to this? > > @Matthew: I had tried MatSetValuesStencil with the same result, and in > fact the web page says this: The columns and rows in the stencil passed > in MUST be contained within the ghost region of the given process as set > with DMDACreateXXX() or MatSetStencil > > (). > You are right. In the deep past, we determined this directly using the dimensions. Now we use the map. You will have to calculate the global indices you want by hand in the PETSc ordering, which means knowing what process you want owns your (i,j,k). This is of course tedious, which is why we prefer to use the maps, but its not possible to store every global index in the local map. Matt > Randy > > On Aug 24, 2016, at 2:52 PM, Barry Smith wrote: > > > On Aug 24, 2016, at 4:45 PM, Randall Mackie wrote: > > Well, I only need this particular matrix to multiply a vector (ordering > based on the DMDA grid), so I don?t need to do any ghost communication > (like residual calculations). I just need to be able to set a few non-local > entries. Is there no way to do that without increasing the stencil width of > the DMDA? > > > Create your own matrix of the appropriate size and layout to match the > DMDA vector and then put your values in it with MatSetValues(); don't use > the matrix from DMCreateMatrix() > > Barry > > > Randy > > > On Aug 24, 2016, at 2:39 PM, Barry Smith wrote: > > > On Aug 24, 2016, at 4:27 PM, Randall Mackie wrote: > > I?ve run into a situation where MatSetValues seems to be dropping > non-local entries. Most of the entries that are set are local, but a few > are possibly non-local, and are only maximum a few grid points off the > local part of the grid. > > Specifically, I get the local to global mapping, and the indices like so: > > call DMGetLocalToGlobalMapping(da,ltogm,ierr) > call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) > > then set the row using ltog(idltog + row +1) etc > > When run on 1 process, everything worked fine, but for > 1 I was not > getting the right result (I know what the right answer should be for a > simple case). > > I found that when I increased the stencil width on the DA (in the call to > DACreate3d) that was used in the DMGetLocalToGlobalMapping to be large > enough that the non-local points would be in the ghost region, everything > was fine even for > 1 process. > > > So, in conclusion, it seems like if I use a local to global mapping from > DMGetLocalToGlobalMapping, then MatSetValues will drop any non-local > entries that are not included in the ghost region. > > > Is this the correct behavior and if so, is there another way to set these > non-local values so they don?t get dropped? > > > Yes, this is the expected behavior. Note also that DMDA only allocates > space in the matrix for these locations and if it did stick in your "extra" > locations it would be very very slow because it would have to reallocate > the matrix data structures. > > Why would you want to put in matrix entries that are not represented in > the ghosting? The whole point of the ghosting is to indicate what values > need to be communicated so you putting additional values in that do not fit > the ghosting does not match the paradigm. > > Barry > > > > Thanks, > > Randy > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 24 18:46:12 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 24 Aug 2016 18:46:12 -0500 Subject: [petsc-users] MatSetValues dropping non-local entries In-Reply-To: References: <97F444A2-CA2F-4B62-B640-B700836D16FE@mcs.anl.gov> <20A94C07-E35D-4A9B-AB5F-45D24B921EA3@gmail.com> Message-ID: How important is this matrix for performance? Is it just a side operation you do rarely compared to linear solves and other stuff or is it used many times in a tight loop? If you do it rarely you can use a DMDAGlobalToNaturalBegin/End then do the matrix product (and build the matrix) using the normal natural ordering on a grid and then call DMDANaturalToGlobalBegin/End after. Not efficient but maybe good enough. Barry > On Aug 24, 2016, at 5:23 PM, Matthew Knepley wrote: > > On Wed, Aug 24, 2016 at 5:01 PM, Randall Mackie wrote: > I already create my own matrix with the appropriate size and layout. The problem seems to be the local to global mapping from DMGetLocalToGlobalMapping, which I suspect does not allow for these non-local entries outside the stencil width. > > How is one suppose to determine the local to global mapping without a call to this? > > @Matthew: I had tried MatSetValuesStencil with the same result, and in fact the web page says this: The columns and rows in the stencil passed in MUST be contained within the ghost region of the given process as set with DMDACreateXXX() or MatSetStencil(). > > You are right. In the deep past, we determined this directly using the dimensions. Now we use the map. > > You will have to calculate the global indices you want by hand in the PETSc ordering, which means > knowing what process you want owns your (i,j,k). This is of course tedious, which is why we prefer to > use the maps, but its not possible to store every global index in the local map. > > Matt > > Randy > >> On Aug 24, 2016, at 2:52 PM, Barry Smith wrote: >> >> >>> On Aug 24, 2016, at 4:45 PM, Randall Mackie wrote: >>> >>> Well, I only need this particular matrix to multiply a vector (ordering based on the DMDA grid), so I don?t need to do any ghost communication (like residual calculations). I just need to be able to set a few non-local entries. Is there no way to do that without increasing the stencil width of the DMDA? >> >> Create your own matrix of the appropriate size and layout to match the DMDA vector and then put your values in it with MatSetValues(); don't use the matrix from DMCreateMatrix() >> >> Barry >> >>> >>> Randy >>> >>> >>>> On Aug 24, 2016, at 2:39 PM, Barry Smith wrote: >>>> >>>> >>>>> On Aug 24, 2016, at 4:27 PM, Randall Mackie wrote: >>>>> >>>>> I?ve run into a situation where MatSetValues seems to be dropping non-local entries. Most of the entries that are set are local, but a few are possibly non-local, and are only maximum a few grid points off the local part of the grid. >>>>> >>>>> Specifically, I get the local to global mapping, and the indices like so: >>>>> >>>>> call DMGetLocalToGlobalMapping(da,ltogm,ierr) >>>>> call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) >>>>> >>>>> then set the row using ltog(idltog + row +1) etc >>>>> >>>>> When run on 1 process, everything worked fine, but for > 1 I was not getting the right result (I know what the right answer should be for a simple case). >>>>> >>>>> I found that when I increased the stencil width on the DA (in the call to DACreate3d) that was used in the DMGetLocalToGlobalMapping to be large enough that the non-local points would be in the ghost region, everything was fine even for > 1 process. >>>>> >>>>> >>>>> So, in conclusion, it seems like if I use a local to global mapping from DMGetLocalToGlobalMapping, then MatSetValues will drop any non-local entries that are not included in the ghost region. >>>>> >>>>> >>>>> Is this the correct behavior and if so, is there another way to set these non-local values so they don?t get dropped? >>>> >>>> Yes, this is the expected behavior. Note also that DMDA only allocates space in the matrix for these locations and if it did stick in your "extra" locations it would be very very slow because it would have to reallocate the matrix data structures. >>>> >>>> Why would you want to put in matrix entries that are not represented in the ghosting? The whole point of the ghosting is to indicate what values need to be communicated so you putting additional values in that do not fit the ghosting does not match the paradigm. >>>> >>>> Barry >>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Randy >>>> >>> >> > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From rlmackie862 at gmail.com Wed Aug 24 19:50:27 2016 From: rlmackie862 at gmail.com (Randall Mackie) Date: Wed, 24 Aug 2016 17:50:27 -0700 Subject: [petsc-users] MatSetValues dropping non-local entries In-Reply-To: References: <97F444A2-CA2F-4B62-B640-B700836D16FE@mcs.anl.gov> <20A94C07-E35D-4A9B-AB5F-45D24B921EA3@gmail.com> Message-ID: Thanks for the help. I now understand why the DMDA local to global mapping doesn?t work. I was able to do what I wanted by computing the row/col numbers using natural ordering and then converting to PETSc ordering using AOApplicationToPetsc and the AO from DMDAGetAO, and that works fine. Randy > On Aug 24, 2016, at 4:46 PM, Barry Smith wrote: > > > How important is this matrix for performance? Is it just a side operation you do rarely compared to linear solves and other stuff or is it used many times in a tight loop? > > If you do it rarely you can use a DMDAGlobalToNaturalBegin/End then do the matrix product (and build the matrix) using the normal natural ordering on a grid and then call DMDANaturalToGlobalBegin/End after. Not efficient but maybe good enough. > > Barry > >> On Aug 24, 2016, at 5:23 PM, Matthew Knepley wrote: >> >> On Wed, Aug 24, 2016 at 5:01 PM, Randall Mackie wrote: >> I already create my own matrix with the appropriate size and layout. The problem seems to be the local to global mapping from DMGetLocalToGlobalMapping, which I suspect does not allow for these non-local entries outside the stencil width. >> >> How is one suppose to determine the local to global mapping without a call to this? >> >> @Matthew: I had tried MatSetValuesStencil with the same result, and in fact the web page says this: The columns and rows in the stencil passed in MUST be contained within the ghost region of the given process as set with DMDACreateXXX() or MatSetStencil(). >> >> You are right. In the deep past, we determined this directly using the dimensions. Now we use the map. >> >> You will have to calculate the global indices you want by hand in the PETSc ordering, which means >> knowing what process you want owns your (i,j,k). This is of course tedious, which is why we prefer to >> use the maps, but its not possible to store every global index in the local map. >> >> Matt >> >> Randy >> >>> On Aug 24, 2016, at 2:52 PM, Barry Smith wrote: >>> >>> >>>> On Aug 24, 2016, at 4:45 PM, Randall Mackie wrote: >>>> >>>> Well, I only need this particular matrix to multiply a vector (ordering based on the DMDA grid), so I don?t need to do any ghost communication (like residual calculations). I just need to be able to set a few non-local entries. Is there no way to do that without increasing the stencil width of the DMDA? >>> >>> Create your own matrix of the appropriate size and layout to match the DMDA vector and then put your values in it with MatSetValues(); don't use the matrix from DMCreateMatrix() >>> >>> Barry >>> >>>> >>>> Randy >>>> >>>> >>>>> On Aug 24, 2016, at 2:39 PM, Barry Smith wrote: >>>>> >>>>> >>>>>> On Aug 24, 2016, at 4:27 PM, Randall Mackie wrote: >>>>>> >>>>>> I?ve run into a situation where MatSetValues seems to be dropping non-local entries. Most of the entries that are set are local, but a few are possibly non-local, and are only maximum a few grid points off the local part of the grid. >>>>>> >>>>>> Specifically, I get the local to global mapping, and the indices like so: >>>>>> >>>>>> call DMGetLocalToGlobalMapping(da,ltogm,ierr) >>>>>> call ISLocalToGlobalMappingGetIndices(ltogm,ltog,idltog,ierr) >>>>>> >>>>>> then set the row using ltog(idltog + row +1) etc >>>>>> >>>>>> When run on 1 process, everything worked fine, but for > 1 I was not getting the right result (I know what the right answer should be for a simple case). >>>>>> >>>>>> I found that when I increased the stencil width on the DA (in the call to DACreate3d) that was used in the DMGetLocalToGlobalMapping to be large enough that the non-local points would be in the ghost region, everything was fine even for > 1 process. >>>>>> >>>>>> >>>>>> So, in conclusion, it seems like if I use a local to global mapping from DMGetLocalToGlobalMapping, then MatSetValues will drop any non-local entries that are not included in the ghost region. >>>>>> >>>>>> >>>>>> Is this the correct behavior and if so, is there another way to set these non-local values so they don?t get dropped? >>>>> >>>>> Yes, this is the expected behavior. Note also that DMDA only allocates space in the matrix for these locations and if it did stick in your "extra" locations it would be very very slow because it would have to reallocate the matrix data structures. >>>>> >>>>> Why would you want to put in matrix entries that are not represented in the ghosting? The whole point of the ghosting is to indicate what values need to be communicated so you putting additional values in that do not fit the ghosting does not match the paradigm. >>>>> >>>>> Barry >>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Randy >>>>> >>>> >>> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From harshadranadive at gmail.com Wed Aug 24 20:07:37 2016 From: harshadranadive at gmail.com (Harshad Ranadive) Date: Thu, 25 Aug 2016 11:07:37 +1000 Subject: [petsc-users] Code performance for solving multiple RHS In-Reply-To: References: <6699AE6B-520C-4220-94D0-0389FD5941E6@mcs.anl.gov> <4F22DE27-9E50-47D9-A2EA-8EDA762B6D02@mcs.anl.gov> Message-ID: Thanks a lot Barry for your help. I have resolved the issue of slow linear solver for excessively large RHS vectors to some degree. *To summarize:* 1) In my case, I wanted to solve the *same linear system a large number of times for different RHS vectors*. Previously I was using the iterative solver KSPGMRES with preconditioner PCJACOBI 2) I had a huge number of RHS to be solved (~1M/time step) .... this had a computational cost *9.6 times* greater than my explicit code (which did not need matrix inversion - only reqd. RHS eval) 3) Based on Barry's suggestions to use a direct solver I changed the KSP type to KSPPREONLY and used the preconditioner - PCLU. 4) This approach has a small drawback that the linear system needs to be solved sequentially. Although different systems are now solved by different processors in parallel. 5) The current computational cost is only *1.16 times* the explicit code. Thanks and Regards, Harshad On Fri, Aug 12, 2016 at 1:27 PM, Barry Smith wrote: > > > On Aug 11, 2016, at 10:14 PM, Harshad Ranadive < > harshadranadive at gmail.com> wrote: > > > > Hi Barry, > > > > Thanks for this recommendation. > > > > As you mention, the matrix factorization should be on a single processor. > > If the factored matrix A is available on all processors can I then use > MatMatSolve(A,B,X) in parallel? That is could the RHS block matrix 'B' and > solution matrix 'X' be distributed in different processors as is done while > using MatCreateDense(...) ? > > Note sure what you mean. > > You can have different processes handle different right hand sides. So > give the full linear system matrix to each process; each process factors it > and then each process solves a different set of right hand sides. > Embarrassingly parallel except for any communication you need to do to get > the matrix and right hand sides to the right processes. > > If the linear system involves say millions of unknowns this is the way > to go. If the linear system is over say 1 billion unknowns then it might be > worth each linear system in parallel. > > Barry > > > > > Thanks, > > Harshad > > > > > > > > On Fri, Aug 12, 2016 at 2:09 AM, Barry Smith wrote: > > > > If it is sequential, which it probably should be, then you can you > MatLUFactorSymbolic(), MatLUFactorNumeric() and MatMatSolve() where you put > a bunch of your right hand side vectors into a dense array; not all million > of them but maybe 10 to 100 at a time. > > > > Barry > > > > > On Aug 10, 2016, at 10:18 PM, Harshad Ranadive < > harshadranadive at gmail.com> wrote: > > > > > > Hi Barry > > > > > > The matrix A is mostly tridiagonal > > > > > > 1 ? 0 ......... 0 > > > > > > ? 1 ? 0 .......0 > > > > > > > > > 0 ? 1 ? 0 ....0 > > > > > > > > > .................... > > > 0..............? 1 > > > > > > In some cases (periodic boundaries) there would be an '?' in > right-top-corner and left-bottom corner. > > > > > > I am not using multigrid approach. I just implemented an implicit > filtering approach (instead of an explicit existing one) which requires the > solution of the above system. > > > > > > Thanks > > > Harshad > > > > > > On Thu, Aug 11, 2016 at 1:07 PM, Barry Smith > wrote: > > > > > > Effectively utilizing multiple right hand sides with the same system > can result in roughly 2 or at absolute most 3 times improvement in solve > time. A great improvement but when you have a million right hand sides not > a giant improvement. > > > > > > The first step is to get the best (most efficient) preconditioner > for you problem. Since you have many right hand sides it obviously pays to > spend more time building the preconditioner so that each solve is faster. > If you provide more information on your linear system we might have > suggestions. CFD so is your linear system a Poisson problem? Are you using > geometric or algebraic multigrid with PETSc? It not a Poisson problem how > can you describe the linear system? > > > > > > Barry > > > > > > > > > > > > > On Aug 10, 2016, at 9:54 PM, Harshad Ranadive < > harshadranadive at gmail.com> wrote: > > > > > > > > Hi All, > > > > > > > > I have currently added the PETSc library with our CFD solver. > > > > > > > > In this I need to use KSPSolve(...) multiple time for the same > matrix A. I have read that PETSc does not support passing multiple RHS > vectors in the form of a matrix and the only solution to this is calling > KSPSolve multiple times as in example given here: > > > > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/ > examples/tutorials/ex16.c.html > > > > > > > > I have followed this technique, but I find that the performance of > the code is very slow now. I basically have a mesh size of 8-10 Million and > I need to solve the matrix A very large number of times. I have checked > that the statement KSPSolve(..) is taking close to 90% of my computation > time. > > > > > > > > I am setting up the matrix A, KSPCreate, KSPSetup etc just once at > the start. Only the following statements are executed in a repeated loop > > > > > > > > Loop begin: (say million times !!) > > > > > > > > loop over vector length > > > > VecSetValues( ....) > > > > end > > > > > > > > VecAssemblyBegin( ... ) > > > > VecAssemblyEnd (...) > > > > > > > > KSPSolve (...) > > > > > > > > VecGetValues > > > > > > > > Loop end. > > > > > > > > Is there an efficient way of doing this rather than using KSPSolve > multiple times? > > > > > > > > Please note my matrix A never changes during the time steps or > across the mesh ... So essentially if I can get the inverse once would it > be good enough? It has been recommended in the FAQ that matrix inverse > should be avoided but would it be okay to use in my case? > > > > > > > > Also could someone please provide an example of how to use > MatLUFactor and MatCholeskyFactor() to find the matrix inverse... the > arguments below were not clear to me. > > > > IS row > > > > IS col > > > > const MatFactorInfo *info > > > > > > > > Apologies for a long email and thanks to anyone for help. > > > > > > > > Regards > > > > Harshad > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 24 20:14:51 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 24 Aug 2016 20:14:51 -0500 Subject: [petsc-users] Code performance for solving multiple RHS In-Reply-To: References: <6699AE6B-520C-4220-94D0-0389FD5941E6@mcs.anl.gov> <4F22DE27-9E50-47D9-A2EA-8EDA762B6D02@mcs.anl.gov> Message-ID: > On Aug 24, 2016, at 8:07 PM, Harshad Ranadive wrote: > > Thanks a lot Barry for your help. I have resolved the issue of slow linear solver for excessively large RHS vectors to some degree. > > To summarize: > > 1) In my case, I wanted to solve the same linear system a large number of times for different RHS vectors. Previously I was using the iterative solver KSPGMRES with preconditioner PCJACOBI > > 2) I had a huge number of RHS to be solved (~1M/time step) .... this had a computational cost 9.6 times greater than my explicit code (which did not need matrix inversion - only reqd. RHS eval) > > 3) Based on Barry's suggestions to use a direct solver I changed the KSP type to KSPPREONLY and used the preconditioner - PCLU. > > 4) This approach has a small drawback that the linear system needs to be solved sequentially. Although different systems are now solved by different processors in parallel. Note that if you ./configure PETSc with --download-superlu_dist then you can actually solve them in parallel. > > 5) The current computational cost is only 1.16 times the explicit code. > > Thanks and Regards, > Harshad > > > On Fri, Aug 12, 2016 at 1:27 PM, Barry Smith wrote: > > > On Aug 11, 2016, at 10:14 PM, Harshad Ranadive wrote: > > > > Hi Barry, > > > > Thanks for this recommendation. > > > > As you mention, the matrix factorization should be on a single processor. > > If the factored matrix A is available on all processors can I then use MatMatSolve(A,B,X) in parallel? That is could the RHS block matrix 'B' and solution matrix 'X' be distributed in different processors as is done while using MatCreateDense(...) ? > > Note sure what you mean. > > You can have different processes handle different right hand sides. So give the full linear system matrix to each process; each process factors it and then each process solves a different set of right hand sides. Embarrassingly parallel except for any communication you need to do to get the matrix and right hand sides to the right processes. > > If the linear system involves say millions of unknowns this is the way to go. If the linear system is over say 1 billion unknowns then it might be worth each linear system in parallel. > > Barry > > > > > Thanks, > > Harshad > > > > > > > > On Fri, Aug 12, 2016 at 2:09 AM, Barry Smith wrote: > > > > If it is sequential, which it probably should be, then you can you MatLUFactorSymbolic(), MatLUFactorNumeric() and MatMatSolve() where you put a bunch of your right hand side vectors into a dense array; not all million of them but maybe 10 to 100 at a time. > > > > Barry > > > > > On Aug 10, 2016, at 10:18 PM, Harshad Ranadive wrote: > > > > > > Hi Barry > > > > > > The matrix A is mostly tridiagonal > > > > > > 1 ? 0 ......... 0 > > > > > > ? 1 ? 0 .......0 > > > > > > > > > 0 ? 1 ? 0 ....0 > > > > > > > > > .................... > > > 0..............? 1 > > > > > > In some cases (periodic boundaries) there would be an '?' in right-top-corner and left-bottom corner. > > > > > > I am not using multigrid approach. I just implemented an implicit filtering approach (instead of an explicit existing one) which requires the solution of the above system. > > > > > > Thanks > > > Harshad > > > > > > On Thu, Aug 11, 2016 at 1:07 PM, Barry Smith wrote: > > > > > > Effectively utilizing multiple right hand sides with the same system can result in roughly 2 or at absolute most 3 times improvement in solve time. A great improvement but when you have a million right hand sides not a giant improvement. > > > > > > The first step is to get the best (most efficient) preconditioner for you problem. Since you have many right hand sides it obviously pays to spend more time building the preconditioner so that each solve is faster. If you provide more information on your linear system we might have suggestions. CFD so is your linear system a Poisson problem? Are you using geometric or algebraic multigrid with PETSc? It not a Poisson problem how can you describe the linear system? > > > > > > Barry > > > > > > > > > > > > > On Aug 10, 2016, at 9:54 PM, Harshad Ranadive wrote: > > > > > > > > Hi All, > > > > > > > > I have currently added the PETSc library with our CFD solver. > > > > > > > > In this I need to use KSPSolve(...) multiple time for the same matrix A. I have read that PETSc does not support passing multiple RHS vectors in the form of a matrix and the only solution to this is calling KSPSolve multiple times as in example given here: > > > > http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex16.c.html > > > > > > > > I have followed this technique, but I find that the performance of the code is very slow now. I basically have a mesh size of 8-10 Million and I need to solve the matrix A very large number of times. I have checked that the statement KSPSolve(..) is taking close to 90% of my computation time. > > > > > > > > I am setting up the matrix A, KSPCreate, KSPSetup etc just once at the start. Only the following statements are executed in a repeated loop > > > > > > > > Loop begin: (say million times !!) > > > > > > > > loop over vector length > > > > VecSetValues( ....) > > > > end > > > > > > > > VecAssemblyBegin( ... ) > > > > VecAssemblyEnd (...) > > > > > > > > KSPSolve (...) > > > > > > > > VecGetValues > > > > > > > > Loop end. > > > > > > > > Is there an efficient way of doing this rather than using KSPSolve multiple times? > > > > > > > > Please note my matrix A never changes during the time steps or across the mesh ... So essentially if I can get the inverse once would it be good enough? It has been recommended in the FAQ that matrix inverse should be avoided but would it be okay to use in my case? > > > > > > > > Also could someone please provide an example of how to use MatLUFactor and MatCholeskyFactor() to find the matrix inverse... the arguments below were not clear to me. > > > > IS row > > > > IS col > > > > const MatFactorInfo *info > > > > > > > > Apologies for a long email and thanks to anyone for help. > > > > > > > > Regards > > > > Harshad > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From jed at jedbrown.org Thu Aug 25 01:11:52 2016 From: jed at jedbrown.org (Jed Brown) Date: Thu, 25 Aug 2016 00:11:52 -0600 Subject: [petsc-users] DMPlex higher order elements In-Reply-To: References: <4C3A7B70-8049-47D4-A890-D1CD240F2960@cims.nyu.edu> <87a8g2qefm.fsf@jedbrown.org> <57BDDBF8.2070104@princeton.edu> <57BDDFDF.10900@princeton.edu> Message-ID: <877fb5pec7.fsf@jedbrown.org> Andrew Ho writes: > I suppose that's true. The more I look at what Cubit/other meshing software > generate at the surface, I don't think it can exactly represent even simple > conics sections such as circular arcs and ellipses exactly, which is a > shame. True > Still, I'd take 3rd order accuracy with quadratic surfaces over 2nd > order accuracy, Yes, it's what we have now. > and I'm assuming that whatever is needed to support Tri6 elements in > DMPlex generalizes easily to Tri10/Tri15/etc. whenever there exists a > meshing tool capable of generating these meshes. I actually think it's a poor model for representing this information, but it's what exists today and retooling the mesh generation stack to do it differently is a monumental and thankless task that will likely fail. Unfortunately, getting access to the CAD model from the solver is really a mess if you want portable code or don't have lots of money. I think it's actually a barrier to a lot of cool things, but I think you'd need to be in a very special position to invest time in building a better solution. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From niko.karin at gmail.com Thu Aug 25 08:46:07 2016 From: niko.karin at gmail.com (Karin&NiKo) Date: Thu, 25 Aug 2016 15:46:07 +0200 Subject: [petsc-users] Command lines to reproduce the tests of "Composing scalable nonlinear algebraic solvers" In-Reply-To: References: Message-ID: Dear PETSc gurus, Thanks to the help of Matthew, I have been able to reproduce in PETSc some tests of the paper of Peter Brune et al. entitled "Composing scalable nonlinear algebraic solvers", with special attention to the elasticity test. I have also tried to reproduce it in a widely used mechanics finite element solver and I cannot obtain the same results, mainly because of my lack of undestanding of the boundary conditions and of the loading. According to the paper, I undestand that the edges in red in the attached image are fully clamped. If I do that, I do observe a rotation of the grey face (see attached image). If I clamp the over-mentioned edges plus I forbid the normal displacement of the grey face, I get a quite similar warped shape but the details of the deformation near the clamped faces are very different In the paper, the loading is defined as a volume force applied to the whole structure whereas in Wriggers' book, it is defined as a nodal force. Could you please give me some details on these points? Best regards, Nicolas 2016-08-23 20:25 GMT+02:00 Matthew Knepley : > On Tue, Aug 23, 2016 at 10:54 AM, Karin&NiKo wrote: > >> Dear PETSc team, >> >> I have read with high interest the paper of Peter Brune et al. entitled >> "Composing scalable nonlinear algebraic solvers". >> Nevertheless I would like to be able to reproduce the tests that are >> presented within (mainly the elasticity problem, ex16). >> >> Could you please provide us with the command lines of these tests? >> > > I believe Peter used the attached script. > > Thanks, > > Matt > > >> Best regards, >> Nicolas >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Capture.PNG Type: image/png Size: 35719 bytes Desc: not available URL: From ztdepyahoo at 163.com Fri Aug 26 02:42:53 2016 From: ztdepyahoo at 163.com (=?GBK?B?tqHAz8qm?=) Date: Fri, 26 Aug 2016 15:42:53 +0800 (CST) Subject: [petsc-users] What caused this problem."PetscInitialize() must be called before PetscFinalize()" Message-ID: <528e86fa.83a9.156c5cd2e37.Coremail.ztdepyahoo@163.com> Dear friends: My code run without any problem yesterday, but it gives me the following error. I do not know what is the reason? Regards [0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c [0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c [0]PETSC ERROR: #3 PetscInitialize() line 828 in /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/pinit.c PetscInitialize() must be called before PetscFinalize() Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Fri Aug 26 02:46:37 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Fri, 26 Aug 2016 09:46:37 +0200 Subject: [petsc-users] What caused this problem."PetscInitialize() must be called before PetscFinalize()" In-Reply-To: <528e86fa.83a9.156c5cd2e37.Coremail.ztdepyahoo@163.com> References: <528e86fa.83a9.156c5cd2e37.Coremail.ztdepyahoo@163.com> Message-ID: Could you send the entire error message? On Fri, Aug 26, 2016 at 9:42 AM, ??? wrote: > Dear friends: > My code run without any problem yesterday, but it gives me the following > error. I do not know what is the reason? > Regards > > [0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in > /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > [0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in > /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > [0]PETSC ERROR: #3 PetscInitialize() line 828 in > /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/pinit.c > > PetscInitialize() must be called before PetscFinalize() > > Regards > > > > > > > > > > > > > > > > > > From rongliang.chan at gmail.com Fri Aug 26 05:52:27 2016 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Fri, 26 Aug 2016 18:52:27 +0800 Subject: [petsc-users] dmplex with block size error In-Reply-To: References: <57B578E2.9020201@gmail.com> <57B58855.7060702@gmail.com> Message-ID: <57C01F6B.5050509@gmail.com> Hi Matt, I think there is a typo in the following code (line 633 and 636 in plexpreallocate.c). ================== } else { /* Only loop over blocks of rows */ for (r = rStart/bs; r < rEnd/bs; ++r) { const PetscInt row = r*bs; PetscInt numCols, cStart, c; ierr = PetscSectionGetDof(sectionAdj, row, &numCols);CHKERRQ(ierr); ierr = PetscSectionGetOffset(sectionAdj, row, &cStart);CHKERRQ(ierr); for (c = cStart; c < cStart+numCols; ++c) { if ((cols[c] >= rStart*bs) && (cols[c] < rEnd*bs)) { ++dnz[r-rStart]; if (cols[c] >= row) ++dnzu[r-rStart]; // I think "dnzu[r-rStart]" should be "dnzu[r-rStart/bs]", right? } else { ++onz[r-rStart]; if (cols[c] >= row) ++onzu[r-rStart]; // I think "onzu[r-rStart]" should be "onzu[r-rStart/bs]" , right? } } } for (r = 0; r < (rEnd - rStart)/bs; ++r) { dnz[r] /= bs; onz[r] /= bs; dnzu[r] /= bs; onzu[r] /= bs; } } ================== Best, Rongliang On 08/18/2016 06:08 PM, Matthew Knepley wrote: > On Thu, Aug 18, 2016 at 5:05 AM, Rongliang Chen > > wrote: > > Hi Matt, > > The log of the valgrind is attached. When I run with valgrind, the > following error message comes up. > > > The valgrind log says your code is writing over memory. Fix that first. > > Matt > > ------------------------- > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Arguments are incompatible > [1]PETSC ERROR: Cannot change block size 3670016 to 7 > [1]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble > shooting. > [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 > [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu > Aug 18 17:59:43 2016 > [1]PETSC ERROR: Configure options --download-blacs > --download-scalapack --download-metis --download-parmetis > --download-exodusii --download-netcdf --download-hdf5 > --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared > --with-debugging=1 --download-fblaslapack --with-64-bit-indices > [1]PETSC ERROR: #1 PetscLayoutSetBlockSize() line 424 in > /home/rlchen/soft/petsc-3.6.3/src/vec/is/utils/pmap.c > [1]PETSC ERROR: #2 MatSetBlockSize() line 6920 in > /home/rlchen/soft/petsc-3.6.3/src/mat/interface/matrix.c > [1]PETSC ERROR: #3 MatXAIJSetPreallocation() line 282 in > /home/rlchen/soft/3D_fluid/FSI/Spmcs-v1.5/Fluid-petsc-3.6/src/application/Fluid/gcreate.c > ---------------------- > > Best regards, > Rongliang > > On 08/18/2016 05:52 PM, Matthew Knepley wrote: >> Run with valgrind and send the log. >> >> Thanks, >> >> Matt >> >> On Thu, Aug 18, 2016 at 3:59 AM, Rongliang Chen >> > wrote: >> >> Dear All, >> >> I try to use the block matrix (BAIJ) for the dmplex data >> structure with the option "-dm_mat_type baij" (the block size >> is 7). The code works fine when np = 1 but the following >> error comes up when np>1. And the code also works fine for >> np>1 if I set the block size to be 1. Any suggestions are >> highly appreciated. >> >> ---------------------------------------------------- >> [1]PETSC ERROR: PetscMallocValidate: error detected at >> VecAXPY_Seq() line 89 in >> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c >> [1]PETSC ERROR: Memory at address 0x1332571 is corrupted >> [1]PETSC ERROR: Probably write past beginning or end of array >> [1]PETSC ERROR: Last intact block allocated in >> PetscStrallocpy() line 188 in >> /home/rlchen/soft/petsc-3.6.3/src/sys/utils/str.c >> [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [1]PETSC ERROR: Memory corruption: >> http://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind >> >> [1]PETSC ERROR: >> [1]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html >> for >> trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >> [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen >> Thu Aug 18 16:42:34 2016 >> [1]PETSC ERROR: Configure options --download-blacs >> --download-scalapack --download-metis --download-parmetis >> --download-exodusii --download-netcdf --download-hdf5 >> --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared >> --with-debugging=1 --download-fblaslapack --with-64-bit-indices >> [1]PETSC ERROR: #1 PetscMallocValidate() line 136 in >> /home/rlchen/soft/petsc-3.6.3/src/sys/memory/mtr.c >> [1]PETSC ERROR: #2 VecAXPY_Seq() line 89 in >> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c >> [1]PETSC ERROR: #3 VecAXPY() line 640 in >> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/interface/rvector.c >> --------------------------------------------------- >> >> Best regards, >> Rongliang >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 26 06:47:34 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Aug 2016 06:47:34 -0500 Subject: [petsc-users] dmplex with block size error In-Reply-To: <57C01F6B.5050509@gmail.com> References: <57B578E2.9020201@gmail.com> <57B58855.7060702@gmail.com> <57C01F6B.5050509@gmail.com> Message-ID: On Fri, Aug 26, 2016 at 5:52 AM, Rongliang Chen wrote: > Hi Matt, > > I think there is a typo in the following code (line 633 and 636 in > plexpreallocate.c). > Those are fixed in the current 'master'. Which branch are you using? Thanks, Matt > ================== > } else { > /* Only loop over blocks of rows */ > for (r = rStart/bs; r < rEnd/bs; ++r) { > const PetscInt row = r*bs; > PetscInt numCols, cStart, c; > > ierr = PetscSectionGetDof(sectionAdj, row, &numCols);CHKERRQ(ierr); > ierr = PetscSectionGetOffset(sectionAdj, row, > &cStart);CHKERRQ(ierr); > for (c = cStart; c < cStart+numCols; ++c) { > if ((cols[c] >= rStart*bs) && (cols[c] < rEnd*bs)) { > ++dnz[r-rStart]; > if (cols[c] >= row) ++dnzu[r-rStart]; // I think > "dnzu[r-rStart]" should be "dnzu[r-rStart/bs]", right? > } else { > ++onz[r-rStart]; > if (cols[c] >= row) ++onzu[r-rStart]; // I think > "onzu[r-rStart]" should be "onzu[r-rStart/bs]" , right? > } > } > } > for (r = 0; r < (rEnd - rStart)/bs; ++r) { > dnz[r] /= bs; > onz[r] /= bs; > dnzu[r] /= bs; > onzu[r] /= bs; > } > } > ================== > > Best, > Rongliang > > On 08/18/2016 06:08 PM, Matthew Knepley wrote: > > On Thu, Aug 18, 2016 at 5:05 AM, Rongliang Chen > wrote: > >> Hi Matt, >> >> The log of the valgrind is attached. When I run with valgrind, the >> following error message comes up. >> > > The valgrind log says your code is writing over memory. Fix that first. > > Matt > > >> ------------------------- >> [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [1]PETSC ERROR: Arguments are incompatible >> [1]PETSC ERROR: Cannot change block size 3670016 to 7 >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >> [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu Aug 18 >> 17:59:43 2016 >> [1]PETSC ERROR: Configure options --download-blacs --download-scalapack >> --download-metis --download-parmetis --download-exodusii --download-netcdf >> --download-hdf5 --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared >> --with-debugging=1 --download-fblaslapack --with-64-bit-indices >> [1]PETSC ERROR: #1 PetscLayoutSetBlockSize() line 424 in >> /home/rlchen/soft/petsc-3.6.3/src/vec/is/utils/pmap.c >> [1]PETSC ERROR: #2 MatSetBlockSize() line 6920 in >> /home/rlchen/soft/petsc-3.6.3/src/mat/interface/matrix.c >> [1]PETSC ERROR: #3 MatXAIJSetPreallocation() line 282 in >> /home/rlchen/soft/3D_fluid/FSI/Spmcs-v1.5/Fluid-petsc-3.6/ >> src/application/Fluid/gcreate.c >> ---------------------- >> >> Best regards, >> Rongliang >> >> On 08/18/2016 05:52 PM, Matthew Knepley wrote: >> >> Run with valgrind and send the log. >> >> Thanks, >> >> Matt >> >> On Thu, Aug 18, 2016 at 3:59 AM, Rongliang Chen > > wrote: >> >>> Dear All, >>> >>> I try to use the block matrix (BAIJ) for the dmplex data structure with >>> the option "-dm_mat_type baij" (the block size is 7). The code works fine >>> when np = 1 but the following error comes up when np>1. And the code also >>> works fine for np>1 if I set the block size to be 1. Any suggestions are >>> highly appreciated. >>> >>> ---------------------------------------------------- >>> [1]PETSC ERROR: PetscMallocValidate: error detected at VecAXPY_Seq() >>> line 89 in /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c >>> [1]PETSC ERROR: Memory at address 0x1332571 is corrupted >>> [1]PETSC ERROR: Probably write past beginning or end of array >>> [1]PETSC ERROR: Last intact block allocated in PetscStrallocpy() line >>> 188 in /home/rlchen/soft/petsc-3.6.3/src/sys/utils/str.c >>> [1]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [1]PETSC ERROR: Memory corruption: http://www.mcs.anl.gov/petsc/d >>> ocumentation/installation.html#valgrind >>> [1]PETSC ERROR: >>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> for trouble shooting. >>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>> [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen Thu Aug 18 >>> 16:42:34 2016 >>> [1]PETSC ERROR: Configure options --download-blacs --download-scalapack >>> --download-metis --download-parmetis --download-exodusii --download-netcdf >>> --download-hdf5 --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared >>> --with-debugging=1 --download-fblaslapack --with-64-bit-indices >>> [1]PETSC ERROR: #1 PetscMallocValidate() line 136 in >>> /home/rlchen/soft/petsc-3.6.3/src/sys/memory/mtr.c >>> [1]PETSC ERROR: #2 VecAXPY_Seq() line 89 in >>> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c >>> [1]PETSC ERROR: #3 VecAXPY() line 640 in /home/rlchen/soft/petsc-3.6.3/ >>> src/vec/vec/interface/rvector.c >>> --------------------------------------------------- >>> >>> Best regards, >>> Rongliang >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Fri Aug 26 06:53:00 2016 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Fri, 26 Aug 2016 19:53:00 +0800 Subject: [petsc-users] dmplex with block size error In-Reply-To: References: <57B578E2.9020201@gmail.com> <57B58855.7060702@gmail.com> <57C01F6B.5050509@gmail.com> Message-ID: <57C02D9C.8060000@gmail.com> Thanks. I am using a very old version, the petsc-3.6.1. I will try to update my code to the latest version. Best, Rongliang On 08/26/2016 07:47 PM, Matthew Knepley wrote: > On Fri, Aug 26, 2016 at 5:52 AM, Rongliang Chen > > wrote: > > Hi Matt, > > I think there is a typo in the following code (line 633 and 636 in > plexpreallocate.c). > > > Those are fixed in the current 'master'. Which branch are you using? > > Thanks, > > Matt > > ================== > } else { > /* Only loop over blocks of rows */ > for (r = rStart/bs; r < rEnd/bs; ++r) { > const PetscInt row = r*bs; > PetscInt numCols, cStart, c; > > ierr = PetscSectionGetDof(sectionAdj, row, > &numCols);CHKERRQ(ierr); > ierr = PetscSectionGetOffset(sectionAdj, row, > &cStart);CHKERRQ(ierr); > for (c = cStart; c < cStart+numCols; ++c) { > if ((cols[c] >= rStart*bs) && (cols[c] < rEnd*bs)) { > ++dnz[r-rStart]; > if (cols[c] >= row) ++dnzu[r-rStart]; // I think > "dnzu[r-rStart]" should be "dnzu[r-rStart/bs]", right? > } else { > ++onz[r-rStart]; > if (cols[c] >= row) ++onzu[r-rStart]; // I think > "onzu[r-rStart]" should be "onzu[r-rStart/bs]" , right? > } > } > } > for (r = 0; r < (rEnd - rStart)/bs; ++r) { > dnz[r] /= bs; > onz[r] /= bs; > dnzu[r] /= bs; > onzu[r] /= bs; > } > } > ================== > > Best, > Rongliang > > On 08/18/2016 06:08 PM, Matthew Knepley wrote: >> On Thu, Aug 18, 2016 at 5:05 AM, Rongliang Chen >> > wrote: >> >> Hi Matt, >> >> The log of the valgrind is attached. When I run with >> valgrind, the following error message comes up. >> >> >> The valgrind log says your code is writing over memory. Fix that >> first. >> >> Matt >> >> ------------------------- >> [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> [1]PETSC ERROR: Arguments are incompatible >> [1]PETSC ERROR: Cannot change block size 3670016 to 7 >> [1]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html >> for >> trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >> [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by rlchen >> Thu Aug 18 17:59:43 2016 >> [1]PETSC ERROR: Configure options --download-blacs >> --download-scalapack --download-metis --download-parmetis >> --download-exodusii --download-netcdf --download-hdf5 >> --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared >> --with-debugging=1 --download-fblaslapack --with-64-bit-indices >> [1]PETSC ERROR: #1 PetscLayoutSetBlockSize() line 424 in >> /home/rlchen/soft/petsc-3.6.3/src/vec/is/utils/pmap.c >> [1]PETSC ERROR: #2 MatSetBlockSize() line 6920 in >> /home/rlchen/soft/petsc-3.6.3/src/mat/interface/matrix.c >> [1]PETSC ERROR: #3 MatXAIJSetPreallocation() line 282 in >> /home/rlchen/soft/3D_fluid/FSI/Spmcs-v1.5/Fluid-petsc-3.6/src/application/Fluid/gcreate.c >> ---------------------- >> >> Best regards, >> Rongliang >> >> On 08/18/2016 05:52 PM, Matthew Knepley wrote: >>> Run with valgrind and send the log. >>> >>> Thanks, >>> >>> Matt >>> >>> On Thu, Aug 18, 2016 at 3:59 AM, Rongliang Chen >>> > >>> wrote: >>> >>> Dear All, >>> >>> I try to use the block matrix (BAIJ) for the dmplex data >>> structure with the option "-dm_mat_type baij" (the block >>> size is 7). The code works fine when np = 1 but the >>> following error comes up when np>1. And the code also >>> works fine for np>1 if I set the block size to be 1. Any >>> suggestions are highly appreciated. >>> >>> ---------------------------------------------------- >>> [1]PETSC ERROR: PetscMallocValidate: error detected at >>> VecAXPY_Seq() line 89 in >>> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c >>> [1]PETSC ERROR: Memory at address 0x1332571 is corrupted >>> [1]PETSC ERROR: Probably write past beginning or end of >>> array >>> [1]PETSC ERROR: Last intact block allocated in >>> PetscStrallocpy() line 188 in >>> /home/rlchen/soft/petsc-3.6.3/src/sys/utils/str.c >>> [1]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> [1]PETSC ERROR: Memory corruption: >>> http://www.mcs.anl.gov/petsc/documentation/installation.html#valgrind >>> >>> [1]PETSC ERROR: >>> [1]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>> >>> for trouble shooting. >>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>> [1]PETSC ERROR: ./fsi on a 64bit-debug named rlchen by >>> rlchen Thu Aug 18 16:42:34 2016 >>> [1]PETSC ERROR: Configure options --download-blacs >>> --download-scalapack --download-metis >>> --download-parmetis --download-exodusii >>> --download-netcdf --download-hdf5 >>> --with-mpi-dir=/home/rlchen/soft/Program/mpich2-shared >>> --with-debugging=1 --download-fblaslapack >>> --with-64-bit-indices >>> [1]PETSC ERROR: #1 PetscMallocValidate() line 136 in >>> /home/rlchen/soft/petsc-3.6.3/src/sys/memory/mtr.c >>> [1]PETSC ERROR: #2 VecAXPY_Seq() line 89 in >>> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/impls/seq/bvec1.c >>> [1]PETSC ERROR: #3 VecAXPY() line 640 in >>> /home/rlchen/soft/petsc-3.6.3/src/vec/vec/interface/rvector.c >>> --------------------------------------------------- >>> >>> Best regards, >>> Rongliang >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin >>> their experiments is infinitely more interesting than any >>> results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Fri Aug 26 07:14:29 2016 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Fri, 26 Aug 2016 13:14:29 +0100 Subject: [petsc-users] MatGetDiagonalBlock and shell matrices Message-ID: Hi all I'm just wondering if there is any plans in the future for MatGetDiagonalBlock to support shell matrices by registering a user-implemented MATOP? MatGetDiagonal supports MATOP, but the block version of this does not. I found a previous query on the user list which touched on this and mentioned that it would be easy to add: http://lists.mcs.anl.gov/pipermail/petsc-users/2011-May/008700.html I have implemented a matrix-free multigrid algorithm using shell operations in PETSc, and it would be very convenient to be able to provide a local shell Mat so I could apply local GMRES (or other matvec-based solvers) as a local block smoother. Thanks! Steven -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Fri Aug 26 07:14:35 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Fri, 26 Aug 2016 14:14:35 +0200 Subject: [petsc-users] What caused this problem."PetscInitialize() must be called before PetscFinalize()" In-Reply-To: <7ca81682.b909.156c6c2dbcb.Coremail.ztdepyahoo@163.com> References: <528e86fa.83a9.156c5cd2e37.Coremail.ztdepyahoo@163.com> <7ca81682.b909.156c6c2dbcb.Coremail.ztdepyahoo@163.com> Message-ID: I mean the entire message between the dotted lines [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- ............ [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Also, make sure that you "reply all". On Fri, Aug 26, 2016 at 2:11 PM, ??? wrote: > > [0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in > home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > [0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in > /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > [0]PETSC ERROR: #3 PetscInitialize() line 828 in > /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/pinit.c > PetscInitialize() must be called before PetscFinalize() > > > > > > > ? 2016-08-26 15:46:37?"Patrick Sanan" ??? >>Could you send the entire error message? >> >>On Fri, Aug 26, 2016 at 9:42 AM, ??? wrote: >>> Dear friends: >>> My code run without any problem yesterday, but it gives me the >>> following >>> error. I do not know what is the reason? >>> Regards >>> >>> [0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in >>> /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c >>> [0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in >>> /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c >>> [0]PETSC ERROR: #3 PetscInitialize() line 828 in >>> /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/pinit.c >>> >>> PetscInitialize() must be called before PetscFinalize() >>> >>> Regards >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> > > > > From dave.mayhem23 at gmail.com Fri Aug 26 07:34:07 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 26 Aug 2016 14:34:07 +0200 Subject: [petsc-users] MatGetDiagonalBlock and shell matrices In-Reply-To: References: Message-ID: On 26 August 2016 at 14:14, Steven Dargaville wrote: > Hi all > > I'm just wondering if there is any plans in the future for > MatGetDiagonalBlock to support shell matrices by registering a > user-implemented MATOP? MatGetDiagonal supports MATOP, but the block > version of this does not. > > I found a previous query on the user list which touched on this and > mentioned that it would be easy to add: > > http://lists.mcs.anl.gov/pipermail/petsc-users/2011-May/008700.html > > I have implemented a matrix-free multigrid algorithm using shell > operations in PETSc, and it would be very convenient to be able to provide > a local shell Mat so I could apply local GMRES (or other matvec-based > solvers) as a local block smoother. > It looks like the thing you need to do is use PetscObjectComposeFunction() and not MatShellSetOperation() http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/ PetscObjectComposeFunction.html#PetscObjectComposeFunction with your MatShell object. That is, instead of calling MatShellSetOperation(), call ierr = PetscObjectComposeFunction(myshell,"MatGetDiagonalBlock_C", MatGetDiagonalBlock_MyShell);CHKERRQ(ierr); where MatGetDiagonalBlock_MyShell is a function pointer to your method to get the diagonal block. Important detail is that you don't change the string "MatGetDiagonalBlock_C". The method MatGetDiagonalBlock() does a function pointer query using this string. See http://www.mcs.anl.gov/petsc/petsc-current/src/mat/interface/matrix.c.html#MatGetDiagonalBlock Thanks Dave > > Thanks! > Steven > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 26 07:35:25 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Aug 2016 07:35:25 -0500 Subject: [petsc-users] What caused this problem."PetscInitialize() must be called before PetscFinalize()" In-Reply-To: <528e86fa.83a9.156c5cd2e37.Coremail.ztdepyahoo@163.com> References: <528e86fa.83a9.156c5cd2e37.Coremail.ztdepyahoo@163.com> Message-ID: On Fri, Aug 26, 2016 at 2:42 AM, ??? wrote: > Dear friends: > My code run without any problem yesterday, but it gives me the > following error. I do not know what is the reason? > I agree with Patrick, send the entire message. However, are you using Fortran? Matt > Regards > > [0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in > /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > [0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in > /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > [0]PETSC ERROR: #3 PetscInitialize() line 828 in > /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/pinit.c > > PetscInitialize() must be called before PetscFinalize() > > Regards > > > > > > > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Fri Aug 26 07:35:42 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 26 Aug 2016 14:35:42 +0200 Subject: [petsc-users] MatGetDiagonalBlock and shell matrices In-Reply-To: References: Message-ID: On 26 August 2016 at 14:34, Dave May wrote: > > > On 26 August 2016 at 14:14, Steven Dargaville > wrote: > >> Hi all >> >> I'm just wondering if there is any plans in the future for >> MatGetDiagonalBlock to support shell matrices by registering a >> user-implemented MATOP? MatGetDiagonal supports MATOP, but the block >> version of this does not. >> >> I found a previous query on the user list which touched on this and >> mentioned that it would be easy to add: >> >> http://lists.mcs.anl.gov/pipermail/petsc-users/2011-May/008700.html >> >> I have implemented a matrix-free multigrid algorithm using shell >> operations in PETSc, and it would be very convenient to be able to provide >> a local shell Mat so I could apply local GMRES (or other matvec-based >> solvers) as a local block smoother. >> > > It looks like the thing you need to do is use PetscObjectComposeFunction() > and not MatShellSetOperation() > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/ > Sys/PetscObjectComposeFunction.html#PetscObjectComposeFunction > > with your MatShell object. > > That is, instead of calling MatShellSetOperation(), call > ierr = PetscObjectComposeFunction(myshell,"MatGetDiagonalBlock_C", > MatGetDiagonalBlock_MyShell);CHKERRQ(ierr); > Oops - I forgot the cast, the above should be ierr = PetscObjectComposeFunction((PetscObject)myshell," MatGetDiagonalBlock_C", MatGetDiagonalBlock_MyShell);CHKERRQ(ierr); > > where MatGetDiagonalBlock_MyShell is a function pointer to your method to > get the diagonal block. > > Important detail is that you don't change the string > "MatGetDiagonalBlock_C". The method MatGetDiagonalBlock() does a function > pointer query using this string. See > http://www.mcs.anl.gov/petsc/petsc-current/src/mat/ > interface/matrix.c.html#MatGetDiagonalBlock > > > Thanks > Dave > > > > > > >> >> Thanks! >> Steven >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Aug 26 08:14:32 2016 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 26 Aug 2016 09:14:32 -0400 Subject: [petsc-users] Command lines to reproduce the tests of "Composing scalable nonlinear algebraic solvers" In-Reply-To: References: Message-ID: On Thu, Aug 25, 2016 at 9:46 AM, Karin&NiKo wrote: > Dear PETSc gurus, > > Thanks to the help of Matthew, I have been able to reproduce in PETSc some > tests of the paper of Peter Brune et al. entitled "Composing scalable > nonlinear algebraic solvers", with special attention to the elasticity test. > I have also tried to reproduce it in a widely used mechanics finite > element solver and I cannot obtain the same results, mainly because of my > lack of undestanding of the boundary conditions and of the loading. > > According to the paper, I undestand that the edges in red in the attached > image are fully clamped. If I do that, I do observe a rotation of the grey > face (see attached image). > > This does not sound stable. > If I clamp the over-mentioned edges plus I forbid the normal displacement > of the grey face, I get a quite similar warped shape but the details of the > deformation near the clamped faces are very different > > Your picture looks like this problem. > In the paper, the loading is defined as a volume force applied to the > whole structure whereas in Wriggers' book, it is defined as a nodal force. > > Could you please give me some details on these points? > > Best regards, > Nicolas > > 2016-08-23 20:25 GMT+02:00 Matthew Knepley : > >> On Tue, Aug 23, 2016 at 10:54 AM, Karin&NiKo >> wrote: >> >>> Dear PETSc team, >>> >>> I have read with high interest the paper of Peter Brune et al. entitled >>> "Composing scalable nonlinear algebraic solvers". >>> Nevertheless I would like to be able to reproduce the tests that are >>> presented within (mainly the elasticity problem, ex16). >>> >>> Could you please provide us with the command lines of these tests? >>> >> >> I believe Peter used the attached script. >> >> Thanks, >> >> Matt >> >> >>> Best regards, >>> Nicolas >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Fri Aug 26 08:16:00 2016 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Fri, 26 Aug 2016 14:16:00 +0100 Subject: [petsc-users] MatGetDiagonalBlock and shell matrices In-Reply-To: References: Message-ID: Hi Dave Thanks for the response. I'm actually using fortran and I wasn't sure that PetscObjectComposeFunction would be accessible, and if so, what sort of fortran magic I might need to call this function (possibly an interface, with or without c_opt). Do you know if it is possible to call that routine directly from fortran? Thanks Steven On Fri, Aug 26, 2016 at 1:35 PM, Dave May wrote: > > > On 26 August 2016 at 14:34, Dave May wrote: > >> >> >> On 26 August 2016 at 14:14, Steven Dargaville < >> dargaville.steven at gmail.com> wrote: >> >>> Hi all >>> >>> I'm just wondering if there is any plans in the future for >>> MatGetDiagonalBlock to support shell matrices by registering a >>> user-implemented MATOP? MatGetDiagonal supports MATOP, but the block >>> version of this does not. >>> >>> I found a previous query on the user list which touched on this and >>> mentioned that it would be easy to add: >>> >>> http://lists.mcs.anl.gov/pipermail/petsc-users/2011-May/008700.html >>> >>> I have implemented a matrix-free multigrid algorithm using shell >>> operations in PETSc, and it would be very convenient to be able to provide >>> a local shell Mat so I could apply local GMRES (or other matvec-based >>> solvers) as a local block smoother. >>> >> >> It looks like the thing you need to do is use >> PetscObjectComposeFunction() and not MatShellSetOperation() >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/ >> Sys/PetscObjectComposeFunction.html#PetscObjectComposeFunction >> >> with your MatShell object. >> >> That is, instead of calling MatShellSetOperation(), call >> ierr = PetscObjectComposeFunction(myshell,"MatGetDiagonalBlock_C", >> MatGetDiagonalBlock_MyShell);CHKERRQ(ierr); >> > > Oops - I forgot the cast, the above should be > > ierr = PetscObjectComposeFunction((PetscObject)myshell,"MatGetDia > gonalBlock_C", MatGetDiagonalBlock_MyShell);CHKERRQ(ierr); > > >> >> where MatGetDiagonalBlock_MyShell is a function pointer to your method to >> get the diagonal block. >> >> Important detail is that you don't change the string >> "MatGetDiagonalBlock_C". The method MatGetDiagonalBlock() does a function >> pointer query using this string. See >> http://www.mcs.anl.gov/petsc/petsc-current/src/mat/interface >> /matrix.c.html#MatGetDiagonalBlock >> >> >> Thanks >> Dave >> >> >> >> >> >> >>> >>> Thanks! >>> Steven >>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Fri Aug 26 08:22:09 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Fri, 26 Aug 2016 15:22:09 +0200 Subject: [petsc-users] MatGetDiagonalBlock and shell matrices In-Reply-To: References: Message-ID: On Friday, 26 August 2016, Steven Dargaville wrote: > Hi Dave > > Thanks for the response. I'm actually using fortran and I wasn't sure that > PetscObjectComposeFunction would be accessible, and if so, what sort of > fortran magic I might need to call this function (possibly an interface, > with or without c_opt). > > Do you know if it is possible to call that routine directly from fortran? > I don't know. I'll have appeal to the other petsc users for an answer to these questions. Thanks, Dave > Thanks > Steven > > On Fri, Aug 26, 2016 at 1:35 PM, Dave May > wrote: > >> >> >> On 26 August 2016 at 14:34, Dave May > > wrote: >> >>> >>> >>> On 26 August 2016 at 14:14, Steven Dargaville < >>> dargaville.steven at gmail.com >>> > wrote: >>> >>>> Hi all >>>> >>>> I'm just wondering if there is any plans in the future for >>>> MatGetDiagonalBlock to support shell matrices by registering a >>>> user-implemented MATOP? MatGetDiagonal supports MATOP, but the block >>>> version of this does not. >>>> >>>> I found a previous query on the user list which touched on this and >>>> mentioned that it would be easy to add: >>>> >>>> http://lists.mcs.anl.gov/pipermail/petsc-users/2011-May/008700.html >>>> >>>> I have implemented a matrix-free multigrid algorithm using shell >>>> operations in PETSc, and it would be very convenient to be able to provide >>>> a local shell Mat so I could apply local GMRES (or other matvec-based >>>> solvers) as a local block smoother. >>>> >>> >>> It looks like the thing you need to do is use >>> PetscObjectComposeFunction() and not MatShellSetOperation() >>> >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/ >>> Sys/PetscObjectComposeFunction.html#PetscObjectComposeFunction >>> >>> with your MatShell object. >>> >>> That is, instead of calling MatShellSetOperation(), call >>> ierr = PetscObjectComposeFunction(myshell,"MatGetDiagonalBlock_C", >>> MatGetDiagonalBlock_MyShell);CHKERRQ(ierr); >>> >> >> Oops - I forgot the cast, the above should be >> >> ierr = PetscObjectComposeFunction((PetscObject)myshell,"MatGetDiago >> nalBlock_C", MatGetDiagonalBlock_MyShell);CHKERRQ(ierr); >> >> >>> >>> where MatGetDiagonalBlock_MyShell is a function pointer to your method >>> to get the diagonal block. >>> >>> Important detail is that you don't change the string >>> "MatGetDiagonalBlock_C". The method MatGetDiagonalBlock() does a function >>> pointer query using this string. See >>> http://www.mcs.anl.gov/petsc/petsc-current/src/mat/interface >>> /matrix.c.html#MatGetDiagonalBlock >>> >>> >>> Thanks >>> Dave >>> >>> >>> >>> >>> >>> >>>> >>>> Thanks! >>>> Steven >>>> >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 26 09:42:44 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 26 Aug 2016 09:42:44 -0500 Subject: [petsc-users] Command lines to reproduce the tests of "Composing scalable nonlinear algebraic solvers" In-Reply-To: References: Message-ID: On Thu, Aug 25, 2016 at 8:46 AM, Karin&NiKo wrote: > Dear PETSc gurus, > > Thanks to the help of Matthew, I have been able to reproduce in PETSc some > tests of the paper of Peter Brune et al. entitled "Composing scalable > nonlinear algebraic solvers", with special attention to the elasticity test. > I have also tried to reproduce it in a widely used mechanics finite > element solver and I cannot obtain the same results, mainly because of my > lack of undestanding of the boundary conditions and of the loading. > > According to the paper, I undestand that the edges in red in the attached > image are fully clamped. If I do that, I do observe a rotation of the grey > face (see attached image). > > If I clamp the over-mentioned edges plus I forbid the normal displacement > of the grey face, I get a quite similar warped shape but the details of the > deformation near the clamped faces are very different > > In the paper, the loading is defined as a volume force applied to the > whole structure whereas in Wriggers' book, it is defined as a nodal force. > > Could you please give me some details on these points? > I am looking at the code: https://bitbucket.org/petsc/petsc/src/bd8b9ddea6dc39522a77cdc0585cbede38eb66bb/src/snes/examples/tutorials/ex16.c?at=master&fileviewer=file-view-default#ex16.c-204 and to me it clearly looks like we are fixing just the top edge on each side. I do not see anything forcing the end faces to stay aligned with the y-z plane, but maybe I am missing something. Can you just compare the residual/Jacobian for the initial guess? Matt > Best regards, > Nicolas > > 2016-08-23 20:25 GMT+02:00 Matthew Knepley : > >> On Tue, Aug 23, 2016 at 10:54 AM, Karin&NiKo >> wrote: >> >>> Dear PETSc team, >>> >>> I have read with high interest the paper of Peter Brune et al. entitled >>> "Composing scalable nonlinear algebraic solvers". >>> Nevertheless I would like to be able to reproduce the tests that are >>> presented within (mainly the elasticity problem, ex16). >>> >>> Could you please provide us with the command lines of these tests? >>> >> >> I believe Peter used the attached script. >> >> Thanks, >> >> Matt >> >> >>> Best regards, >>> Nicolas >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 26 10:49:44 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 26 Aug 2016 10:49:44 -0500 Subject: [petsc-users] What caused this problem."PetscInitialize() must be called before PetscFinalize()" In-Reply-To: <528e86fa.83a9.156c5cd2e37.Coremail.ztdepyahoo@163.com> References: <528e86fa.83a9.156c5cd2e37.Coremail.ztdepyahoo@163.com> Message-ID: <21FE81AD-AE44-4A1E-8590-D3CB31228167@mcs.anl.gov> You got this message because you did not check the error code for PetscInitialize() and so your program continued running until it reached PetscFinalize() where it could not handle the state since PETSc was not properly initialized. You should have code something like call PetscInitialize(PETSC_NULL_CHARACTER, ierr) if (ierr .ne. 0) then print*,'Error in PetscInitialize ',ierr endif > On Aug 26, 2016, at 2:42 AM, ??? wrote: > > Dear friends: > My code run without any problem yesterday, but it gives me the following error. I do not know what is the reason? > Regards > > [0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > [0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > [0]PETSC ERROR: #3 PetscInitialize() line 828 in /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/pinit.c > > PetscInitialize() must be called before PetscFinalize() > > Regards > > > > > > > > > > > > > > From jychang48 at gmail.com Fri Aug 26 14:15:19 2016 From: jychang48 at gmail.com (Justin Chang) Date: Fri, 26 Aug 2016 14:15:19 -0500 Subject: [petsc-users] GMRES restart guidelines Message-ID: Hi all, When exactly would one need to modify the default restart value of 30 for GMRES? I have seen papers like this which suggest playing around with values between 1 and 30, but are there cases where people need to use large restart values like 50 or 100? Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 26 14:17:27 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 26 Aug 2016 14:17:27 -0500 Subject: [petsc-users] MatGetDiagonalBlock and shell matrices In-Reply-To: References: Message-ID: I have added support for this operation in the branch barry/add-matshell-matgetdiagonalblock-fortran Barry Because this required changing the handling of the method MatGetDiagonalBlock() the addition cannot be back ported to PETSc 3.7.x > On Aug 26, 2016, at 8:22 AM, Dave May wrote: > > > > On Friday, 26 August 2016, Steven Dargaville wrote: > Hi Dave > > Thanks for the response. I'm actually using fortran and I wasn't sure that PetscObjectComposeFunction would be accessible, and if so, what sort of fortran magic I might need to call this function (possibly an interface, with or without c_opt). > > Do you know if it is possible to call that routine directly from fortran? > > I don't know. I'll have appeal to the other petsc users for an answer to these questions. > > Thanks, > Dave > > Thanks > Steven > > On Fri, Aug 26, 2016 at 1:35 PM, Dave May wrote: > > > On 26 August 2016 at 14:34, Dave May wrote: > > > On 26 August 2016 at 14:14, Steven Dargaville wrote: > Hi all > > I'm just wondering if there is any plans in the future for MatGetDiagonalBlock to support shell matrices by registering a user-implemented MATOP? MatGetDiagonal supports MATOP, but the block version of this does not. > > I found a previous query on the user list which touched on this and mentioned that it would be easy to add: > > http://lists.mcs.anl.gov/pipermail/petsc-users/2011-May/008700.html > > I have implemented a matrix-free multigrid algorithm using shell operations in PETSc, and it would be very convenient to be able to provide a local shell Mat so I could apply local GMRES (or other matvec-based solvers) as a local block smoother. > > It looks like the thing you need to do is use PetscObjectComposeFunction() and not MatShellSetOperation() > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscObjectComposeFunction.html#PetscObjectComposeFunction > > with your MatShell object. > > That is, instead of calling MatShellSetOperation(), call > ierr = PetscObjectComposeFunction(myshell,"MatGetDiagonalBlock_C", MatGetDiagonalBlock_MyShell);CHKERRQ(ierr); > > Oops - I forgot the cast, the above should be > > ierr = PetscObjectComposeFunction((PetscObject)myshell,"MatGetDiagonalBlock_C", MatGetDiagonalBlock_MyShell);CHKERRQ(ierr); > > > where MatGetDiagonalBlock_MyShell is a function pointer to your method to get the diagonal block. > > Important detail is that you don't change the string "MatGetDiagonalBlock_C". The method MatGetDiagonalBlock() does a function pointer query using this string. See > http://www.mcs.anl.gov/petsc/petsc-current/src/mat/interface/matrix.c.html#MatGetDiagonalBlock > > > Thanks > Dave > > > > > > > Thanks! > Steven > > > > > From bsmith at mcs.anl.gov Fri Aug 26 14:19:41 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 26 Aug 2016 14:19:41 -0500 Subject: [petsc-users] GMRES restart guidelines In-Reply-To: References: Message-ID: <758D937C-A72C-468F-AC88-C43400ED9BE7@mcs.anl.gov> > On Aug 26, 2016, at 2:15 PM, Justin Chang wrote: > > Hi all, > > When exactly would one need to modify the default restart value of 30 for GMRES? I have seen papers like this which suggest playing around with values between 1 and 30, but are there cases where people need to use large restart values like 50 or 100? Sure. It is not ideal because the work and memory requirements go up but if you have a problem where even after proper preconditioning there are say 50 "bad" eigenvalue/eigenvectors of the preconditioned operator then using a larger value is valid. Barry > > > Thanks, > Justin From bsmith at mcs.anl.gov Fri Aug 26 18:27:20 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 26 Aug 2016 18:27:20 -0500 Subject: [petsc-users] What caused this problem."PetscInitialize() must be called before PetscFinalize()" In-Reply-To: <4cc6328.46d.156c90fc994.Coremail.ztdepyahoo@163.com> References: <528e86fa.83a9.156c5cd2e37.Coremail.ztdepyahoo@163.com> <21FE81AD-AE44-4A1E-8590-D3CB31228167@mcs.anl.gov> <4cc6328.46d.156c90fc994.Coremail.ztdepyahoo@163.com> Message-ID: <2EFFF745-7377-443F-8DE1-81B17B44912E@mcs.anl.gov> You should still check the error code after the call to PetscInitialize() because if the initialization failed the rest of the code will crash badly. Barry > On Aug 26, 2016, at 5:54 PM, ??? wrote: > > thanks, but i use c++ language, there is no need to call the error code. > > > > > > > At 2016-08-26 23:49:44, "Barry Smith" wrote: > > > > You got this message because you did not check the error code for PetscInitialize() and so your program continued running until it reached PetscFinalize() where it could not handle the state since PETSc was not properly initialized. > > > > You should have code something like > > > > call PetscInitialize(PETSC_NULL_CHARACTER, ierr) > > > > if (ierr .ne. 0) then > > print*,'Error in PetscInitialize ',ierr > > endif > > > > > >> On Aug 26, 2016, at 2:42 AM, ??? wrote: > >> > >> Dear friends: > >> My code run without any problem yesterday, but it gives me the following error. I do not know what is the reason? > >> Regards > >> > >> [0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > >> [0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c > >> [0]PETSC ERROR: #3 PetscInitialize() line 828 in /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/pinit.c > >> > >> PetscInitialize() must be called before PetscFinalize() > >> > >> Regards > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > > > > > > From bsmith at mcs.anl.gov Fri Aug 26 21:54:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 26 Aug 2016 21:54:14 -0500 Subject: [petsc-users] Question regarding updating PETSc Fortran examples to embrace post F77 constructs Message-ID: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> PETSc users, We've always been very conservative in PETSc to keep almost all our Fortran examples in a format that works with classic FORTRAN 77 constructs: fixed line format, (72 character limit) and no use of ; to separate operations on the same line, etc. Is it time to forgo these constructs and use more modern Fortran conventions in all our examples? Any feedback is appreciated Barry Note: it would continue to be possible to use PETSc in the FORTRAN 77 style, this is just a question about updating the examples. From pcarrica at engineering.uiowa.edu Sat Aug 27 08:47:04 2016 From: pcarrica at engineering.uiowa.edu (Pablo M. Carrica) Date: Sat, 27 Aug 2016 08:47:04 -0500 Subject: [petsc-users] [petsc-announce] Question regarding updating PETSc Fortran examples to embrace post F77 constructs In-Reply-To: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> References: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> Message-ID: <03ff01d20069$7ec3c150$7c4b43f0$@engineering.uiowa.edu> Hi Barry, I believe is fine to move forward with more modern Fortran. All compilers support at least Fortran 90. Best, Pablo. -----Original Message----- From: petsc-announce-bounces at mcs.anl.gov [mailto:petsc-announce-bounces at mcs.anl.gov] On Behalf Of Barry Smith Sent: Friday, August 26, 2016 9:54 PM To: petsc-announce at mcs.anl.gov; PETSc users list Subject: [petsc-announce] Question regarding updating PETSc Fortran examples to embrace post F77 constructs PETSc users, We've always been very conservative in PETSc to keep almost all our Fortran examples in a format that works with classic FORTRAN 77 constructs: fixed line format, (72 character limit) and no use of ; to separate operations on the same line, etc. Is it time to forgo these constructs and use more modern Fortran conventions in all our examples? Any feedback is appreciated Barry Note: it would continue to be possible to use PETSc in the FORTRAN 77 style, this is just a question about updating the examples. From ztdepyahoo at 163.com Sat Aug 27 08:49:28 2016 From: ztdepyahoo at 163.com (=?GBK?B?tqHAz8qm?=) Date: Sat, 27 Aug 2016 21:49:28 +0800 (CST) Subject: [petsc-users] Question regarding updating PETSc Fortran examples to embrace post F77 constructs In-Reply-To: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> References: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> Message-ID: <3724dd44.4c81.156cc4329df.Coremail.ztdepyahoo@163.com> Dear sir: I think we might following the history trend, forgot fortran77. We are facing HPC computation,so HPC fortran is out choice. Regard At 2016-08-27 10:54:14, "Barry Smith" wrote: > > PETSc users, > > We've always been very conservative in PETSc to keep almost all our Fortran examples in a format that works with classic FORTRAN 77 constructs: fixed line format, (72 character limit) and no use of ; to separate operations on the same line, etc. > > Is it time to forgo these constructs and use more modern Fortran conventions in all our examples? > > Any feedback is appreciated > > Barry > >Note: it would continue to be possible to use PETSc in the FORTRAN 77 style, this is just a question about updating the examples. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bourdin at math.lsu.edu Sat Aug 27 09:11:56 2016 From: bourdin at math.lsu.edu (Blaise Bourdin) Date: Sat, 27 Aug 2016 09:11:56 -0500 Subject: [petsc-users] [petsc-announce] Question regarding updating PETSc Fortran examples to embrace post F77 constructs In-Reply-To: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> References: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> Message-ID: <862AE906-6A56-4D62-9C5A-C3C511A1B33D@math.lsu.edu> FINALLY! Let's get rid of fortran77 free form in examples. I can't think of any reason to self inflict such a suffering. Are there ANY compiler around that people use and would not be able to process free form examples?I can see a point in keeping compatibility with fortran77 in petsc. It would make sense to keep a few old style pure f77 examples in the using-fortran section, but for the rest of the examples, using fixed form serves no purpose other than unexplicable bugs caused when a macro expands to more than 72 cols. Going farther, but it is a really un gratifying job that nobody wants to do, it would make sense of having fortran77 bindings through iso_c_binding. That would allow better argument type checking and debugging (I.e. Inspecting the content of a petsc object from the debugger in a fortran program). Would that prevent F77 interoperability? I am not sure. Blaise Sent from a mobile device > On Aug 26, 2016, at 9:54 PM, Barry Smith wrote: > > > PETSc users, > > We've always been very conservative in PETSc to keep almost all our Fortran examples in a format that works with classic FORTRAN 77 constructs: fixed line format, (72 character limit) and no use of ; to separate operations on the same line, etc. > > Is it time to forgo these constructs and use more modern Fortran conventions in all our examples? > > Any feedback is appreciated > > Barry > > Note: it would continue to be possible to use PETSc in the FORTRAN 77 style, this is just a question about updating the examples. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sat Aug 27 09:41:34 2016 From: balay at mcs.anl.gov (Satish Balay) Date: Sat, 27 Aug 2016 09:41:34 -0500 Subject: [petsc-users] Question regarding updating PETSc Fortran examples to embrace post F77 constructs In-Reply-To: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> References: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> Message-ID: On Fri, 26 Aug 2016, Barry Smith wrote: > > PETSc users, > > We've always been very conservative in PETSc to keep almost all our Fortran examples in a format that works with classic FORTRAN 77 constructs: fixed line format, (72 character limit) and no use of ; to separate operations on the same line, etc. > > Is it time to forgo these constructs and use more modern Fortran conventions in all our examples? > > Any feedback is appreciated > > Barry > > Note: it would continue to be possible to use PETSc in the FORTRAN 77 style, this is just a question about updating the examples. Well - if we don't have examples in the "FORTRAN 77 style" - then that mode won't get tested - and users code [that might use this mode] are likely to break.. [due to changes in includes..] Satish From bsmith at mcs.anl.gov Sat Aug 27 11:25:16 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 27 Aug 2016 11:25:16 -0500 Subject: [petsc-users] Question regarding updating PETSc Fortran examples to embrace post F77 constructs In-Reply-To: References: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> Message-ID: <111F6D4A-D02D-4A10-9EBC-A1203E02F348@mcs.anl.gov> > On Aug 27, 2016, at 9:41 AM, Satish Balay wrote: > > On Fri, 26 Aug 2016, Barry Smith wrote: > >> >> PETSc users, >> >> We've always been very conservative in PETSc to keep almost all our Fortran examples in a format that works with classic FORTRAN 77 constructs: fixed line format, (72 character limit) and no use of ; to separate operations on the same line, etc. >> >> Is it time to forgo these constructs and use more modern Fortran conventions in all our examples? >> >> Any feedback is appreciated >> >> Barry >> >> Note: it would continue to be possible to use PETSc in the FORTRAN 77 style, this is just a question about updating the examples. > > Well - if we don't have examples in the "FORTRAN 77 style" - then that > mode won't get tested - and users code [that might use this mode] are > likely to break.. [due to changes in includes..] Satish, Sure we'd have to keep a couple of F77. BTW: you don't have to approve the pets-announce responses; since they come to us we know what people say, no reason to spam the world with them. Barry > > Satish From s_g at berkeley.edu Sat Aug 27 11:42:54 2016 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Sat, 27 Aug 2016 09:42:54 -0700 Subject: [petsc-users] Question regarding updating PETSc Fortran examples to embrace post F77 constructs In-Reply-To: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> References: <9E9D225A-BDE7-4E3B-A46C-EF38D3693F9D@mcs.anl.gov> Message-ID: Moving forward is fine. One can always retain some fixed line examples along side other non-fixed line examples. -sanjay On 8/26/16 7:54 PM, Barry Smith wrote: > PETSc users, > > We've always been very conservative in PETSc to keep almost all our Fortran examples in a format that works with classic FORTRAN 77 constructs: fixed line format, (72 character limit) and no use of ; to separate operations on the same line, etc. > > Is it time to forgo these constructs and use more modern Fortran conventions in all our examples? > > Any feedback is appreciated > > Barry > > Note: it would continue to be possible to use PETSc in the FORTRAN 77 style, this is just a question about updating the examples. > > -- ------------------------------------------------------------------- Sanjay Govindjee, PhD, PE Horace, Dorothy, and Katherine Johnson Endowed Chair in Engineering 779 Davis Hall University of California Berkeley, CA 94720-1710 Voice: +1 510 642 6060 FAX: +1 510 643 5264 s_g at berkeley.edu http://www.ce.berkeley.edu/~sanjay ------------------------------------------------------------------- Books: Engineering Mechanics of Deformable Solids: A Presentation with Exercises http://www.oup.com/us/catalog/general/subject/Physics/MaterialsScience/?view=usa&ci=9780199651641 http://ukcatalogue.oup.com/product/9780199651641.do http://amzn.com/0199651647 Engineering Mechanics 3 (Dynamics) 2nd Edition http://www.springer.com/978-3-642-53711-0 http://amzn.com/3642537111 Engineering Mechanics 3, Supplementary Problems: Dynamics http://www.amzn.com/B00SOXN8JU ----------------------------------------------- From jed at jedbrown.org Sat Aug 27 17:02:49 2016 From: jed at jedbrown.org (Jed Brown) Date: Sat, 27 Aug 2016 16:02:49 -0600 Subject: [petsc-users] MatGetDiagonalBlock and shell matrices In-Reply-To: References: Message-ID: <874m65na46.fsf@jedbrown.org> Barry Smith writes: > I have added support for this operation in the branch barry/add-matshell-matgetdiagonalblock-fortran > > Barry > > Because this required changing the handling of the method MatGetDiagonalBlock() the addition cannot be back ported to PETSc 3.7.x Huh, it seems okay to me. The change is only matimpl.h (which is private anyway) and doesn't change method numbering so it shouldn't affect the ABI. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From jed at jedbrown.org Mon Aug 29 00:14:49 2016 From: jed at jedbrown.org (Jed Brown) Date: Sun, 28 Aug 2016 23:14:49 -0600 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: Message-ID: <87zinwjgvq.fsf@jedbrown.org> Justin Chang writes: > Redid some of those experiments for 8 and 20 cores and scaled it up to even > larger problems. Attached is the plot. > > Looking at this "dynamic plot" (if you ask me, I honestly think there could > be a better word for this out there), "performance spectrum"? > the lines curve up for the smaller problems, have a "flat line" in the > middle, then slowly tail down as the problem gets bigger. I am > guessing these downward curves have to do with either memory bandwidth > effects or simply the solver requiring more effort to handle larger > problems (or a combination of both). The trailing off to the right typically means suboptimal algorithmic convergence. Steps degrading performance to the right usually means dropping out of cache. > I currently only have access to a small 80 node (20 cores per node) > HPC cluster so obviously I am unable to experiment with 10k cores or > more. > > If our goal is to see how close flat the lines get, we can easily game the > system by scaling the problem until we find the "sweet spot(s)". This performance spectrum plot spans across those sweet spots, by design. I'm not sure how it would be "gamed". > In the weak-scaling and strong-scaling studies there are perfect lines > we can compare to, But those "perfect" lines are all relative to a base case. If the base case is inefficient, the scaling looks better. This makes it trivial to show near-perfect scaling despite having a terrible algorithm/implementation -- the scaling gets more perfect as the algorithm gets worse. (This is perhaps the most critical thing to recognize about the classical strong and week scaling plots.) Demonstrating that the base case is "optimal" is not easy, rarely actually the case, and often overlooked by authors and readers alike. Those plots mean nothing without a great deal of context and careful reading of scales. > but there does not seem to be such lines for this type of study even > in the seemingly flat regions. Seems these plots are useful if we > simply compare different solvers/preconditioners/etc or different HPC > platforms. > > Also, the solver count iteration increases with problem size - it went from > 9 KSP iterations for 1,331 dofs to 48 KSP iterations for 3,442,951 dofs. That suboptimal algorithmic scaling is important and one of the most important areas in which you could improve performance. > Algorithmic time-to-solution is not linearly proportional to problem size > so the RHS of the graph is obviously going to have lower N/time rates at > some point - similar to what we observe from weak-scaling. > > Also, the N/time rate seems very similar to the floating-point rate, > although I can see why it's more informative. Not really -- the floating point rate probably doesn't taper off to the right because you did floating point work on every iteration (and you need more iterations for the larger problem sizes that appear to the right). Meanwhile, the N/time metric is relevant to an end user that cares about their science or engineering objective rather than bragging about using flops. > Any thoughts on anything I said or did thus far? Just wanting to make sure > I understand these correctly :) > > On Mon, Aug 22, 2016 at 9:03 PM, Justin Chang wrote: > >> Thanks all. So this issue was one of our ATPESC2015 exam questions, and >> turned some friends into foes. Most eventually fell into the strong-scale >> is harder camp, but some of these "friends" also believed PETSc is *not* >> capable of handling dense matrices and is not portable. Just wanted to hear >> some expert opinions on this :) >> >> Anyway, in one of my applications, I am comparing the performance of some >> VI solvers (i.e., with variable bounds) with that of just standard linear >> solves (i.e., no variable bounds) for 3D advection-diffusion equations in >> highly heterogeneous and anisotropic porous media. The parallel efficiency >> in the strong-sense is roughly the same but the parallel efficiency in the >> weak-sense is significantly worse for VI solvers. I suppose one inference >> that can be made is that VI solvers take longer to solver as the problem >> size grows. And yes solver iteration counts do grow so that has some to do >> with it. >> >> As for these "dynamic range" plots, I tried something like this across 1 >> and 8 MPI processes with the following problem sizes for a 3D anisotropic >> diffusion problem with CG/BoomerAMG: >> >> 1,331 >> 9,261 >> 29,791 >> 68,921 >> 132,651 >> 226,981 >> 357,911 >> 531,441 >> 753,571 >> 1,030,301 >> >> Using a single Intel Xeon E5-2670 compute node for this. Attached is the >> plot, but instead of flat or incline lines, i get concave down curves. If >> my problem size gets too big, the N/time rate decreases, whereas for very >> small problems it increases. I am guessing bandwidth limitation have >> something to do with the decrease in performance. In that HPGMG >> presentation you attached the other day, it seems the rate should decrease >> as problem size decreases. Perhaps this study should be done with more MPI >> processes? >> >> >> On Mon, Aug 22, 2016 at 4:14 PM, Karl Rupp wrote: >> >>> Hi Justin, >>> >>> >>> I have seen some people claim that strong-scaling is harder to achieve >>>> than weak scaling >>>> (e.g., https://www.sharcnet.ca/help/index.php/Measuring_Parallel_Sc >>>> aling_Performance) >>>> and generally speaking it makes sense - communication overhead increases >>>> with concurrency. >>>> >>>> However, we know that most PETSc solvers/applications are not only >>>> memory-bandwidth bound, but may not scale as well w.r.t. problem size as >>>> other solvers (e.g., ILU(0) may beat out GAMG for small elliptic >>>> problems but GAMG will eventually beat out ILU(0) for larger problems), >>>> so wouldn't weak-scaling not only be the more interesting but more >>>> difficult performance metric to achieve? Strong-scaling issues arise >>>> mostly from communication overhead but weak-scaling issues may come from >>>> that and also solver/algorithmic scalability w.r.t problem size (e.g., >>>> problem size N takes 10*T seconds to compute but problem size 2*N takes >>>> 50*T seconds to compute). >>>> >>>> In other words, if one were to propose or design a new algorithm/solver >>>> capable of handling large-scale problems, would it be equally if not >>>> more important to show the weak-scaling potential? Because if you really >>>> think about it, a "truly efficient" algorithm will be less likely to >>>> scale in the strong sense but computation time will be close to linearly >>>> proportional to problem size (hence better scaling in the weak-sense). >>>> It seems if I am trying to convince someone that a proposed >>>> computational framework is "high performing" without getting too deep >>>> into performance modeling, a poor parallel efficiency (arising due to >>>> good sequential efficiency) in the strong sense may not look promising. >>>> >>> >>> These are all valid thoughts. Let me add another perspective: If you are >>> only interested in the machine aspects of scaling, you could run for a >>> fixed number of solver iterations. That allows you to focus on the actual >>> computational work done and your results will exclusively reflect the >>> machine's performance. Thus, even though fixing solver iterations and thus >>> not running solvers to convergence is a bad shortcut from the solver point >>> of view, it can be a handy way of eliminating algorithmic fluctuations. >>> (Clearly, this simplistic approach has not only been used, but also >>> abused...) >>> >>> Best regards, >>> Karli >>> >>> >> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From ztdepyahoo at 163.com Mon Aug 29 06:19:37 2016 From: ztdepyahoo at 163.com (=?GBK?B?tqHAz8qm?=) Date: Mon, 29 Aug 2016 19:19:37 +0800 (CST) Subject: [petsc-users] Can i use rank=0 to set matrix value, and solve with all the cpus. Message-ID: <3ae4a7aa.cfd5.156d606af7c.Coremail.ztdepyahoo@163.com> Dear friends: I want to use the root node(MyRank=0) only to set the matrix values, while solve the Ax=b with all the CPU nodes. will it work? Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Mon Aug 29 06:29:05 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 29 Aug 2016 13:29:05 +0200 Subject: [petsc-users] Can i use rank=0 to set matrix value, and solve with all the cpus. In-Reply-To: <3ae4a7aa.cfd5.156d606af7c.Coremail.ztdepyahoo@163.com> References: <3ae4a7aa.cfd5.156d606af7c.Coremail.ztdepyahoo@163.com> Message-ID: On 29 August 2016 at 13:19, ??? wrote: > Dear friends: > I want to use the root node(MyRank=0) only to set the matrix values, > while solve the Ax=b with all the CPU nodes. > will it work? > Yes it will "work", in the sense that the code will run. Note however that this approach, whilst "working", is not the recommended way to assemble a matrix as (i) it is sequential in nature and (ii) will involve a lot of data movement across the network. Thanks, Dave > > Regards > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Mon Aug 29 16:40:15 2016 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Mon, 29 Aug 2016 22:40:15 +0100 Subject: [petsc-users] MatGetDiagonalBlock and shell matrices In-Reply-To: <874m65na46.fsf@jedbrown.org> References: <874m65na46.fsf@jedbrown.org> Message-ID: HI all Thanks for this Barry, the branch you've made seems to be working perfectly. Do you think this change will make it into a future PETSc release? I can imagine this would be very useful for a number of people. Thanks Steven On Sat, Aug 27, 2016 at 11:02 PM, Jed Brown wrote: > Barry Smith writes: > > > I have added support for this operation in the branch > barry/add-matshell-matgetdiagonalblock-fortran > > > > Barry > > > > Because this required changing the handling of the method > MatGetDiagonalBlock() the addition cannot be back ported to PETSc 3.7.x > > Huh, it seems okay to me. The change is only matimpl.h (which is > private anyway) and doesn't change method numbering so it shouldn't > affect the ABI. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Aug 29 16:45:35 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 29 Aug 2016 16:45:35 -0500 Subject: [petsc-users] MatGetDiagonalBlock and shell matrices In-Reply-To: References: <874m65na46.fsf@jedbrown.org> Message-ID: Yes the branch is now already in master and will be in future releases. Barry > On Aug 29, 2016, at 4:40 PM, Steven Dargaville wrote: > > HI all > > Thanks for this Barry, the branch you've made seems to be working perfectly. Do you think this change will make it into a future PETSc release? I can imagine this would be very useful for a number of people. > > Thanks > Steven > > > On Sat, Aug 27, 2016 at 11:02 PM, Jed Brown wrote: > Barry Smith writes: > > > I have added support for this operation in the branch barry/add-matshell-matgetdiagonalblock-fortran > > > > Barry > > > > Because this required changing the handling of the method MatGetDiagonalBlock() the addition cannot be back ported to PETSc 3.7.x > > Huh, it seems okay to me. The change is only matimpl.h (which is > private anyway) and doesn't change method numbering so it shouldn't > affect the ABI. > From mailinglists at xgm.de Tue Aug 30 07:01:46 2016 From: mailinglists at xgm.de (Florian Lindner) Date: Tue, 30 Aug 2016 14:01:46 +0200 Subject: [petsc-users] Condition number of matrix Message-ID: Hello, there is a FAQ and a Stackoverflow article about getting the condition number of a petsc matrix: http://www.mcs.anl.gov/petsc/documentation/faq.html#conditionnumber http://scicomp.stackexchange.com/questions/34/how-can-i-estimate-the-condition-number-of-a-large-sparse-matrix-using-petsc Both tell me to add: -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 to my options. I add the line to .petscrc but nothing happens, no additional output at all. I added -ksp_view, so my .petscrc looks like that: -ksp_view -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 The complete output is below, but something I wonder about: GMRES: restart=30, shouldn't that be 1000 And where can I read out the condition number approximation? Thanks, Florian KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=1e-09, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: none linear system matrix = precond matrix: Mat Object: C 1 MPI processes type: seqsbaij rows=14403, cols=14403 total: nonzeros=1044787, allocated nonzeros=1123449 total number of mallocs used during MatSetValues calls =72016 block size is 1 (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:395 in initialize: it 1 of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | ongoing yes | dt complete no | (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:446 in advance: Iteration #1 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000 tolerances: relative=1e-09, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: none linear system matrix = precond matrix: Mat Object: C 1 MPI processes type: seqsbaij rows=14403, cols=14403 total: nonzeros=1044787, allocated nonzeros=1123449 total number of mallocs used during MatSetValues calls =72016 block size is 1 From bsmith at mcs.anl.gov Tue Aug 30 07:05:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 30 Aug 2016 07:05:14 -0500 Subject: [petsc-users] Condition number of matrix In-Reply-To: References: Message-ID: The format of .petscrc requires each option to be on its own line > -ksp_view > -pc_type none > -ksp_type gmres > -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > On Aug 30, 2016, at 7:01 AM, Florian Lindner wrote: > > Hello, > > there is a FAQ and a Stackoverflow article about getting the condition number of a petsc matrix: > > http://www.mcs.anl.gov/petsc/documentation/faq.html#conditionnumber > http://scicomp.stackexchange.com/questions/34/how-can-i-estimate-the-condition-number-of-a-large-sparse-matrix-using-petsc > > Both tell me to add: > > -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 > > to my options. > > I add the line to .petscrc but nothing happens, no additional output at all. I added -ksp_view, so my .petscrc looks > like that: > > -ksp_view > -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 > > The complete output is below, but something I wonder about: > > GMRES: restart=30, shouldn't that be 1000 > > And where can I read out the condition number approximation? > > Thanks, > Florian > > > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: none > linear system matrix = precond matrix: > Mat Object: C 1 MPI processes > type: seqsbaij > rows=14403, cols=14403 > total: nonzeros=1044787, allocated nonzeros=1123449 > total number of mallocs used during MatSetValues calls =72016 > block size is 1 > (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:395 in initialize: it 1 of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | > ongoing yes | dt complete no | > (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:446 in advance: Iteration #1 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: none > linear system matrix = precond matrix: > Mat Object: C 1 MPI processes > type: seqsbaij > rows=14403, cols=14403 > total: nonzeros=1044787, allocated nonzeros=1123449 > total number of mallocs used during MatSetValues calls =72016 > block size is 1 From knepley at gmail.com Tue Aug 30 07:05:36 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 30 Aug 2016 07:05:36 -0500 Subject: [petsc-users] Condition number of matrix In-Reply-To: References: Message-ID: On Tue, Aug 30, 2016 at 7:01 AM, Florian Lindner wrote: > Hello, > > there is a FAQ and a Stackoverflow article about getting the condition > number of a petsc matrix: > > http://www.mcs.anl.gov/petsc/documentation/faq.html#conditionnumber > http://scicomp.stackexchange.com/questions/34/how-can-i- > estimate-the-condition-number-of-a-large-sparse-matrix-using-petsc > > Both tell me to add: > > -pc_type none -ksp_type gmres -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > > to my options. > > I add the line to .petscrc but nothing happens, no additional output at > all. I added -ksp_view, so my .petscrc looks > like that: > Each option must be on its own line .petscrc Thanks, Matt > -ksp_view > -pc_type none -ksp_type gmres -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > > The complete output is below, but something I wonder about: > > GMRES: restart=30, shouldn't that be 1000 > > And where can I read out the condition number approximation? > > Thanks, > Florian > > > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: none > linear system matrix = precond matrix: > Mat Object: C 1 MPI processes > type: seqsbaij > rows=14403, cols=14403 > total: nonzeros=1044787, allocated nonzeros=1123449 > total number of mallocs used during MatSetValues calls =72016 > block size is 1 > (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:395 in initialize: it 1 > of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | > ongoing yes | dt complete no | > (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:446 in advance: > Iteration #1 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000 > tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > left preconditioning > using nonzero initial guess > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: none > linear system matrix = precond matrix: > Mat Object: C 1 MPI processes > type: seqsbaij > rows=14403, cols=14403 > total: nonzeros=1044787, allocated nonzeros=1123449 > total number of mallocs used during MatSetValues calls =72016 > block size is 1 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailinglists at xgm.de Tue Aug 30 09:03:23 2016 From: mailinglists at xgm.de (Florian Lindner) Date: Tue, 30 Aug 2016 16:03:23 +0200 Subject: [petsc-users] Condition number of matrix In-Reply-To: References: Message-ID: Hi, Am 30.08.2016 um 14:05 schrieb Barry Smith: > > The format of .petscrc requires each option to be on its own line > >> -ksp_view >> -pc_type none >> -ksp_type gmres >> -ksp_monitor_singular_value >> -ksp_gmres_restart 1000 Oh man, didn't know that. Sorry! Is using a hash # ok for comments in .petscrc? I added the option accordingly: -ksp_view -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 petsc outputs a line like: 550 KSP Residual norm 1.374922291162e-07 % max 1.842011038215e+03 min 6.509297234157e-04 max/min 2.829815526858e+06 for each iteration. Sorry about my mathematical illerateness, but where can I see the condition number of the matrix? Thanks, Florian > > > >> On Aug 30, 2016, at 7:01 AM, Florian Lindner wrote: >> >> Hello, >> >> there is a FAQ and a Stackoverflow article about getting the condition number of a petsc matrix: >> >> http://www.mcs.anl.gov/petsc/documentation/faq.html#conditionnumber >> http://scicomp.stackexchange.com/questions/34/how-can-i-estimate-the-condition-number-of-a-large-sparse-matrix-using-petsc >> >> Both tell me to add: >> >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 >> >> to my options. >> >> I add the line to .petscrc but nothing happens, no additional output at all. I added -ksp_view, so my .petscrc looks >> like that: >> >> -ksp_view >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 >> >> The complete output is below, but something I wonder about: >> >> GMRES: restart=30, shouldn't that be 1000 >> >> And where can I read out the condition number approximation? >> >> Thanks, >> Florian >> >> >> KSP Object: 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000 >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. >> left preconditioning >> using nonzero initial guess >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: none >> linear system matrix = precond matrix: >> Mat Object: C 1 MPI processes >> type: seqsbaij >> rows=14403, cols=14403 >> total: nonzeros=1044787, allocated nonzeros=1123449 >> total number of mallocs used during MatSetValues calls =72016 >> block size is 1 >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:395 in initialize: it 1 of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | >> ongoing yes | dt complete no | >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:446 in advance: Iteration #1 >> KSP Object: 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000 >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. >> left preconditioning >> using nonzero initial guess >> using PRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: none >> linear system matrix = precond matrix: >> Mat Object: C 1 MPI processes >> type: seqsbaij >> rows=14403, cols=14403 >> total: nonzeros=1044787, allocated nonzeros=1123449 >> total number of mallocs used during MatSetValues calls =72016 >> block size is 1 > From knepley at gmail.com Tue Aug 30 09:10:39 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 30 Aug 2016 09:10:39 -0500 Subject: [petsc-users] Condition number of matrix In-Reply-To: References: Message-ID: On Tue, Aug 30, 2016 at 9:03 AM, Florian Lindner wrote: > Hi, > > Am 30.08.2016 um 14:05 schrieb Barry Smith: > > > > The format of .petscrc requires each option to be on its own line > > > >> -ksp_view > >> -pc_type none > >> -ksp_type gmres > >> -ksp_monitor_singular_value > >> -ksp_gmres_restart 1000 > > Oh man, didn't know that. Sorry! Is using a hash # ok for comments in > .petscrc? > > I added the option accordingly: > > -ksp_view > -pc_type none > -ksp_type gmres > -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > > petsc outputs a line like: > > 550 KSP Residual norm 1.374922291162e-07 % max 1.842011038215e+03 min > 6.509297234157e-04 max/min 2.829815526858e+06 > > for each iteration. Sorry about my mathematical illerateness, but where > can I see the condition number of the matrix? > Its max/min since this means max singular value/min singular value. Thanks, Matt > Thanks, > Florian > > > > > > > > >> On Aug 30, 2016, at 7:01 AM, Florian Lindner > wrote: > >> > >> Hello, > >> > >> there is a FAQ and a Stackoverflow article about getting the condition > number of a petsc matrix: > >> > >> http://www.mcs.anl.gov/petsc/documentation/faq.html#conditionnumber > >> http://scicomp.stackexchange.com/questions/34/how-can-i- > estimate-the-condition-number-of-a-large-sparse-matrix-using-petsc > >> > >> Both tell me to add: > >> > >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > >> > >> to my options. > >> > >> I add the line to .petscrc but nothing happens, no additional output at > all. I added -ksp_view, so my .petscrc looks > >> like that: > >> > >> -ksp_view > >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > >> > >> The complete output is below, but something I wonder about: > >> > >> GMRES: restart=30, shouldn't that be 1000 > >> > >> And where can I read out the condition number approximation? > >> > >> Thanks, > >> Florian > >> > >> > >> KSP Object: 1 MPI processes > >> type: gmres > >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > >> GMRES: happy breakdown tolerance 1e-30 > >> maximum iterations=10000 > >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > >> left preconditioning > >> using nonzero initial guess > >> using PRECONDITIONED norm type for convergence test > >> PC Object: 1 MPI processes > >> type: none > >> linear system matrix = precond matrix: > >> Mat Object: C 1 MPI processes > >> type: seqsbaij > >> rows=14403, cols=14403 > >> total: nonzeros=1044787, allocated nonzeros=1123449 > >> total number of mallocs used during MatSetValues calls =72016 > >> block size is 1 > >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:395 in initialize: > it 1 of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | > >> ongoing yes | dt complete no | > >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:446 in advance: > Iteration #1 > >> KSP Object: 1 MPI processes > >> type: gmres > >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > >> GMRES: happy breakdown tolerance 1e-30 > >> maximum iterations=10000 > >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > >> left preconditioning > >> using nonzero initial guess > >> using PRECONDITIONED norm type for convergence test > >> PC Object: 1 MPI processes > >> type: none > >> linear system matrix = precond matrix: > >> Mat Object: C 1 MPI processes > >> type: seqsbaij > >> rows=14403, cols=14403 > >> total: nonzeros=1044787, allocated nonzeros=1123449 > >> total number of mallocs used during MatSetValues calls =72016 > >> block size is 1 > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From song.gao2 at mail.mcgill.ca Tue Aug 30 09:10:54 2016 From: song.gao2 at mail.mcgill.ca (Song Gao) Date: Tue, 30 Aug 2016 10:10:54 -0400 Subject: [petsc-users] Condition number of matrix In-Reply-To: References: Message-ID: condition number is max_eigenvalue/min_eigenvalue. so I guess it is max/min 2.829815526858e+06? 2016-08-30 10:03 GMT-04:00 Florian Lindner : > Hi, > > Am 30.08.2016 um 14:05 schrieb Barry Smith: > > > > The format of .petscrc requires each option to be on its own line > > > >> -ksp_view > >> -pc_type none > >> -ksp_type gmres > >> -ksp_monitor_singular_value > >> -ksp_gmres_restart 1000 > > Oh man, didn't know that. Sorry! Is using a hash # ok for comments in > .petscrc? > > I added the option accordingly: > > -ksp_view > -pc_type none > -ksp_type gmres > -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > > petsc outputs a line like: > > 550 KSP Residual norm 1.374922291162e-07 % max 1.842011038215e+03 min > 6.509297234157e-04 max/min 2.829815526858e+06 > > for each iteration. Sorry about my mathematical illerateness, but where > can I see the condition number of the matrix? > > Thanks, > Florian > > > > > > > > >> On Aug 30, 2016, at 7:01 AM, Florian Lindner > wrote: > >> > >> Hello, > >> > >> there is a FAQ and a Stackoverflow article about getting the condition > number of a petsc matrix: > >> > >> http://www.mcs.anl.gov/petsc/documentation/faq.html#conditionnumber > >> http://scicomp.stackexchange.com/questions/34/how-can-i- > estimate-the-condition-number-of-a-large-sparse-matrix-using-petsc > >> > >> Both tell me to add: > >> > >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > >> > >> to my options. > >> > >> I add the line to .petscrc but nothing happens, no additional output at > all. I added -ksp_view, so my .petscrc looks > >> like that: > >> > >> -ksp_view > >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > >> > >> The complete output is below, but something I wonder about: > >> > >> GMRES: restart=30, shouldn't that be 1000 > >> > >> And where can I read out the condition number approximation? > >> > >> Thanks, > >> Florian > >> > >> > >> KSP Object: 1 MPI processes > >> type: gmres > >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > >> GMRES: happy breakdown tolerance 1e-30 > >> maximum iterations=10000 > >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > >> left preconditioning > >> using nonzero initial guess > >> using PRECONDITIONED norm type for convergence test > >> PC Object: 1 MPI processes > >> type: none > >> linear system matrix = precond matrix: > >> Mat Object: C 1 MPI processes > >> type: seqsbaij > >> rows=14403, cols=14403 > >> total: nonzeros=1044787, allocated nonzeros=1123449 > >> total number of mallocs used during MatSetValues calls =72016 > >> block size is 1 > >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:395 in initialize: > it 1 of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | > >> ongoing yes | dt complete no | > >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:446 in advance: > Iteration #1 > >> KSP Object: 1 MPI processes > >> type: gmres > >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > >> GMRES: happy breakdown tolerance 1e-30 > >> maximum iterations=10000 > >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > >> left preconditioning > >> using nonzero initial guess > >> using PRECONDITIONED norm type for convergence test > >> PC Object: 1 MPI processes > >> type: none > >> linear system matrix = precond matrix: > >> Mat Object: C 1 MPI processes > >> type: seqsbaij > >> rows=14403, cols=14403 > >> total: nonzeros=1044787, allocated nonzeros=1123449 > >> total number of mallocs used during MatSetValues calls =72016 > >> block size is 1 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Tue Aug 30 09:11:12 2016 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 30 Aug 2016 09:11:12 -0500 Subject: [petsc-users] Condition number of matrix In-Reply-To: References: Message-ID: max/min 2.829815526858e+06 is an estimate to your condition number. Hong On Tue, Aug 30, 2016 at 9:03 AM, Florian Lindner wrote: > Hi, > > Am 30.08.2016 um 14:05 schrieb Barry Smith: > > > > The format of .petscrc requires each option to be on its own line > > > >> -ksp_view > >> -pc_type none > >> -ksp_type gmres > >> -ksp_monitor_singular_value > >> -ksp_gmres_restart 1000 > > Oh man, didn't know that. Sorry! Is using a hash # ok for comments in > .petscrc? > > I added the option accordingly: > > -ksp_view > -pc_type none > -ksp_type gmres > -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > > petsc outputs a line like: > > 550 KSP Residual norm 1.374922291162e-07 % max 1.842011038215e+03 min > 6.509297234157e-04 max/min 2.829815526858e+06 > > for each iteration. Sorry about my mathematical illerateness, but where > can I see the condition number of the matrix? > > Thanks, > Florian > > > > > > > > >> On Aug 30, 2016, at 7:01 AM, Florian Lindner > wrote: > >> > >> Hello, > >> > >> there is a FAQ and a Stackoverflow article about getting the condition > number of a petsc matrix: > >> > >> http://www.mcs.anl.gov/petsc/documentation/faq.html#conditionnumber > >> http://scicomp.stackexchange.com/questions/34/how-can-i- > estimate-the-condition-number-of-a-large-sparse-matrix-using-petsc > >> > >> Both tell me to add: > >> > >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > >> > >> to my options. > >> > >> I add the line to .petscrc but nothing happens, no additional output at > all. I added -ksp_view, so my .petscrc looks > >> like that: > >> > >> -ksp_view > >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > >> > >> The complete output is below, but something I wonder about: > >> > >> GMRES: restart=30, shouldn't that be 1000 > >> > >> And where can I read out the condition number approximation? > >> > >> Thanks, > >> Florian > >> > >> > >> KSP Object: 1 MPI processes > >> type: gmres > >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > >> GMRES: happy breakdown tolerance 1e-30 > >> maximum iterations=10000 > >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > >> left preconditioning > >> using nonzero initial guess > >> using PRECONDITIONED norm type for convergence test > >> PC Object: 1 MPI processes > >> type: none > >> linear system matrix = precond matrix: > >> Mat Object: C 1 MPI processes > >> type: seqsbaij > >> rows=14403, cols=14403 > >> total: nonzeros=1044787, allocated nonzeros=1123449 > >> total number of mallocs used during MatSetValues calls =72016 > >> block size is 1 > >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:395 in initialize: > it 1 of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | > >> ongoing yes | dt complete no | > >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:446 in advance: > Iteration #1 > >> KSP Object: 1 MPI processes > >> type: gmres > >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > >> GMRES: happy breakdown tolerance 1e-30 > >> maximum iterations=10000 > >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > >> left preconditioning > >> using nonzero initial guess > >> using PRECONDITIONED norm type for convergence test > >> PC Object: 1 MPI processes > >> type: none > >> linear system matrix = precond matrix: > >> Mat Object: C 1 MPI processes > >> type: seqsbaij > >> rows=14403, cols=14403 > >> total: nonzeros=1044787, allocated nonzeros=1123449 > >> total number of mallocs used during MatSetValues calls =72016 > >> block size is 1 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mailinglists at xgm.de Tue Aug 30 09:22:05 2016 From: mailinglists at xgm.de (Florian Lindner) Date: Tue, 30 Aug 2016 16:22:05 +0200 Subject: [petsc-users] Condition number of matrix In-Reply-To: References: Message-ID: Thanks everybody, just to be sure, it's max/min of the last iteration? Florian Am 30.08.2016 um 16:10 schrieb Matthew Knepley: > On Tue, Aug 30, 2016 at 9:03 AM, Florian Lindner > wrote: > > Hi, > > Am 30.08.2016 um 14:05 schrieb Barry Smith: > > > > The format of .petscrc requires each option to be on its own line > > > >> -ksp_view > >> -pc_type none > >> -ksp_type gmres > >> -ksp_monitor_singular_value > >> -ksp_gmres_restart 1000 > > Oh man, didn't know that. Sorry! Is using a hash # ok for comments in .petscrc? > > I added the option accordingly: > > -ksp_view > -pc_type none > -ksp_type gmres > -ksp_monitor_singular_value > -ksp_gmres_restart 1000 > > petsc outputs a line like: > > 550 KSP Residual norm 1.374922291162e-07 % max 1.842011038215e+03 min 6.509297234157e-04 max/min 2.829815526858e+06 > > for each iteration. Sorry about my mathematical illerateness, but where can I see the condition number of the matrix? > > > Its max/min since this means max singular value/min singular value. > > Thanks, > > Matt > > > Thanks, > Florian > > > > > > > > >> On Aug 30, 2016, at 7:01 AM, Florian Lindner > wrote: > >> > >> Hello, > >> > >> there is a FAQ and a Stackoverflow article about getting the condition number of a petsc matrix: > >> > >> http://www.mcs.anl.gov/petsc/documentation/faq.html#conditionnumber > > >> > http://scicomp.stackexchange.com/questions/34/how-can-i-estimate-the-condition-number-of-a-large-sparse-matrix-using-petsc > > >> > >> Both tell me to add: > >> > >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 > >> > >> to my options. > >> > >> I add the line to .petscrc but nothing happens, no additional output at all. I added -ksp_view, so my .petscrc looks > >> like that: > >> > >> -ksp_view > >> -pc_type none -ksp_type gmres -ksp_monitor_singular_value -ksp_gmres_restart 1000 > >> > >> The complete output is below, but something I wonder about: > >> > >> GMRES: restart=30, shouldn't that be 1000 > >> > >> And where can I read out the condition number approximation? > >> > >> Thanks, > >> Florian > >> > >> > >> KSP Object: 1 MPI processes > >> type: gmres > >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > >> GMRES: happy breakdown tolerance 1e-30 > >> maximum iterations=10000 > >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > >> left preconditioning > >> using nonzero initial guess > >> using PRECONDITIONED norm type for convergence test > >> PC Object: 1 MPI processes > >> type: none > >> linear system matrix = precond matrix: > >> Mat Object: C 1 MPI processes > >> type: seqsbaij > >> rows=14403, cols=14403 > >> total: nonzeros=1044787, allocated nonzeros=1123449 > >> total number of mallocs used during MatSetValues calls =72016 > >> block size is 1 > >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:395 in initialize: it 1 of 1 | dt# 1 | t 0 of 1 | dt 1 | max dt 1 | > >> ongoing yes | dt complete no | > >> (0) 13:58:35 [precice::impl::SolverInterfaceImpl]:446 in advance: Iteration #1 > >> KSP Object: 1 MPI processes > >> type: gmres > >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > >> GMRES: happy breakdown tolerance 1e-30 > >> maximum iterations=10000 > >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > >> left preconditioning > >> using nonzero initial guess > >> using PRECONDITIONED norm type for convergence test > >> PC Object: 1 MPI processes > >> type: none > >> linear system matrix = precond matrix: > >> Mat Object: C 1 MPI processes > >> type: seqsbaij > >> rows=14403, cols=14403 > >> total: nonzeros=1044787, allocated nonzeros=1123449 > >> total number of mallocs used during MatSetValues calls =72016 > >> block size is 1 > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any > results to which their experiments lead. > -- Norbert Wiener From mfadams at lbl.gov Tue Aug 30 09:36:23 2016 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 30 Aug 2016 10:36:23 -0400 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: Message-ID: On Tue, Aug 23, 2016 at 4:33 AM, Justin Chang wrote: > Redid some of those experiments for 8 and 20 cores and scaled it up to > even larger problems. Attached is the plot. > > Looking at this "dynamic plot" (if you ask me, I honestly think there > could be a better word for this out there), the lines curve up for the > smaller problems, have a "flat line" in the middle, then slowly tail down > as the problem gets bigger. I am guessing these downward curves have to do > with either memory bandwidth effects or simply the solver requiring more > effort to handle larger problems (or a combination of both). > I would guess it is the latter. It is hard to get "rollover" to the right. You could get it on KNL (cache configuration of HBM) when you spill out of HBM. Personally, if you are you going to go into this much detail (eg, more than just one plot) I would show a plot of iteration count vs problem size, and be done with it, and then fix the iteration count for the weak scaling and dynamic range plot (I agree we could use a better name). > I currently only have access to a small 80 node (20 cores per node) HPC > cluster so obviously I am unable to experiment with 10k cores or more. > > If our goal is to see how close flat the lines get, we can easily game the > system by scaling the problem until we find the "sweet spot(s)". In the > weak-scaling and strong-scaling studies there are perfect lines we can > compare to, but there does not seem to be such lines for this type of study > even in the seemingly flat regions. Seems these plots are useful if we > simply compare different solvers/preconditioners/etc or different HPC > platforms. > > Also, the solver count iteration increases with problem size - it went > from 9 KSP iterations for 1,331 dofs to 48 KSP iterations for 3,442,951 > dofs. Algorithmic time-to-solution is not linearly proportional to problem > size so the RHS of the graph is obviously going to have lower N/time rates > at some point - similar to what we observe from weak-scaling. > > Also, the N/time rate seems very similar to the floating-point rate, > although I can see why it's more informative. > > Any thoughts on anything I said or did thus far? Just wanting to make sure > I understand these correctly :) > > On Mon, Aug 22, 2016 at 9:03 PM, Justin Chang wrote: > >> Thanks all. So this issue was one of our ATPESC2015 exam questions, and >> turned some friends into foes. Most eventually fell into the strong-scale >> is harder camp, but some of these "friends" also believed PETSc is *not* >> capable of handling dense matrices and is not portable. Just wanted to hear >> some expert opinions on this :) >> >> Anyway, in one of my applications, I am comparing the performance of some >> VI solvers (i.e., with variable bounds) with that of just standard linear >> solves (i.e., no variable bounds) for 3D advection-diffusion equations in >> highly heterogeneous and anisotropic porous media. The parallel efficiency >> in the strong-sense is roughly the same but the parallel efficiency in the >> weak-sense is significantly worse for VI solvers. I suppose one inference >> that can be made is that VI solvers take longer to solver as the problem >> size grows. And yes solver iteration counts do grow so that has some to do >> with it. >> >> As for these "dynamic range" plots, I tried something like this across 1 >> and 8 MPI processes with the following problem sizes for a 3D anisotropic >> diffusion problem with CG/BoomerAMG: >> >> 1,331 >> 9,261 >> 29,791 >> 68,921 >> 132,651 >> 226,981 >> 357,911 >> 531,441 >> 753,571 >> 1,030,301 >> >> Using a single Intel Xeon E5-2670 compute node for this. Attached is the >> plot, but instead of flat or incline lines, i get concave down curves. If >> my problem size gets too big, the N/time rate decreases, whereas for very >> small problems it increases. I am guessing bandwidth limitation have >> something to do with the decrease in performance. In that HPGMG >> presentation you attached the other day, it seems the rate should decrease >> as problem size decreases. Perhaps this study should be done with more MPI >> processes? >> >> >> On Mon, Aug 22, 2016 at 4:14 PM, Karl Rupp wrote: >> >>> Hi Justin, >>> >>> >>> I have seen some people claim that strong-scaling is harder to achieve >>>> than weak scaling >>>> (e.g., https://www.sharcnet.ca/help/index.php/Measuring_Parallel_Sc >>>> aling_Performance) >>>> and generally speaking it makes sense - communication overhead increases >>>> with concurrency. >>>> >>>> However, we know that most PETSc solvers/applications are not only >>>> memory-bandwidth bound, but may not scale as well w.r.t. problem size as >>>> other solvers (e.g., ILU(0) may beat out GAMG for small elliptic >>>> problems but GAMG will eventually beat out ILU(0) for larger problems), >>>> so wouldn't weak-scaling not only be the more interesting but more >>>> difficult performance metric to achieve? Strong-scaling issues arise >>>> mostly from communication overhead but weak-scaling issues may come from >>>> that and also solver/algorithmic scalability w.r.t problem size (e.g., >>>> problem size N takes 10*T seconds to compute but problem size 2*N takes >>>> 50*T seconds to compute). >>>> >>>> In other words, if one were to propose or design a new algorithm/solver >>>> capable of handling large-scale problems, would it be equally if not >>>> more important to show the weak-scaling potential? Because if you really >>>> think about it, a "truly efficient" algorithm will be less likely to >>>> scale in the strong sense but computation time will be close to linearly >>>> proportional to problem size (hence better scaling in the weak-sense). >>>> It seems if I am trying to convince someone that a proposed >>>> computational framework is "high performing" without getting too deep >>>> into performance modeling, a poor parallel efficiency (arising due to >>>> good sequential efficiency) in the strong sense may not look promising. >>>> >>> >>> These are all valid thoughts. Let me add another perspective: If you are >>> only interested in the machine aspects of scaling, you could run for a >>> fixed number of solver iterations. That allows you to focus on the actual >>> computational work done and your results will exclusively reflect the >>> machine's performance. Thus, even though fixing solver iterations and thus >>> not running solvers to convergence is a bad shortcut from the solver point >>> of view, it can be a handy way of eliminating algorithmic fluctuations. >>> (Clearly, this simplistic approach has not only been used, but also >>> abused...) >>> >>> Best regards, >>> Karli >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Aug 30 13:55:13 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 30 Aug 2016 12:55:13 -0600 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: Message-ID: <8737lmccj2.fsf@jedbrown.org> Mark Adams writes: > I would guess it is the latter. In this case, definitely. > It is hard to get "rollover" to the right. You could get it on KNL > (cache configuration of HBM) when you spill out of HBM. Yes, but the same occurs if you start repeatedly spilling from some level of cache, which can happen even if the overall data structure is much larger than cache. Not all algorithms have the flexibility to choose tile sizes independently from problem size and specification; it's easy to forget that this luxury is not universal when focusing on dense linear algebra, for example. > Personally, if you are you going to go into this much detail (eg, more than > just one plot) I would show a plot of iteration count vs problem size, and > be done with it, and then fix the iteration count for the weak scaling and > dynamic range plot (I agree we could use a better name). Alternatively, plot the performance spectrum (dynamic range) for the end-to-end solve and per iteration. The end user ultimately doesn't care about the cost per iteration (and it's meaningless when comparing to an algorithm that converges differently), so I'd prefer that the spectrum for the end-to-end application always be shown. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From mono at dtu.dk Tue Aug 30 14:43:11 2016 From: mono at dtu.dk (=?Windows-1252?Q?Morten_Nobel-J=F8rgensen?=) Date: Tue, 30 Aug 2016 19:43:11 +0000 Subject: [petsc-users] Distribution of DMPlex for FEM In-Reply-To: References: , Message-ID: <6B03D347796DED499A2696FC095CE81A05B3738E@ait-pex02mbx04.win.dtu.dk> We have now hit a related problem. If we change the dof from 1 to 3 we the following error message. I'm using the next branch (pulled from git today). mpiexec -np 2 ./ex18k [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Argument out of range [1]PETSC ERROR: Row too large: row 36 max 35 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Development GIT revision: v3.7.3-1857-g5b40c63 GIT Date: 2016-08-29 22:13:25 -0500 [1]PETSC ERROR: ./ex18k on a arch-linux2-c-debug named morten-VirtualBox by morten Tue Aug 30 21:31:36 2016 [1]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack --download-mpich --download-netcdf --download-hdf5 --download-exodusii [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 556 in /home/morten/petsc/src/mat/impls/aij/mpi/mpiaij.c [1]PETSC ERROR: #2 MatSetValues() line 1239 in /home/morten/petsc/src/mat/interface/matrix.c [1]PETSC ERROR: #3 MatSetValuesLocal() line 2102 in /home/morten/petsc/src/mat/interface/matrix.c [1]PETSC ERROR: #4 CreateGlobalStiffnessMatrix() line 82 in /home/morten/topop_in_petsc_unstruct/ex18k.cc [1]PETSC ERROR: #5 main() line 113 in /home/morten/topop_in_petsc_unstruct/ex18k.cc [1]PETSC ERROR: No PETSc Option Table entries We are not fully sure if this is related to the way we use DMPlex or a bug in our code. Any help or tips will be appreciated :) Kind regards, Morten ________________________________ From: Morten Nobel-J?rgensen Sent: Thursday, July 14, 2016 9:45 AM To: Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Distribution of DMPlex for FEM Hi Matthew Thanks for your answer and your fix. It works :))) Kind regards, Morten Fra: Matthew Knepley > Dato: Thursday 14 July 2016 at 00:03 Til: Morten Nobel-Joergensen > Cc: "petsc-users at mcs.anl.gov" > Emne: Re: [petsc-users] Distribution of DMPlex for FEM On Wed, Jul 13, 2016 at 3:57 AM, Morten Nobel-J?rgensen > wrote: I?m having problems distributing a simple FEM model using DMPlex. For test case I use 1x1x2 hex box elements (/cells) with 12 vertices. Each vertex has one DOF. When I distribute the system to two processors, each get a single element and the local vector has the size 8 (one DOF for each vertex of a hex box) as expected. My problem is that when I manually assemble the global stiffness matrix (a 12x12 matrix) it seems like my ghost values are ignored. I?m sure that I?m missing something obvious but cannot see what it is. In the attached example, I?m assembling the global stiffness matrix using a simple local stiffness matrix of ones. This makes it very easy to see if the matrix is assembled correctly. If I run it on one process, then global stiffness matrix consists of 0?s, 1?s and 2?s and its trace is 16.0. But if I run it distributed on on two, then it consists only of 0's and 1?s and its trace is 12.0. I hope that somebody can spot my mistake and help me in the right direction :) This is my fault, and Stefano Zampini had already tried to tell me this was broken. I normally use DMPlexMatSetClosure(), which handles global indices correctly. I have fixed this in the branch knepley/fix-plex-l2g which is also merged to 'next'. I am attaching a version of your sample where all objects are freed correctly. Let me know if that works for you. Thanks, Matt Kind regards, Morten -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex18k.cc Type: application/octet-stream Size: 4631 bytes Desc: ex18k.cc URL: From jychang48 at gmail.com Tue Aug 30 19:01:48 2016 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 30 Aug 2016 19:01:48 -0500 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: <8737lmccj2.fsf@jedbrown.org> References: <8737lmccj2.fsf@jedbrown.org> Message-ID: Thanks everyone. I still think there is a even better phrase for this, like, static scaling? Because unlike strong/weak scaling, concurrency is fixed (hence "static") and we only scale the problem, so this is a mix between strong and weak scaling. ?\_(?)_/? Anyway, what I really wanted to say is, it's good to know that these "dynamic range/performance spectrum/static scaling" plots are designed to go past the sweet spots. I also agree that it would be interesting to see a time vs dofs*iterations/time plot. Would it then also be useful to look at the step to setting up the preconditioner? Justin On Tue, Aug 30, 2016 at 1:55 PM, Jed Brown wrote: > Mark Adams writes: > > I would guess it is the latter. > > In this case, definitely. > > > It is hard to get "rollover" to the right. You could get it on KNL > > (cache configuration of HBM) when you spill out of HBM. > > Yes, but the same occurs if you start repeatedly spilling from some > level of cache, which can happen even if the overall data structure is > much larger than cache. Not all algorithms have the flexibility to > choose tile sizes independently from problem size and specification; > it's easy to forget that this luxury is not universal when focusing on > dense linear algebra, for example. > > > Personally, if you are you going to go into this much detail (eg, more > than > > just one plot) I would show a plot of iteration count vs problem size, > and > > be done with it, and then fix the iteration count for the weak scaling > and > > dynamic range plot (I agree we could use a better name). > > Alternatively, plot the performance spectrum (dynamic range) for the > end-to-end solve and per iteration. The end user ultimately doesn't > care about the cost per iteration (and it's meaningless when comparing > to an algorithm that converges differently), so I'd prefer that the > spectrum for the end-to-end application always be shown. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Tue Aug 30 19:26:14 2016 From: mfadams at lbl.gov (Mark Adams) Date: Tue, 30 Aug 2016 20:26:14 -0400 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: <8737lmccj2.fsf@jedbrown.org> Message-ID: > > > Anyway, what I really wanted to say is, it's good to know that these > "dynamic range/performance spectrum/static scaling" plots are designed to > go past the sweet spots. I also agree that it would be interesting to see a > time vs dofs*iterations/time plot. Would it then also be useful to look at > the step to setting up the preconditioner? > > Yes, I generally split up timing between "mesh setup" (symbolic factorization of LU), "matrix setup" (eg, factorizations), and solve time. The degree of amortization that you get for the two setup phases depends on your problem and so it is useful to separate them. > Justin > > On Tue, Aug 30, 2016 at 1:55 PM, Jed Brown wrote: > >> Mark Adams writes: >> > I would guess it is the latter. >> >> In this case, definitely. >> >> > It is hard to get "rollover" to the right. You could get it on KNL >> > (cache configuration of HBM) when you spill out of HBM. >> >> Yes, but the same occurs if you start repeatedly spilling from some >> level of cache, which can happen even if the overall data structure is >> much larger than cache. Not all algorithms have the flexibility to >> choose tile sizes independently from problem size and specification; >> it's easy to forget that this luxury is not universal when focusing on >> dense linear algebra, for example. >> >> > Personally, if you are you going to go into this much detail (eg, more >> than >> > just one plot) I would show a plot of iteration count vs problem size, >> and >> > be done with it, and then fix the iteration count for the weak scaling >> and >> > dynamic range plot (I agree we could use a better name). >> >> Alternatively, plot the performance spectrum (dynamic range) for the >> end-to-end solve and per iteration. The end user ultimately doesn't >> care about the cost per iteration (and it's meaningless when comparing >> to an algorithm that converges differently), so I'd prefer that the >> spectrum for the end-to-end application always be shown. >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Aug 30 23:14:17 2016 From: jed at jedbrown.org (Jed Brown) Date: Tue, 30 Aug 2016 22:14:17 -0600 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: <8737lmccj2.fsf@jedbrown.org> Message-ID: <87d1kpbmna.fsf@jedbrown.org> Mark Adams writes: >> >> >> Anyway, what I really wanted to say is, it's good to know that these >> "dynamic range/performance spectrum/static scaling" plots are designed to >> go past the sweet spots. I also agree that it would be interesting to see a >> time vs dofs*iterations/time plot. Would it then also be useful to look at >> the step to setting up the preconditioner? >> >> > Yes, I generally split up timing between "mesh setup" (symbolic > factorization of LU), "matrix setup" (eg, factorizations), and solve time. > The degree of amortization that you get for the two setup phases depends on > your problem and so it is useful to separate them. Right, there is nothing wrong with splitting up the phases, but if you never show a spectrum for the total, then I will be suspicious. And if you only show "per iteration" instead of for a complete solve, then I will assume that you're only doing that because convergence is unusably slow. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: not available URL: From jychang48 at gmail.com Wed Aug 31 02:01:50 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 31 Aug 2016 02:01:50 -0500 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: <87d1kpbmna.fsf@jedbrown.org> References: <8737lmccj2.fsf@jedbrown.org> <87d1kpbmna.fsf@jedbrown.org> Message-ID: Attached is the -log_view output (from firedrake). Event Stage 1: Linear_solver is where I assemble and solve the linear system of equations. I am using the HYPRE BoomerAMG preconditioner so log_view cannot "see into" the exact steps, but based on what it can see, how do I distinguish between these various setup and timing phases? For example, when I look at these lines: PCSetUp 1 1.0 2.2858e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 11 0 0 0 0 0 PCApply 38 1.0 1.4102e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 56 0 0 0 0 66 0 0 0 0 0 KSPSetUp 1 1.0 9.9111e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 1.7529e+01 1.0 2.44e+09 1.0 0.0e+00 0.0e+00 0.0e+00 70 7 0 0 0 82 7 0 0 0 139 SNESSolve 1 1.0 2.1056e+01 1.0 3.75e+10 1.0 0.0e+00 0.0e+00 0.0e+00 84100 0 0 0 99100 0 0 0 1781 SNESFunctionEval 1 1.0 1.0763e+00 1.0 1.07e+10 1.0 0.0e+00 0.0e+00 0.0e+00 4 29 0 0 0 5 29 0 0 0 9954 SNESJacobianEval 1 1.0 2.4495e+00 1.0 2.43e+10 1.0 0.0e+00 0.0e+00 0.0e+00 10 65 0 0 0 12 65 0 0 0 9937 So how do I break down "mesh setup", "matrix setup", and "solve time" phases? I am guessing "PCSetUp" has to do with one of the first two phases, but how would I categorize the rest of the events? I see that HYPRE doesn't have as much information as the other PCs like GAMG and ML but can one still breakdown the timing phases through log_view alone? Thanks, Justin On Tue, Aug 30, 2016 at 11:14 PM, Jed Brown wrote: > Mark Adams writes: > > >> > >> > >> Anyway, what I really wanted to say is, it's good to know that these > >> "dynamic range/performance spectrum/static scaling" plots are designed > to > >> go past the sweet spots. I also agree that it would be interesting to > see a > >> time vs dofs*iterations/time plot. Would it then also be useful to look > at > >> the step to setting up the preconditioner? > >> > >> > > Yes, I generally split up timing between "mesh setup" (symbolic > > factorization of LU), "matrix setup" (eg, factorizations), and solve > time. > > The degree of amortization that you get for the two setup phases depends > on > > your problem and so it is useful to separate them. > > Right, there is nothing wrong with splitting up the phases, but if you > never show a spectrum for the total, then I will be suspicious. And if > you only show "per iteration" instead of for a complete solve, then I > will assume that you're only doing that because convergence is unusably > slow. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- Residual norms for linear_ solve. 0 KSP Residual norm 5.261660052036e+02 1 KSP Residual norm 1.356995663739e+02 2 KSP Residual norm 4.098866223191e+01 3 KSP Residual norm 1.600475709119e+01 4 KSP Residual norm 6.956667251063e+00 5 KSP Residual norm 3.861942754258e+00 6 KSP Residual norm 2.331981130299e+00 7 KSP Residual norm 1.404876311943e+00 8 KSP Residual norm 8.215556397889e-01 9 KSP Residual norm 5.226439657305e-01 10 KSP Residual norm 3.421520551962e-01 11 KSP Residual norm 2.382992002722e-01 12 KSP Residual norm 1.743249670147e-01 13 KSP Residual norm 1.277911689618e-01 14 KSP Residual norm 9.453802371730e-02 15 KSP Residual norm 7.022732618304e-02 16 KSP Residual norm 5.276835142527e-02 17 KSP Residual norm 3.966717849679e-02 18 KSP Residual norm 2.987708356527e-02 19 KSP Residual norm 2.221046390150e-02 20 KSP Residual norm 1.631262945106e-02 21 KSP Residual norm 1.188030506469e-02 22 KSP Residual norm 8.655984108945e-03 23 KSP Residual norm 6.239072936196e-03 24 KSP Residual norm 4.455419528387e-03 25 KSP Residual norm 3.235023376588e-03 26 KSP Residual norm 2.345588803418e-03 27 KSP Residual norm 1.668600898579e-03 28 KSP Residual norm 1.180578845647e-03 29 KSP Residual norm 8.327223711005e-04 30 KSP Residual norm 5.853054571413e-04 31 KSP Residual norm 4.038722556707e-04 32 KSP Residual norm 2.731786184181e-04 33 KSP Residual norm 1.853188978548e-04 34 KSP Residual norm 1.277834040044e-04 35 KSP Residual norm 8.853670330190e-05 36 KSP Residual norm 6.151569062192e-05 37 KSP Residual norm 4.247283089736e-05 Linear linear_ solve converged due to CONVERGED_RTOL iterations 37 Wall-clock time: 2.126e+01 seconds ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- 3D_ex1.py on a arch-python-linux-x86_64 named pacotaco-xps with 1 processor, by justin Tue Aug 30 23:34:47 2016 Using Petsc Development GIT revision: v3.4.2-13575-gc28f300 GIT Date: 2016-07-10 20:22:41 -0500 Max Max/Min Avg Total Time (sec): 2.497e+01 1.00000 2.497e+01 Objects: 1.310e+02 1.00000 1.310e+02 Flops: 3.749e+10 1.00000 3.749e+10 3.749e+10 Flops/sec: 1.502e+09 1.00000 1.502e+09 1.502e+09 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 3.7065e+00 14.8% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: Linear_solver: 2.1265e+01 85.2% 3.7494e+10 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 2: Nonlinear_solver: 9.5367e-07 0.0% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecSet 4 1.0 3.1900e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 1 1.0 2.7848e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 DMPlexInterp 1 1.0 1.2410e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexStratify 2 1.0 4.7100e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetGraph 7 1.0 2.5320e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 CreateMesh 8 1.0 1.2974e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 CreateExtMesh 1 1.0 4.5982e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 12 0 0 0 0 0 Mesh: reorder 1 1.0 1.7152e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Mesh: numbering 1 1.0 7.5190e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 CreateFunctionSpace 5 1.0 4.4637e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 12 0 0 0 0 0 Trace: eval 4 1.0 1.3766e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 6 0 0 0 0 37 0 0 0 0 0 ParLoopExecute 2 1.0 1.3765e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 6 0 0 0 0 37 0 0 0 0 0 ParLoopCKernel 6 1.0 1.3747e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 6 0 0 0 0 37 0 0 0 0 0 ParLoopReductionBegin 2 1.0 9.0599e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ParLoopReductionEnd 2 1.0 5.0068e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 CreateSparsity 1 1.0 6.4163e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 17 0 0 0 0 0 MatZeroInitial 1 1.0 6.9048e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 19 0 0 0 0 0 --- Event Stage 1: Linear_solver VecTDot 74 1.0 6.9256e-02 1.0 1.52e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2202 VecNorm 38 1.0 1.8549e-02 1.0 7.83e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4221 VecCopy 4 1.0 4.4966e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 45 1.0 1.4026e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 75 1.0 9.5319e-02 1.0 1.55e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1621 VecAYPX 36 1.0 4.9965e-02 1.0 7.42e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1485 MatMult 37 1.0 9.0438e-01 1.0 1.98e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 5 0 0 0 4 5 0 0 0 2189 MatConvert 1 1.0 1.0125e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 1.6134e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 1 1.0 8.9929e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCSetUp 1 1.0 2.2858e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 11 0 0 0 0 0 PCApply 38 1.0 1.4102e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 56 0 0 0 0 66 0 0 0 0 0 KSPSetUp 1 1.0 9.9111e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 1.7529e+01 1.0 2.44e+09 1.0 0.0e+00 0.0e+00 0.0e+00 70 7 0 0 0 82 7 0 0 0 139 SNESSolve 1 1.0 2.1056e+01 1.0 3.75e+10 1.0 0.0e+00 0.0e+00 0.0e+00 84100 0 0 0 99100 0 0 0 1781 SNESFunctionEval 1 1.0 1.0763e+00 1.0 1.07e+10 1.0 0.0e+00 0.0e+00 0.0e+00 4 29 0 0 0 5 29 0 0 0 9954 SNESJacobianEval 1 1.0 2.4495e+00 1.0 2.43e+10 1.0 0.0e+00 0.0e+00 0.0e+00 10 65 0 0 0 12 65 0 0 0 9937 Trace: eval 11 1.0 3.6623e+00 1.0 3.51e+10 1.0 0.0e+00 0.0e+00 0.0e+00 15 93 0 0 0 17 93 0 0 0 9572 ParLoopExecute 14 1.0 3.6407e+00 1.0 3.51e+10 1.0 0.0e+00 0.0e+00 0.0e+00 15 93 0 0 0 17 93 0 0 0 9629 ParLoopCKernel 31 1.0 3.6314e+00 1.0 3.51e+10 1.0 0.0e+00 0.0e+00 0.0e+00 15 93 0 0 0 17 93 0 0 0 9653 ParLoopReductionBegin 14 1.0 4.6015e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ParLoopReductionEnd 14 1.0 2.2411e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ApplyBC 6 1.0 1.6722e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 --- Event Stage 2: Nonlinear_solver ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 10 8 4608 0. Viewer 2 0 0 0. Index Set 36 27 21384 0. IS L to G Mapping 1 0 0 0. Section 24 8 5568 0. Vector 5 4 32975776 0. Matrix 2 0 0 0. Preconditioner 1 1 1400 0. Krylov Solver 1 1 1248 0. SNES 1 1 1344 0. SNESLineSearch 1 1 992 0. DMSNES 1 0 0 0. Distributed Mesh 8 4 19008 0. GraphPartitioner 2 1 612 0. Star Forest Bipartite Graph 19 11 8888 0. Discrete System 8 4 3520 0. --- Event Stage 1: Linear_solver Vector 8 2 16487888 0. DMKSP interface 1 0 0 0. --- Event Stage 2: Nonlinear_solver ======================================================================================================================== Average time to get PetscTime(): 0. #PETSc Option Table entries: -ksp_rtol 1e-3 -linear_ksp_atol 1e-50 -linear_ksp_converged_reason -linear_ksp_monitor -linear_ksp_rtol 1e-7 -linear_ksp_type cg -linear_pc_hypre_boomeramg_agg_nl 2 -linear_pc_hypre_boomeramg_strong_threshold 0.75 -linear_pc_hypre_type boomeramg -linear_pc_type hypre -linear_snes_atol 1e-8 -linear_snes_type ksponly -log_view -snes_converged_reason -snes_max_it 1000 -tao_converged_reason -tao_max_it 1000 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --prefix=/home/justin/Software/firedrake/lib/python2.7/site-packages/petsc PETSC_ARCH=arch-python-linux-x86_64 --with-shared-libraries=1 --with-debugging=0 --with-c2html=0 --with-cc=/usr/bin/mpicc --with-cxx=/usr/bin/mpicxx --with-fc=/usr/bin/mpif90 --download-ml --download-ctetgen --download-triangle --download-chaco --download-metis --download-parmetis --download-scalapack --download-hypre --download-mumps --download-netcdf --download-hdf5 --download-exodusii ----------------------------------------- Libraries compiled on Fri Aug 5 02:51:37 2016 on pacotaco-xps Machine characteristics: Linux-4.4.0-31-generic-x86_64-with-Ubuntu-16.04-xenial Using PETSc directory: /tmp/pip-kA7m2r-build Using PETSc arch: arch-python-linux-x86_64 ----------------------------------------- Using C compiler: /usr/bin/mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fvisibility=hidden -g -O ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/bin/mpif90 -fPIC -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -g -O ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/tmp/pip-kA7m2r-build/arch-python-linux-x86_64/include -I/tmp/pip-kA7m2r-build/include -I/tmp/pip-kA7m2r-build/include -I/tmp/pip-kA7m2r-build/arch-python-linux-x86_64/include -I/home/justin/Software/firedrake/lib/python2.7/site-packages/petsc/include -I/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent -I/usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi ----------------------------------------- Using C linker: /usr/bin/mpicc Using Fortran linker: /usr/bin/mpif90 Using libraries: -Wl,-rpath,/tmp/pip-kA7m2r-build/arch-python-linux-x86_64/lib -L/tmp/pip-kA7m2r-build/arch-python-linux-x86_64/lib -lpetsc -Wl,-rpath,/home/justin/Software/firedrake/lib/python2.7/site-packages/petsc/lib -L/home/justin/Software/firedrake/lib/python2.7/site-packages/petsc/lib -lHYPRE -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_cxx -lstdc++ -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lml -lmpi_cxx -lstdc++ -llapack -lblas -lparmetis -lmetis -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lhwloc -lctetgen -lssl -lcrypto -lchaco -lm -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/usr/lib/openmpi/lib -lmpi -lgcc_s -lpthread -ldl ----------------------------------------- From kyungjun.choi92 at gmail.com Wed Aug 31 03:49:43 2016 From: kyungjun.choi92 at gmail.com (Choi Kyungjun) Date: Wed, 31 Aug 2016 17:49:43 +0900 Subject: [petsc-users] GMRES with matrix-free method and preconditioning matrix for higher performance. Message-ID: Dear Petsc. I am implementing Petsc library for my CFD flow code. Thanks to Matt, I got what I wanted last week. It was the GMRES with matrix-free method, no preconditioning matrix and command line options are below. *-snes_mf -pc_type none -..monitor -..converged_reason* The solve worked, but performed very poorly. I learned that the efficiency of Krylov-subspace methods depends strongly depends on a good preconditioner. And in the Petsc manual, the matrix-free method is allowed only with no preconditioning, a user-provided preconditioner matrix, or a user-provided preconditioner shell. Here are my questions. 1) To improve the solver performance using GMRES, is there any way using snes_mf without preconditioning matrix? 2) For user-provided preconditioner matrix, I saw some example codes that provide approx. Jacobian matrix as preconditioner matrix. But this means that I should derive approx. Jacobian mat for system, am I right? 3) I'd like to know which is the fastest way to solve with GMRES method. Could you tell me or let me know any other examples? Thank you very much for your help. Sincerely, Kyungjun. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 31 05:07:43 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Aug 2016 05:07:43 -0500 Subject: [petsc-users] GMRES with matrix-free method and preconditioning matrix for higher performance. In-Reply-To: References: Message-ID: On Wed, Aug 31, 2016 at 3:49 AM, Choi Kyungjun wrote: > Dear Petsc. > > I am implementing Petsc library for my CFD flow code. > > Thanks to Matt, I got what I wanted last week. > > It was the GMRES with matrix-free method, no preconditioning matrix and > command line options are below. > > *-snes_mf -pc_type none -..monitor -..converged_reason* > > The solve worked, but performed very poorly. > > > I learned that the efficiency of Krylov-subspace methods depends strongly > depends on a good preconditioner. > > And in the Petsc manual, the matrix-free method is allowed only with no > preconditioning, a user-provided preconditioner matrix, or a user-provided > preconditioner shell. > > > Here are my questions. > > 1) To improve the solver performance using GMRES, is there any way using > snes_mf without preconditioning matrix? > Not really. The CHEBY preconditioner will work without an explicit matrix, however its not great by itself. > 2) For user-provided preconditioner matrix, I saw some example codes that > provide approx. Jacobian matrix as preconditioner matrix. But this means > that I should derive approx. Jacobian mat for system, am I right? > Yes. > 3) I'd like to know which is the fastest way to solve with GMRES method. > Could you tell me or let me know any other examples? > 1) The solve depends greatly on the physics/matrix you are using. Without knowing that, we can't say anything. For example, is the system elliptic? If so, then using Multigrid (MG) is generally a good idea. 2) In general, I think its a mistake to think of GMRES or any KSP as a solver. We should think of them as accelerators for solvers, as they were originally intended. For example, MG is a good solver for elliptic CFD equations, as long as you somehow deal with incompressibility. Then you can use GMRES to cleanup some things you miss when implementing your MG solver. 3) The best thing to do in this case is to look at the literature, which is voluminous, and find the solver you want to implement. PETSc really speeds up the actually implementation and testing. Thanks, Matt > Thank you very much for your help. > > Sincerely, > > Kyungjun. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 31 05:13:39 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Aug 2016 05:13:39 -0500 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: <8737lmccj2.fsf@jedbrown.org> <87d1kpbmna.fsf@jedbrown.org> Message-ID: On Wed, Aug 31, 2016 at 2:01 AM, Justin Chang wrote: > Attached is the -log_view output (from firedrake). Event Stage 1: > Linear_solver is where I assemble and solve the linear system of equations. > > I am using the HYPRE BoomerAMG preconditioner so log_view cannot "see > into" the exact steps, but based on what it can see, how do I distinguish > between these various setup and timing phases? > > For example, when I look at these lines: > > PCSetUp 1 1.0 2.2858e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 9 0 0 0 0 11 0 0 0 0 0 > PCApply 38 1.0 1.4102e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 56 0 0 0 0 66 0 0 0 0 0 > KSPSetUp 1 1.0 9.9111e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 1 1.0 1.7529e+01 1.0 2.44e+09 1.0 0.0e+00 0.0e+00 > 0.0e+00 70 7 0 0 0 82 7 0 0 0 139 > SNESSolve 1 1.0 2.1056e+01 1.0 3.75e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 84100 0 0 0 99100 0 0 0 1781 > SNESFunctionEval 1 1.0 1.0763e+00 1.0 1.07e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 4 29 0 0 0 5 29 0 0 0 9954 > SNESJacobianEval 1 1.0 2.4495e+00 1.0 2.43e+10 1.0 0.0e+00 0.0e+00 > 0.0e+00 10 65 0 0 0 12 65 0 0 0 9937 > > So how do I break down "mesh setup", "matrix setup", and "solve time" > phases? I am guessing "PCSetUp" has to do with one of the first two phases, > but how would I categorize the rest of the events? I see that HYPRE doesn't > have as much information as the other PCs like GAMG and ML but can one > still breakdown the timing phases through log_view alone? > 1) It looks like you call PCSetUp() yourself, since otherwise KSPSetUp() would contain that time. Notice that you can ignore KSPSetUp() here. 2) The setup time is usually KSPSetUp(), but if here you add to it PCSetUp() since you called it. 3) The solve time for SNES can be split into a) KSPSolve() for the update calculation b) SNESFunctionEval, SNESJacobianEval for everything else (conv check, line search, J calc, etc.) or you can just take SNESSolve() - KSPSolve() 4) Note that PCApply() is most of KSPSolve(), which is generally good Thanks, Matt > Thanks, > Justin > > On Tue, Aug 30, 2016 at 11:14 PM, Jed Brown wrote: > >> Mark Adams writes: >> >> >> >> >> >> >> Anyway, what I really wanted to say is, it's good to know that these >> >> "dynamic range/performance spectrum/static scaling" plots are designed >> to >> >> go past the sweet spots. I also agree that it would be interesting to >> see a >> >> time vs dofs*iterations/time plot. Would it then also be useful to >> look at >> >> the step to setting up the preconditioner? >> >> >> >> >> > Yes, I generally split up timing between "mesh setup" (symbolic >> > factorization of LU), "matrix setup" (eg, factorizations), and solve >> time. >> > The degree of amortization that you get for the two setup phases >> depends on >> > your problem and so it is useful to separate them. >> >> Right, there is nothing wrong with splitting up the phases, but if you >> never show a spectrum for the total, then I will be suspicious. And if >> you only show "per iteration" instead of for a complete solve, then I >> will assume that you're only doing that because convergence is unusably >> slow. >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Aug 31 05:23:25 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 31 Aug 2016 05:23:25 -0500 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: <8737lmccj2.fsf@jedbrown.org> <87d1kpbmna.fsf@jedbrown.org> Message-ID: Matt, So is the "solve phase" going to be KSPSolve() - PCSetUp()? In other words, if I want to look at time/iterations, should it just be over KSPSolve or should I exclude the PC setup? Justin On Wed, Aug 31, 2016 at 5:13 AM, Matthew Knepley wrote: > On Wed, Aug 31, 2016 at 2:01 AM, Justin Chang wrote: > >> Attached is the -log_view output (from firedrake). Event Stage 1: >> Linear_solver is where I assemble and solve the linear system of equations. >> >> I am using the HYPRE BoomerAMG preconditioner so log_view cannot "see >> into" the exact steps, but based on what it can see, how do I distinguish >> between these various setup and timing phases? >> >> For example, when I look at these lines: >> >> PCSetUp 1 1.0 2.2858e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 9 0 0 0 0 11 0 0 0 0 0 >> PCApply 38 1.0 1.4102e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 56 0 0 0 0 66 0 0 0 0 0 >> KSPSetUp 1 1.0 9.9111e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 1 1.0 1.7529e+01 1.0 2.44e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 70 7 0 0 0 82 7 0 0 0 139 >> SNESSolve 1 1.0 2.1056e+01 1.0 3.75e+10 1.0 0.0e+00 0.0e+00 >> 0.0e+00 84100 0 0 0 99100 0 0 0 1781 >> SNESFunctionEval 1 1.0 1.0763e+00 1.0 1.07e+10 1.0 0.0e+00 0.0e+00 >> 0.0e+00 4 29 0 0 0 5 29 0 0 0 9954 >> SNESJacobianEval 1 1.0 2.4495e+00 1.0 2.43e+10 1.0 0.0e+00 0.0e+00 >> 0.0e+00 10 65 0 0 0 12 65 0 0 0 9937 >> >> So how do I break down "mesh setup", "matrix setup", and "solve time" >> phases? I am guessing "PCSetUp" has to do with one of the first two phases, >> but how would I categorize the rest of the events? I see that HYPRE doesn't >> have as much information as the other PCs like GAMG and ML but can one >> still breakdown the timing phases through log_view alone? >> > > 1) It looks like you call PCSetUp() yourself, since otherwise KSPSetUp() > would contain that time. Notice that you can ignore KSPSetUp() here. > > 2) The setup time is usually KSPSetUp(), but if here you add to it > PCSetUp() since you called it. > > 3) The solve time for SNES can be split into > > a) KSPSolve() for the update calculation > > b) SNESFunctionEval, SNESJacobianEval for everything else (conv check, > line search, J calc, etc.) or you can just take SNESSolve() - KSPSolve() > > 4) Note that PCApply() is most of KSPSolve(), which is generally good > > Thanks, > > Matt > > >> Thanks, >> Justin >> >> On Tue, Aug 30, 2016 at 11:14 PM, Jed Brown wrote: >> >>> Mark Adams writes: >>> >>> >> >>> >> >>> >> Anyway, what I really wanted to say is, it's good to know that these >>> >> "dynamic range/performance spectrum/static scaling" plots are >>> designed to >>> >> go past the sweet spots. I also agree that it would be interesting to >>> see a >>> >> time vs dofs*iterations/time plot. Would it then also be useful to >>> look at >>> >> the step to setting up the preconditioner? >>> >> >>> >> >>> > Yes, I generally split up timing between "mesh setup" (symbolic >>> > factorization of LU), "matrix setup" (eg, factorizations), and solve >>> time. >>> > The degree of amortization that you get for the two setup phases >>> depends on >>> > your problem and so it is useful to separate them. >>> >>> Right, there is nothing wrong with splitting up the phases, but if you >>> never show a spectrum for the total, then I will be suspicious. And if >>> you only show "per iteration" instead of for a complete solve, then I >>> will assume that you're only doing that because convergence is unusably >>> slow. >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 31 05:28:22 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Aug 2016 05:28:22 -0500 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: <8737lmccj2.fsf@jedbrown.org> <87d1kpbmna.fsf@jedbrown.org> Message-ID: On Wed, Aug 31, 2016 at 5:23 AM, Justin Chang wrote: > Matt, > > So is the "solve phase" going to be KSPSolve() - PCSetUp()? > Setup Phase: KSPSetUp + PCSetup Solve Phase: SNESSolve This contains SNESFunctionEval, SNESJacobianEval, KSPSolve Matt In other words, if I want to look at time/iterations, should it just be > over KSPSolve or should I exclude the PC setup? > > Justin > > > > On Wed, Aug 31, 2016 at 5:13 AM, Matthew Knepley > wrote: > >> On Wed, Aug 31, 2016 at 2:01 AM, Justin Chang >> wrote: >> >>> Attached is the -log_view output (from firedrake). Event Stage 1: >>> Linear_solver is where I assemble and solve the linear system of equations. >>> >>> I am using the HYPRE BoomerAMG preconditioner so log_view cannot "see >>> into" the exact steps, but based on what it can see, how do I distinguish >>> between these various setup and timing phases? >>> >>> For example, when I look at these lines: >>> >>> PCSetUp 1 1.0 2.2858e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 9 0 0 0 0 11 0 0 0 0 0 >>> PCApply 38 1.0 1.4102e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 56 0 0 0 0 66 0 0 0 0 0 >>> KSPSetUp 1 1.0 9.9111e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> KSPSolve 1 1.0 1.7529e+01 1.0 2.44e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 70 7 0 0 0 82 7 0 0 0 139 >>> SNESSolve 1 1.0 2.1056e+01 1.0 3.75e+10 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 84100 0 0 0 99100 0 0 0 1781 >>> SNESFunctionEval 1 1.0 1.0763e+00 1.0 1.07e+10 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 4 29 0 0 0 5 29 0 0 0 9954 >>> SNESJacobianEval 1 1.0 2.4495e+00 1.0 2.43e+10 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 10 65 0 0 0 12 65 0 0 0 9937 >>> >>> So how do I break down "mesh setup", "matrix setup", and "solve time" >>> phases? I am guessing "PCSetUp" has to do with one of the first two phases, >>> but how would I categorize the rest of the events? I see that HYPRE doesn't >>> have as much information as the other PCs like GAMG and ML but can one >>> still breakdown the timing phases through log_view alone? >>> >> >> 1) It looks like you call PCSetUp() yourself, since otherwise KSPSetUp() >> would contain that time. Notice that you can ignore KSPSetUp() here. >> >> 2) The setup time is usually KSPSetUp(), but if here you add to it >> PCSetUp() since you called it. >> >> 3) The solve time for SNES can be split into >> >> a) KSPSolve() for the update calculation >> >> b) SNESFunctionEval, SNESJacobianEval for everything else (conv check, >> line search, J calc, etc.) or you can just take SNESSolve() - KSPSolve() >> >> 4) Note that PCApply() is most of KSPSolve(), which is generally good >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Justin >>> >>> On Tue, Aug 30, 2016 at 11:14 PM, Jed Brown wrote: >>> >>>> Mark Adams writes: >>>> >>>> >> >>>> >> >>>> >> Anyway, what I really wanted to say is, it's good to know that these >>>> >> "dynamic range/performance spectrum/static scaling" plots are >>>> designed to >>>> >> go past the sweet spots. I also agree that it would be interesting >>>> to see a >>>> >> time vs dofs*iterations/time plot. Would it then also be useful to >>>> look at >>>> >> the step to setting up the preconditioner? >>>> >> >>>> >> >>>> > Yes, I generally split up timing between "mesh setup" (symbolic >>>> > factorization of LU), "matrix setup" (eg, factorizations), and solve >>>> time. >>>> > The degree of amortization that you get for the two setup phases >>>> depends on >>>> > your problem and so it is useful to separate them. >>>> >>>> Right, there is nothing wrong with splitting up the phases, but if you >>>> never show a spectrum for the total, then I will be suspicious. And if >>>> you only show "per iteration" instead of for a complete solve, then I >>>> will assume that you're only doing that because convergence is unusably >>>> slow. >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Wed Aug 31 07:22:32 2016 From: kyungjun.choi92 at gmail.com (Choi Kyungjun) Date: Wed, 31 Aug 2016 21:22:32 +0900 Subject: [petsc-users] GMRES with matrix-free method and preconditioning matrix for higher performance. In-Reply-To: References: Message-ID: Thank you very much again Matt. Just another simple question. 2016-08-31 20:00 GMT+09:00 Matthew Knepley : > On Wed, Aug 31, 2016 at 5:46 AM, Choi Kyungjun > wrote: > >> Thanks Matt. >> >> I really appreciate your help every time. >> >> >> I think I forgot mentioning code info again. >> >> 1) >> I'm working on 2-D/3-D Compressible Euler equation solver, which is >> completely hyperbolic system. >> > > Okay, then MG is out. I would start by using a sparse direct solver, like > SuperLU. Then for parallelism > you could use ASM, and as the subsolver use SuperLU, so something like > > -ksp_type gmres -pc_type asm -sub_pc_type superlu > > You could get more sophisticated by > > - Trying to have blocks bigger than 1 process and using SuperLU_dist > > - Splitting up the fields using PCFIELDSPLIT. There are indications that > solving one of the fields first > can really help convergence. I am thinking of the work of David Keyes > and LuLu Liu on MSM methods. > For the above part, It's not compatible with -snes_mf command line option, is it? I applied cmd line options like below *-snes_mf -ksp_type gmres -pc_type asm -sub_pc_type superlu -snes_view -snes_monitor -ksp_monitor -snes_converged_reason -ksp_converged_reason* and my code flows like this *- call SNESCreate(PETSC_COMM_WORLD, Mixt%snes, ier)* *- call SNESSetFunction(Mixt%snes, Mixt%r, FormFunction, userctx, ier)* *- call SNESSetFromOptions(Mixt%snes, ier)* *- call SNESGetKSP(Mixt%snes, ksp, ier)* *- call KSPGetPC(ksp, pc, ier)* *- call KSPSetFromOptions(ksp, ier)* *- call PCSetFromOptions(pc, ier)* > > >> 2) >> And I'm trying to implement some implicit time scheme for convergence of >> my steady state problem. >> >> I used LUSGS implicit scheme before, but these days GMRES implicit scheme >> is popular for quadratic convergence characteristics. >> > > I am not sure I understand here. Nothing about GMRES is quadratic. > However, Newton's method can be quadratic > if your initial guess is good, and GMRES could be part of a solver for > that. > > >> For implicit time scheme, it is just same as matrix inversion, so I >> started PETSc library for GMRES, which is one of the greatest mathematical >> library. >> >> >> 3) >> As I'm using different numerical convective flux scheme (e.g. Roe's FDS, >> AUSM, etc), it would be really time consuming to derive Jacobian matrix for >> each scheme. >> > > Yes. However, the preconditioner matrix only needs to be approximate. I > think you should derive one for the easiest flux scheme and > always use that. The important thing is to couple the unknowns which > influence each other, rather than the precise method of influence. > > Thanks, > > Matt > > >> So I was fascinated by matrix-free method (I didn't completely understand >> this method back then), and I implemented GMRES with no preconditioning >> matrix with your help. >> >> After that, I wanted to ask you about any accelerating methods for my >> GMRES condition. >> >> I will try applying CHEBY preconditioner as you mentioned first (even if >> its performance wouldn't be that good). >> >> In order to constitute user-provided preconditioning matrix, could you >> tell me any similar examples? >> >> >> Thanks again. >> >> Your best, >> >> Kyungjun. >> >> >> 2016-08-31 19:07 GMT+09:00 Matthew Knepley : >> >>> On Wed, Aug 31, 2016 at 3:49 AM, Choi Kyungjun < >>> kyungjun.choi92 at gmail.com> wrote: >>> >>>> Dear Petsc. >>>> >>>> I am implementing Petsc library for my CFD flow code. >>>> >>>> Thanks to Matt, I got what I wanted last week. >>>> >>>> It was the GMRES with matrix-free method, no preconditioning matrix and >>>> command line options are below. >>>> >>>> *-snes_mf -pc_type none -..monitor -..converged_reason* >>>> >>>> The solve worked, but performed very poorly. >>>> >>>> >>>> I learned that the efficiency of Krylov-subspace methods depends >>>> strongly depends on a good preconditioner. >>>> >>>> And in the Petsc manual, the matrix-free method is allowed only with no >>>> preconditioning, a user-provided preconditioner matrix, or a user-provided >>>> preconditioner shell. >>>> >>>> >>>> Here are my questions. >>>> >>>> 1) To improve the solver performance using GMRES, is there any way >>>> using snes_mf without preconditioning matrix? >>>> >>> >>> Not really. The CHEBY preconditioner will work without an explicit >>> matrix, however its not great by itself. >>> >>> >>>> 2) For user-provided preconditioner matrix, I saw some example codes >>>> that provide approx. Jacobian matrix as preconditioner matrix. But this >>>> means that I should derive approx. Jacobian mat for system, am I right? >>>> >>> >>> Yes. >>> >>> >>>> 3) I'd like to know which is the fastest way to solve with GMRES >>>> method. Could you tell me or let me know any other examples? >>>> >>> >>> 1) The solve depends greatly on the physics/matrix you are using. >>> Without knowing that, we can't say anything. For example, is >>> the system elliptic? If so, then using Multigrid (MG) is generally a >>> good idea. >>> >>> 2) In general, I think its a mistake to think of GMRES or any KSP as a >>> solver. We should think of them as accelerators for solvers, >>> as they were originally intended. For example, MG is a good solver for >>> elliptic CFD equations, as long as you somehow deal with >>> incompressibility. Then you can use GMRES to cleanup some things you >>> miss when implementing your MG solver. >>> >>> 3) The best thing to do in this case is to look at the literature, which >>> is voluminous, and find the solver you want to implement. PETSc >>> really speeds up the actually implementation and testing. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thank you very much for your help. >>>> >>>> Sincerely, >>>> >>>> Kyungjun. >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 31 07:23:58 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Aug 2016 07:23:58 -0500 Subject: [petsc-users] GMRES with matrix-free method and preconditioning matrix for higher performance. In-Reply-To: References: Message-ID: On Wed, Aug 31, 2016 at 7:22 AM, Choi Kyungjun wrote: > Thank you very much again Matt. > > Just another simple question. > > 2016-08-31 20:00 GMT+09:00 Matthew Knepley : > >> On Wed, Aug 31, 2016 at 5:46 AM, Choi Kyungjun > > wrote: >> >>> Thanks Matt. >>> >>> I really appreciate your help every time. >>> >>> >>> I think I forgot mentioning code info again. >>> >>> 1) >>> I'm working on 2-D/3-D Compressible Euler equation solver, which is >>> completely hyperbolic system. >>> >> >> Okay, then MG is out. I would start by using a sparse direct solver, like >> SuperLU. Then for parallelism >> you could use ASM, and as the subsolver use SuperLU, so something like >> >> -ksp_type gmres -pc_type asm -sub_pc_type superlu >> >> You could get more sophisticated by >> >> - Trying to have blocks bigger than 1 process and using SuperLU_dist >> >> - Splitting up the fields using PCFIELDSPLIT. There are indications >> that solving one of the fields first >> can really help convergence. I am thinking of the work of David Keyes >> and LuLu Liu on MSM methods. >> > > For the above part, > > It's not compatible with -snes_mf command line option, is it? > No. I think MF is not a useful idea unless you have a preconditioning matrix. Thanks, Matt > I applied cmd line options like below > *-snes_mf -ksp_type gmres -pc_type asm -sub_pc_type superlu > -snes_view -snes_monitor -ksp_monitor -snes_converged_reason > -ksp_converged_reason* > > > and my code flows like this > > *- call SNESCreate(PETSC_COMM_WORLD, Mixt%snes, ier)* > *- call SNESSetFunction(Mixt%snes, Mixt%r, FormFunction, userctx, ier)* > *- call SNESSetFromOptions(Mixt%snes, ier)* > *- call SNESGetKSP(Mixt%snes, ksp, ier)* > *- call KSPGetPC(ksp, pc, ier)* > > *- call KSPSetFromOptions(ksp, ier)* > *- call PCSetFromOptions(pc, ier)* > > > > > > >> >> >>> 2) >>> And I'm trying to implement some implicit time scheme for convergence of >>> my steady state problem. >>> >>> I used LUSGS implicit scheme before, but these days GMRES implicit >>> scheme is popular for quadratic convergence characteristics. >>> >> >> I am not sure I understand here. Nothing about GMRES is quadratic. >> However, Newton's method can be quadratic >> if your initial guess is good, and GMRES could be part of a solver for >> that. >> >> >>> For implicit time scheme, it is just same as matrix inversion, so I >>> started PETSc library for GMRES, which is one of the greatest mathematical >>> library. >>> >>> >>> 3) >>> As I'm using different numerical convective flux scheme (e.g. Roe's FDS, >>> AUSM, etc), it would be really time consuming to derive Jacobian matrix for >>> each scheme. >>> >> >> Yes. However, the preconditioner matrix only needs to be approximate. I >> think you should derive one for the easiest flux scheme and >> always use that. The important thing is to couple the unknowns which >> influence each other, rather than the precise method of influence. >> >> Thanks, >> >> Matt >> >> >>> So I was fascinated by matrix-free method (I didn't completely >>> understand this method back then), and I implemented GMRES with no >>> preconditioning matrix with your help. >>> >>> After that, I wanted to ask you about any accelerating methods for my >>> GMRES condition. >>> >>> I will try applying CHEBY preconditioner as you mentioned first (even if >>> its performance wouldn't be that good). >>> >>> In order to constitute user-provided preconditioning matrix, could you >>> tell me any similar examples? >>> >>> >>> Thanks again. >>> >>> Your best, >>> >>> Kyungjun. >>> >>> >>> 2016-08-31 19:07 GMT+09:00 Matthew Knepley : >>> >>>> On Wed, Aug 31, 2016 at 3:49 AM, Choi Kyungjun < >>>> kyungjun.choi92 at gmail.com> wrote: >>>> >>>>> Dear Petsc. >>>>> >>>>> I am implementing Petsc library for my CFD flow code. >>>>> >>>>> Thanks to Matt, I got what I wanted last week. >>>>> >>>>> It was the GMRES with matrix-free method, no preconditioning matrix >>>>> and command line options are below. >>>>> >>>>> *-snes_mf -pc_type none -..monitor -..converged_reason* >>>>> >>>>> The solve worked, but performed very poorly. >>>>> >>>>> >>>>> I learned that the efficiency of Krylov-subspace methods depends >>>>> strongly depends on a good preconditioner. >>>>> >>>>> And in the Petsc manual, the matrix-free method is allowed only with >>>>> no preconditioning, a user-provided preconditioner matrix, or a >>>>> user-provided preconditioner shell. >>>>> >>>>> >>>>> Here are my questions. >>>>> >>>>> 1) To improve the solver performance using GMRES, is there any way >>>>> using snes_mf without preconditioning matrix? >>>>> >>>> >>>> Not really. The CHEBY preconditioner will work without an explicit >>>> matrix, however its not great by itself. >>>> >>>> >>>>> 2) For user-provided preconditioner matrix, I saw some example codes >>>>> that provide approx. Jacobian matrix as preconditioner matrix. But this >>>>> means that I should derive approx. Jacobian mat for system, am I right? >>>>> >>>> >>>> Yes. >>>> >>>> >>>>> 3) I'd like to know which is the fastest way to solve with GMRES >>>>> method. Could you tell me or let me know any other examples? >>>>> >>>> >>>> 1) The solve depends greatly on the physics/matrix you are using. >>>> Without knowing that, we can't say anything. For example, is >>>> the system elliptic? If so, then using Multigrid (MG) is generally a >>>> good idea. >>>> >>>> 2) In general, I think its a mistake to think of GMRES or any KSP as a >>>> solver. We should think of them as accelerators for solvers, >>>> as they were originally intended. For example, MG is a good solver for >>>> elliptic CFD equations, as long as you somehow deal with >>>> incompressibility. Then you can use GMRES to cleanup some things you >>>> miss when implementing your MG solver. >>>> >>>> 3) The best thing to do in this case is to look at the literature, >>>> which is voluminous, and find the solver you want to implement. PETSc >>>> really speeds up the actually implementation and testing. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> Thank you very much for your help. >>>>> >>>>> Sincerely, >>>>> >>>>> Kyungjun. >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kyungjun.choi92 at gmail.com Wed Aug 31 07:34:35 2016 From: kyungjun.choi92 at gmail.com (Choi Kyungjun) Date: Wed, 31 Aug 2016 21:34:35 +0900 Subject: [petsc-users] GMRES with matrix-free method and preconditioning matrix for higher performance. In-Reply-To: References: Message-ID: 2016-08-31 21:23 GMT+09:00 Matthew Knepley : > On Wed, Aug 31, 2016 at 7:22 AM, Choi Kyungjun > wrote: > >> Thank you very much again Matt. >> >> Just another simple question. >> >> 2016-08-31 20:00 GMT+09:00 Matthew Knepley : >> >>> On Wed, Aug 31, 2016 at 5:46 AM, Choi Kyungjun < >>> kyungjun.choi92 at gmail.com> wrote: >>> >>>> Thanks Matt. >>>> >>>> I really appreciate your help every time. >>>> >>>> >>>> I think I forgot mentioning code info again. >>>> >>>> 1) >>>> I'm working on 2-D/3-D Compressible Euler equation solver, which is >>>> completely hyperbolic system. >>>> >>> >>> Okay, then MG is out. I would start by using a sparse direct solver, >>> like SuperLU. Then for parallelism >>> you could use ASM, and as the subsolver use SuperLU, so something like >>> >>> -ksp_type gmres -pc_type asm -sub_pc_type superlu >>> >>> You could get more sophisticated by >>> >>> - Trying to have blocks bigger than 1 process and using SuperLU_dist >>> >>> - Splitting up the fields using PCFIELDSPLIT. There are indications >>> that solving one of the fields first >>> can really help convergence. I am thinking of the work of David >>> Keyes and LuLu Liu on MSM methods. >>> >> >> For the above part, >> >> It's not compatible with -snes_mf command line option, is it? >> > > No. I think MF is not a useful idea unless you have a preconditioning > matrix. > > Thanks, > > Matt > But in order to use KSP context, I have to make my system matrix, isn't it? Then what's the difference between having preconditioning matrix preA and making system matrix A? Because *-snes_mf* option requireed no system matrix and just computed the residual which felt very convenient. If is necessary to make my system matrix to use KSP - GMRES, as you recommended above, then I'll try. Thank you very much. Kyungjun. > > >> I applied cmd line options like below >> *-snes_mf -ksp_type gmres -pc_type asm -sub_pc_type superlu >> -snes_view -snes_monitor -ksp_monitor -snes_converged_reason >> -ksp_converged_reason* >> >> >> and my code flows like this >> >> *- call SNESCreate(PETSC_COMM_WORLD, Mixt%snes, ier)* >> *- call SNESSetFunction(Mixt%snes, Mixt%r, FormFunction, userctx, ier)* >> *- call SNESSetFromOptions(Mixt%snes, ier)* >> *- call SNESGetKSP(Mixt%snes, ksp, ier)* >> *- call KSPGetPC(ksp, pc, ier)* >> >> *- call KSPSetFromOptions(ksp, ier)* >> *- call PCSetFromOptions(pc, ier)* >> >> >> >> >> >> >>> >>> >>>> 2) >>>> And I'm trying to implement some implicit time scheme for convergence >>>> of my steady state problem. >>>> >>>> I used LUSGS implicit scheme before, but these days GMRES implicit >>>> scheme is popular for quadratic convergence characteristics. >>>> >>> >>> I am not sure I understand here. Nothing about GMRES is quadratic. >>> However, Newton's method can be quadratic >>> if your initial guess is good, and GMRES could be part of a solver for >>> that. >>> >>> >>>> For implicit time scheme, it is just same as matrix inversion, so I >>>> started PETSc library for GMRES, which is one of the greatest mathematical >>>> library. >>>> >>>> >>>> 3) >>>> As I'm using different numerical convective flux scheme (e.g. Roe's >>>> FDS, AUSM, etc), it would be really time consuming to derive Jacobian >>>> matrix for each scheme. >>>> >>> >>> Yes. However, the preconditioner matrix only needs to be approximate. I >>> think you should derive one for the easiest flux scheme and >>> always use that. The important thing is to couple the unknowns which >>> influence each other, rather than the precise method of influence. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> So I was fascinated by matrix-free method (I didn't completely >>>> understand this method back then), and I implemented GMRES with no >>>> preconditioning matrix with your help. >>>> >>>> After that, I wanted to ask you about any accelerating methods for my >>>> GMRES condition. >>>> >>>> I will try applying CHEBY preconditioner as you mentioned first (even >>>> if its performance wouldn't be that good). >>>> >>>> In order to constitute user-provided preconditioning matrix, could you >>>> tell me any similar examples? >>>> >>>> >>>> Thanks again. >>>> >>>> Your best, >>>> >>>> Kyungjun. >>>> >>>> >>>> 2016-08-31 19:07 GMT+09:00 Matthew Knepley : >>>> >>>>> On Wed, Aug 31, 2016 at 3:49 AM, Choi Kyungjun < >>>>> kyungjun.choi92 at gmail.com> wrote: >>>>> >>>>>> Dear Petsc. >>>>>> >>>>>> I am implementing Petsc library for my CFD flow code. >>>>>> >>>>>> Thanks to Matt, I got what I wanted last week. >>>>>> >>>>>> It was the GMRES with matrix-free method, no preconditioning matrix >>>>>> and command line options are below. >>>>>> >>>>>> *-snes_mf -pc_type none -..monitor -..converged_reason* >>>>>> >>>>>> The solve worked, but performed very poorly. >>>>>> >>>>>> >>>>>> I learned that the efficiency of Krylov-subspace methods depends >>>>>> strongly depends on a good preconditioner. >>>>>> >>>>>> And in the Petsc manual, the matrix-free method is allowed only with >>>>>> no preconditioning, a user-provided preconditioner matrix, or a >>>>>> user-provided preconditioner shell. >>>>>> >>>>>> >>>>>> Here are my questions. >>>>>> >>>>>> 1) To improve the solver performance using GMRES, is there any way >>>>>> using snes_mf without preconditioning matrix? >>>>>> >>>>> >>>>> Not really. The CHEBY preconditioner will work without an explicit >>>>> matrix, however its not great by itself. >>>>> >>>>> >>>>>> 2) For user-provided preconditioner matrix, I saw some example codes >>>>>> that provide approx. Jacobian matrix as preconditioner matrix. But this >>>>>> means that I should derive approx. Jacobian mat for system, am I right? >>>>>> >>>>> >>>>> Yes. >>>>> >>>>> >>>>>> 3) I'd like to know which is the fastest way to solve with GMRES >>>>>> method. Could you tell me or let me know any other examples? >>>>>> >>>>> >>>>> 1) The solve depends greatly on the physics/matrix you are using. >>>>> Without knowing that, we can't say anything. For example, is >>>>> the system elliptic? If so, then using Multigrid (MG) is generally a >>>>> good idea. >>>>> >>>>> 2) In general, I think its a mistake to think of GMRES or any KSP as a >>>>> solver. We should think of them as accelerators for solvers, >>>>> as they were originally intended. For example, MG is a good solver for >>>>> elliptic CFD equations, as long as you somehow deal with >>>>> incompressibility. Then you can use GMRES to cleanup some things you >>>>> miss when implementing your MG solver. >>>>> >>>>> 3) The best thing to do in this case is to look at the literature, >>>>> which is voluminous, and find the solver you want to implement. PETSc >>>>> really speeds up the actually implementation and testing. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Thank you very much for your help. >>>>>> >>>>>> Sincerely, >>>>>> >>>>>> Kyungjun. >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 31 07:46:42 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Aug 2016 07:46:42 -0500 Subject: [petsc-users] GMRES with matrix-free method and preconditioning matrix for higher performance. In-Reply-To: References: Message-ID: On Wed, Aug 31, 2016 at 7:34 AM, Choi Kyungjun wrote: > 2016-08-31 21:23 GMT+09:00 Matthew Knepley : > >> On Wed, Aug 31, 2016 at 7:22 AM, Choi Kyungjun > > wrote: >> >>> Thank you very much again Matt. >>> >>> Just another simple question. >>> >>> 2016-08-31 20:00 GMT+09:00 Matthew Knepley : >>> >>>> On Wed, Aug 31, 2016 at 5:46 AM, Choi Kyungjun < >>>> kyungjun.choi92 at gmail.com> wrote: >>>> >>>>> Thanks Matt. >>>>> >>>>> I really appreciate your help every time. >>>>> >>>>> >>>>> I think I forgot mentioning code info again. >>>>> >>>>> 1) >>>>> I'm working on 2-D/3-D Compressible Euler equation solver, which is >>>>> completely hyperbolic system. >>>>> >>>> >>>> Okay, then MG is out. I would start by using a sparse direct solver, >>>> like SuperLU. Then for parallelism >>>> you could use ASM, and as the subsolver use SuperLU, so something like >>>> >>>> -ksp_type gmres -pc_type asm -sub_pc_type superlu >>>> >>>> You could get more sophisticated by >>>> >>>> - Trying to have blocks bigger than 1 process and using SuperLU_dist >>>> >>>> - Splitting up the fields using PCFIELDSPLIT. There are indications >>>> that solving one of the fields first >>>> can really help convergence. I am thinking of the work of David >>>> Keyes and LuLu Liu on MSM methods. >>>> >>> >>> For the above part, >>> >>> It's not compatible with -snes_mf command line option, is it? >>> >> >> No. I think MF is not a useful idea unless you have a preconditioning >> matrix. >> >> Thanks, >> >> Matt >> > > > But in order to use KSP context, > > I have to make my system matrix, isn't it? > > Then what's the difference between having preconditioning matrix preA and > making system matrix A? > The preA can be an _approximation_ to A, since MF A provides an accurate matrix-vector product. > Because *-snes_mf* option requireed no system matrix and just computed > the residual which felt very convenient. > > > If is necessary to make my system matrix to use KSP - GMRES, as you > recommended above, then I'll try. > The KSP does not need the matrix, but as I mentioned KSPes are not good solvers. They are just for accelerating solves. In general, good solvers need the matrix. Thanks, Matt > Thank you very much. > > Kyungjun. > > > >> >> >>> I applied cmd line options like below >>> *-snes_mf -ksp_type gmres -pc_type asm -sub_pc_type superlu >>> -snes_view -snes_monitor -ksp_monitor -snes_converged_reason >>> -ksp_converged_reason* >>> >>> >>> and my code flows like this >>> >>> *- call SNESCreate(PETSC_COMM_WORLD, Mixt%snes, ier)* >>> *- call SNESSetFunction(Mixt%snes, Mixt%r, FormFunction, userctx, ier)* >>> *- call SNESSetFromOptions(Mixt%snes, ier)* >>> *- call SNESGetKSP(Mixt%snes, ksp, ier)* >>> *- call KSPGetPC(ksp, pc, ier)* >>> >>> *- call KSPSetFromOptions(ksp, ier)* >>> *- call PCSetFromOptions(pc, ier)* >>> >>> >>> >>> >>> >>> >>>> >>>> >>>>> 2) >>>>> And I'm trying to implement some implicit time scheme for convergence >>>>> of my steady state problem. >>>>> >>>>> I used LUSGS implicit scheme before, but these days GMRES implicit >>>>> scheme is popular for quadratic convergence characteristics. >>>>> >>>> >>>> I am not sure I understand here. Nothing about GMRES is quadratic. >>>> However, Newton's method can be quadratic >>>> if your initial guess is good, and GMRES could be part of a solver for >>>> that. >>>> >>>> >>>>> For implicit time scheme, it is just same as matrix inversion, so I >>>>> started PETSc library for GMRES, which is one of the greatest mathematical >>>>> library. >>>>> >>>>> >>>>> 3) >>>>> As I'm using different numerical convective flux scheme (e.g. Roe's >>>>> FDS, AUSM, etc), it would be really time consuming to derive Jacobian >>>>> matrix for each scheme. >>>>> >>>> >>>> Yes. However, the preconditioner matrix only needs to be approximate. I >>>> think you should derive one for the easiest flux scheme and >>>> always use that. The important thing is to couple the unknowns which >>>> influence each other, rather than the precise method of influence. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> So I was fascinated by matrix-free method (I didn't completely >>>>> understand this method back then), and I implemented GMRES with no >>>>> preconditioning matrix with your help. >>>>> >>>>> After that, I wanted to ask you about any accelerating methods for my >>>>> GMRES condition. >>>>> >>>>> I will try applying CHEBY preconditioner as you mentioned first (even >>>>> if its performance wouldn't be that good). >>>>> >>>>> In order to constitute user-provided preconditioning matrix, could you >>>>> tell me any similar examples? >>>>> >>>>> >>>>> Thanks again. >>>>> >>>>> Your best, >>>>> >>>>> Kyungjun. >>>>> >>>>> >>>>> 2016-08-31 19:07 GMT+09:00 Matthew Knepley : >>>>> >>>>>> On Wed, Aug 31, 2016 at 3:49 AM, Choi Kyungjun < >>>>>> kyungjun.choi92 at gmail.com> wrote: >>>>>> >>>>>>> Dear Petsc. >>>>>>> >>>>>>> I am implementing Petsc library for my CFD flow code. >>>>>>> >>>>>>> Thanks to Matt, I got what I wanted last week. >>>>>>> >>>>>>> It was the GMRES with matrix-free method, no preconditioning matrix >>>>>>> and command line options are below. >>>>>>> >>>>>>> *-snes_mf -pc_type none -..monitor -..converged_reason* >>>>>>> >>>>>>> The solve worked, but performed very poorly. >>>>>>> >>>>>>> >>>>>>> I learned that the efficiency of Krylov-subspace methods depends >>>>>>> strongly depends on a good preconditioner. >>>>>>> >>>>>>> And in the Petsc manual, the matrix-free method is allowed only with >>>>>>> no preconditioning, a user-provided preconditioner matrix, or a >>>>>>> user-provided preconditioner shell. >>>>>>> >>>>>>> >>>>>>> Here are my questions. >>>>>>> >>>>>>> 1) To improve the solver performance using GMRES, is there any way >>>>>>> using snes_mf without preconditioning matrix? >>>>>>> >>>>>> >>>>>> Not really. The CHEBY preconditioner will work without an explicit >>>>>> matrix, however its not great by itself. >>>>>> >>>>>> >>>>>>> 2) For user-provided preconditioner matrix, I saw some example codes >>>>>>> that provide approx. Jacobian matrix as preconditioner matrix. But this >>>>>>> means that I should derive approx. Jacobian mat for system, am I right? >>>>>>> >>>>>> >>>>>> Yes. >>>>>> >>>>>> >>>>>>> 3) I'd like to know which is the fastest way to solve with GMRES >>>>>>> method. Could you tell me or let me know any other examples? >>>>>>> >>>>>> >>>>>> 1) The solve depends greatly on the physics/matrix you are using. >>>>>> Without knowing that, we can't say anything. For example, is >>>>>> the system elliptic? If so, then using Multigrid (MG) is generally a >>>>>> good idea. >>>>>> >>>>>> 2) In general, I think its a mistake to think of GMRES or any KSP as >>>>>> a solver. We should think of them as accelerators for solvers, >>>>>> as they were originally intended. For example, MG is a good solver >>>>>> for elliptic CFD equations, as long as you somehow deal with >>>>>> incompressibility. Then you can use GMRES to cleanup some things you >>>>>> miss when implementing your MG solver. >>>>>> >>>>>> 3) The best thing to do in this case is to look at the literature, >>>>>> which is voluminous, and find the solver you want to implement. PETSc >>>>>> really speeds up the actually implementation and testing. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thank you very much for your help. >>>>>>> >>>>>>> Sincerely, >>>>>>> >>>>>>> Kyungjun. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From ibarletta at inogs.it Wed Aug 31 10:32:46 2016 From: ibarletta at inogs.it (Ivano Barletta) Date: Wed, 31 Aug 2016 17:32:46 +0200 Subject: [petsc-users] Number of Iteration of KSP and relative tolerance Message-ID: Dear Petsc Users I'm using Petsc to solve an elliptic equation The code can be run in parallel but I'm running some tests in sequential by the moment When I look at the output, what it looks odd to me is that the relative tolerance that I set is not fulfilled. I've set -ksp_rtol 1e-8 in my runtime options but the solver stops when the ratio || r || / || b || is still 9e-8, then almost one order of magnitude greater of the rtol that I set (as you can see in the txt in attachment). My question is, isn't the solver supposed to make other few iterations to reach the relative tolerance? Thanks in advance for replies and suggestions Kind Regards Ivano P.S. my runtime options are these: -ksp_monitor_true_residual -ksp_type cg -ksp_converged_reason -ksp_view -ksp_rtol 1e-8 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- 0: norm of rhs: 439.879419781616 0: 0 KSP preconditioned resid norm 9,632653071666e-09 true resid norm 4,398794197816e+02 ||r(i)||/||b|| 1,000000000000e+00 0: 1 KSP preconditioned resid norm 2,171633143101e-09 true resid norm 2,556604849166e+02 ||r(i)||/||b|| 5,812058337339e-01 0: 2 KSP preconditioned resid norm 1,441343422857e-09 true resid norm 2,500950008116e+02 ||r(i)||/||b|| 5,685535389125e-01 0: 3 KSP preconditioned resid norm 1,214236434534e-09 true resid norm 2,208035386889e+02 ||r(i)||/||b|| 5,019637854358e-01 0: 4 KSP preconditioned resid norm 1,086804365216e-09 true resid norm 2,004011071514e+02 ||r(i)||/||b|| 4,555819120861e-01 0: 5 KSP preconditioned resid norm 7,736107826135e-10 true resid norm 1,700068699361e+02 ||r(i)||/||b|| 3,864851645493e-01 0: 6 KSP preconditioned resid norm 5,011384091121e-10 true resid norm 1,237597947404e+02 ||r(i)||/||b|| 2,813493634275e-01 0: 7 KSP preconditioned resid norm 4,318011974073e-10 true resid norm 1,000907681554e+02 ||r(i)||/||b|| 2,275413753276e-01 0: 8 KSP preconditioned resid norm 4,380667762565e-10 true resid norm 9,394931213238e+01 ||r(i)||/||b|| 2,135796945877e-01 0: 9 KSP preconditioned resid norm 3,780427851011e-10 true resid norm 8,101250279573e+01 ||r(i)||/||b|| 1,841697955225e-01 0: 10 KSP preconditioned resid norm 3,058003827193e-10 true resid norm 7,199892948255e+01 ||r(i)||/||b|| 1,636787861507e-01 0: 11 KSP preconditioned resid norm 2,474492702547e-10 true resid norm 6,123616634315e+01 ||r(i)||/||b|| 1,392112556063e-01 0: 12 KSP preconditioned resid norm 2,095988234942e-10 true resid norm 5,724994684966e+01 ||r(i)||/||b|| 1,301491824239e-01 0: 13 KSP preconditioned resid norm 1,884061587633e-10 true resid norm 4,972054611666e+01 ||r(i)||/||b|| 1,130322171957e-01 0: 14 KSP preconditioned resid norm 1,405045488203e-10 true resid norm 4,140683767500e+01 ||r(i)||/||b|| 9,413224582218e-02 0: 15 KSP preconditioned resid norm 1,286551939078e-10 true resid norm 3,451264974383e+01 ||r(i)||/||b|| 7,845934179181e-02 0: 16 KSP preconditioned resid norm 1,178683579708e-10 true resid norm 2,939054790173e+01 ||r(i)||/||b|| 6,681501015964e-02 0: 17 KSP preconditioned resid norm 8,234959252122e-11 true resid norm 2,285365043457e+01 ||r(i)||/||b|| 5,195435250395e-02 0: 18 KSP preconditioned resid norm 7,509832956946e-11 true resid norm 1,751214986415e+01 ||r(i)||/||b|| 3,981125071240e-02 0: 19 KSP preconditioned resid norm 5,409124168355e-11 true resid norm 1,352858531620e+01 ||r(i)||/||b|| 3,075521315117e-02 0: 20 KSP preconditioned resid norm 3,995162840424e-11 true resid norm 1,056894250592e+01 ||r(i)||/||b|| 2,402690835403e-02 0: 21 KSP preconditioned resid norm 3,018772359234e-11 true resid norm 8,491273353790e+00 ||r(i)||/||b|| 1,930363861534e-02 0: 22 KSP preconditioned resid norm 2,319456860895e-11 true resid norm 6,521467351121e+00 ||r(i)||/||b|| 1,482557959715e-02 0: 23 KSP preconditioned resid norm 1,714661331213e-11 true resid norm 5,068319413857e+00 ||r(i)||/||b|| 1,152206533412e-02 0: 24 KSP preconditioned resid norm 1,308225317474e-11 true resid norm 3,923590511912e+00 ||r(i)||/||b|| 8,919695569891e-03 0: 25 KSP preconditioned resid norm 1,065548274277e-11 true resid norm 3,349720325035e+00 ||r(i)||/||b|| 7,615087622646e-03 0: 26 KSP preconditioned resid norm 9,288082413605e-12 true resid norm 2,839180689893e+00 ||r(i)||/||b|| 6,454452202611e-03 0: 27 KSP preconditioned resid norm 8,129874937446e-12 true resid norm 2,516801431301e+00 ||r(i)||/||b|| 5,721571226385e-03 0: 28 KSP preconditioned resid norm 6,251370974549e-12 true resid norm 2,069095898397e+00 ||r(i)||/||b|| 4,703779729964e-03 0: 29 KSP preconditioned resid norm 5,403561134116e-12 true resid norm 1,703050622152e+00 ||r(i)||/||b|| 3,871630600490e-03 0: 30 KSP preconditioned resid norm 4,223047249100e-12 true resid norm 1,498492398164e+00 ||r(i)||/||b|| 3,406598105699e-03 0: 31 KSP preconditioned resid norm 3,334181196298e-12 true resid norm 1,250453513567e+00 ||r(i)||/||b|| 2,842718839149e-03 0: 32 KSP preconditioned resid norm 2,524440622906e-12 true resid norm 9,582339369648e-01 ||r(i)||/||b|| 2,178401384271e-03 0: 33 KSP preconditioned resid norm 2,235471174413e-12 true resid norm 7,890045237748e-01 ||r(i)||/||b|| 1,793683651230e-03 0: 34 KSP preconditioned resid norm 1,856262698124e-12 true resid norm 6,585027575574e-01 ||r(i)||/||b|| 1,497007425090e-03 0: 35 KSP preconditioned resid norm 1,368876723850e-12 true resid norm 4,986313783422e-01 ||r(i)||/||b|| 1,133563781160e-03 0: 36 KSP preconditioned resid norm 1,232999096255e-12 true resid norm 4,243119220237e-01 ||r(i)||/||b|| 9,646096246884e-04 0: 37 KSP preconditioned resid norm 1,059574749098e-12 true resid norm 3,698114973392e-01 ||r(i)||/||b|| 8,407110692352e-04 0: 38 KSP preconditioned resid norm 7,934364061187e-13 true resid norm 3,099505530599e-01 ||r(i)||/||b|| 7,046261750863e-04 0: 39 KSP preconditioned resid norm 7,439958768077e-13 true resid norm 2,561115153046e-01 ||r(i)||/||b|| 5,822311837907e-04 0: 40 KSP preconditioned resid norm 6,138328821745e-13 true resid norm 2,374365553293e-01 ||r(i)||/||b|| 5,397764583921e-04 0: 41 KSP preconditioned resid norm 4,957137308350e-13 true resid norm 1,864936647074e-01 ||r(i)||/||b|| 4,239654239790e-04 0: 42 KSP preconditioned resid norm 3,626526442636e-13 true resid norm 1,431380534998e-01 ||r(i)||/||b|| 3,254029333105e-04 0: 43 KSP preconditioned resid norm 3,020992100659e-13 true resid norm 1,179620641634e-01 ||r(i)||/||b|| 2,681690910248e-04 0: 44 KSP preconditioned resid norm 2,225548801089e-13 true resid norm 9,360224549511e-02 ||r(i)||/||b|| 2,127906905524e-04 0: 45 KSP preconditioned resid norm 1,809693413599e-13 true resid norm 7,443272262501e-02 ||r(i)||/||b|| 1,692116504609e-04 0: 46 KSP preconditioned resid norm 1,372065895247e-13 true resid norm 5,900930182328e-02 ||r(i)||/||b|| 1,341488125373e-04 0: 47 KSP preconditioned resid norm 1,208983050044e-13 true resid norm 4,617672978018e-02 ||r(i)||/||b|| 1,049758813520e-04 0: 48 KSP preconditioned resid norm 8,503926168226e-14 true resid norm 3,758170008713e-02 ||r(i)||/||b|| 8,543636823424e-05 0: 49 KSP preconditioned resid norm 7,569271928792e-14 true resid norm 2,756193861109e-02 ||r(i)||/||b|| 6,265794072561e-05 0: 50 KSP preconditioned resid norm 4,935003512052e-14 true resid norm 2,057875530499e-02 ||r(i)||/||b|| 4,678271903515e-05 0: 51 KSP preconditioned resid norm 4,123111257025e-14 true resid norm 1,560285082507e-02 ||r(i)||/||b|| 3,547074521653e-05 0: 52 KSP preconditioned resid norm 2,918330157388e-14 true resid norm 1,175568462832e-02 ||r(i)||/||b|| 2,672478888454e-05 0: 53 KSP preconditioned resid norm 2,165351561155e-14 true resid norm 8,746606938065e-03 ||r(i)||/||b|| 1,988410128941e-05 0: 54 KSP preconditioned resid norm 1,886014127891e-14 true resid norm 7,447830540951e-03 ||r(i)||/||b|| 1,693152760965e-05 0: 55 KSP preconditioned resid norm 1,485420723874e-14 true resid norm 5,811624702993e-03 ||r(i)||/||b|| 1,321185861771e-05 0: 56 KSP preconditioned resid norm 1,190524430303e-14 true resid norm 4,818891484988e-03 ||r(i)||/||b|| 1,095502828339e-05 0: 57 KSP preconditioned resid norm 1,135298601262e-14 true resid norm 4,028016670720e-03 ||r(i)||/||b|| 9,157092806750e-06 0: 58 KSP preconditioned resid norm 8,835732623967e-15 true resid norm 3,581369088290e-03 ||r(i)||/||b|| 8,141706402332e-06 0: 59 KSP preconditioned resid norm 6,601144858500e-15 true resid norm 2,916186245046e-03 ||r(i)||/||b|| 6,629512802607e-06 0: 60 KSP preconditioned resid norm 5,573564152983e-15 true resid norm 2,254185923764e-03 ||r(i)||/||b|| 5,124554190063e-06 0: 61 KSP preconditioned resid norm 4,348118360429e-15 true resid norm 1,856399919560e-03 ||r(i)||/||b|| 4,220247267948e-06 0: 62 KSP preconditioned resid norm 3,564748099839e-15 true resid norm 1,483306639981e-03 ||r(i)||/||b|| 3,372075558156e-06 0: 63 KSP preconditioned resid norm 2,502729819898e-15 true resid norm 1,124605293813e-03 ||r(i)||/||b|| 2,556621754141e-06 0: 64 KSP preconditioned resid norm 1,953973271399e-15 true resid norm 8,579066324068e-04 ||r(i)||/||b|| 1,950322278848e-06 0: 65 KSP preconditioned resid norm 1,510562005540e-15 true resid norm 6,504461230781e-04 ||r(i)||/||b|| 1,478691872880e-06 0: 66 KSP preconditioned resid norm 1,031183935808e-15 true resid norm 4,871796605801e-04 ||r(i)||/||b|| 1,107530015435e-06 0: 67 KSP preconditioned resid norm 8,524858201959e-16 true resid norm 3,639142205670e-04 ||r(i)||/||b|| 8,273044934625e-07 0: 68 KSP preconditioned resid norm 5,870210236903e-16 true resid norm 2,580840567039e-04 ||r(i)||/||b|| 5,867154613236e-07 0: 69 KSP preconditioned resid norm 4,018511060069e-16 true resid norm 1,933551250949e-04 ||r(i)||/||b|| 4,395639268390e-07 0: 70 KSP preconditioned resid norm 3,327707911589e-16 true resid norm 1,477827907372e-04 ||r(i)||/||b|| 3,359620479870e-07 0: 71 KSP preconditioned resid norm 2,458283001628e-16 true resid norm 1,169124786837e-04 ||r(i)||/||b|| 2,657830155858e-07 0: 72 KSP preconditioned resid norm 2,127388665524e-16 true resid norm 9,184205349023e-05 ||r(i)||/||b|| 2,087891575737e-07 0: 73 KSP preconditioned resid norm 1,495585985687e-16 true resid norm 6,863031725036e-05 ||r(i)||/||b|| 1,560207506058e-07 0: 74 KSP preconditioned resid norm 1,174238330276e-16 true resid norm 5,246571049336e-05 ||r(i)||/||b|| 1,192729373868e-07 0: 75 KSP preconditioned resid norm 9,317459720425e-17 true resid norm 4,120877969601e-05 ||r(i)||/||b|| 9,368199066115e-08 0:Linear solve converged due to CONVERGED_RTOL iterations 75 0:KSP Object: 1 MPI processes 0: type: cg 0: maximum iterations=10000, initial guess is zero 0: tolerances: relative=1e-08, absolute=1e-50, divergence=10000. 0: left preconditioning 0: using PRECONDITIONED norm type for convergence test 0:PC Object: 1 MPI processes 0: type: bjacobi 0: block Jacobi: number of blocks = 1 0: Local solve is same for all blocks, in the following KSP and PC objects: 0: KSP Object: (sub_) 1 MPI processes 0: type: preonly 0: maximum iterations=10000, initial guess is zero 0: tolerances: relative=1e-05, absolute=1e-50, divergence=10000. 0: left preconditioning 0: using NONE norm type for convergence test 0: PC Object: (sub_) 1 MPI processes 0: type: ilu 0: ILU: out-of-place factorization 0: 0 levels of fill 0: tolerance for zero pivot 2,22045e-14 0: matrix ordering: natural 0: factor fill ratio given 1., needed 1. 0: Factored matrix follows: 0: Mat Object: 1 MPI processes 0: type: seqaij 0: rows=27118, cols=27118 0: package used to perform factorization: petsc 0: total: nonzeros=134928, allocated nonzeros=134928 0: total number of mallocs used during MatSetValues calls =0 0: not using I-node routines 0: linear system matrix = precond matrix: 0: Mat Object: 1 MPI processes 0: type: seqaij 0: rows=27118, cols=27118 0: total: nonzeros=134928, allocated nonzeros=135590 0: total number of mallocs used during MatSetValues calls =0 0: not using I-node routines 0: linear system matrix = precond matrix: 0: Mat Object: 1 MPI processes 0: type: mpiaij 0: rows=27118, cols=27118 0: total: nonzeros=134928, allocated nonzeros=189826 0: total number of mallocs used during MatSetValues calls =0 0: not using I-node (on process 0) routines 0: iterations 75 0: norm of residual: 4.120877969601395E-005 0: solver time: 0.139440812170506 0: calculating true residual of petsc ksp 0: true norm of residual: 4.120877960391687E-005 From knepley at gmail.com Wed Aug 31 10:45:26 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 31 Aug 2016 10:45:26 -0500 Subject: [petsc-users] Number of Iteration of KSP and relative tolerance In-Reply-To: References: Message-ID: On Wed, Aug 31, 2016 at 10:32 AM, Ivano Barletta wrote: > Dear Petsc Users > > I'm using Petsc to solve an elliptic equation > > The code can be run in parallel but I'm running > some tests in sequential by the moment > > When I look at the output, what it looks odd to > me is that the relative tolerance that I set is > not fulfilled. > I've set -ksp_rtol 1e-8 in my runtime options > but the solver stops when the ratio > || r || / || b || is still 9e-8, then almost > one order of magnitude greater of the rtol > that I set (as you can see in the txt in attachment). > > My question is, isn't the solver supposed to > make other few iterations to reach the relative tolerance? > Here is the code: https://bitbucket.org/petsc/petsc/src/a4f377b2def412a6138fd768d05e3a45fdbb68fb/src/ksp/ksp/interface/iterativ.c?at=master&fileviewer=file-view-default#iterativ.c-725 If you have a zero initial guess, we do not use ||b||, but ||r0||. In this case, the ratio ||r_75|| / ||r_0|| < 1.0e-8 when using preconditioned residuals. Thanks, Matt > Thanks in advance for replies and suggestions > Kind Regards > Ivano > > P.S. my runtime options are these: > -ksp_monitor_true_residual -ksp_type cg -ksp_converged_reason -ksp_view > -ksp_rtol 1e-8 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Aug 31 10:46:49 2016 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 31 Aug 2016 11:46:49 -0400 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: <8737lmccj2.fsf@jedbrown.org> <87d1kpbmna.fsf@jedbrown.org> Message-ID: And you can't get much more detail with hypre because it does not record performance data. Or can you get hypre to print its performance data? ML uses more PETSc stuff, you can get the PtAP time, which is most of the matrix setup. GAMG is native and has more timers. In addition to PtAP there is P0 smoothing, which is listed along with some other parts of the GAMG (mesh) setup. On Wed, Aug 31, 2016 at 6:28 AM, Matthew Knepley wrote: > On Wed, Aug 31, 2016 at 5:23 AM, Justin Chang wrote: > >> Matt, >> >> So is the "solve phase" going to be KSPSolve() - PCSetUp()? >> > > Setup Phase: KSPSetUp + PCSetup > > Solve Phase: SNESSolve > This contains SNESFunctionEval, SNESJacobianEval, KSPSolve > > Matt > > In other words, if I want to look at time/iterations, should it just be >> over KSPSolve or should I exclude the PC setup? >> >> Justin >> >> >> >> On Wed, Aug 31, 2016 at 5:13 AM, Matthew Knepley >> wrote: >> >>> On Wed, Aug 31, 2016 at 2:01 AM, Justin Chang >>> wrote: >>> >>>> Attached is the -log_view output (from firedrake). Event Stage 1: >>>> Linear_solver is where I assemble and solve the linear system of equations. >>>> >>>> I am using the HYPRE BoomerAMG preconditioner so log_view cannot "see >>>> into" the exact steps, but based on what it can see, how do I distinguish >>>> between these various setup and timing phases? >>>> >>>> For example, when I look at these lines: >>>> >>>> PCSetUp 1 1.0 2.2858e+00 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 9 0 0 0 0 11 0 0 0 0 0 >>>> PCApply 38 1.0 1.4102e+01 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 56 0 0 0 0 66 0 0 0 0 0 >>>> KSPSetUp 1 1.0 9.9111e-04 1.0 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> KSPSolve 1 1.0 1.7529e+01 1.0 2.44e+09 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 70 7 0 0 0 82 7 0 0 0 139 >>>> SNESSolve 1 1.0 2.1056e+01 1.0 3.75e+10 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 84100 0 0 0 99100 0 0 0 1781 >>>> SNESFunctionEval 1 1.0 1.0763e+00 1.0 1.07e+10 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 4 29 0 0 0 5 29 0 0 0 9954 >>>> SNESJacobianEval 1 1.0 2.4495e+00 1.0 2.43e+10 1.0 0.0e+00 >>>> 0.0e+00 0.0e+00 10 65 0 0 0 12 65 0 0 0 9937 >>>> >>>> So how do I break down "mesh setup", "matrix setup", and "solve time" >>>> phases? I am guessing "PCSetUp" has to do with one of the first two phases, >>>> but how would I categorize the rest of the events? I see that HYPRE doesn't >>>> have as much information as the other PCs like GAMG and ML but can one >>>> still breakdown the timing phases through log_view alone? >>>> >>> >>> 1) It looks like you call PCSetUp() yourself, since otherwise KSPSetUp() >>> would contain that time. Notice that you can ignore KSPSetUp() here. >>> >>> 2) The setup time is usually KSPSetUp(), but if here you add to it >>> PCSetUp() since you called it. >>> >>> 3) The solve time for SNES can be split into >>> >>> a) KSPSolve() for the update calculation >>> >>> b) SNESFunctionEval, SNESJacobianEval for everything else (conv check, >>> line search, J calc, etc.) or you can just take SNESSolve() - KSPSolve() >>> >>> 4) Note that PCApply() is most of KSPSolve(), which is generally good >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> Thanks, >>>> Justin >>>> >>>> On Tue, Aug 30, 2016 at 11:14 PM, Jed Brown wrote: >>>> >>>>> Mark Adams writes: >>>>> >>>>> >> >>>>> >> >>>>> >> Anyway, what I really wanted to say is, it's good to know that these >>>>> >> "dynamic range/performance spectrum/static scaling" plots are >>>>> designed to >>>>> >> go past the sweet spots. I also agree that it would be interesting >>>>> to see a >>>>> >> time vs dofs*iterations/time plot. Would it then also be useful to >>>>> look at >>>>> >> the step to setting up the preconditioner? >>>>> >> >>>>> >> >>>>> > Yes, I generally split up timing between "mesh setup" (symbolic >>>>> > factorization of LU), "matrix setup" (eg, factorizations), and solve >>>>> time. >>>>> > The degree of amortization that you get for the two setup phases >>>>> depends on >>>>> > your problem and so it is useful to separate them. >>>>> >>>>> Right, there is nothing wrong with splitting up the phases, but if you >>>>> never show a spectrum for the total, then I will be suspicious. And if >>>>> you only show "per iteration" instead of for a complete solve, then I >>>>> will assume that you're only doing that because convergence is unusably >>>>> slow. >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lawrence.mitchell at imperial.ac.uk Wed Aug 31 10:54:34 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Wed, 31 Aug 2016 16:54:34 +0100 Subject: [petsc-users] strong-scaling vs weak-scaling In-Reply-To: References: <8737lmccj2.fsf@jedbrown.org> <87d1kpbmna.fsf@jedbrown.org> Message-ID: <57C6FDBA.9040608@imperial.ac.uk> On 31/08/16 16:46, Mark Adams wrote: > And you can't get much more detail with hypre because it does not > record performance data. Or can you get hypre to print its performance > data? -pc_hypre_boomeramg_print_statistics -pc_hypre_boomeramg_print_debug Should give you some info, not sure it's useful. Lawrence -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 490 bytes Desc: OpenPGP digital signature URL: From bsmith at mcs.anl.gov Wed Aug 31 12:51:37 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 31 Aug 2016 12:51:37 -0500 Subject: [petsc-users] Number of Iteration of KSP and relative tolerance In-Reply-To: References: Message-ID: From the KSP view using PRECONDITIONED norm type for convergence test it is using the ratio of the preconditioned residual norms for the convergence test, not the true residual norms. If you want to use the true residual norm use -ksp_pc_side right See also KSPSetNormType() and KSPSetPCSide() Barry > On Aug 31, 2016, at 10:32 AM, Ivano Barletta wrote: > > Dear Petsc Users > > I'm using Petsc to solve an elliptic equation > > The code can be run in parallel but I'm running > some tests in sequential by the moment > > When I look at the output, what it looks odd to > me is that the relative tolerance that I set is > not fulfilled. > I've set -ksp_rtol 1e-8 in my runtime options > but the solver stops when the ratio > || r || / || b || is still 9e-8, then almost > one order of magnitude greater of the rtol > that I set (as you can see in the txt in attachment). > > My question is, isn't the solver supposed to > make other few iterations to reach the relative tolerance? > > Thanks in advance for replies and suggestions > Kind Regards > Ivano > > P.S. my runtime options are these: > -ksp_monitor_true_residual -ksp_type cg -ksp_converged_reason -ksp_view -ksp_rtol 1e-8 > >