[petsc-users] Null space for Poisson with zero Neumann BCs
Åsmund Ervik
asmund.ervik at ntnu.no
Mon Nov 18 11:56:57 CST 2013
Hi again,
Never mind this question.
It turned out the residual was overly strict for this way of removing
the singularity. Taking it back a notch (from 1e-15 to 1e-13) and both
methods are equally fast. Dunno why my "dirty" method was faster with a
smaller residual.
- Åsmund
On 18. nov. 2013 18:33, Åsmund Ervik wrote:
> Good people of PETSc,
>
> I have read in the list archives that setting the null space is the
> preferred method of ensuring that the pressure Poisson equation is
> non-singular. Setting the pressure value at one point is not recommended.
>
> I have tried this advice using the following code:
>
> call MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_TRUE,0,0,nullspace,ierr)
> call MatSetNullSpace(A,nullspace,ierr)
> call MatNullSpaceRemove(nullspace,rhs_vec,PETSC_NULL_OBJECT,ierr)
> call MatNullSpaceDestroy(nullspace,ierr)
>
> after assembling the mat/vec and before KSPSetOperators, KSPSolve etc.
>
> With this I get a better looking pressure field, but the runtime is
> horrible (10x longer) when compared to my old "set the value at some
> point" method. Also, I see that it takes a long time for some time
> steps, but then goes fast for others. The Poisson equation should be
> essentially the same for all these time steps, the velocity field is
> changing slowly.
>
> Below is log_summary for "old method" (first) and "set nullspace"
> (last). (And yes, I know I'm running an old version of "dev", but I've
> been too swamped to change to the git setup.) Any suggestions on
> improving the runtime?
>
> Best regards,
> Åsmund
>
>
>
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
>
> ./run on a arch-linux2-c-opt named ty with 1 processor, by asmunder Mon
> Nov 18 18:29:26 2013
> Using Petsc Development HG revision:
> 3a41f882cfc717ec37b4c7f6b31f43b10211af66 HG Date: Sun Feb 17 13:07:58
> 2013 -0600
>
> Max Max/Min Avg Total
> Time (sec): 3.617e+00 1.00000 3.617e+00
> Objects: 4.400e+01 1.00000 4.400e+01
> Flops: 2.051e+07 1.00000 2.051e+07 2.051e+07
> Flops/sec: 5.670e+06 1.00000 5.670e+06 5.670e+06
> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Reductions: 1.620e+02 1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
> e.g., VecAXPY() for real vectors of length N
> --> 2N flops
> and VecAXPY() for complex vectors of length
> N --> 8N flops
>
> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages
> --- -- Message Lengths -- -- Reductions --
> Avg %Total Avg %Total counts
> %Total Avg %Total counts %Total
> 0: Main Stage: 3.6173e+00 100.0% 2.0508e+07 100.0% 0.000e+00
> 0.0% 0.000e+00 0.0% 1.610e+02 99.4%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
> Count: number of times phase was executed
> Time and Flops: Max - maximum over all processors
> Ratio - ratio of maximum to minimum over all processors
> Mess: number of messages sent
> Avg. len: average message length (bytes)
> Reduct: number of global reductions
> Global: entire computation
> Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
> %T - percent time in this phase %f - percent flops in this
> phase
> %M - percent messages in this phase %L - percent message
> lengths in this phase
> %R - percent reductions in this phase
> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> ------------------------------------------------------------------------------------------------------------------------
> Event Count Time (sec) Flops
> --- Global --- --- Stage --- Total
> Max Ratio Max Ratio Max Ratio Mess Avg len
> Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> ThreadCommRunKer 1628 1.0 9.5713e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecDot 234 1.0 5.8794e-04 1.0 1.92e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 9 0 0 0 0 9 0 0 0 3260
> VecDotNorm2 117 1.0 5.3787e-04 1.0 1.92e+06 1.0 0.0e+00 0.0e+00
> 1.2e+02 0 9 0 0 72 0 9 0 0 73 3564
> VecNorm 177 1.0 4.3344e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecCopy 60 1.0 1.6022e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecSet 394 1.0 5.6601e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 30 1.0 1.1802e-04 1.0 2.46e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 1 0 0 0 0 1 0 0 0 2082
> VecAXPBYCZ 234 1.0 1.3099e-03 1.0 3.83e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 19 0 0 0 0 19 0 0 0 2927
> VecWAXPY 234 1.0 1.1146e-03 1.0 1.92e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 9 0 0 0 0 9 0 0 0 1720
> VecAssemblyBegin 30 1.0 4.7684e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAssemblyEnd 30 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatMult 264 1.0 5.7216e-03 1.0 1.07e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 52 0 0 0 0 52 0 0 0 1866
> MatConvert 30 1.0 1.8499e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyBegin 30 1.0 5.0068e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyEnd 30 1.0 8.7762e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetRowIJ 30 1.0 2.8610e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSetUp 30 1.0 1.9836e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.0e+00 0 0 0 0 4 0 0 0 0 4 0
> KSPSolve 30 1.0 2.1702e-01 1.0 2.05e+07 1.0 0.0e+00 0.0e+00
> 1.6e+02 6100 0 0 97 6100 0 0 98 95
> PCSetUp 30 1.0 8.2594e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 4.0e+00 2 0 0 0 2 2 0 0 0 2 0
> PCApply 294 1.0 1.2338e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type Creations Destructions Memory Descendants' Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
> Vector 40 40 1370240 0
> Matrix 1 1 313964 0
> Krylov Solver 1 1 1288 0
> Preconditioner 1 1 1248 0
> Viewer 1 0 0 0
> ========================================================================================================================
> Average time to get PetscTime(): 0
> #PETSc Option Table entries:
> -ksp_type bcgs
> -log_summary
> -pc_hypre_type boomeramg
> -pc_type hypre
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure run at: Tue Feb 19 15:18:58 2013
> Configure options: --with-threadcomm --with-pthreadclasses --with-openmp
> --with-cc=icc --with-fc=ifort --with-debugging=0
> --with-shared-libraries=1 --download-mpich --download-hypre
> COPTFLAGS=-O3 FOPTFLAGS=-O3
> -----------------------------------------
> Libraries compiled on Tue Feb 19 15:18:58 2013 on ty
> Machine characteristics: Linux-3.7.9-1-ARCH-x86_64-with-glibc2.2.5
> Using PETSc directory: /opt/petsc/petsc-dev-ifort
> Using PETSc arch: arch-linux2-c-opt
> -----------------------------------------
>
> Using C compiler: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
> -fPIC -wd1572 -O3 -fopenmp ${COPTFLAGS} ${CFLAGS}
> Using Fortran compiler:
> /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90 -fPIC -O3
> -fopenmp ${FOPTFLAGS} ${FFLAGS}
> -----------------------------------------
>
> Using include paths:
> -I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
> -I/opt/petsc/petsc-dev-ifort/include
> -I/opt/petsc/petsc-dev-ifort/include
> -I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
> -----------------------------------------
>
> Using C linker: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
> Using Fortran linker:
> /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90
> Using libraries:
> -Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
> -L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lpetsc
> -Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
> -L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lHYPRE
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
> -Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2
> -L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2 -lmpichcxx -llapack -lblas
> -lX11 -lpthread -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl
> -lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lirng -lipgo -ldecimal
> -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
> -----------------------------------------
>
>
>
>
>
>
>
>
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
>
> ./run on a arch-linux2-c-opt named ty with 1 processor, by asmunder Mon
> Nov 18 18:22:38 2013
> Using Petsc Development HG revision:
> 3a41f882cfc717ec37b4c7f6b31f43b10211af66 HG Date: Sun Feb 17 13:07:58
> 2013 -0600
>
> Max Max/Min Avg Total
> Time (sec): 9.378e+01 1.00000 9.378e+01
> Objects: 7.400e+01 1.00000 7.400e+01
> Flops: 1.630e+10 1.00000 1.630e+10 1.630e+10
> Flops/sec: 1.738e+08 1.00000 1.738e+08 1.738e+08
> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
> MPI Reductions: 3.004e+05 1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
> e.g., VecAXPY() for real vectors of length N
> --> 2N flops
> and VecAXPY() for complex vectors of length
> N --> 8N flops
>
> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages
> --- -- Message Lengths -- -- Reductions --
> Avg %Total Avg %Total counts
> %Total Avg %Total counts %Total
> 0: Main Stage: 9.3784e+01 100.0% 1.6300e+10 100.0% 0.000e+00
> 0.0% 0.000e+00 0.0% 3.004e+05 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
> Count: number of times phase was executed
> Time and Flops: Max - maximum over all processors
> Ratio - ratio of maximum to minimum over all processors
> Mess: number of messages sent
> Avg. len: average message length (bytes)
> Reduct: number of global reductions
> Global: entire computation
> Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
> %T - percent time in this phase %f - percent flops in this
> phase
> %M - percent messages in this phase %L - percent message
> lengths in this phase
> %R - percent reductions in this phase
> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> ------------------------------------------------------------------------------------------------------------------------
> Event Count Time (sec) Flops
> --- Global --- --- Stage --- Total
> Max Ratio Max Ratio Max Ratio Mess Avg len
> Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> ThreadCommRunKer 1101474 1.0 6.9823e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 7 0 0 0 0 7 0 0 0 0 0
> VecDot 200206 1.0 4.7812e-01 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 1 10 0 0 0 1 10 0 0 0 3430
> VecDotNorm2 100103 1.0 4.3580e-01 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
> 1.0e+05 0 10 0 0 33 0 10 0 0 33 3763
> VecNorm 100163 1.0 2.1225e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecCopy 60 1.0 1.6689e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecSet 200366 1.0 2.3949e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 30 1.0 1.1563e-04 1.0 2.46e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 2125
> VecAXPBYCZ 200206 1.0 1.1021e+00 1.0 3.28e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 1 20 0 0 0 1 20 0 0 0 2976
> VecWAXPY 200206 1.0 9.3084e-01 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 1 10 0 0 0 1 10 0 0 0 1762
> VecAssemblyBegin 30 1.0 2.8610e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAssemblyEnd 30 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatMult 200236 1.0 4.2707e+00 1.0 8.10e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00 5 50 0 0 0 5 50 0 0 0 1896
> MatConvert 30 1.0 1.9252e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyBegin 30 1.0 1.0967e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyEnd 30 1.0 8.7786e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetRowIJ 30 1.0 4.0531e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSetUp 30 1.0 1.8668e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSolve 30 1.0 8.9967e+01 1.0 1.63e+10 1.0 0.0e+00 0.0e+00
> 3.0e+05 96100 0 0100 96100 0 0100 181
> PCSetUp 30 1.0 8.3021e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 4.0e+00 0 0 0 0 0 0 0 0 0 0 0
> PCApply 200266 1.0 8.0928e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 86 0 0 0 0 86 0 0 0 0 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type Creations Destructions Memory Descendants' Mem.
> Reports information only for process 0.
>
> --- Event Stage 0: Main Stage
>
> Vector 40 40 1370240 0
> Matrix 1 1 313964 0
> Matrix Null Space 30 30 18120 0
> Krylov Solver 1 1 1288 0
> Preconditioner 1 1 1248 0
> Viewer 1 0 0 0
> ========================================================================================================================
> Average time to get PetscTime(): 9.53674e-08
> #PETSc Option Table entries:
> -ksp_type bcgs
> -log_summary
> -pc_hypre_type boomeramg
> -pc_type hypre
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure run at: Tue Feb 19 15:18:58 2013
> Configure options: --with-threadcomm --with-pthreadclasses --with-openmp
> --with-cc=icc --with-fc=ifort --with-debugging=0
> --with-shared-libraries=1 --download-mpich --download-hypre
> COPTFLAGS=-O3 FOPTFLAGS=-O3
> -----------------------------------------
> Libraries compiled on Tue Feb 19 15:18:58 2013 on ty
> Machine characteristics: Linux-3.7.9-1-ARCH-x86_64-with-glibc2.2.5
> Using PETSc directory: /opt/petsc/petsc-dev-ifort
> Using PETSc arch: arch-linux2-c-opt
> -----------------------------------------
>
> Using C compiler: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
> -fPIC -wd1572 -O3 -fopenmp ${COPTFLAGS} ${CFLAGS}
> Using Fortran compiler:
> /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90 -fPIC -O3
> -fopenmp ${FOPTFLAGS} ${FFLAGS}
> -----------------------------------------
>
> Using include paths:
> -I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
> -I/opt/petsc/petsc-dev-ifort/include
> -I/opt/petsc/petsc-dev-ifort/include
> -I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
> -----------------------------------------
>
> Using C linker: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
> Using Fortran linker:
> /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90
> Using libraries:
> -Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
> -L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lpetsc
> -Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
> -L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lHYPRE
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
> -Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2
> -L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2 -lmpichcxx -llapack -lblas
> -lX11 -lpthread -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl
> -lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lirng -lipgo -ldecimal
> -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
> -----------------------------------------
>
>
>
More information about the petsc-users
mailing list