[petsc-users] Null space for Poisson with zero Neumann BCs

Åsmund Ervik asmund.ervik at ntnu.no
Mon Nov 18 11:56:57 CST 2013


Hi again,

Never mind this question.

It turned out the residual was overly strict for this way of removing
the singularity. Taking it back a notch (from 1e-15 to 1e-13) and both
methods are equally fast. Dunno why my "dirty" method was faster with a
smaller residual.

 - Åsmund

On 18. nov. 2013 18:33, Åsmund Ervik wrote:
> Good people of PETSc,
> 
> I have read in the list archives that setting the null space is the
> preferred method of ensuring that the pressure Poisson equation is
> non-singular. Setting the pressure value at one point is not recommended.
> 
> I have tried this advice using the following code:
> 
> call MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_TRUE,0,0,nullspace,ierr)
> call MatSetNullSpace(A,nullspace,ierr)
> call MatNullSpaceRemove(nullspace,rhs_vec,PETSC_NULL_OBJECT,ierr)
> call MatNullSpaceDestroy(nullspace,ierr)
> 
> after assembling the mat/vec and before KSPSetOperators, KSPSolve etc.
> 
> With this I get a better looking pressure field, but the runtime is
> horrible (10x longer) when compared to my old "set the value at some
> point" method. Also, I see that it takes a long time for some time
> steps, but then goes fast for others. The Poisson equation should be
> essentially the same for all these time steps, the velocity field is
> changing slowly.
> 
> Below is log_summary for "old method" (first) and "set nullspace"
> (last). (And yes, I know I'm running an old version of "dev", but I've
> been too swamped to change to the git setup.) Any suggestions on
> improving the runtime?
> 
> Best regards,
> Åsmund
> 
> 
> 
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
> 
> ./run on a arch-linux2-c-opt named ty with 1 processor, by asmunder Mon
> Nov 18 18:29:26 2013
> Using Petsc Development HG revision:
> 3a41f882cfc717ec37b4c7f6b31f43b10211af66  HG Date: Sun Feb 17 13:07:58
> 2013 -0600
> 
>                          Max       Max/Min        Avg      Total
> Time (sec):           3.617e+00      1.00000   3.617e+00
> Objects:              4.400e+01      1.00000   4.400e+01
> Flops:                2.051e+07      1.00000   2.051e+07  2.051e+07
> Flops/sec:            5.670e+06      1.00000   5.670e+06  5.670e+06
> MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Reductions:       1.620e+02      1.00000
> 
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N
> --> 2N flops
>                             and VecAXPY() for complex vectors of length
> N --> 8N flops
> 
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
> ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>  0:      Main Stage: 3.6173e+00 100.0%  2.0508e+07 100.0%  0.000e+00
> 0.0%  0.000e+00        0.0%  1.610e+02  99.4%
> 
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
>       %T - percent time in this phase         %f - percent flops in this
> phase
>       %M - percent messages in this phase     %L - percent message
> lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops
>          --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
> 
> --- Event Stage 0: Main Stage
> 
> ThreadCommRunKer    1628 1.0 9.5713e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecDot               234 1.0 5.8794e-04 1.0 1.92e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  9  0  0  0   0  9  0  0  0  3260
> VecDotNorm2          117 1.0 5.3787e-04 1.0 1.92e+06 1.0 0.0e+00 0.0e+00
> 1.2e+02  0  9  0  0 72   0  9  0  0 73  3564
> VecNorm              177 1.0 4.3344e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecCopy               60 1.0 1.6022e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               394 1.0 5.6601e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY               30 1.0 1.1802e-04 1.0 2.46e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0  2082
> VecAXPBYCZ           234 1.0 1.3099e-03 1.0 3.83e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 19  0  0  0   0 19  0  0  0  2927
> VecWAXPY             234 1.0 1.1146e-03 1.0 1.92e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  9  0  0  0   0  9  0  0  0  1720
> VecAssemblyBegin      30 1.0 4.7684e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAssemblyEnd        30 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatMult              264 1.0 5.7216e-03 1.0 1.07e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 52  0  0  0   0 52  0  0  0  1866
> MatConvert            30 1.0 1.8499e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin      30 1.0 5.0068e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd        30 1.0 8.7762e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ           30 1.0 2.8610e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSetUp              30 1.0 1.9836e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.0e+00  0  0  0  0  4   0  0  0  0  4     0
> KSPSolve              30 1.0 2.1702e-01 1.0 2.05e+07 1.0 0.0e+00 0.0e+00
> 1.6e+02  6100  0  0 97   6100  0  0 98    95
> PCSetUp               30 1.0 8.2594e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 4.0e+00  2  0  0  0  2   2  0  0  0  2     0
> PCApply              294 1.0 1.2338e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  3  0  0  0  0   3  0  0  0  0     0
> ------------------------------------------------------------------------------------------------------------------------
> 
> Memory usage is given in bytes:
> 
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
> 
> --- Event Stage 0: Main Stage
> 
>               Vector    40             40      1370240     0
>               Matrix     1              1       313964     0
>        Krylov Solver     1              1         1288     0
>       Preconditioner     1              1         1248     0
>               Viewer     1              0            0     0
> ========================================================================================================================
> Average time to get PetscTime(): 0
> #PETSc Option Table entries:
> -ksp_type bcgs
> -log_summary
> -pc_hypre_type boomeramg
> -pc_type hypre
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure run at: Tue Feb 19 15:18:58 2013
> Configure options: --with-threadcomm --with-pthreadclasses --with-openmp
> --with-cc=icc --with-fc=ifort --with-debugging=0
> --with-shared-libraries=1 --download-mpich --download-hypre
> COPTFLAGS=-O3 FOPTFLAGS=-O3
> -----------------------------------------
> Libraries compiled on Tue Feb 19 15:18:58 2013 on ty
> Machine characteristics: Linux-3.7.9-1-ARCH-x86_64-with-glibc2.2.5
> Using PETSc directory: /opt/petsc/petsc-dev-ifort
> Using PETSc arch: arch-linux2-c-opt
> -----------------------------------------
> 
> Using C compiler: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
>  -fPIC -wd1572 -O3 -fopenmp  ${COPTFLAGS} ${CFLAGS}
> Using Fortran compiler:
> /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90  -fPIC  -O3
> -fopenmp  ${FOPTFLAGS} ${FFLAGS}
> -----------------------------------------
> 
> Using include paths:
> -I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
> -I/opt/petsc/petsc-dev-ifort/include
> -I/opt/petsc/petsc-dev-ifort/include
> -I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
> -----------------------------------------
> 
> Using C linker: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
> Using Fortran linker:
> /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90
> Using libraries:
> -Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
> -L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lpetsc
> -Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
> -L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lHYPRE
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
> -Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2
> -L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2 -lmpichcxx -llapack -lblas
> -lX11 -lpthread -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl
> -lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lirng -lipgo -ldecimal
> -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
> -----------------------------------------
> 
> 
> 
> 
> 
> 
> 
> 
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
> 
> ./run on a arch-linux2-c-opt named ty with 1 processor, by asmunder Mon
> Nov 18 18:22:38 2013
> Using Petsc Development HG revision:
> 3a41f882cfc717ec37b4c7f6b31f43b10211af66  HG Date: Sun Feb 17 13:07:58
> 2013 -0600
> 
>                          Max       Max/Min        Avg      Total
> Time (sec):           9.378e+01      1.00000   9.378e+01
> Objects:              7.400e+01      1.00000   7.400e+01
> Flops:                1.630e+10      1.00000   1.630e+10  1.630e+10
> Flops/sec:            1.738e+08      1.00000   1.738e+08  1.738e+08
> MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Reductions:       3.004e+05      1.00000
> 
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N
> --> 2N flops
>                             and VecAXPY() for complex vectors of length
> N --> 8N flops
> 
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
> ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>  0:      Main Stage: 9.3784e+01 100.0%  1.6300e+10 100.0%  0.000e+00
> 0.0%  0.000e+00        0.0%  3.004e+05 100.0%
> 
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
>       %T - percent time in this phase         %f - percent flops in this
> phase
>       %M - percent messages in this phase     %L - percent message
> lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops
>          --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
> 
> --- Event Stage 0: Main Stage
> 
> ThreadCommRunKer 1101474 1.0 6.9823e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  7  0  0  0  0   7  0  0  0  0     0
> VecDot            200206 1.0 4.7812e-01 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  1 10  0  0  0   1 10  0  0  0  3430
> VecDotNorm2       100103 1.0 4.3580e-01 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
> 1.0e+05  0 10  0  0 33   0 10  0  0 33  3763
> VecNorm           100163 1.0 2.1225e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecCopy               60 1.0 1.6689e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet            200366 1.0 2.3949e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY               30 1.0 1.1563e-04 1.0 2.46e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0  2125
> VecAXPBYCZ        200206 1.0 1.1021e+00 1.0 3.28e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  1 20  0  0  0   1 20  0  0  0  2976
> VecWAXPY          200206 1.0 9.3084e-01 1.0 1.64e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  1 10  0  0  0   1 10  0  0  0  1762
> VecAssemblyBegin      30 1.0 2.8610e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAssemblyEnd        30 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatMult           200236 1.0 4.2707e+00 1.0 8.10e+09 1.0 0.0e+00 0.0e+00
> 0.0e+00  5 50  0  0  0   5 50  0  0  0  1896
> MatConvert            30 1.0 1.9252e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin      30 1.0 1.0967e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd        30 1.0 8.7786e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ           30 1.0 4.0531e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSetUp              30 1.0 1.8668e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve              30 1.0 8.9967e+01 1.0 1.63e+10 1.0 0.0e+00 0.0e+00
> 3.0e+05 96100  0  0100  96100  0  0100   181
> PCSetUp               30 1.0 8.3021e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCApply           200266 1.0 8.0928e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 86  0  0  0  0  86  0  0  0  0     0
> ------------------------------------------------------------------------------------------------------------------------
> 
> Memory usage is given in bytes:
> 
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
> 
> --- Event Stage 0: Main Stage
> 
>               Vector    40             40      1370240     0
>               Matrix     1              1       313964     0
>    Matrix Null Space    30             30        18120     0
>        Krylov Solver     1              1         1288     0
>       Preconditioner     1              1         1248     0
>               Viewer     1              0            0     0
> ========================================================================================================================
> Average time to get PetscTime(): 9.53674e-08
> #PETSc Option Table entries:
> -ksp_type bcgs
> -log_summary
> -pc_hypre_type boomeramg
> -pc_type hypre
> #End of PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> Configure run at: Tue Feb 19 15:18:58 2013
> Configure options: --with-threadcomm --with-pthreadclasses --with-openmp
> --with-cc=icc --with-fc=ifort --with-debugging=0
> --with-shared-libraries=1 --download-mpich --download-hypre
> COPTFLAGS=-O3 FOPTFLAGS=-O3
> -----------------------------------------
> Libraries compiled on Tue Feb 19 15:18:58 2013 on ty
> Machine characteristics: Linux-3.7.9-1-ARCH-x86_64-with-glibc2.2.5
> Using PETSc directory: /opt/petsc/petsc-dev-ifort
> Using PETSc arch: arch-linux2-c-opt
> -----------------------------------------
> 
> Using C compiler: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
>  -fPIC -wd1572 -O3 -fopenmp  ${COPTFLAGS} ${CFLAGS}
> Using Fortran compiler:
> /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90  -fPIC  -O3
> -fopenmp  ${FOPTFLAGS} ${FFLAGS}
> -----------------------------------------
> 
> Using include paths:
> -I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
> -I/opt/petsc/petsc-dev-ifort/include
> -I/opt/petsc/petsc-dev-ifort/include
> -I/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/include
> -----------------------------------------
> 
> Using C linker: /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpicc
> Using Fortran linker:
> /opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/bin/mpif90
> Using libraries:
> -Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
> -L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lpetsc
> -Wl,-rpath,/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib
> -L/opt/petsc/petsc-dev-ifort/arch-linux2-c-opt/lib -lHYPRE
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/compiler/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/ipp/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/mkl/lib/intel64
> -Wl,-rpath,/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
> -L/opt/intel/composer_xe_2013.1.117/tbb/lib/intel64
> -Wl,-rpath,/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2
> -L/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2 -lmpichcxx -llapack -lblas
> -lX11 -lpthread -lmpichf90 -lifport -lifcore -lm -lm -lmpichcxx -ldl
> -lmpich -lopa -lmpl -lrt -lpthread -limf -lsvml -lirng -lipgo -ldecimal
> -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl
> -----------------------------------------
> 
> 
> 


More information about the petsc-users mailing list