[petsc-users] Order of PetscRandom and SVD

Torquil Macdonald Sørensen torquil at gmail.com
Tue Jun 30 16:05:17 CDT 2015


Hi!

I'm experiencing some problems using PetscRandom and SVD from SLEPc. For
some reason, the order of two seemingly unrelated  parts of the program
below affects the random values in a vector. The test program below, run
with "mpiexec -n 2", will print a vector with two identical component
values. If I move the four lines of SVD code up above "PetscRandom r",
the two vector component are different, as expected. Why does this
matter, when the SVD code is unrelated to the PetscRandom r and Vec u?

#include "petscdmda.h"
#include "slepcsvd.h"

int main(int argc, char **argv)
{
    PetscErrorCode ierr = SlepcInitialize(&argc, &argv, 0, 0);
CHKERRQ(ierr);

    DM da;
    ierr = DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 2, 1, 1, 0,
&da); CHKERRQ(ierr);
    ierr = DMSetFromOptions(da); CHKERRQ(ierr);

    Vec u;
    ierr = DMCreateGlobalVector(da, &u); CHKERRQ(ierr);

    PetscRandom r;
    ierr = PetscRandomCreate(PETSC_COMM_WORLD, &r); CHKERRQ(ierr);
    ierr = PetscRandomSetFromOptions(r);

    SVD svd;
    ierr = SVDCreate(PETSC_COMM_WORLD, &svd); CHKERRQ(ierr);
    ierr = SVDSetFromOptions(svd); CHKERRQ(ierr);
    ierr = SVDDestroy(&svd); CHKERRQ(ierr);

    ierr = VecSetRandom(u, r); CHKERRQ(ierr);
    ierr = VecView(u, PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);

    ierr = PetscRandomDestroy(&r); CHKERRQ(ierr);
    ierr = VecDestroy(&u); CHKERRQ(ierr);
    ierr = DMDestroy(&da); CHKERRQ(ierr);
    ierr = SlepcFinalize(); CHKERRQ(ierr);

    return 0;
}

Output of the program as written above:

Vec Object: 2 MPI processes
  type: mpi
Vec Object:Vec_0xd00a30_0 2 MPI processes
  type: mpi
Process [0]
0.720032 + 0.061794 i
Process [1]
0.720032 + 0.061794 i

Output of the program with the four lines of SVD code moved above the
line "PetscRandom r":

Vec Object: 2 MPI processes
  type: mpi
Vec Object:Vec_0x117aa30_0 2 MPI processes
  type: mpi
Process [0]
0.720032 + 0.061794 i
Process [1]
0.541144 + 0.529699 i

Other information:

$ gcc --version
gcc (Debian 5.1.1-12) 5.1.1 20150622

The PETSc version was obtained with the command:
git checkout 75ae60a9cdad23ddd49e9e39341b43db353bd940

because the newest GIT version wouldn't compile. SLEPc is the newest
from the master branch. The -log_summary output of the program is:

---------------------------------------------- PETSc Performance
Summary: ----------------------------------------------

./randtest.exe on a arch-linux2-c-debug named lenovo with 2 processors,
by tmac Tue Jun 30 22:42:49 2015
Using Petsc Development GIT revision: v3.6-89-g75ae60a  GIT Date:
2015-06-29 18:37:23 -0500

                         Max       Max/Min        Avg      Total
Time (sec):           1.274e-02      1.00000   1.274e-02
Objects:              3.900e+01      1.00000   3.900e+01
Flops:                0.000e+00      0.00000   0.000e+00  0.000e+00
Flops/sec:            0.000e+00      0.00000   0.000e+00  0.000e+00
Memory:               1.273e+05      1.00000              2.546e+05
MPI Messages:         6.500e+00      1.00000   6.500e+00  1.300e+01
MPI Message Lengths:  3.600e+01      1.00000   5.538e+00  7.200e+01
MPI Reductions:       9.000e+01      1.00000

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N
--> 2N flops
                            and VecAXPY() for complex vectors of length
N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts  
%Total     Avg         %Total   counts   %Total
 0:      Main Stage: 1.2723e-02  99.9%  0.0000e+00   0.0%  1.300e+01
100.0%  5.538e+00      100.0%  8.900e+01  98.9%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush()
and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this
phase
      %M - percent messages in this phase     %L - percent message
lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run ./configure                #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################




      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   The code for various complex numbers numerical       #
      #   kernels uses C++, which generally is not well        #
      #   optimized.  For performance that is about 4-5 times  #
      #   faster, specify --with-fortran-kernels=1             #
      #   when running ./configure.py.                         #
      #                                                        #
      ##########################################################


Event                Count      Time (sec)    
Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecView                1 1.0 1.0881e-03 1.2 0.00e+00 0.0 5.0e+00 8.0e+00
1.5e+01  8  0 38 56 17   8  0 38 56 17     0
VecSet                 1 1.0 1.5020e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin        1 1.0 2.1935e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterEnd          1 1.0 7.1526e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSetRandom           1 1.0 1.6928e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       1 1.0 8.4877e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
2.0e+00  1  0  0  0  2   1  0  0  0  2     0
MatAssemblyEnd         1 1.0 6.7306e-04 1.0 0.00e+00 0.0 4.0e+00 4.0e+00
1.4e+01  5  0 31 22 16   5  0 31 22 16     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector     7              7        11064     0
      Vector Scatter     3              3         3280     0
              Matrix     3              3         8064     0
    Distributed Mesh     1              1         4972     0
Star Forest Bipartite Graph     2              2         1712     0
     Discrete System     1              1          848     0
           Index Set     6              6         4660     0
   IS L to G Mapping     1              1          632     0
         PetscRandom     3              3         1920     0
          SVD Solver     1              1          984     0
          EPS Solver     1              1         1272     0
  Spectral Transform     1              1          784     0
       Krylov Solver     1              1         1304     0
      Preconditioner     1              1          848     0
       Basis Vectors     3              3         3024     0
       Direct Solver     2              2         2640     0
              Region     1              1          648     0
              Viewer     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 1.43051e-06
Average time for zero size MPI_Send(): 4.52995e-06
#PETSc Option Table entries:
-ksp_converged_reason
-log_summary
-objects_dump 0
-options_table
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 16 sizeof(PetscInt) 4
Configure options: --prefix=/home/tmac/usr/stow/petsc
--with-scalar-type=complex --with-shared-libraries=yes
--with-debugging=yes --download-superlu=yes --download-ptscotch=yes
--with-sowing=no --download-sowing=no
-----------------------------------------
Libraries compiled on Tue Jun 30 20:18:19 2015 on lenovo
Machine characteristics: Linux-4.0.0-2-amd64-x86_64-with-debian-stretch-sid
Using PETSc directory: /home/tmac/tmp/src/petsc
Using PETSc arch: arch-linux2-c-debug
-----------------------------------------

Using C compiler: mpicc  -fPIC -Wall -Wwrite-strings
-Wno-strict-aliasing -Wno-unknown-pragmas -g3 -O0  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90  -fPIC -Wall -Wno-unused-variable
-ffree-line-length-0 -g -O0   ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------

Using include paths:
-I/home/tmac/tmp/src/petsc/arch-linux2-c-debug/include
-I/home/tmac/tmp/src/petsc/include -I/home/tmac/tmp/src/petsc/include
-I/home/tmac/tmp/src/petsc/arch-linux2-c-debug/include
-I/home/tmac/usr/stow/petsc/include -I/usr/lib/openmpi/include
-I/usr/lib/openmpi/include/openmpi
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries:
-Wl,-rpath,/home/tmac/tmp/src/petsc/arch-linux2-c-debug/lib
-L/home/tmac/tmp/src/petsc/arch-linux2-c-debug/lib -lpetsc
-Wl,-rpath,/home/tmac/usr/stow/petsc/lib -L/home/tmac/usr/stow/petsc/lib
-lsuperlu_4.3 -llapack -lblas -lptesmumps -lptscotch -lptscotcherr
-lscotch -lscotcherr -lX11 -lssl -lcrypto -lm
-Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib
-Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5
-L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
-L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
-L/lib/x86_64-linux-gnu -lmpi_f90 -lmpi_f77 -lgfortran -lm -lgfortran
-lm -lquadmath -lm -lmpi_cxx -lstdc++ -lrt -lm -lz
-Wl,-rpath,/usr/lib/openmpi/lib -L/usr/lib/openmpi/lib
-Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/5
-L/usr/lib/gcc/x86_64-linux-gnu/5 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
-L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
-L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu
-L/usr/lib/x86_64-linux-gnu -ldl -lmpi -lhwloc -lgcc_s -lpthread -ldl
-----------------------------------------

Best regards
Torquil Sørensen


More information about the petsc-users mailing list