<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    On 6/6/2012 10:31 PM, Barry Smith wrote:

    <blockquote

      cite="mid:69439126-7332-4891-BF58-697966F0D975@mcs.anl.gov"

      type="cite">

      <pre wrap="">

On Jun 6, 2012, at 3:04 PM, TAY wee-beng wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">Hi,

I have used 3 KSP, 2 to solve momentum eqns and 1 for the multigrid. I have used

call KSPSetOptionsPrefix(ksp,"mg_",ierr) for the multigrid.

I run with :

-log_summary -mg_ksp_view so as to single out the multigrid ksp, but I'm not sure if it's really working...

</pre>

      </blockquote>

      <pre wrap="">

   Are you sure the KSPSetOptionsPrefix() is called before the KSPSetFromOptions()? It appears the prefix is not being set correctly.

   From the limited data below it is running multigrid with one level (hence you cannot expect great performance). You need to at least provide MG with a bit more information like how many levels you would like it to use.

   Barry

</pre>

    </blockquote>

    <br>

    Ya, I called KSPSetOptionsPrefix after KSPSetFromOptions. I've

    changed it. I've included the input below. Btw, in my code, I

    followed the example in ex29.c which uses:<br>

    <br>

    <b>call KSPCreate(MPI_COMM_WORLD,ksp,ierr)<br>

          <br>

      call

DMDACreate2d(MPI_COMM_WORLD,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,1,num_procs,i1,i1,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da,ierr)<br>

        <br>

      call DMSetFunction(da,ComputeRHS,ierr)<br>

      <br>

      call DMSetJacobian(da,ComputeMatrix,ierr)  <br>

          <br>

      call KSPSetDM(ksp,da,ierr)<br>

      <br>

      call KSPSetOptionsPrefix(ksp,"mg_",ierr)<br>

          <br>

      call KSPSetFromOptions(ksp,ierr)<br>

          <br>

      tol=1.e-5<br>

      <br>

      call

KSPSetTolerances(ksp,tol,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr)<br>

      <br>

      call KSPSetUp(ksp,ierr) <br>

        <br>

      call KSPSolve(ksp,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr) </b><br>

    <br>

    I just read the manual and it says for multigrid, I have to use:<br>

    <br>

    KSPGetPC(KSP ksp,PC *pc);<br>

    PCSetType(PC pc,PCMG);<br>

    PCMGSetLevels(pc,int levels,MPI Comm *comms)<br>

    <br>

    I included after after KSPCreate with:<br>

    <br>

    <b>call KSPGetPC(ksp,pc,ierr)<br>

          <br>

      call PCSetType(pc_uv,PCMG,ierr) <br>

      <br>

      mg_lvl = 2<br>

          <br>

      call PCMGSetLevels(pc,mg_lvl,MPI_COMM_WORLD,ierr)</b><br>

    <br>

    However, I get the error:<br>

    <br>

    Caught signal number 11 SEGV: Segmentation Violation, probably

    memory access out of range<br>

    <br>

    after calling <b>PCMGSetLevels</b><br>

    <br>

    What's the problem? Is there any examples which I can follow?<br>

    <br>

    Thanks!<br>

    <br>

    <b>---------------------------------------------- PETSc Performance

      Summary: ----------------------------------------------</b><br>

    <br>

    ./a.out on a petsc-3.2 named n12-50 with 4 processors, by wtay Wed

    Jun  6 22:45:37 2012<br>

    Using Petsc Development HG revision:

    c76fb3cac2a4ad0dfc9436df80f678898c867e86  HG Date: Thu May 31

    00:33:26 2012 -0500<br>

    <br>

                             Max       Max/Min        Avg      Total<br>

    Time (sec):           1.062e+01      1.00001   1.062e+01<br>

    Objects:              2.700e+01      1.00000   2.700e+01<br>

    Flops:                4.756e+08      1.00811   4.744e+08  1.897e+09<br>

    Flops/sec:            4.477e+07      1.00811   4.466e+07  1.786e+08<br>

    MPI Messages:         4.080e+02      2.00000   3.060e+02  1.224e+03<br>

    MPI Message Lengths:  2.328e+06      2.00000   5.706e+03  6.984e+06<br>

    MPI Reductions:       8.750e+02      1.00000<br>

    <br>

    Flop counting convention: 1 flop = 1 real number operation of type

    (multiply/divide/add/subtract)<br>

                                e.g., VecAXPY() for real vectors of

    length N --> 2N flops<br>

                                and VecAXPY() for complex vectors of

    length N --> 8N flops<br>

    <br>

    Summary of Stages:   ----- Time ------  ----- Flops -----  ---

    Messages ---  -- Message Lengths --  -- Reductions --<br>

                            Avg     %Total     Avg     %Total   counts  

    %Total     Avg         %Total   counts   %Total<br>

     0:      Main Stage: 1.0623e+01 100.0%  1.8975e+09 100.0%  1.224e+03

    100.0%  5.706e+03      100.0%  8.740e+02  99.9%<br>

    <br>

------------------------------------------------------------------------------------------------------------------------<br>

    See the 'Profiling' chapter of the users' manual for details on

    interpreting output.<br>

    Phase summary info:<br>

       Count: number of times phase was executed<br>

       Time and Flops: Max - maximum over all processors<br>

                       Ratio - ratio of maximum to minimum over all

    processors<br>

       Mess: number of messages sent<br>

       Avg. len: average message length<br>

       Reduct: number of global reductions<br>

       Global: entire computation<br>

       Stage: stages of a computation. Set stages with

    PetscLogStagePush() and PetscLogStagePop().<br>

          %T - percent time in this phase         %f - percent flops in

    this phase<br>

          %M - percent messages in this phase     %L - percent message

    lengths in this phase<br>

          %R - percent reductions in this phase<br>

       Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max

    time over all processors)<br>

------------------------------------------------------------------------------------------------------------------------<br>

    Event                Count      Time (sec)    

    Flops                             --- Global ---  --- Stage ---  

    Total<br>

                       Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg

    len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s<br>

------------------------------------------------------------------------------------------------------------------------<br>

    --- Event Stage 0: Main Stage<br>

    <br>

    MatMult              202 1.0 5.5212e-01 1.0 1.38e+08 1.0 1.2e+03

    5.7e+03 0.0e+00  5 29 99100  0   5 29 99100  0   996<br>

    MatSolve             252 1.0 6.8899e-01 1.1 1.71e+08 1.0 0.0e+00

    0.0e+00 0.0e+00  6 36  0  0  0   6 36  0  0  0   989<br>

    MatLUFactorNum        50 1.0 4.5529e-01 1.0 7.31e+07 1.0 0.0e+00

    0.0e+00 0.0e+00  4 15  0  0  0   4 15  0  0  0   640<br>

    MatILUFactorSym        1 1.0 9.7420e-03 1.1 0.00e+00 0.0 0.0e+00

    0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0<br>

    MatAssemblyBegin      50 1.0 1.7412e-03 1.2 0.00e+00 0.0 0.0e+00

    0.0e+00 1.0e+02  0  0  0  0 11   0  0  0  0 11     0<br>

    MatAssemblyEnd        50 1.0 1.0649e-01 1.0 0.00e+00 0.0 1.2e+01

    1.4e+03 8.0e+00  1  0  1  0  1   1  0  1  0  1     0<br>

    MatGetRowIJ            1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00

    0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0<br>

    MatGetOrdering         1 1.0 7.0190e-04 1.0 0.00e+00 0.0 0.0e+00

    0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0<br>

    KSPSetUp             100 1.0 2.9013e-03 1.0 0.00e+00 0.0 0.0e+00

    0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0<br>

    KSPSolve              50 1.0 2.0438e+00 1.0 4.76e+08 1.0 1.2e+03

    5.7e+03 4.6e+02 19100 99100 52  19100 99100 53   928<br>

    VecDot               202 1.0 6.9886e-02 1.4 1.63e+07 1.0 0.0e+00

    0.0e+00 2.0e+02  1  3  0  0 23   1  3  0  0 23   932<br>

    VecDotNorm2          101 1.0 4.0677e-02 2.1 1.63e+07 1.0 0.0e+00

    0.0e+00 1.0e+02  0  3  0  0 12   0  3  0  0 12  1602<br>

    VecNorm              151 1.0 3.5888e-02 1.4 1.22e+07 1.0 0.0e+00

    0.0e+00 1.5e+02  0  3  0  0 17   0  3  0  0 17  1357<br>

    VecCopy              100 1.0 2.2957e-02 1.1 0.00e+00 0.0 0.0e+00

    0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0<br>

    VecSet               403 1.0 6.1034e-02 1.1 0.00e+00 0.0 0.0e+00

    0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0<br>

    VecAXPBYCZ           202 1.0 6.6927e-02 1.0 3.26e+07 1.0 0.0e+00

    0.0e+00 0.0e+00  1  7  0  0  0   1  7  0  0  0  1947<br>

    VecWAXPY             202 1.0 7.0219e-02 1.1 1.63e+07 1.0 0.0e+00

    0.0e+00 0.0e+00  1  3  0  0  0   1  3  0  0  0   928<br>

    VecAssemblyBegin     100 1.0 4.0812e-0213.3 0.00e+00 0.0 0.0e+00

    0.0e+00 3.0e+02  0  0  0  0 34   0  0  0  0 34     0<br>

    VecAssemblyEnd       100 1.0 4.8542e-04 1.1 0.00e+00 0.0 0.0e+00

    0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0<br>

    VecScatterBegin      202 1.0 7.2360e-03 1.7 0.00e+00 0.0 1.2e+03

    5.7e+03 0.0e+00  0  0 99100  0   0  0 99100  0     0<br>

    VecScatterEnd        202 1.0 2.7255e-02 2.8 0.00e+00 0.0 0.0e+00

    0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0<br>

    PCSetUp              100 1.0 4.6843e-01 1.0 7.31e+07 1.0 0.0e+00

    0.0e+00 5.0e+00  4 15  0  0  1   4 15  0  0  1   622<br>

    PCSetUpOnBlocks       50 1.0 4.6814e-01 1.0 7.31e+07 1.0 0.0e+00

    0.0e+00 3.0e+00  4 15  0  0  0   4 15  0  0  0   623<br>

    PCApply              252 1.0 7.3618e-01 1.1 1.71e+08 1.0 0.0e+00

    0.0e+00 0.0e+00  7 36  0  0  0   7 36  0  0  0   926<br>

------------------------------------------------------------------------------------------------------------------------<br>

    <br>

    Memory usage is given in bytes:<br>

    <br>

    Object Type          Creations   Destructions     Memory 

    Descendants' Mem.<br>

    Reports information only for process 0.<br>

    <br>

    --- Event Stage 0: Main Stage<br>

    <br>

                  Matrix     4              4     16900896     0<br>

           Krylov Solver     2              2         2168     0<br>

                  Vector    12             12      2604080     0<br>

          Vector Scatter     1              1         1060     0<br>

               Index Set     5              5       167904     0<br>

          Preconditioner     2              2         1800     0<br>

                  Viewer     1              0            0     0<br>

========================================================================================================================<br>

    Average time to get PetscTime(): 1.19209e-06<br>

    Average time for MPI_Barrier(): 6.62804e-06<br>

    Average time for zero size MPI_Send(): 2.02656e-05<br>

    #PETSc Option Table entries:<br>

    -log_summary<br>

    -mg_ksp_view<br>

    #End of PETSc Option Table entries<br>

    Compiled without FORTRAN kernels<br>

    Compiled with full precision matrices (default)<br>

    sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8

    sizeof(PetscScalar) 8 sizeof(PetscInt) 4<br>

    Configure run at: Thu May 31 09:53:43 2012<br>

    Configure options: --with-mpi-dir=/opt/openmpi-1.5.3/

    --with-blas-lapack-dir=/opt/intelcpro-11.1.059/mkl/lib/em64t/

    --with-debugging=0 --download-hypre=1

    --prefix=/home/wtay/Lib/petsc-3.2-dev_shared_rel

    --known-mpi-shared=1 --with-shared-libraries<br>

    -----------------------------------------<br>

    Libraries compiled on Thu May 31 09:53:43 2012 on hpc12<br>

    Machine characteristics:

    Linux-2.6.32-220.2.1.el6.x86_64-x86_64-with-centos-6.2-Final<br>

    Using PETSc directory: /home/wtay/Codes/petsc-dev<br>

    Using PETSc arch: petsc-3.2-dev_shared_rel<br>

    -----------------------------------------<br>

    <br>

    Using C compiler: /opt/openmpi-1.5.3/bin/mpicc  -fPIC -wd1572

    -Qoption,cpp,--extended_float_type -O3  ${COPTFLAGS} ${CFLAGS}<br>

    Using Fortran compiler: /opt/openmpi-1.5.3/bin/mpif90  -fPIC -O3  

    ${FOPTFLAGS} ${FFLAGS}<br>

    -----------------------------------------<br>

    <br>

    Using include paths:

    -I/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/include

    -I/home/wtay/Codes/petsc-dev/include

    -I/home/wtay/Codes/petsc-dev/include

    -I/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/include

    -I/opt/openmpi-1.5.3/include<br>

    -----------------------------------------<br>

    <br>

    Using C linker: /opt/openmpi-1.5.3/bin/mpicc<br>

    Using Fortran linker: /opt/openmpi-1.5.3/bin/mpif90<br>

    Using libraries:

    -Wl,-rpath,/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib

    -L/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -lpetsc

    -lX11 -lpthread

    -Wl,-rpath,/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib

    -L/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -lHYPRE

    -lmpi_cxx -Wl,-rpath,/opt/openmpi-1.5.3/lib

    -Wl,-rpath,/opt/intelcpro-11.1.059/lib/intel64

    -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lstdc++

    -Wl,-rpath,/opt/intelcpro-11.1.059/mkl/lib/em64t

    -L/opt/intelcpro-11.1.059/mkl/lib/em64t -lmkl_intel_lp64

    -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -ldl

    -L/opt/openmpi-1.5.3/lib -lmpi -lnsl -lutil

    -L/opt/intelcpro-11.1.059/lib/intel64 -limf

    -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lsvml -lipgo -ldecimal

    -lgcc_s -lirc -lpthread -lirc_s -lmpi_f90 -lmpi_f77 -lm -lm -lifport

    -lifcore -lm -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl

    -lmpi -lnsl -lutil -limf -lsvml -lipgo -ldecimal -lgcc_s -lirc

    -lpthread -lirc_s -ldl<br>

    <br>

    <blockquote

      cite="mid:69439126-7332-4891-BF58-697966F0D975@mcs.anl.gov"

      type="cite">

      <pre wrap="">

</pre>

      <blockquote type="cite">

        <pre wrap="">

Here's the output:

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./a.out on a petsc-3.2 named n12-50 with 4 processors, by wtay Wed Jun  6 21:57:33 2012

Using Petsc Development HG revision: c76fb3cac2a4ad0dfc9436df80f678898c867e86  HG Date: Thu May 31 00:33:26 2012 -0500

                        Max       Max/Min        Avg      Total

Time (sec):           1.064e+01      1.00000   1.064e+01

Objects:              2.700e+01      1.00000   2.700e+01

Flops:                4.756e+08      1.00811   4.744e+08  1.897e+09

Flops/sec:            4.468e+07      1.00811   4.457e+07  1.783e+08

MPI Messages:         4.080e+02      2.00000   3.060e+02  1.224e+03

MPI Message Lengths:  2.328e+06      2.00000   5.706e+03  6.984e+06

MPI Reductions:       8.750e+02      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)

                           e.g., VecAXPY() for real vectors of length N --> 2N flops

                           and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --

                       Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total

0:      Main Stage: 1.0644e+01 100.0%  1.8975e+09 100.0%  1.224e+03 100.0%  5.706e+03      100.0%  8.740e+02  99.9%

------------------------------------------------------------------------------------------------------------------------

See the 'Profiling' chapter of the users' manual for details on interpreting output.

Phase summary info:

  Count: number of times phase was executed

  Time and Flops: Max - maximum over all processors

                  Ratio - ratio of maximum to minimum over all processors

  Mess: number of messages sent

  Avg. len: average message length

  Reduct: number of global reductions

  Global: entire computation

  Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().

     %T - percent time in this phase         %f - percent flops in this phase

     %M - percent messages in this phase     %L - percent message lengths in this phase

     %R - percent reductions in this phase

  Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)

------------------------------------------------------------------------------------------------------------------------

Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total

                  Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s

------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

MatMult              202 1.0 5.5096e-01 1.0 1.38e+08 1.0 1.2e+03 5.7e+03 0.0e+00  5 29 99100  0   5 29 99100  0   998

MatSolve             252 1.0 6.9136e-01 1.1 1.71e+08 1.0 0.0e+00 0.0e+00 0.0e+00  6 36  0  0  0   6 36  0  0  0   986

MatLUFactorNum        50 1.0 4.6002e-01 1.0 7.31e+07 1.0 0.0e+00 0.0e+00 0.0e+00  4 15  0  0  0   4 15  0  0  0   634

MatILUFactorSym        1 1.0 9.5899e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatAssemblyBegin      50 1.0 1.6270e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+02  0  0  0  0 11   0  0  0  0 11     0

MatAssemblyEnd        50 1.0 1.0896e-01 1.0 0.00e+00 0.0 1.2e+01 1.4e+03 8.0e+00  1  0  1  0  1   1  0  1  0  1     0

MatGetRowIJ            1 1.0 2.8610e-06 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatGetOrdering         1 1.0 7.2002e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0

KSPSetUp             100 1.0 2.9130e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

KSPSolve              50 1.0 2.0737e+00 1.0 4.76e+08 1.0 1.2e+03 5.7e+03 4.6e+02 19100 99100 52  19100 99100 53   915

VecDot               202 1.0 7.3588e-02 1.1 1.63e+07 1.0 0.0e+00 0.0e+00 2.0e+02  1  3  0  0 23   1  3  0  0 23   885

VecDotNorm2          101 1.0 3.9155e-02 1.7 1.63e+07 1.0 0.0e+00 0.0e+00 1.0e+02  0  3  0  0 12   0  3  0  0 12  1664

VecNorm              151 1.0 5.8769e-02 1.7 1.22e+07 1.0 0.0e+00 0.0e+00 1.5e+02  0  3  0  0 17   0  3  0  0 17   829

VecCopy              100 1.0 2.3459e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecSet               403 1.0 5.9994e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0

VecAXPBYCZ           202 1.0 6.6376e-02 1.0 3.26e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  7  0  0  0   1  7  0  0  0  1963

VecWAXPY             202 1.0 6.9311e-02 1.0 1.63e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  3  0  0  0   1  3  0  0  0   940

VecAssemblyBegin     100 1.0 4.0355e-0214.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+02  0  0  0  0 34   0  0  0  0 34     0

VecAssemblyEnd       100 1.0 5.0378e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecScatterBegin      202 1.0 6.2275e-03 1.5 0.00e+00 0.0 1.2e+03 5.7e+03 0.0e+00  0  0 99100  0   0  0 99100  0     0

VecScatterEnd        202 1.0 2.0878e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

PCSetUp              100 1.0 4.7225e-01 1.0 7.31e+07 1.0 0.0e+00 0.0e+00 5.0e+00  4 15  0  0  1   4 15  0  0  1   617

PCSetUpOnBlocks       50 1.0 4.7191e-01 1.0 7.31e+07 1.0 0.0e+00 0.0e+00 3.0e+00  4 15  0  0  0   4 15  0  0  0   618

PCApply              252 1.0 7.3425e-01 1.1 1.71e+08 1.0 0.0e+00 0.0e+00 0.0e+00  7 36  0  0  0   7 36  0  0  0   928

------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.

Reports information only for process 0.

--- Event Stage 0: Main Stage

             Matrix     4              4     16900896     0

      Krylov Solver     2              2         2168     0

             Vector    12             12      2604080     0

     Vector Scatter     1              1         1060     0

          Index Set     5              5       167904     0

     Preconditioner     2              2         1800     0

             Viewer     1              0            0     0

========================================================================================================================

Average time to get PetscTime(): 1.09673e-06

Average time for MPI_Barrier(): 4.00543e-06

Average time for zero size MPI_Send(): 1.22786e-05

#PETSc Option Table entries:

-log_summary

-mg_ksp_view

#End of PETSc Option Table entries

Compiled without FORTRAN kernels

Compiled with full precision matrices (default)

sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4

Configure run at: Thu May 31 09:53:43 2012

Configure options: --with-mpi-dir=/opt/openmpi-1.5.3/ --with-blas-lapack-dir=/opt/intelcpro-11.1.059/mkl/lib/em64t/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.2-dev_shared_rel --known-mpi-shared=1 --with-shared-libraries

-----------------------------------------

Libraries compiled on Thu May 31 09:53:43 2012 on hpc12

Machine characteristics: Linux-2.6.32-220.2.1.el6.x86_64-x86_64-with-centos-6.2-Final

Using PETSc directory: /home/wtay/Codes/petsc-dev

Using PETSc arch: petsc-3.2-dev_shared_rel

-----------------------------------------

Using C compiler: /opt/openmpi-1.5.3/bin/mpicc  -fPIC -wd1572 -Qoption,cpp,--extended_float_type -O3  ${COPTFLAGS} ${CFLAGS}

Using Fortran compiler: /opt/openmpi-1.5.3/bin/mpif90  -fPIC -O3   ${FOPTFLAGS} ${FFLAGS}

-----------------------------------------

Using include paths: -I/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/include -I/home/wtay/Codes/petsc-dev/include -I/home/wtay/Codes/petsc-dev/include -I/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/include -I/opt/openmpi-1.5.3/include

-----------------------------------------

Using C linker: /opt/openmpi-1.5.3/bin/mpicc

Using Fortran linker: /opt/openmpi-1.5.3/bin/mpif90

Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -L/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -lpetsc -lX11 -lpthread -Wl,-rpath,/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -L/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -lHYPRE -lmpi_cxx -Wl,-rpath,/opt/openmpi-1.5.3/lib -Wl,-rpath,/opt/intelcpro-11.1.059/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lstdc++ -Wl,-rpath,/opt/intelcpro-11.1.059/mkl/lib/em64t -L/opt/intelcpro-11.1.059/mkl/lib/em64t -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -ldl -L/opt/openmpi-1.5.3/lib -lmpi -lnsl -lutil -L/opt/intelcpro-11.1.059/lib/intel64 -limf -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -lmpi_f90 -lmpi_f77 -lm -lm -lifport -lifcore -lm -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lnsl -lutil -limf -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -ldl

-----------------------------------------

Yours sincerely,

TAY wee-beng

On 5/6/2012 1:34 AM, Barry Smith wrote:

</pre>

        <blockquote type="cite">

          <pre wrap="">   Also run with -ksp_view to see exasctly what solver options it is using. For example the number of levels, smoother on each level etc. My guess is that the below is running on one level (because I don't see you supplying options to control the number of levels etc).

   Barry

On Jun 4, 2012, at 4:15 PM, Jed Brown wrote:

</pre>

          <blockquote type="cite">

            <pre wrap="">Always send -log_summary when asking about performance.

On Mon, Jun 4, 2012 at 4:11 PM, TAY wee-beng<a class="moz-txt-link-rfc2396E" href="mailto:zonexo@gmail.com"><zonexo@gmail.com></a>  wrote:

Hi,

I tried using PETSc multigrid on my 2D CFD code. I had converted ksp eg. ex29 to Fortran and then added into my code to solve the Poisson equation.

The main subroutines are:

call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)

call DMDACreate2d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_STAR,i3,i3,PETSC_DECIDE,PETSC_DECIDE,i1,i1,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da,ierr)

call DMSetFunction(da,ComputeRHS,ierr)

call DMSetJacobian(da,ComputeMatrix,ierr)

call KSPSetDM(ksp,da,ierr)

call KSPSetFromOptions(ksp,ierr)

call KSPSetUp(ksp,ierr)

call KSPSolve(ksp,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr)

call KSPGetSolution(ksp,x,ierr)

call VecView(x,PETSC_VIEWER_STDOUT_WORLD,ierr)

call KSPDestroy(ksp,ierr)

call DMDestroy(da,ierr)

call PetscFinalize(ierr)

Since the LHS matrix doesn't change, I only set up at the 1st time step, thereafter I only called ComputeRHS every time step.

I was using HYPRE's geometric multigrid and the speed was much faster.

What other options can I tweak to improve the speed? Or should I call the subroutines above at every timestep?

Thanks!

-- 

Yours sincerely,

TAY wee-beng

</pre>

          </blockquote>

        </blockquote>

      </blockquote>

      <pre wrap="">

</pre>

    </blockquote>

  </body>

</html>