[petsc-users] GAMG speed

Tue Aug 13 19:05:50 CDT 2013

Hi Matt,

I attached the output of the commands you suggested.
The options I used are:

-log_summary -ksp_monitor -ksp_view -ksp_converged_reason -pc_type mg  
-pc_mg_galerkin -pc_mg_levels 5 -options_left

and here are the lines of codes where I setup the solution process:

     call DMDACreate3d( PETSC_COMM_WORLD 
,                                  &
                     & DMDA_BOUNDARY_PERIODIC , 
DMDA_BOUNDARY_PERIODIC,     &
                     & DMDA_BOUNDARY_PERIODIC , 
DMDA_STENCIL_STAR,          &
                     & N_Z , N_Y , N_X , N_B3 , N_B2 , 1_ip, 1_ip , 1_ip , &
                     & NNZ ,NNY , NNX, da , ierr)

     ! Create Global Vectors
     call DMCreateGlobalVector(da,b,ierr)
     call VecDuplicate(b,x,ierr)

     ! Set initial guess for first use of the module to 0
     call VecSet(x,0.0_rp,ierr)

     ! Create matrix
     call DMCreateMatrix(da,MATAIJ,A,ierr)

     ! Create solver
     call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
     call KSPSetDM(ksp,da,ierr)
     call KSPSetDMActive(ksp,PETSC_FALSE,ierr)
     call KSPSetOperators(ksp,A,A,SAME_NONZERO_PATTERN,ierr)
     call KSPSetType(ksp,KSPCG,ierr)
     call KSPSetNormType(ksp,KSP_NORM_UNPRECONDITIONED,ierr)
     call KSPSetInitialGuessNonzero(ksp,PETSC_TRUE,ierr)
     call KSPSetTolerances(ksp, tol ,PETSC_DEFAULT_DOUBLE_PRECISION,&
         & PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr)

     ! Nullspace removal
     call MatNullSpaceCreate( 
PETSC_COMM_WORLD,PETSC_TRUE,PETSC_NULL_INTEGER,&
                            & PETSC_NULL_INTEGER,nullspace,ierr)
     call KSPSetNullspace(ksp,nullspace,ierr)
     call MatNullSpaceDestroy(nullspace,ierr)

     ! To allow using option from command line
     call KSPSetFromOptions(ksp,ierr)

Hope I did not omit anything useful.
Thank you for your time.

Best,
Michele

On 08/13/2013 04:26 PM, Matthew Knepley wrote:
> On Tue, Aug 13, 2013 at 6:09 PM, Michele Rosso <mrosso at uci.edu 
> <mailto:mrosso at uci.edu>> wrote:
>
>     Hi Karli,
>
>     thank you for your hint: now it works.
>     Now I would like to speed up the solution: I was counting on
>     increasing the number of levels/the number of processors used, but
>     now I see I cannot do that.
>     Do you have any hint to achieve better speed?
>     Thanks!
>
>
> "Better speed" is not very helpful for us, and thus we cannot offer 
> much help. You could
>
>  1) Send the output of -log_summary -ksp_monitor -ksp_view
>
>  2) Describe the operator succintly
>
>     Matt
>
>     Best,
>     Michele
>>>>>>>>>>>>>>>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130813/4561546f/attachment-0001.html>
-------------- next part --------------
  0 KSP Residual norm 3.653385002401e-05 
  1 KSP Residual norm 9.460380827787e-07 
  2 KSP Residual norm 2.745875833479e-08 
  3 KSP Residual norm 4.613281252783e-10 
Linear solve converged due to CONVERGED_RTOL iterations 3
KSP Object: 8 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=0.0001, absolute=1e-50, divergence=10000
  left preconditioning
  has attached null space
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: mg
    MG: type is MULTIPLICATIVE, levels=5 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     8 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     8 MPI processes
      type: redundant
        Redundant preconditioner: First (color=0) of 8 PCs follows
      KSP Object:      (mg_coarse_redundant_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_coarse_redundant_)       1 MPI processes
        type: lu
          LU: out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: nd
          factor fill ratio given 5, needed 8.69546
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=512, cols=512
                package used to perform factorization: petsc
                total: nonzeros=120206, allocated nonzeros=120206
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=512, cols=512
          total: nonzeros=13824, allocated nonzeros=13824
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       8 MPI processes
        type: mpiaij
        rows=512, cols=512
        total: nonzeros=13824, allocated nonzeros=13824
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 32 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     8 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.140194, max = 1.54213
        Chebyshev: estimated using:  [0 0.1; 0 1.1]
        KSP Object:        (mg_levels_1_est_)         8 MPI processes
          type: gmres
            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
            GMRES: happy breakdown tolerance 1e-30
          maximum iterations=10
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using nonzero initial guess
          using NONE norm type for convergence test
        PC Object:        (mg_levels_1_)         8 MPI processes
          type: sor
            SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
          linear system matrix = precond matrix:
          Matrix Object:           8 MPI processes
            type: mpiaij
            rows=4096, cols=4096
            total: nonzeros=110592, allocated nonzeros=110592
            total number of mallocs used during MatSetValues calls =0
              not using I-node (on process 0) routines
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     8 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       8 MPI processes
        type: mpiaij
        rows=4096, cols=4096
        total: nonzeros=110592, allocated nonzeros=110592
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     8 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.139949, max = 1.53944
        Chebyshev: estimated using:  [0 0.1; 0 1.1]
        KSP Object:        (mg_levels_2_est_)         8 MPI processes
          type: gmres
            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
            GMRES: happy breakdown tolerance 1e-30
          maximum iterations=10
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using nonzero initial guess
          using NONE norm type for convergence test
        PC Object:        (mg_levels_2_)         8 MPI processes
          type: sor
            SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
          linear system matrix = precond matrix:
          Matrix Object:           8 MPI processes
            type: mpiaij
            rows=32768, cols=32768
            total: nonzeros=884736, allocated nonzeros=884736
            total number of mallocs used during MatSetValues calls =0
              not using I-node (on process 0) routines
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     8 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       8 MPI processes
        type: mpiaij
        rows=32768, cols=32768
        total: nonzeros=884736, allocated nonzeros=884736
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object:    (mg_levels_3_)     8 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.135788, max = 1.49366
        Chebyshev: estimated using:  [0 0.1; 0 1.1]
        KSP Object:        (mg_levels_3_est_)         8 MPI processes
          type: gmres
            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
            GMRES: happy breakdown tolerance 1e-30
          maximum iterations=10
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using nonzero initial guess
          using NONE norm type for convergence test
        PC Object:        (mg_levels_3_)         8 MPI processes
          type: sor
            SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
          linear system matrix = precond matrix:
          Matrix Object:           8 MPI processes
            type: mpiaij
            rows=262144, cols=262144
            total: nonzeros=7077888, allocated nonzeros=7077888
            total number of mallocs used during MatSetValues calls =0
              not using I-node (on process 0) routines
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_3_)     8 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       8 MPI processes
        type: mpiaij
        rows=262144, cols=262144
        total: nonzeros=7077888, allocated nonzeros=7077888
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 4 -------------------------------
    KSP Object:    (mg_levels_4_)     8 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.138904, max = 1.52794
        Chebyshev: estimated using:  [0 0.1; 0 1.1]
        KSP Object:        (mg_levels_4_est_)         8 MPI processes
          type: gmres
            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
            GMRES: happy breakdown tolerance 1e-30
          maximum iterations=10
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using nonzero initial guess
          using NONE norm type for convergence test
        PC Object:        (mg_levels_4_)         8 MPI processes
          type: sor
            SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
          linear system matrix = precond matrix:
          Matrix Object:           8 MPI processes
            type: mpiaij
            rows=2097152, cols=2097152
            total: nonzeros=14680064, allocated nonzeros=14680064
            total number of mallocs used during MatSetValues calls =0
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_4_)     8 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       8 MPI processes
        type: mpiaij
        rows=2097152, cols=2097152
        total: nonzeros=14680064, allocated nonzeros=14680064
        total number of mallocs used during MatSetValues calls =0
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   8 MPI processes
    type: mpiaij
    rows=2097152, cols=2097152
    total: nonzeros=14680064, allocated nonzeros=14680064
    total number of mallocs used during MatSetValues calls =0

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./hit on a arch-cray-xt5-pkgs-opt named nid14554 with 8 processors, by Unknown Tue Aug 13 19:53:41 2013
Using Petsc Release Version 3.4.2, Jul, 02, 2013 

                         Max       Max/Min        Avg      Total 
Time (sec):           6.402e+00      1.00011   6.402e+00
Objects:              2.970e+02      1.00000   2.970e+02
Flops:                6.953e+08      1.00000   6.953e+08  5.562e+09
Flops/sec:            1.086e+08      1.00011   1.086e+08  8.688e+08
MPI Messages:         1.170e+03      1.00000   1.170e+03  9.360e+03
MPI Message Lengths:  1.565e+07      1.00000   1.338e+04  1.252e+08
MPI Reductions:       6.260e+02      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 6.4021e+00 100.0%  5.5620e+09 100.0%  9.360e+03 100.0%  1.338e+04      100.0%  6.250e+02  99.8% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecMDot               40 1.0 9.1712e-02 1.0 3.29e+07 1.0 0.0e+00 0.0e+00 4.0e+01  1  5  0  0  6   1  5  0  0  6  2874
VecTDot               10 1.0 2.3873e-02 1.1 5.24e+06 1.0 0.0e+00 0.0e+00 1.0e+01  0  1  0  0  2   0  1  0  0  2  1757
VecNorm               52 1.0 2.3764e-02 1.3 1.08e+07 1.0 0.0e+00 0.0e+00 5.2e+01  0  2  0  0  8   0  2  0  0  8  3630
VecScale             124 1.0 2.6341e-02 1.4 9.29e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  2820
VecCopy               27 1.0 1.7691e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               109 1.0 1.7006e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              178 1.0 1.1252e-01 1.2 3.04e+07 1.0 0.0e+00 0.0e+00 0.0e+00  2  4  0  0  0   2  4  0  0  0  2162
VecAYPX              164 1.0 1.0078e-01 1.1 1.68e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  2  0  0  0   1  2  0  0  0  1334
VecMAXPY              44 1.0 1.1766e-01 1.0 3.89e+07 1.0 0.0e+00 0.0e+00 0.0e+00  2  6  0  0  0   2  6  0  0  0  2647
VecScatterBegin      228 1.0 5.5004e-02 1.1 0.00e+00 0.0 7.4e+03 1.4e+04 0.0e+00  1  0 79 82  0   1  0 79 82  0     0
VecScatterEnd        228 1.0 4.0928e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          44 1.0 1.8650e-02 1.3 9.88e+06 1.0 0.0e+00 0.0e+00 4.4e+01  0  1  0  0  7   0  1  0  0  7  4240
MatMult              170 1.0 9.7667e-01 1.0 2.41e+08 1.0 6.0e+03 1.6e+04 0.0e+00 15 35 65 77  0  15 35 65 77  0  1977
MatMultAdd            20 1.0 5.1495e-02 1.1 1.01e+07 1.0 4.8e+02 2.8e+03 0.0e+00  1  1  5  1  0   1  1  5  1  0  1570
MatMultTranspose      24 1.0 6.8663e-02 1.2 1.21e+07 1.0 5.8e+02 2.8e+03 0.0e+00  1  2  6  1  0   1  2  6  1  0  1413
MatSolve               5 1.0 3.2754e-03 1.0 1.20e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2930
MatSOR               164 1.0 1.7211e+00 1.0 2.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 27 32  0  0  0  27 32  0  0  0  1050
MatLUFactorSym         1 1.0 3.0711e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 2.4564e-02 1.0 1.95e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0  6355
MatAssemblyBegin      20 1.0 8.0438e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.2e+01  0  0  0  0  4   0  0  0  0  4     0
MatAssemblyEnd        20 1.0 1.3442e-01 1.0 0.00e+00 0.0 5.6e+02 2.1e+03 7.2e+01  2  0  6  1 12   2  0  6  1 12     0
MatGetRowIJ            1 1.0 1.1206e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 4.0507e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView               24 1.2 1.7951e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+01  0  0  0  0  3   0  0  0  0  3     0
MatPtAP                4 1.0 6.4214e-01 1.0 4.06e+07 1.0 1.1e+03 1.7e+04 1.0e+02 10  6 12 16 16  10  6 12 16 16   506
MatPtAPSymbolic        4 1.0 3.7196e-01 1.0 0.00e+00 0.0 7.2e+02 2.0e+04 6.0e+01  6  0  8 12 10   6  0  8 12 10     0
MatPtAPNumeric         4 1.0 2.7023e-01 1.0 4.06e+07 1.0 4.2e+02 1.2e+04 4.0e+01  4  6  4  4  6   4  6  4  4  6  1201
MatGetRedundant        1 1.0 8.0895e-04 1.1 0.00e+00 0.0 1.7e+02 7.1e+03 4.0e+00  0  0  2  1  1   0  0  2  1  1     0
MatGetLocalMat         4 1.0 4.0415e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  1  0  0  0  1   1  0  0  0  1     0
MatGetBrAoCol          4 1.0 1.7636e-02 1.0 0.00e+00 0.0 4.3e+02 2.7e+04 8.0e+00  0  0  5  9  1   0  0  5  9  1     0
MatGetSymTrans         8 1.0 1.3187e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog        40 1.0 1.8928e-01 1.0 6.59e+07 1.0 0.0e+00 0.0e+00 4.0e+01  3  9  0  0  6   3  9  0  0  6  2785
KSPSetUp              11 1.0 3.2629e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.2e+01  0  0  0  0 12   0  0  0  0 12     0
KSPSolve               2 1.0 3.3489e+00 1.0 6.33e+08 1.0 7.3e+03 1.4e+04 2.3e+02 52 91 78 79 36  52 91 78 79 36  1512
PCSetUp                1 1.0 8.6804e-01 1.0 6.21e+07 1.0 1.9e+03 1.1e+04 3.2e+02 14  9 21 17 52  14  9 21 17 52   572
PCApply                5 1.0 3.1772e+00 1.0 5.96e+08 1.0 7.1e+03 1.3e+04 2.0e+02 49 86 76 72 33  49 86 76 72 33  1501
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Container     1              1          564     0
              Vector   139            139     71560728     0
      Vector Scatter    21             21        22092     0
              Matrix    37             37     75834272     0
   Matrix Null Space     1              1          596     0
    Distributed Mesh     5              5      2740736     0
     Bipartite Graph    10             10         7920     0
           Index Set    50             50      1546832     0
   IS L to G Mapping     5              5      1361108     0
       Krylov Solver    11             11       129320     0
     DMKSP interface     3              3         1944     0
      Preconditioner    11             11         9840     0
              Viewer     3              2         1456     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 2.43187e-06
Average time for zero size MPI_Send(): 2.5034e-06
#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor
-ksp_view
-log_summary
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_type mg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Wed Jul 31 22:48:06 2013
Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=0 --known-mpi-c-double-complex=0 --with-cc=cc --with-cxx=CC --with-fc=ftn --with-clib-autodetect=0 --with-cxxlib-autodetect=0 --with-fortranlib-autodetect=0 --with-debugging=0 --COPTFLAGS="-fastsse -Mipa=fast -mp" --CXXOPTFLAGS="-fastsse -Mipa=fast -mp" --FOPTFLAGS="-fastsse -Mipa=fast -mp" --with-blas-lapack-lib="-L/opt/acml/4.4.0/pgi64/lib -lacml -lacml_mv" --with-shared-libraries=0 --with-x=0 --with-batch --known-mpi-shared-libraries=0 PETSC_ARCH=arch-cray-xt5-pkgs-opt
-----------------------------------------
Libraries compiled on Wed Jul 31 22:48:06 2013 on krakenpf1 
Machine characteristics: Linux-2.6.27.48-0.12.1_1.0301.5943-cray_ss_s-x86_64-with-SuSE-11-x86_64
Using PETSc directory: /nics/c/home/mrosso/LIBS/petsc-3.4.2
Using PETSc arch: arch-cray-xt5-pkgs-opt
-----------------------------------------

Using C compiler: cc  -fastsse -Mipa=fast -mp  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: ftn  -fastsse -Mipa=fast -mp   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/nics/c/home/mrosso/LIBS/petsc-3.4.2/arch-cray-xt5-pkgs-opt/include -I/nics/c/home/mrosso/LIBS/petsc-3.4.2/include -I/nics/c/home/mrosso/LIBS/petsc-3.4.2/include -I/nics/c/home/mrosso/LIBS/petsc-3.4.2/arch-cray-xt5-pkgs-opt/include -I/opt/cray/portals/2.2.0-1.0301.26633.6.9.ss/include -I/opt/cray/pmi/2.1.4-1.0000.8596.15.1.ss/include -I/opt/cray/mpt/5.3.5/xt/seastar/mpich2-pgi/109/include -I/opt/acml/4.4.0/pgi64/include -I/opt/xt-libsci/11.0.04/pgi/109/istanbul/include -I/opt/fftw/3.3.0.0/x86_64/include -I/usr/include/alps
-----------------------------------------

Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/nics/c/home/mrosso/LIBS/petsc-3.4.2/arch-cray-xt5-pkgs-opt/lib -L/nics/c/home/mrosso/LIBS/petsc-3.4.2/arch-cray-xt5-pkgs-opt/lib -lpetsc -L/opt/acml/4.4.0/pgi64/lib -lacml -lacml_mv -lpthread -ldl 
-----------------------------------------

#PETSc Option Table entries:
-ksp_converged_reason
-ksp_monitor
-ksp_view
-log_summary
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_type mg
#End of PETSc Option Table entries
There are no unused options.