[petsc-users] How to speed up geometric multigrid

Michele Rosso mrosso at uci.edu
Tue Oct 1 21:02:28 CDT 2013


Barry,

I repeated the previous runs since I noticed that it was not using the 
options

-mg_levels_ksp_max_it 3

They are faster then before now but still slower than my initial test 
and anyway the solution time increases considerably in time.
I attached the diagnostics (run1.txt and run2.txt, please see the files 
for the list of the options I used).
I also run a case by using your last proposed options (run3.txt): there 
is a divergence condition since 30 iterations seem not be enough to low 
the error below the need tolerance and thus after some time steps my 
solution blows up.
Please let me know what you think about it.
In the mean time I will try run my initial test with the option 
-mg_levels_ksp_max_it 3 instead of  -mg_levels_ksp_max_it 1.
As usual, thank you very much.

Michele

On 09/30/2013 07:17 PM, Barry Smith wrote:
>    I wasn't expecting this. Try
>
> -pc_mg_type full -ksp_type richardson -mg_levels_pc_type bjacobi -mg_levels_ksp_type gmres -mg_levels_ksp_max_it 3
> -mg_coarse_pc_factor_mat_solver_package superlu_dist -mg_coarse_pc_type lu -pc_mg_galerkin -pc_mg_levels 5 -pc_type mg
> -log_summary  -pc_mg_log  -ksp_monitor_true_residual  -options_left   -ksp_view -ksp_max_it 30
>
>
>
>
> On Sep 30, 2013, at 8:00 PM, Michele Rosso <mrosso at uci.edu> wrote:
>
>> Barry,
>>
>> sorry again for the very late answer. I tried all the variations you proposed: all of them converge very slow , except the last one (CG instead of fgmres) diverges.
>> I attached the diagnostics for the two options that convergence: each of the attached files starts with a list of the options I used for the run.
>> As you pointed out earlier, the residual I used to require was too small, therefore I increased atol to e-9.
>> After some tests, I noticed that any further increase of the absolute tolerance changes significantly the solution.
>> What would you suggest to try next?
>> Thank you very much,
>>
>> Michele
>>
>>
>>
>>
>>
>>
>> On 09/24/2013 05:08 PM, Barry Smith wrote:
>>>   Thanks. The balance of work on the different levels and across processes looks ok. So it is is a matter of improving the convergence rate.
>>>
>>>   The initial residual norm is very small. Are you sure you need to decrease it to 10^-12 ????
>>>
>>>   Start with a really robust multigrid smoother use
>>>
>>> -pc_mg_type full -ksp_type fgmres -mg_levels_pc_type bjacobi -mg_levels_ksp_type gmres -mg_levels_ksp_max_it 3   PLUS -mg_coarse_pc_factor_mat_solver_package superlu_dist
>>> -mg_coarse_pc_type lu -pc_mg_galerkin -pc_mg_levels 5 -pc_mg_log -pc_type mg
>>>
>>>   run with the -log_summary and -pc_mg_log
>>>
>>> Now back off a little on the smoother and use -mg_levels_pc_type sor instead  how does that change the convergence and time.
>>>
>>> Back off even more an replace the -ksp_type fgmres with -ksp_type cg and the -mg_levels_ksp_type gmres with -mg_levels_ksp_type richardson   how does that change the convergence and the time?
>>>
>>>   There are some additional variants we can try based on the results from above.
>>>
>>>   Barry
>>>
>>>
>>>
>>> On Sep 24, 2013, at 4:29 PM, Michele Rosso <mrosso at uci.edu> wrote:
>>>
>>>> Barry,
>>>>
>>>> I re-rerun the test case with the option -pc_mg_log as you suggested.
>>>> I attached the new output ("final_new.txt').
>>>> Thanks for your help.
>>>>
>>>> Michele
>>>>
>>>> On 09/23/2013 09:35 AM, Barry Smith wrote:
>>>>>    Run with the additional option -pc_mg_log and send us the log file.
>>>>>
>>>>>    Barry
>>>>>
>>>>> Maybe we should make this the default somehow.
>>>>>
>>>>>
>>>>> On Sep 23, 2013, at 10:55 AM, Michele Rosso <mrosso at uci.edu> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am successfully using PETSc to solve a 3D Poisson's equation with CG + MG .  Such equation arises from a projection algorithm for a multiphase incompressible flow simulation.
>>>>>> I set up the solver  as I was suggested to do in a previous thread (title: "GAMG speed") and run a test case (liquid droplet with surface tension falling under the effect of gravity in a quiescent fluid).
>>>>>> The solution of the Poisson Equation via multigrid is correct but it becomes progressively slower and slower as the simulation progresses (I am performing successive solves) due to an increase in the number of iterations.
>>>>>> Since the solution of the Poisson equation is mission-critical, I need to speed it up as much as I can.
>>>>>> Could you please help me out with this?
>>>>>>
>>>>>> I run the test case with the following options:
>>>>>>
>>>>>> -pc_type mg  -pc_mg_galerkin  -pc_mg_levels 5   -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1
>>>>>> -mg_coarse_pc_type lu   -mg_coarse_pc_factor_mat_solver_package superlu_dist
>>>>>> -log_summary -ksp_view  -ksp_monitor_true_residual  -options_left
>>>>>>
>>>>>> Please find the diagnostic for the final solve in the attached file "final.txt'.
>>>>>> Thank you,
>>>>>>
>>>>>> Michele
>>>>>> <final.txt>
>>>> <final_new.txt>
>> <final1.txt><final2.txt>
>

-------------- next part --------------
OPTIONS:

-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg



  0 KSP unpreconditioned resid norm 1.036334906411e-06 true resid norm 1.036334906411e-06 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP unpreconditioned resid norm 7.265278566968e-07 true resid norm 7.265278566968e-07 ||r(i)||/||b|| 7.010550857663e-01
  2 KSP unpreconditioned resid norm 5.582868896378e-07 true resid norm 5.582868896378e-07 ||r(i)||/||b|| 5.387128101002e-01
  3 KSP unpreconditioned resid norm 3.634998226503e-07 true resid norm 3.634998226503e-07 ||r(i)||/||b|| 3.507551665022e-01
  4 KSP unpreconditioned resid norm 3.158574836963e-07 true resid norm 3.158574836963e-07 ||r(i)||/||b|| 3.047832141350e-01
  5 KSP unpreconditioned resid norm 2.777502658984e-07 true resid norm 2.777502658984e-07 ||r(i)||/||b|| 2.680120723331e-01
  6 KSP unpreconditioned resid norm 2.284215345140e-07 true resid norm 2.284215345140e-07 ||r(i)||/||b|| 2.204128540889e-01
  7 KSP unpreconditioned resid norm 1.969414140052e-07 true resid norm 1.969414140052e-07 ||r(i)||/||b|| 1.900364571210e-01
  8 KSP unpreconditioned resid norm 1.782653889853e-07 true resid norm 1.782653889853e-07 ||r(i)||/||b|| 1.720152316423e-01
  9 KSP unpreconditioned resid norm 1.536397911477e-07 true resid norm 1.536397911477e-07 ||r(i)||/||b|| 1.482530311362e-01
 10 KSP unpreconditioned resid norm 1.302441466920e-07 true resid norm 1.302441466920e-07 ||r(i)||/||b|| 1.256776606542e-01
 11 KSP unpreconditioned resid norm 1.142838199081e-07 true resid norm 1.142838199081e-07 ||r(i)||/||b|| 1.102769183988e-01
 12 KSP unpreconditioned resid norm 1.031156644411e-07 true resid norm 1.031156644411e-07 ||r(i)||/||b|| 9.950032928847e-02
 13 KSP unpreconditioned resid norm 9.027533291890e-08 true resid norm 9.027533291890e-08 ||r(i)||/||b|| 8.711019223651e-02
 14 KSP unpreconditioned resid norm 7.493550605983e-08 true resid norm 7.493550605983e-08 ||r(i)||/||b|| 7.230819457710e-02
 15 KSP unpreconditioned resid norm 6.742982059332e-08 true resid norm 6.742982059333e-08 ||r(i)||/||b|| 6.506566571884e-02
 16 KSP unpreconditioned resid norm 5.379988711018e-08 true resid norm 5.379988711019e-08 ||r(i)||/||b|| 5.191361091608e-02
 17 KSP unpreconditioned resid norm 4.872990226442e-08 true resid norm 4.872990226444e-08 ||r(i)||/||b|| 4.702138465372e-02
 18 KSP unpreconditioned resid norm 4.543576707229e-08 true resid norm 4.543576707230e-08 ||r(i)||/||b|| 4.384274503465e-02
 19 KSP unpreconditioned resid norm 4.243835036633e-08 true resid norm 4.243835036635e-08 ||r(i)||/||b|| 4.095042066401e-02
 20 KSP unpreconditioned resid norm 3.855396651833e-08 true resid norm 3.855396651834e-08 ||r(i)||/||b|| 3.720222707913e-02
 21 KSP unpreconditioned resid norm 3.540838965507e-08 true resid norm 3.540838965509e-08 ||r(i)||/||b|| 3.416693718994e-02
 22 KSP unpreconditioned resid norm 3.114021467696e-08 true resid norm 3.114021467699e-08 ||r(i)||/||b|| 3.004840856401e-02
 23 KSP unpreconditioned resid norm 2.687679086480e-08 true resid norm 2.687679086485e-08 ||r(i)||/||b|| 2.593446452356e-02
 24 KSP unpreconditioned resid norm 2.231320925522e-08 true resid norm 2.231320925524e-08 ||r(i)||/||b|| 2.153088650898e-02
 25 KSP unpreconditioned resid norm 1.849367046766e-08 true resid norm 1.849367046769e-08 ||r(i)||/||b|| 1.784526445387e-02
 26 KSP unpreconditioned resid norm 1.597347720732e-08 true resid norm 1.597347720735e-08 ||r(i)||/||b|| 1.541343161224e-02
 27 KSP unpreconditioned resid norm 1.351813033069e-08 true resid norm 1.351813033073e-08 ||r(i)||/||b|| 1.304417157726e-02
 28 KSP unpreconditioned resid norm 1.135895547453e-08 true resid norm 1.135895547456e-08 ||r(i)||/||b|| 1.096069948458e-02
 29 KSP unpreconditioned resid norm 9.644960881002e-09 true resid norm 9.644960881027e-09 ||r(i)||/||b|| 9.306799202997e-03
 30 KSP unpreconditioned resid norm 8.454149815651e-09 true resid norm 8.454149815651e-09 ||r(i)||/||b|| 8.157739127910e-03
 31 KSP unpreconditioned resid norm 7.380097753084e-09 true resid norm 7.380097753084e-09 ||r(i)||/||b|| 7.121344371812e-03
 32 KSP unpreconditioned resid norm 6.949063499474e-09 true resid norm 6.949063499474e-09 ||r(i)||/||b|| 6.705422596965e-03
 33 KSP unpreconditioned resid norm 6.732114039970e-09 true resid norm 6.732114039970e-09 ||r(i)||/||b|| 6.496079595816e-03
 34 KSP unpreconditioned resid norm 5.348043445752e-09 true resid norm 5.348043445752e-09 ||r(i)||/||b|| 5.160535858308e-03
 35 KSP unpreconditioned resid norm 4.753111075163e-09 true resid norm 4.753111075163e-09 ||r(i)||/||b|| 4.586462393342e-03
 36 KSP unpreconditioned resid norm 4.053751219961e-09 true resid norm 4.053751219961e-09 ||r(i)||/||b|| 3.911622772602e-03
 37 KSP unpreconditioned resid norm 3.648046750367e-09 true resid norm 3.648046750367e-09 ||r(i)||/||b|| 3.520142694990e-03
 38 KSP unpreconditioned resid norm 3.154002693751e-09 true resid norm 3.154002693751e-09 ||r(i)||/||b|| 3.043420301911e-03
 39 KSP unpreconditioned resid norm 2.796711364093e-09 true resid norm 2.796711364093e-09 ||r(i)||/||b|| 2.698655952618e-03
 40 KSP unpreconditioned resid norm 2.490859697022e-09 true resid norm 2.490859697022e-09 ||r(i)||/||b|| 2.403527741479e-03
 41 KSP unpreconditioned resid norm 2.214978925969e-09 true resid norm 2.214978925969e-09 ||r(i)||/||b|| 2.137319617689e-03
 42 KSP unpreconditioned resid norm 2.062308903139e-09 true resid norm 2.062308903139e-09 ||r(i)||/||b|| 1.990002353853e-03
 43 KSP unpreconditioned resid norm 1.893218419438e-09 true resid norm 1.893218419438e-09 ||r(i)||/||b|| 1.826840346422e-03
 44 KSP unpreconditioned resid norm 1.633410997565e-09 true resid norm 1.633410997565e-09 ||r(i)||/||b|| 1.576142024610e-03
 45 KSP unpreconditioned resid norm 1.497658367438e-09 true resid norm 1.497658367438e-09 ||r(i)||/||b|| 1.445149013290e-03
 46 KSP unpreconditioned resid norm 1.384720917112e-09 true resid norm 1.384720917112e-09 ||r(i)||/||b|| 1.336171259450e-03
 47 KSP unpreconditioned resid norm 1.149245204430e-09 true resid norm 1.149245204429e-09 ||r(i)||/||b|| 1.108951553517e-03
 48 KSP unpreconditioned resid norm 1.044541114051e-09 true resid norm 1.044541114051e-09 ||r(i)||/||b|| 1.007918490045e-03
 49 KSP unpreconditioned resid norm 8.702453784205e-10 true resid norm 8.702453784202e-10 ||r(i)||/||b|| 8.397337318626e-04
KSP Object: 128 MPI processes
  type: fgmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=0.001, absolute=1e-50, divergence=10000
  right preconditioning
  has attached null space
  using UNPRECONDITIONED norm type for convergence test
PC Object: 128 MPI processes
  type: mg
    MG: type is FULL, levels=5 cycles=v
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     128 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     128 MPI processes
      type: lu
        LU: out-of-place factorization
        tolerance for zero pivot 2.22045e-14
        matrix ordering: natural
        factor fill ratio given 0, needed 0
          Factored matrix follows:
            Matrix Object:             128 MPI processes
              type: mpiaij
              rows=1024, cols=1024
              package used to perform factorization: superlu_dist
              total: nonzeros=0, allocated nonzeros=0
              total number of mallocs used during MatSetValues calls =0
                SuperLU_DIST run parameters:
                  Process grid nprow 16 x npcol 8 
                  Equilibrate matrix TRUE 
                  Matrix input mode 1 
                  Replace tiny pivots TRUE 
                  Use iterative refinement FALSE 
                  Processors in row 16 col partition 8 
                  Row permutation LargeDiag 
                  Column permutation METIS_AT_PLUS_A
                  Parallel symbolic factorization FALSE 
                  Repeated factorization SamePattern_SameRowPerm
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=1024, cols=1024
        total: nonzeros=27648, allocated nonzeros=27648
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_1_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_1_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=64, cols=64
                package used to perform factorization: petsc
                total: nonzeros=768, allocated nonzeros=768
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 16 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=64, cols=64
          total: nonzeros=768, allocated nonzeros=768
          total number of mallocs used during MatSetValues calls =0
            using I-node routines: found 16 nodes, limit used is 5
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=8192, cols=8192
        total: nonzeros=221184, allocated nonzeros=221184
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 16 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_2_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_2_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=512, cols=512
                package used to perform factorization: petsc
                total: nonzeros=9600, allocated nonzeros=9600
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=512, cols=512
          total: nonzeros=9600, allocated nonzeros=9600
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=65536, cols=65536
        total: nonzeros=1769472, allocated nonzeros=1769472
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object:    (mg_levels_3_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_3_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_3_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_3_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=4096, cols=4096
                package used to perform factorization: petsc
                total: nonzeros=92928, allocated nonzeros=92928
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=4096, cols=4096
          total: nonzeros=92928, allocated nonzeros=92928
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=524288, cols=524288
        total: nonzeros=14155776, allocated nonzeros=14155776
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 4 -------------------------------
    KSP Object:    (mg_levels_4_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_4_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_4_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_4_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=32768, cols=32768
                package used to perform factorization: petsc
                total: nonzeros=221184, allocated nonzeros=221184
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=32768, cols=32768
          total: nonzeros=221184, allocated nonzeros=221184
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=4194304, cols=4194304
        total: nonzeros=29360128, allocated nonzeros=29360128
        total number of mallocs used during MatSetValues calls =0
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   128 MPI processes
    type: mpiaij
    rows=4194304, cols=4194304
    total: nonzeros=29360128, allocated nonzeros=29360128
    total number of mallocs used during MatSetValues calls =0
 

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./hit on a interlagos-64idx-pgi-opt named nid12058 with 128 processors, by Unknown Tue Oct  1 12:47:42 2013
Using Petsc Release Version 3.4.2, Jul, 02, 2013 

                         Max       Max/Min        Avg      Total 
Time (sec):           5.564e+01      1.00031   5.563e+01
Objects:              1.344e+03      1.00000   1.344e+03
Flops:                7.519e+09      1.00000   7.519e+09  9.624e+11
Flops/sec:            1.352e+08      1.00031   1.352e+08  1.730e+10
MPI Messages:         2.584e+05      1.09854   2.354e+05  3.014e+07
MPI Message Lengths:  4.162e+08      1.00022   1.767e+03  5.326e+10
MPI Reductions:       4.880e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 1.4617e+01  26.3%  1.3758e+11  14.3%  4.615e+05   1.5%  2.117e+02       12.0%  3.681e+03   7.5% 
 1:        MG Apply: 4.1013e+01  73.7%  8.2479e+11  85.7%  2.968e+07  98.5%  1.556e+03       88.0%  4.512e+04  92.5% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecMDot              322 1.0 6.3548e-01 1.1 2.20e+08 1.0 0.0e+00 0.0e+00 3.2e+02  1  3  0  0  1   4 20  0  0  9 44353
VecNorm              786 1.0 2.7858e-01 2.9 5.15e+07 1.0 0.0e+00 0.0e+00 7.9e+02  0  1  0  0  2   1  5  0  0 21 23668
VecScale             372 1.0 1.7262e-02 1.1 1.22e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0 90388
VecCopy              414 1.0 5.8294e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               528 1.0 8.1342e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecAXPY              368 1.0 6.1460e-02 1.3 2.41e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  2  0  0  0 50228
VecAYPX              368 1.0 4.9858e-02 1.5 1.21e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0 30958
VecWAXPY               4 1.0 1.1320e-03 2.1 1.31e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 14821
VecMAXPY             690 1.0 1.0875e+00 1.2 4.54e+08 1.0 0.0e+00 0.0e+00 0.0e+00  2  6  0  0  0   7 42  0  0  0 53393
VecScatterBegin      790 1.0 1.3550e-01 1.3 0.00e+00 0.0 3.8e+05 1.6e+04 0.0e+00  0  0  1 12  0   1  0 82 97  0     0
VecScatterEnd        790 1.0 3.5084e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
MatMult              694 1.0 2.4980e+00 1.1 2.96e+08 1.0 3.6e+05 1.6e+04 0.0e+00  4  4  1 11  0  16 28 77 91  0 15149
MatMultTranspose       4 1.0 2.2562e-03 1.1 2.53e+05 1.0 1.5e+03 9.9e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0 14338
MatLUFactorSym         1 1.0 5.0211e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 1.2263e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatAssemblyBegin      63 1.0 2.0496e-01 8.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+02  0  0  0  0  0   1  0  0  0  3     0
MatAssemblyEnd        63 1.0 2.0343e-01 1.1 0.00e+00 0.0 1.2e+04 1.1e+03 7.2e+01  0  0  0  0  0   1  0  3  0  2     0
MatGetRowIJ            1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 2.0981e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView              690 2.1 2.1315e-01 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+02  0  0  0  0  1   1  0  0  0  9     0
MatPtAP                4 1.0 2.0783e-01 1.0 5.11e+06 1.0 2.5e+04 6.0e+03 1.0e+02  0  0  0  0  0   1  0  5  2  3  3144
MatPtAPSymbolic        4 1.0 1.4338e-01 1.0 0.00e+00 0.0 1.5e+04 7.8e+03 6.0e+01  0  0  0  0  0   1  0  3  2  2     0
MatPtAPNumeric         4 1.0 7.0436e-02 1.1 5.11e+06 1.0 9.7e+03 3.1e+03 4.0e+01  0  0  0  0  0   0  0  2  0  1  9277
MatGetLocalMat         4 1.0 2.3359e-02 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol          4 1.0 3.0560e-02 3.2 0.00e+00 0.0 1.1e+04 8.4e+03 8.0e+00  0  0  0  0  0   0  0  2  1  0     0
MatGetSymTrans         8 1.0 1.0388e-02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog       322 1.0 1.0764e+00 1.1 4.40e+08 1.0 0.0e+00 0.0e+00 3.2e+02  2  6  0  0  1   7 41  0  0  9 52372
KSPSetUp               6 1.0 3.7848e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+01  0  0  0  0  0   0  0  0  0  2     0
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
KSPSolve              46 1.0 4.6268e+01 1.0 7.52e+09 1.0 3.0e+07 1.8e+03 4.8e+04 83100100 99 99 31670065158291309 20800
PCSetUp                1 1.0 4.2101e-01 1.0 5.36e+06 1.0 3.4e+04 4.6e+03 3.3e+02  1  0  0  0  1   3  0  7  2  9  1629
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
PCApply              322 1.0 4.1021e+01 1.0 6.44e+09 1.0 3.0e+07 1.6e+03 4.5e+04 74 86 98 88 92 28160064317351226 20106
MGSetup Level 0        1 1.0 1.2542e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   1  0  0  0  0     0
MGSetup Level 1        1 1.0 2.4819e-03 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 2        1 1.0 5.2500e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 3        1 1.0 5.1689e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 4        1 1.0 1.1669e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 1: MG Apply

VecMDot            19320 1.0 2.9361e+00 1.6 3.30e+08 1.0 0.0e+00 0.0e+00 1.9e+04  4  4  0  0 40   6  5  0  0 43 14401
VecNorm            25760 1.0 1.5193e+00 1.5 2.20e+08 1.0 0.0e+00 0.0e+00 2.6e+04  2  3  0  0 53   3  3  0  0 57 18556
VecScale           25760 1.0 1.7513e-01 1.1 1.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0 80490
VecCopy             8050 1.0 2.0040e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet             46368 1.0 5.5128e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
VecAXPY            10948 1.0 1.9405e-01 1.2 1.07e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0 70388
VecAYPX             3220 1.0 5.9333e-02 1.1 1.38e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 29698
VecMAXPY           25760 1.0 5.2558e-01 1.1 4.96e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  7  0  0  0   1  8  0  0  0 120695
VecScatterBegin    36064 1.0 2.7229e+00 2.4 0.00e+00 0.0 3.0e+07 1.6e+03 0.0e+00  4  0 98 88  0   6  0100100  0     0
VecScatterEnd      36064 1.0 2.0120e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   4  0  0  0  0     0
VecNormalize       25760 1.0 1.7211e+00 1.4 3.30e+08 1.0 0.0e+00 0.0e+00 2.6e+04  3  4  0  0 53   4  5  0  0 57 24571
MatMult            28336 1.0 1.9003e+01 1.1 2.75e+09 1.0 2.7e+07 1.7e+03 0.0e+00 33 37 89 85  0  45 43 90 96  0 18501
MatMultAdd          3220 1.0 7.4839e-01 1.2 9.29e+07 1.0 1.2e+06 5.3e+02 0.0e+00  1  1  4  1  0   2  1  4  1  0 15893
MatMultTranspose    4508 1.0 1.4513e+00 1.2 1.74e+08 1.0 1.7e+06 6.6e+02 0.0e+00  2  2  6  2  0   3  3  6  2  0 15372
MatSolve           27370 1.0 1.4336e+01 1.1 2.15e+09 1.0 0.0e+00 0.0e+00 0.0e+00 25 29  0  0  0  34 33  0  0  0 19206
MatLUFactorNum         4 1.0 1.0956e-02 1.2 1.86e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 21770
MatILUFactorSym        4 1.0 5.2656e-02 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            4 1.0 1.0967e-05 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         4 1.0 5.1585e-0230.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog     19320 1.0 3.2936e+00 1.5 6.61e+08 1.0 0.0e+00 0.0e+00 1.9e+04  5  9  0  0 40   7 10  0  0 43 25678
KSPSetUp               4 1.0 5.0068e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve            8050 1.0 3.6157e+01 1.0 5.79e+09 1.0 2.3e+07 1.7e+03 4.5e+04 65 77 77 74 92  88 90 78 84100 20482
PCSetUp                4 1.0 7.6262e-02 2.3 1.86e+06 1.0 0.0e+00 0.0e+00 1.4e+01  0  0  0  0  0   0  0  0  0  0  3127
PCSetUpOnBlocks     6440 1.0 8.1504e-02 2.1 1.86e+06 1.0 0.0e+00 0.0e+00 1.4e+01  0  0  0  0  0   0  0  0  0  0  2926
PCApply            27370 1.0 1.5573e+01 1.1 2.15e+09 1.0 0.0e+00 0.0e+00 0.0e+00 27 29  0  0  0  37 33  0  0  0 17681
MGSmooth Level 0    1610 1.0 2.2521e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  4  0  0  0  0   5  0  0  0  0     0
MGSmooth Level 1    2576 1.0 1.4212e+00 1.0 5.43e+07 1.0 9.6e+06 1.9e+02 1.8e+04  3  1 32  3 37   3  1 32  4 40  4890
MGResid Level 1     1288 1.0 7.1253e-02 1.3 4.45e+06 1.0 1.3e+06 1.9e+02 0.0e+00  0  0  4  0  0   0  0  4  1  0  7996
MGInterp Level 1    3220 1.0 1.6670e-01 3.5 1.37e+06 1.0 1.2e+06 6.4e+01 0.0e+00  0  0  4  0  0   0  0  4  0  0  1052
MGSmooth Level 2    1932 1.0 2.2821e+00 1.0 3.82e+08 1.0 7.3e+06 6.4e+02 1.4e+04  4  5 24  9 28   5  6 24 10 30 21402
MGResid Level 2      966 1.0 1.4905e-01 1.5 2.67e+07 1.0 9.9e+05 6.4e+02 0.0e+00  0  0  3  1  0   0  0  3  1  0 22936
MGInterp Level 2    2576 1.0 1.8992e-01 2.3 8.74e+06 1.0 9.9e+05 2.1e+02 0.0e+00  0  0  3  0  0   0  0  3  0  0  5889
MGSmooth Level 3    1288 1.0 1.2010e+01 1.0 2.23e+09 1.0 4.9e+06 2.3e+03 9.0e+03 22 30 16 21 18  29 35 17 24 20 23726
MGResid Level 3      644 1.0 8.5489e-01 1.1 1.42e+08 1.0 6.6e+05 2.3e+03 0.0e+00  1  2  2  3  0   2  2  2  3  0 21327
MGInterp Level 3    1932 1.0 4.4246e-01 1.4 5.21e+07 1.0 7.4e+05 7.7e+02 0.0e+00  1  1  2  1  0   1  1  2  1  0 15071
MGSmooth Level 4     644 1.0 1.8477e+01 1.0 3.12e+09 1.0 1.3e+06 1.6e+04 4.5e+03 33 42  4 41  9  45 48  4 46 10 21640
MGResid Level 4      322 1.0 1.1910e+00 1.1 1.48e+08 1.0 1.6e+05 1.6e+04 0.0e+00  2  2  1  5  0   3  2  1  6  0 15876
MGInterp Level 4    1288 1.0 2.1723e+00 1.1 2.74e+08 1.0 4.9e+05 2.9e+03 0.0e+00  4  4  2  3  0   5  4  2  3  0 16165
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector   979            991    247462152     0
      Vector Scatter    19             19        22572     0
              Matrix    38             42     19508928     0
   Matrix Null Space     1              1          652     0
    Distributed Mesh     5              5       830792     0
     Bipartite Graph    10             10         8560     0
           Index Set    47             59       844496     0
   IS L to G Mapping     5              5       405756     0
       Krylov Solver    11             11       102272     0
     DMKSP interface     3              3         2088     0
      Preconditioner    11             11        11864     0
              Viewer   185            184       144256     0

--- Event Stage 1: MG Apply

              Vector    12              0            0     0
              Matrix     4              0            0     0
           Index Set    14              2         1792     0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Average time for MPI_Barrier(): 0.000448608
Average time for zero size MPI_Send(): 2.44565e-06
#PETSc Option Table entries:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure run at: Wed Aug 28 23:25:43 2013
Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=0 --known-mpi-c-double-complex=0 --with-batch="1 " --known-mpi-shared="0 " --known-memcmp-ok  --with-blas-lapack-lib="-L/opt/acml/5.3.0/pgi64/lib  -lacml" --COPTFLAGS="-O3 -fastsse" --FOPTFLAGS="-O3 -fastsse" --CXXOPTFLAGS="-O3 -fastsse" --with-x="0 " --with-debugging="0 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries=0 --with-dynamic-loading=0 --with-mpi-compilers="1 " --known-mpi-shared-libraries=0 --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --with-cc=cc --with-cxx=CC --with-fc=ftn PETSC_ARCH=interlagos-64idx-pgi-opt
-----------------------------------------
Libraries compiled on Wed Aug 28 23:25:43 2013 on h2ologin3 
Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64
Using PETSc directory: /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2
Using PETSc arch: interlagos-64idx-pgi-opt
-----------------------------------------

Using C compiler: cc  -O3 -fastsse  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: ftn  -O3 -fastsse   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/opt/cray/udreg/2.3.2-1.0402.7311.2.1.gem/include -I/opt/cray/ugni/5.0-1.0402.7128.7.6.gem/include -I/opt/cray/pmi/4.0.1-1.0000.9421.73.3.gem/include -I/opt/cray/dmapp/4.0.1-1.0402.7439.5.1.gem/include -I/opt/cray/gni-headers/2.1-1.0402.7082.6.2.gem/include -I/opt/cray/xpmem/0.1-2.0402.44035.2.1.gem/include -I/opt/cray/rca/1.0.0-2.0402.42153.2.106.gem/include -I/opt/cray-hss-devel/7.0.0/include -I/opt/cray/krca/1.0.0-2.0402.42157.2.94.gem/include -I/opt/cray/mpt/6.0.1/gni/mpich2-pgi/121/include -I/opt/acml/5.3.0/pgi64_fma4/include -I/opt/cray/libsci/12.1.01/pgi/121/interlagos/include -I/opt/fftw/3.3.0.3/interlagos/include -I/usr/include/alps -I/opt/pgi/13.6.0/linux86-64/13.6/include -I/opt/cray/xe-sysroot/4.2.24/usr/include
-----------------------------------------

Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lsuperlu_dist_3.3 -L/opt/acml/5.3.0/pgi64/lib -lacml -lpthread -lparmetis -lmetis -ldl 
-----------------------------------------

#PETSc Option Table entries:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
There are no unused options.
-------------- next part --------------
OPTIONS:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type sor
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg


  0 KSP unpreconditioned resid norm 5.609297891476e-07 true resid norm 5.609297891476e-07 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP unpreconditioned resid norm 2.581822560386e-07 true resid norm 2.581822560386e-07 ||r(i)||/||b|| 4.602755300819e-01
  2 KSP unpreconditioned resid norm 2.428085555532e-07 true resid norm 2.428085555532e-07 ||r(i)||/||b|| 4.328679992592e-01
  3 KSP unpreconditioned resid norm 2.335657385482e-07 true resid norm 2.335657385482e-07 ||r(i)||/||b|| 4.163903273940e-01
  4 KSP unpreconditioned resid norm 2.234855409323e-07 true resid norm 2.234855409323e-07 ||r(i)||/||b|| 3.984198116345e-01
  5 KSP unpreconditioned resid norm 2.078097503833e-07 true resid norm 2.078097503833e-07 ||r(i)||/||b|| 3.704737284484e-01
  6 KSP unpreconditioned resid norm 2.060299595561e-07 true resid norm 2.060299595561e-07 ||r(i)||/||b|| 3.673007986779e-01
  7 KSP unpreconditioned resid norm 1.304371586686e-07 true resid norm 1.304371586686e-07 ||r(i)||/||b|| 2.325374069842e-01
  8 KSP unpreconditioned resid norm 1.282069723612e-07 true resid norm 1.282069723612e-07 ||r(i)||/||b|| 2.285615327294e-01
  9 KSP unpreconditioned resid norm 8.994010091283e-08 true resid norm 8.994010091283e-08 ||r(i)||/||b|| 1.603411026708e-01
 10 KSP unpreconditioned resid norm 7.624200700050e-08 true resid norm 7.624200700050e-08 ||r(i)||/||b|| 1.359207666905e-01
 11 KSP unpreconditioned resid norm 7.614198614178e-08 true resid norm 7.614198614178e-08 ||r(i)||/||b|| 1.357424540734e-01
 12 KSP unpreconditioned resid norm 6.892442767094e-08 true resid norm 6.892442767094e-08 ||r(i)||/||b|| 1.228753205917e-01
 13 KSP unpreconditioned resid norm 6.891270426886e-08 true resid norm 6.891270426886e-08 ||r(i)||/||b|| 1.228544206461e-01
 14 KSP unpreconditioned resid norm 5.299112064401e-08 true resid norm 5.299112064401e-08 ||r(i)||/||b|| 9.447014879446e-02
 15 KSP unpreconditioned resid norm 4.214889484631e-08 true resid norm 4.214889484631e-08 ||r(i)||/||b|| 7.514112400835e-02
 16 KSP unpreconditioned resid norm 2.789939104957e-08 true resid norm 2.789939104957e-08 ||r(i)||/||b|| 4.973775967928e-02
 17 KSP unpreconditioned resid norm 2.786722600854e-08 true resid norm 2.786722600854e-08 ||r(i)||/||b|| 4.968041731371e-02
 18 KSP unpreconditioned resid norm 2.457366893493e-08 true resid norm 2.457366893493e-08 ||r(i)||/||b|| 4.380881424085e-02
 19 KSP unpreconditioned resid norm 2.430122634853e-08 true resid norm 2.430122634853e-08 ||r(i)||/||b|| 4.332311604534e-02
 20 KSP unpreconditioned resid norm 1.694910033682e-08 true resid norm 1.694910033683e-08 ||r(i)||/||b|| 3.021608169997e-02
 21 KSP unpreconditioned resid norm 1.383837294163e-08 true resid norm 1.383837294164e-08 ||r(i)||/||b|| 2.467041902458e-02
 22 KSP unpreconditioned resid norm 1.156264774412e-08 true resid norm 1.156264774414e-08 ||r(i)||/||b|| 2.061336011715e-02
 23 KSP unpreconditioned resid norm 7.091619620968e-09 true resid norm 7.091619620978e-09 ||r(i)||/||b|| 1.264261545416e-02
 24 KSP unpreconditioned resid norm 7.065272914870e-09 true resid norm 7.065272914878e-09 ||r(i)||/||b|| 1.259564575027e-02
 25 KSP unpreconditioned resid norm 6.618876503623e-09 true resid norm 6.618876503634e-09 ||r(i)||/||b|| 1.179983062353e-02
 26 KSP unpreconditioned resid norm 6.462504586080e-09 true resid norm 6.462504586091e-09 ||r(i)||/||b|| 1.152105791335e-02
 27 KSP unpreconditioned resid norm 6.440405749487e-09 true resid norm 6.440405749498e-09 ||r(i)||/||b|| 1.148166111713e-02
 28 KSP unpreconditioned resid norm 5.744119795026e-09 true resid norm 5.744119795044e-09 ||r(i)||/||b|| 1.024035432986e-02
 29 KSP unpreconditioned resid norm 2.587935833329e-09 true resid norm 2.587935833348e-09 ||r(i)||/||b|| 4.613653764548e-03
 30 KSP unpreconditioned resid norm 7.019604467372e-10 true resid norm 7.019604467372e-10 ||r(i)||/||b|| 1.251423012859e-03
 31 KSP unpreconditioned resid norm 7.001655404576e-10 true resid norm 7.001655404576e-10 ||r(i)||/||b|| 1.248223135950e-03
 32 KSP unpreconditioned resid norm 6.876834537207e-10 true resid norm 6.876834537207e-10 ||r(i)||/||b|| 1.225970642004e-03
 33 KSP unpreconditioned resid norm 6.378517201634e-10 true resid norm 6.378517201634e-10 ||r(i)||/||b|| 1.137132904160e-03
 34 KSP unpreconditioned resid norm 5.450153316415e-10 true resid norm 5.450153316415e-10 ||r(i)||/||b|| 9.716284322674e-04
KSP Object: 128 MPI processes
  type: fgmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=0.001, absolute=1e-50, divergence=10000
  right preconditioning
  has attached null space
  using UNPRECONDITIONED norm type for convergence test
PC Object: 128 MPI processes
  type: mg
    MG: type is FULL, levels=5 cycles=v
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     128 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     128 MPI processes
      type: lu
        LU: out-of-place factorization
        tolerance for zero pivot 2.22045e-14
        matrix ordering: natural
        factor fill ratio given 0, needed 0
          Factored matrix follows:
            Matrix Object:             128 MPI processes
              type: mpiaij
              rows=1024, cols=1024
              package used to perform factorization: superlu_dist
              total: nonzeros=0, allocated nonzeros=0
              total number of mallocs used during MatSetValues calls =0
                SuperLU_DIST run parameters:
                  Process grid nprow 16 x npcol 8 
                  Equilibrate matrix TRUE 
                  Matrix input mode 1 
                  Replace tiny pivots TRUE 
                  Use iterative refinement FALSE 
                  Processors in row 16 col partition 8 
                  Row permutation LargeDiag 
                  Column permutation METIS_AT_PLUS_A
                  Parallel symbolic factorization FALSE 
                  Repeated factorization SamePattern_SameRowPerm
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=1024, cols=1024
        total: nonzeros=27648, allocated nonzeros=27648
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     128 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=8192, cols=8192
        total: nonzeros=221184, allocated nonzeros=221184
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 16 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     128 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=65536, cols=65536
        total: nonzeros=1769472, allocated nonzeros=1769472
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object:    (mg_levels_3_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_3_)     128 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=524288, cols=524288
        total: nonzeros=14155776, allocated nonzeros=14155776
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 4 -------------------------------
    KSP Object:    (mg_levels_4_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_4_)     128 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=4194304, cols=4194304
        total: nonzeros=29360128, allocated nonzeros=29360128
        total number of mallocs used during MatSetValues calls =0
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   128 MPI processes
    type: mpiaij
    rows=4194304, cols=4194304
    total: nonzeros=29360128, allocated nonzeros=29360128
    total number of mallocs used during MatSetValues calls =0
 

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./hit on a interlagos-64idx-pgi-opt named nid12058 with 128 processors, by Unknown Tue Oct  1 13:05:12 2013
Using Petsc Release Version 3.4.2, Jul, 02, 2013 

                         Max       Max/Min        Avg      Total 
Time (sec):           2.265e+02      1.00006   2.265e+02
Objects:              3.366e+03      1.00000   3.366e+03
Flops:                2.626e+10      1.00000   2.626e+10  3.361e+12
Flops/sec:            1.159e+08      1.00006   1.159e+08  1.484e+10
MPI Messages:         8.260e+05      1.00000   8.260e+05  1.057e+08
MPI Message Lengths:  1.464e+09      1.00000   1.773e+03  1.874e+11
MPI Reductions:       1.710e+05      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 7.1195e+01  31.4%  4.1158e+11  12.2%  1.404e+06   1.3%  2.136e+02       12.1%  1.244e+04   7.3% 
 1:        MG Apply: 1.5530e+02  68.6%  2.9491e+12  87.8%  1.043e+08  98.7%  1.559e+03       87.9%  1.585e+05  92.7% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecMDot             1132 1.0 1.6774e+00 1.2 5.77e+08 1.0 0.0e+00 0.0e+00 1.1e+03  1  2  0  0  1   2 18  0  0  9 44049
VecNorm             2873 1.0 9.3535e-01 2.5 1.88e+08 1.0 0.0e+00 0.0e+00 2.9e+03  0  1  0  0  2   1  6  0  0 23 25766
VecScale            1339 1.0 6.0955e-02 1.0 4.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0 92136
VecCopy             1534 1.0 2.0990e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              1958 1.0 3.0763e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             1333 1.0 2.2397e-01 1.3 8.74e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  3  0  0  0 49926
VecAYPX             1333 1.0 1.7871e-01 1.4 4.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  1  0  0  0 31285
VecWAXPY               6 1.0 1.6773e-03 1.9 1.97e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 15004
VecMAXPY            2465 1.0 2.9482e+00 1.2 1.22e+09 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   4 38  0  0  0 52831
VecScatterBegin     2877 1.0 5.2887e-01 1.3 0.00e+00 0.0 1.4e+06 1.6e+04 0.0e+00  0  0  1 12  0   1  0 98 99  0     0
VecScatterEnd       2877 1.0 1.1966e+00 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatMult             2471 1.0 8.7551e+00 1.1 1.05e+09 1.0 1.3e+06 1.6e+04 0.0e+00  4  4  1 11  0  12 33 90 92  0 15389
MatMultTranspose       4 1.0 2.2500e-03 1.1 2.53e+05 1.0 1.5e+03 9.9e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0 14377
MatLUFactorSym         1 1.0 5.0402e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 1.2829e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin     218 1.0 8.1404e-0118.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.2e+02  0  0  0  0  0   1  0  0  0  3     0
MatAssemblyEnd       218 1.0 7.6480e-01 1.2 0.00e+00 0.0 1.2e+04 1.1e+03 7.2e+01  0  0  0  0  0   1  0  1  0  1     0
MatGetRowIJ            1 1.0 3.0994e-06 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 2.2173e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView             1407 1.0 4.3674e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+03  0  0  0  0  1   1  0  0  0 11     0
MatPtAP                4 1.0 2.1245e-01 1.0 5.11e+06 1.0 2.5e+04 6.0e+03 1.0e+02  0  0  0  0  0   0  0  2  1  1  3076
MatPtAPSymbolic        4 1.0 1.4296e-01 1.0 0.00e+00 0.0 1.5e+04 7.8e+03 6.0e+01  0  0  0  0  0   0  0  1  1  0     0
MatPtAPNumeric         4 1.0 7.0840e-02 1.0 5.11e+06 1.0 9.7e+03 3.1e+03 4.0e+01  0  0  0  0  0   0  0  1  0  0  9224
MatGetLocalMat         4 1.0 2.2167e-02 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol          4 1.0 2.9867e-02 3.1 0.00e+00 0.0 1.1e+04 8.4e+03 8.0e+00  0  0  0  0  0   0  0  1  0  0     0
MatGetSymTrans         8 1.0 9.5510e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog      1132 1.0 2.8047e+00 1.1 1.15e+09 1.0 0.0e+00 0.0e+00 1.1e+03  1  4  0  0  1   4 36  0  0  9 52688
KSPSetUp               6 1.0 3.5489e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+01  0  0  0  0  0   0  0  0  0  0     0
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
KSPSolve             201 1.0 1.7101e+02 1.0 2.63e+10 1.0 1.1e+08 1.8e+03 1.7e+05 75100100 99 98 24081775248221352 19652
PCSetUp                1 1.0 4.2664e-01 1.0 5.36e+06 1.0 3.4e+04 4.6e+03 3.2e+02  0  0  0  0  0   1  0  2  1  3  1607
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
PCApply             1132 1.0 1.5532e+02 1.0 2.30e+10 1.0 1.0e+08 1.6e+03 1.6e+05 69 88 99 88 93 21871774317301274 18987
MGSetup Level 0        1 1.0 1.3110e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 1        1 1.0 1.7891e-03 7.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 2        1 1.0 2.5201e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 3        1 1.0 3.9697e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 4        1 1.0 1.0642e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 1: MG Apply

VecMDot            67920 1.0 1.0526e+01 1.5 1.16e+09 1.0 0.0e+00 0.0e+00 6.8e+04  4  4  0  0 40   6  5  0  0 43 14122
VecNorm            90560 1.0 5.5366e+00 1.4 7.74e+08 1.0 0.0e+00 0.0e+00 9.1e+04  2  3  0  0 53   3  3  0  0 57 17901
VecScale           90560 1.0 6.1576e-01 1.1 3.87e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0 80480
VecCopy            28300 1.0 7.0869e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet             72448 1.0 5.1643e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY            38488 1.0 7.2360e-01 1.2 3.75e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  2  0  0  0 66359
VecAYPX            11320 1.0 2.0807e-01 1.1 4.84e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 29771
VecMAXPY           90560 1.0 1.7752e+00 1.1 1.74e+09 1.0 0.0e+00 0.0e+00 0.0e+00  1  7  0  0  0   1  8  0  0  0 125622
VecScatterBegin   126784 1.0 9.4371e+00 2.3 0.00e+00 0.0 1.0e+08 1.6e+03 0.0e+00  4  0 99 88  0   5  0100100  0     0
VecScatterEnd     126784 1.0 6.8194e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   4  0  0  0  0     0
VecNormalize       90560 1.0 6.2612e+00 1.4 1.16e+09 1.0 0.0e+00 0.0e+00 9.1e+04  2  4  0  0 53   4  5  0  0 57 23745
MatMult            99616 1.0 5.7795e+01 1.1 9.66e+09 1.0 9.4e+07 1.7e+03 0.0e+00 25 37 89 85  0  36 42 90 96  0 21385
MatMultAdd         11320 1.0 2.6260e+00 1.2 3.27e+08 1.0 4.3e+06 5.3e+02 0.0e+00  1  1  4  1  0   2  1  4  1  0 15923
MatMultTranspose   15848 1.0 4.8380e+00 1.1 6.13e+08 1.0 6.1e+06 6.6e+02 0.0e+00  2  2  6  2  0   3  3  6  2  0 16212
MatSolve            5660 1.0 7.7166e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   5  0  0  0  0     0
MatSOR             90560 1.0 6.6838e+01 1.0 7.96e+09 1.0 0.0e+00 0.0e+00 0.0e+00 29 30  0  0  0  42 35  0  0  0 15237
KSPGMRESOrthog     67920 1.0 1.1756e+01 1.4 2.32e+09 1.0 0.0e+00 0.0e+00 6.8e+04  4  9  0  0 40   7 10  0  0 43 25290
KSPSolve           28300 1.0 1.3913e+02 1.0 2.07e+10 1.0 8.1e+07 1.7e+03 1.6e+05 61 79 77 74 93  89 90 78 84100 19070
PCApply            96220 1.0 7.4543e+01 1.0 7.96e+09 1.0 0.0e+00 0.0e+00 0.0e+00 32 30  0  0  0  47 35  0  0  0 13662
MGSmooth Level 0    5660 1.0 7.8165e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   5  0  0  0  0     0
MGSmooth Level 1    9056 1.0 3.9531e+00 1.1 1.93e+08 1.0 3.4e+07 1.9e+02 6.3e+04  2  1 32  3 37   2  1 32  4 40  6255
MGResid Level 1     4528 1.0 2.5980e-01 1.4 1.56e+07 1.0 4.6e+06 1.9e+02 0.0e+00  0  0  4  0  0   0  0  4  1  0  7710
MGInterp Level 1   11320 1.0 8.6609e-01 5.1 4.82e+06 1.0 4.3e+06 6.4e+01 0.0e+00  0  0  4  0  0   0  0  4  0  0   712
MGSmooth Level 2    6792 1.0 7.4435e+00 1.0 1.36e+09 1.0 2.6e+07 6.4e+02 4.8e+04  3  5 24  9 28   5  6 24 10 30 23301
MGResid Level 2     3396 1.0 5.4746e-01 1.4 9.39e+07 1.0 3.5e+06 6.4e+02 0.0e+00  0  0  3  1  0   0  0  3  1  0 21953
MGInterp Level 2    9056 1.0 6.0639e-01 2.0 3.07e+07 1.0 3.5e+06 2.1e+02 0.0e+00  0  0  3  0  0   0  0  3  0  0  6484
MGSmooth Level 3    4528 1.0 3.5954e+01 1.0 7.90e+09 1.0 1.7e+07 2.3e+03 3.2e+04 16 30 16 21 19  23 34 17 24 20 28111
MGResid Level 3     2264 1.0 2.0338e+00 1.2 5.01e+08 1.0 2.3e+06 2.3e+03 0.0e+00  1  2  2  3  0   1  2  2  3  0 31516
MGInterp Level 3    6792 1.0 1.5462e+00 1.4 1.83e+08 1.0 2.6e+06 7.7e+02 0.0e+00  1  1  2  1  0   1  1  2  1  0 15161
MGSmooth Level 4    2264 1.0 8.5215e+01 1.0 1.13e+10 1.0 4.6e+06 1.6e+04 1.6e+04 38 43  4 41  9  55 49  4 46 10 16948
MGResid Level 4     1132 1.0 4.1353e+00 1.1 5.19e+08 1.0 5.8e+05 1.6e+04 0.0e+00  2  2  1  5  0   3  2  1  6  0 16074
MGInterp Level 4    4528 1.0 7.5078e+00 1.1 9.64e+08 1.0 1.7e+06 2.9e+03 0.0e+00  3  4  2  3  0   5  4  2  3  0 16442
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector  3211           3223    861601128     0
      Vector Scatter    19             19        22572     0
              Matrix    38             38     14004608     0
   Matrix Null Space     1              1          652     0
    Distributed Mesh     5              5       830792     0
     Bipartite Graph    10             10         8560     0
           Index Set    47             47       534480     0
   IS L to G Mapping     5              5       405756     0
       Krylov Solver     7              7        97216     0
     DMKSP interface     3              3         2088     0
      Preconditioner     7              7         7352     0
              Viewer     1              0            0     0

--- Event Stage 1: MG Apply

              Vector    12              0            0     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 1.62125e-05
Average time for zero size MPI_Send(): 2.36742e-06
#PETSc Option Table entries:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type sor
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure run at: Wed Aug 28 23:25:43 2013
Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=0 --known-mpi-c-double-complex=0 --with-batch="1 " --known-mpi-shared="0 " --known-memcmp-ok  --with-blas-lapack-lib="-L/opt/acml/5.3.0/pgi64/lib  -lacml" --COPTFLAGS="-O3 -fastsse" --FOPTFLAGS="-O3 -fastsse" --CXXOPTFLAGS="-O3 -fastsse" --with-x="0 " --with-debugging="0 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries=0 --with-dynamic-loading=0 --with-mpi-compilers="1 " --known-mpi-shared-libraries=0 --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --with-cc=cc --with-cxx=CC --with-fc=ftn PETSC_ARCH=interlagos-64idx-pgi-opt
-----------------------------------------
Libraries compiled on Wed Aug 28 23:25:43 2013 on h2ologin3 
Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64
Using PETSc directory: /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2
Using PETSc arch: interlagos-64idx-pgi-opt
-----------------------------------------

Using C compiler: cc  -O3 -fastsse  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: ftn  -O3 -fastsse   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/opt/cray/udreg/2.3.2-1.0402.7311.2.1.gem/include -I/opt/cray/ugni/5.0-1.0402.7128.7.6.gem/include -I/opt/cray/pmi/4.0.1-1.0000.9421.73.3.gem/include -I/opt/cray/dmapp/4.0.1-1.0402.7439.5.1.gem/include -I/opt/cray/gni-headers/2.1-1.0402.7082.6.2.gem/include -I/opt/cray/xpmem/0.1-2.0402.44035.2.1.gem/include -I/opt/cray/rca/1.0.0-2.0402.42153.2.106.gem/include -I/opt/cray-hss-devel/7.0.0/include -I/opt/cray/krca/1.0.0-2.0402.42157.2.94.gem/include -I/opt/cray/mpt/6.0.1/gni/mpich2-pgi/121/include -I/opt/acml/5.3.0/pgi64_fma4/include -I/opt/cray/libsci/12.1.01/pgi/121/interlagos/include -I/opt/fftw/3.3.0.3/interlagos/include -I/usr/include/alps -I/opt/pgi/13.6.0/linux86-64/13.6/include -I/opt/cray/xe-sysroot/4.2.24/usr/include
-----------------------------------------

Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lsuperlu_dist_3.3 -L/opt/acml/5.3.0/pgi64/lib -lacml -lpthread -lparmetis -lmetis -ldl 
-----------------------------------------

#PETSc Option Table entries:
-ksp_monitor_true_residual
-ksp_type fgmres
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type sor
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
There are no unused options.
-------------- next part --------------
OPTIONS:
-ksp_atol 1e-9
-ksp_max_it 30
-ksp_monitor_true_residual
-ksp_type richardson
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg


  0 KSP unpreconditioned resid norm 6.954195782521e-06 true resid norm 6.954195782521e-06 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP unpreconditioned resid norm 4.019686111644e-06 true resid norm 4.019686111644e-06 ||r(i)||/||b|| 5.780231442070e-01
  2 KSP unpreconditioned resid norm 3.085633839543e-06 true resid norm 3.085633839543e-06 ||r(i)||/||b|| 4.437082210568e-01
  3 KSP unpreconditioned resid norm 3.108638870884e-06 true resid norm 3.108638870884e-06 ||r(i)||/||b|| 4.470163003891e-01
  4 KSP unpreconditioned resid norm 2.907823260441e-06 true resid norm 2.907823260441e-06 ||r(i)||/||b|| 4.181394012158e-01
  5 KSP unpreconditioned resid norm 3.095105825237e-06 true resid norm 3.095105825237e-06 ||r(i)||/||b|| 4.450702744114e-01
  6 KSP unpreconditioned resid norm 2.816885443509e-06 true resid norm 2.816885443509e-06 ||r(i)||/||b|| 4.050627177608e-01
  7 KSP unpreconditioned resid norm 3.330756254322e-06 true resid norm 3.330756254322e-06 ||r(i)||/||b|| 4.789563536151e-01
  8 KSP unpreconditioned resid norm 2.734103597927e-06 true resid norm 2.734103597927e-06 ||r(i)||/||b|| 3.931588473249e-01
  9 KSP unpreconditioned resid norm 3.029844525689e-06 true resid norm 3.029844525689e-06 ||r(i)||/||b|| 4.356858248519e-01
 10 KSP unpreconditioned resid norm 2.626258637892e-06 true resid norm 2.626258637892e-06 ||r(i)||/||b|| 3.776509491569e-01
 11 KSP unpreconditioned resid norm 2.620796722232e-06 true resid norm 2.620796722232e-06 ||r(i)||/||b|| 3.768655361731e-01
 12 KSP unpreconditioned resid norm 2.599366584696e-06 true resid norm 2.599366584696e-06 ||r(i)||/||b|| 3.737839235457e-01
 13 KSP unpreconditioned resid norm 2.815136272808e-06 true resid norm 2.815136272808e-06 ||r(i)||/||b|| 4.048111903728e-01
 14 KSP unpreconditioned resid norm 2.592704976330e-06 true resid norm 2.592704976330e-06 ||r(i)||/||b|| 3.728259970544e-01
 15 KSP unpreconditioned resid norm 2.647297548295e-06 true resid norm 2.647297548295e-06 ||r(i)||/||b|| 3.806763040737e-01
 16 KSP unpreconditioned resid norm 2.577657729007e-06 true resid norm 2.577657729007e-06 ||r(i)||/||b|| 3.706622317833e-01
 17 KSP unpreconditioned resid norm 2.637186195877e-06 true resid norm 2.637186195877e-06 ||r(i)||/||b|| 3.792223110120e-01
 18 KSP unpreconditioned resid norm 2.569979492081e-06 true resid norm 2.569979492081e-06 ||r(i)||/||b|| 3.695581160572e-01
 19 KSP unpreconditioned resid norm 2.639092183189e-06 true resid norm 2.639092183189e-06 ||r(i)||/||b|| 3.794963883275e-01
 20 KSP unpreconditioned resid norm 2.557359938672e-06 true resid norm 2.557359938672e-06 ||r(i)||/||b|| 3.677434485091e-01
 21 KSP unpreconditioned resid norm 2.619919367497e-06 true resid norm 2.619919367497e-06 ||r(i)||/||b|| 3.767393742469e-01
 22 KSP unpreconditioned resid norm 2.540615865281e-06 true resid norm 2.540615865281e-06 ||r(i)||/||b|| 3.653356829077e-01
 23 KSP unpreconditioned resid norm 2.578329382313e-06 true resid norm 2.578329382313e-06 ||r(i)||/||b|| 3.707588142389e-01
 24 KSP unpreconditioned resid norm 2.525920830833e-06 true resid norm 2.525920830833e-06 ||r(i)||/||b|| 3.632225651716e-01
 25 KSP unpreconditioned resid norm 2.658560319798e-06 true resid norm 2.658560319798e-06 ||r(i)||/||b|| 3.822958689890e-01
 26 KSP unpreconditioned resid norm 2.522426607571e-06 true resid norm 2.522426607571e-06 ||r(i)||/||b|| 3.627201025762e-01
 27 KSP unpreconditioned resid norm 2.616030191476e-06 true resid norm 2.616030191476e-06 ||r(i)||/||b|| 3.761801182031e-01
 28 KSP unpreconditioned resid norm 2.507602013260e-06 true resid norm 2.507602013260e-06 ||r(i)||/||b|| 3.605883543806e-01
 29 KSP unpreconditioned resid norm 2.624604598576e-06 true resid norm 2.624604598576e-06 ||r(i)||/||b|| 3.774131014793e-01
 30 KSP unpreconditioned resid norm 2.502026180934e-06 true resid norm 2.502026180934e-06 ||r(i)||/||b|| 3.597865603990e-01
KSP Object: 128 MPI processes
  type: richardson
    Richardson: damping factor=1
  maximum iterations=30, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-09, divergence=10000
  left preconditioning
  has attached null space
  using UNPRECONDITIONED norm type for convergence test
PC Object: 128 MPI processes
  type: mg
    MG: type is FULL, levels=5 cycles=v
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     128 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     128 MPI processes
      type: lu
        LU: out-of-place factorization
        tolerance for zero pivot 2.22045e-14
        matrix ordering: natural
        factor fill ratio given 0, needed 0
          Factored matrix follows:
            Matrix Object:             128 MPI processes
              type: mpiaij
              rows=1024, cols=1024
              package used to perform factorization: superlu_dist
              total: nonzeros=0, allocated nonzeros=0
              total number of mallocs used during MatSetValues calls =0
                SuperLU_DIST run parameters:
                  Process grid nprow 16 x npcol 8 
                  Equilibrate matrix TRUE 
                  Matrix input mode 1 
                  Replace tiny pivots TRUE 
                  Use iterative refinement FALSE 
                  Processors in row 16 col partition 8 
                  Row permutation LargeDiag 
                  Column permutation METIS_AT_PLUS_A
                  Parallel symbolic factorization FALSE 
                  Repeated factorization SamePattern_SameRowPerm
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=1024, cols=1024
        total: nonzeros=27648, allocated nonzeros=27648
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_1_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_1_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=64, cols=64
                package used to perform factorization: petsc
                total: nonzeros=768, allocated nonzeros=768
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 16 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=64, cols=64
          total: nonzeros=768, allocated nonzeros=768
          total number of mallocs used during MatSetValues calls =0
            using I-node routines: found 16 nodes, limit used is 5
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=8192, cols=8192
        total: nonzeros=221184, allocated nonzeros=221184
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 16 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_2_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_2_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=512, cols=512
                package used to perform factorization: petsc
                total: nonzeros=9600, allocated nonzeros=9600
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=512, cols=512
          total: nonzeros=9600, allocated nonzeros=9600
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=65536, cols=65536
        total: nonzeros=1769472, allocated nonzeros=1769472
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object:    (mg_levels_3_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_3_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_3_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_3_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=4096, cols=4096
                package used to perform factorization: petsc
                total: nonzeros=92928, allocated nonzeros=92928
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=4096, cols=4096
          total: nonzeros=92928, allocated nonzeros=92928
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=524288, cols=524288
        total: nonzeros=14155776, allocated nonzeros=14155776
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 4 -------------------------------
    KSP Object:    (mg_levels_4_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_4_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_4_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_4_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=32768, cols=32768
                package used to perform factorization: petsc
                total: nonzeros=221184, allocated nonzeros=221184
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=32768, cols=32768
          total: nonzeros=221184, allocated nonzeros=221184
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=4194304, cols=4194304
        total: nonzeros=29360128, allocated nonzeros=29360128
        total number of mallocs used during MatSetValues calls =0
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   128 MPI processes
    type: mpiaij
    rows=4194304, cols=4194304
    total: nonzeros=29360128, allocated nonzeros=29360128
    total number of mallocs used during MatSetValues calls =0

  0 KSP unpreconditioned resid norm 2.917180555663e-04 true resid norm 2.917180555663e-04 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP unpreconditioned resid norm 1.049878179956e-02 true resid norm 1.049878179956e-02 ||r(i)||/||b|| 3.598948230742e+01
  2 KSP unpreconditioned resid norm 4.603139618725e-01 true resid norm 4.603139618725e-01 ||r(i)||/||b|| 1.577941279565e+03
  3 KSP unpreconditioned resid norm 2.274779569665e+01 true resid norm 2.274779569665e+01 ||r(i)||/||b|| 7.797870328075e+04
KSP Object: 128 MPI processes
  type: richardson
    Richardson: damping factor=1
  maximum iterations=30, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-09, divergence=10000
  left preconditioning
  has attached null space
  using UNPRECONDITIONED norm type for convergence test
PC Object: 128 MPI processes
  type: mg
    MG: type is FULL, levels=5 cycles=v
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     128 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     128 MPI processes
      type: lu
        LU: out-of-place factorization
        tolerance for zero pivot 2.22045e-14
        matrix ordering: natural
        factor fill ratio given 0, needed 0
          Factored matrix follows:
            Matrix Object:             128 MPI processes
              type: mpiaij
              rows=1024, cols=1024
              package used to perform factorization: superlu_dist
              total: nonzeros=0, allocated nonzeros=0
              total number of mallocs used during MatSetValues calls =0
                SuperLU_DIST run parameters:
                  Process grid nprow 16 x npcol 8 
                  Equilibrate matrix TRUE 
                  Matrix input mode 1 
                  Replace tiny pivots TRUE 
                  Use iterative refinement FALSE 
                  Processors in row 16 col partition 8 
                  Row permutation LargeDiag 
                  Column permutation METIS_AT_PLUS_A
                  Parallel symbolic factorization FALSE 
                  Repeated factorization SamePattern_SameRowPerm
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=1024, cols=1024
        total: nonzeros=27648, allocated nonzeros=27648
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_1_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_1_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=64, cols=64
                package used to perform factorization: petsc
                total: nonzeros=768, allocated nonzeros=768
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 16 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=64, cols=64
          total: nonzeros=768, allocated nonzeros=768
          total number of mallocs used during MatSetValues calls =0
            using I-node routines: found 16 nodes, limit used is 5
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=8192, cols=8192
        total: nonzeros=221184, allocated nonzeros=221184
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 16 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_2_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_2_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=512, cols=512
                package used to perform factorization: petsc
                total: nonzeros=9600, allocated nonzeros=9600
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=512, cols=512
          total: nonzeros=9600, allocated nonzeros=9600
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=65536, cols=65536
        total: nonzeros=1769472, allocated nonzeros=1769472
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object:    (mg_levels_3_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_3_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_3_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_3_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=4096, cols=4096
                package used to perform factorization: petsc
                total: nonzeros=92928, allocated nonzeros=92928
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=4096, cols=4096
          total: nonzeros=92928, allocated nonzeros=92928
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=524288, cols=524288
        total: nonzeros=14155776, allocated nonzeros=14155776
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 4 -------------------------------
    KSP Object:    (mg_levels_4_)     128 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=3
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_4_)     128 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 128
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_levels_4_sub_)       1 MPI processes
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_levels_4_sub_)       1 MPI processes
        type: ilu
          ILU: out-of-place factorization
          0 levels of fill
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot
          matrix ordering: natural
          factor fill ratio given 1, needed 1
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=32768, cols=32768
                package used to perform factorization: petsc
                total: nonzeros=221184, allocated nonzeros=221184
                total number of mallocs used during MatSetValues calls =0
                  not using I-node routines
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=32768, cols=32768
          total: nonzeros=221184, allocated nonzeros=221184
          total number of mallocs used during MatSetValues calls =0
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:       128 MPI processes
        type: mpiaij
        rows=4194304, cols=4194304
        total: nonzeros=29360128, allocated nonzeros=29360128
        total number of mallocs used during MatSetValues calls =0
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   128 MPI processes
    type: mpiaij
    rows=4194304, cols=4194304
    total: nonzeros=29360128, allocated nonzeros=29360128
    total number of mallocs used during MatSetValues calls =0
 

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./hit on a interlagos-64idx-pgi-opt named nid25319 with 128 processors, by Unknown Tue Oct  1 20:23:44 2013
Using Petsc Release Version 3.4.2, Jul, 02, 2013 

                         Max       Max/Min        Avg      Total 
Time (sec):           4.942e+01      1.00014   4.942e+01
Objects:              1.206e+03      1.00000   1.206e+03
Flops:                6.078e+09      1.00000   6.078e+09  7.779e+11
Flops/sec:            1.230e+08      1.00014   1.230e+08  1.574e+10
MPI Messages:         2.322e+05      1.11091   2.092e+05  2.678e+07
MPI Message Lengths:  3.703e+08      1.00025   1.769e+03  4.739e+10
MPI Reductions:       4.337e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 1.3192e+01  26.7%  4.5330e+10   5.8%  4.225e+05   1.6%  2.144e+02       12.1%  3.293e+03   7.6% 
 1:        MG Apply: 3.6228e+01  73.3%  7.3260e+11  94.2%  2.636e+07  98.4%  1.555e+03       87.9%  4.008e+04  92.4% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecNorm              710 1.0 2.7513e-01 4.2 4.65e+07 1.0 0.0e+00 0.0e+00 7.1e+02  0  1  0  0  2   1 13  0  0 22 21647
VecCopy              378 1.0 6.4831e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               160 1.0 2.7622e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              286 1.0 5.0174e-02 1.2 1.87e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  5  0  0  0 47817
VecAYPX              618 1.0 8.3843e-02 1.9 2.03e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  6  0  0  0 30916
VecScatterBegin      714 1.0 1.3197e-01 1.3 0.00e+00 0.0 3.4e+05 1.6e+04 0.0e+00  0  0  1 12  0   1  0 81 97  0     0
VecScatterEnd        714 1.0 3.0915e-01 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatMult              618 1.0 2.2833e+00 1.1 2.63e+08 1.0 3.2e+05 1.6e+04 0.0e+00  4  4  1 11  0  16 74 75 90  0 14758
MatMultTranspose       4 1.0 2.2891e-03 1.1 2.53e+05 1.0 1.5e+03 9.9e+02 0.0e+00  0  0  0  0  0   0  0  0  0  0 14132
MatLUFactorSym         1 1.0 5.1403e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         1 1.0 1.1998e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatAssemblyBegin      63 1.0 1.9263e-01 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+02  0  0  0  0  0   1  0  0  0  3     0
MatAssemblyEnd        63 1.0 2.1651e-01 1.2 0.00e+00 0.0 1.2e+04 1.1e+03 7.2e+01  0  0  0  0  0   1  0  3  0  2     0
MatGetRowIJ            1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 2.0981e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView              690 2.1 2.0276e-01 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.2e+02  0  0  0  0  1   1  0  0  0 10     0
MatPtAP                4 1.0 2.0942e-01 1.0 5.11e+06 1.0 2.5e+04 6.0e+03 1.0e+02  0  0  0  0  0   2  1  6  3  3  3120
MatPtAPSymbolic        4 1.0 1.4736e-01 1.1 0.00e+00 0.0 1.5e+04 7.8e+03 6.0e+01  0  0  0  0  0   1  0  4  2  2     0
MatPtAPNumeric         4 1.0 6.9803e-02 1.1 5.11e+06 1.0 9.7e+03 3.1e+03 4.0e+01  0  0  0  0  0   1  1  2  1  1  9361
MatGetLocalMat         4 1.0 2.3130e-02 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol          4 1.0 2.8250e-02 3.0 0.00e+00 0.0 1.1e+04 8.4e+03 8.0e+00  0  0  0  0  0   0  0  3  2  0     0
MatGetSymTrans         8 1.0 9.6016e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp               6 1.0 1.7802e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.4e+01  0  0  0  0  0   0  0  0  0  1     0
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
KSPSolve              46 1.0 3.9455e+01 1.0 6.08e+09 1.0 2.7e+07 1.8e+03 4.3e+04 80100100 99 99 299171663218181298 19717
PCSetUp                1 1.0 4.1868e-01 1.0 5.36e+06 1.0 3.4e+04 4.6e+03 3.3e+02  1  0  0  0  1   3  2  8  3 10  1638
Warning -- total time of even greater than time of entire stage -- something is wrong with the timer
PCApply              286 1.0 3.6235e+01 1.0 5.72e+09 1.0 2.6e+07 1.6e+03 4.0e+04 73 94 98 88 92 275161662387251217 20218
MGSetup Level 0        1 1.0 1.2280e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   1  0  0  0  0     0
MGSetup Level 1        1 1.0 2.5809e-03 5.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 2        1 1.0 5.2381e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 3        1 1.0 5.4312e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
MGSetup Level 4        1 1.0 1.1581e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 1: MG Apply

VecMDot            17160 1.0 2.7126e+00 2.0 2.93e+08 1.0 0.0e+00 0.0e+00 1.7e+04  5  5  0  0 40   6  5  0  0 43 13846
VecNorm            22880 1.0 1.5497e+00 1.7 1.96e+08 1.0 0.0e+00 0.0e+00 2.3e+04  2  3  0  0 53   3  3  0  0 57 16159
VecScale           22880 1.0 1.6083e-01 1.1 9.78e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 77850
VecCopy             7150 1.0 1.9496e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet             41184 1.0 4.9180e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
VecAXPY             9724 1.0 1.8098e-01 1.3 9.48e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 67033
VecAYPX             2860 1.0 5.3519e-02 1.2 1.22e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 29243
VecMAXPY           22880 1.0 4.8374e-01 1.2 4.40e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  7  0  0  0   1  8  0  0  0 116471
VecScatterBegin    32032 1.0 2.1183e+00 2.1 0.00e+00 0.0 2.6e+07 1.6e+03 0.0e+00  4  0 98 88  0   5  0100100  0     0
VecScatterEnd      32032 1.0 1.8369e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   4  0  0  0  0     0
VecNormalize       22880 1.0 1.7390e+00 1.6 2.93e+08 1.0 0.0e+00 0.0e+00 2.3e+04  3  5  0  0 53   4  5  0  0 57 21600
MatMult            25168 1.0 1.6588e+01 1.1 2.44e+09 1.0 2.4e+07 1.7e+03 0.0e+00 33 40 89 84  0  44 43 90 96  0 18825
MatMultAdd          2860 1.0 6.8537e-01 1.2 8.25e+07 1.0 1.1e+06 5.3e+02 0.0e+00  1  1  4  1  0   2  1  4  1  0 15414
MatMultTranspose    4004 1.0 1.2716e+00 1.2 1.55e+08 1.0 1.5e+06 6.6e+02 0.0e+00  2  3  6  2  0   3  3  6  2  0 15583
MatSolve           24310 1.0 1.3130e+01 1.1 1.91e+09 1.0 0.0e+00 0.0e+00 0.0e+00 25 31  0  0  0  34 33  0  0  0 18627
MatLUFactorNum         4 1.0 1.0485e-02 1.1 1.86e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 22747
MatILUFactorSym        4 1.0 4.9901e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            4 1.0 1.2398e-05 6.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         4 1.0 5.0456e-0224.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog     17160 1.0 3.0344e+00 1.8 5.87e+08 1.0 0.0e+00 0.0e+00 1.7e+04  5 10  0  0 40   7 10  0  0 43 24755
KSPSetUp               4 1.0 5.9605e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve            7150 1.0 3.1997e+01 1.0 5.14e+09 1.0 2.1e+07 1.7e+03 4.0e+04 64 85 77 74 92  88 90 78 84100 20558
PCSetUp                4 1.0 7.4455e-02 1.9 1.86e+06 1.0 0.0e+00 0.0e+00 1.4e+01  0  0  0  0  0   0  0  0  0  0  3203
PCSetUpOnBlocks     5720 1.0 7.9641e-02 1.8 1.86e+06 1.0 0.0e+00 0.0e+00 1.4e+01  0  0  0  0  0   0  0  0  0  0  2995
PCApply            24310 1.0 1.4260e+01 1.1 1.91e+09 1.0 0.0e+00 0.0e+00 0.0e+00 27 31  0  0  0  37 33  0  0  0 17151
MGSmooth Level 0    1430 1.0 2.0421e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  4  0  0  0  0   5  0  0  0  0     0
MGSmooth Level 1    2288 1.0 1.2168e+00 1.1 4.82e+07 1.0 8.5e+06 1.9e+02 1.6e+04  2  1 32  3 37   3  1 32  4 40  5072
MGResid Level 1     1144 1.0 5.2052e-02 1.3 3.95e+06 1.0 1.2e+06 1.9e+02 0.0e+00  0  0  4  0  0   0  0  4  1  0  9722
MGInterp Level 1    2860 1.0 1.5052e-01 3.6 1.22e+06 1.0 1.1e+06 6.4e+01 0.0e+00  0  0  4  0  0   0  0  4  0  0  1035
MGSmooth Level 2    1716 1.0 1.9140e+00 1.0 3.39e+08 1.0 6.4e+06 6.4e+02 1.2e+04  4  6 24  9 28   5  6 24 10 30 22666
MGResid Level 2      858 1.0 1.1919e-01 1.5 2.37e+07 1.0 8.8e+05 6.4e+02 0.0e+00  0  0  3  1  0   0  0  3  1  0 25474
MGInterp Level 2    2288 1.0 1.6201e-01 2.2 7.76e+06 1.0 8.8e+05 2.1e+02 0.0e+00  0  0  3  0  0   0  0  3  0  0  6132
MGSmooth Level 3    1144 1.0 1.0630e+01 1.0 1.98e+09 1.0 4.4e+06 2.3e+03 8.0e+03 21 33 16 21 18  29 35 17 24 20 23810
MGResid Level 3      572 1.0 7.5736e-01 1.1 1.27e+08 1.0 5.9e+05 2.3e+03 0.0e+00  1  2  2  3  0   2  2  2  3  0 21382
MGInterp Level 3    1716 1.0 3.9544e-01 1.4 4.63e+07 1.0 6.6e+05 7.7e+02 0.0e+00  1  1  2  1  0   1  1  2  1  0 14978
MGSmooth Level 4     572 1.0 1.6467e+01 1.0 2.77e+09 1.0 1.2e+06 1.6e+04 4.0e+03 33 46  4 41  9  45 48  4 46 10 21568
MGResid Level 4      286 1.0 1.0729e+00 1.1 1.31e+08 1.0 1.5e+05 1.6e+04 0.0e+00  2  2  1  5  0   3  2  1  6  0 15654
MGInterp Level 4    1144 1.0 1.9588e+00 1.1 2.44e+08 1.0 4.4e+05 2.9e+03 0.0e+00  4  4  2  3  0   5  4  2  3  0 15922
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector   841            853    211053336     0
      Vector Scatter    19             19        22572     0
              Matrix    38             42     19508928     0
   Matrix Null Space     1              1          652     0
    Distributed Mesh     5              5       830792     0
     Bipartite Graph    10             10         8560     0
           Index Set    47             59       844496     0
   IS L to G Mapping     5              5       405756     0
       Krylov Solver    11             11        84080     0
     DMKSP interface     3              3         2088     0
      Preconditioner    11             11        11864     0
              Viewer   185            184       144256     0

--- Event Stage 1: MG Apply

              Vector    12              0            0     0
              Matrix     4              0            0     0
           Index Set    14              2         1792     0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 1.37806e-05
Average time for zero size MPI_Send(): 2.67848e-06
#PETSc Option Table entries:
-ksp_atol 1e-9
-ksp_max_it 30
-ksp_monitor_true_residual
-ksp_type richardson
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8
Configure run at: Wed Aug 28 23:25:43 2013
Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=0 --known-mpi-c-double-complex=0 --with-batch="1 " --known-mpi-shared="0 " --known-memcmp-ok  --with-blas-lapack-lib="-L/opt/acml/5.3.0/pgi64/lib  -lacml" --COPTFLAGS="-O3 -fastsse" --FOPTFLAGS="-O3 -fastsse" --CXXOPTFLAGS="-O3 -fastsse" --with-x="0 " --with-debugging="0 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries=0 --with-dynamic-loading=0 --with-mpi-compilers="1 " --known-mpi-shared-libraries=0 --with-64-bit-indices --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " --with-cc=cc --with-cxx=CC --with-fc=ftn PETSC_ARCH=interlagos-64idx-pgi-opt
-----------------------------------------
Libraries compiled on Wed Aug 28 23:25:43 2013 on h2ologin3 
Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64
Using PETSc directory: /mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2
Using PETSc arch: interlagos-64idx-pgi-opt
-----------------------------------------

Using C compiler: cc  -O3 -fastsse  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: ftn  -O3 -fastsse   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/include -I/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/include -I/opt/cray/udreg/2.3.2-1.0402.7311.2.1.gem/include -I/opt/cray/ugni/5.0-1.0402.7128.7.6.gem/include -I/opt/cray/pmi/4.0.1-1.0000.9421.73.3.gem/include -I/opt/cray/dmapp/4.0.1-1.0402.7439.5.1.gem/include -I/opt/cray/gni-headers/2.1-1.0402.7082.6.2.gem/include -I/opt/cray/xpmem/0.1-2.0402.44035.2.1.gem/include -I/opt/cray/rca/1.0.0-2.0402.42153.2.106.gem/include -I/opt/cray-hss-devel/7.0.0/include -I/opt/cray/krca/1.0.0-2.0402.42157.2.94.gem/include -I/opt/cray/mpt/6.0.1/gni/mpich2-pgi/121/include -I/opt/acml/5.3.0/pgi64_fma4/include -I/opt/cray/libsci/12.1.01/pgi/121/interlagos/include -I/opt/fftw/3.3.0.3/interlagos/include -I/usr/include/alps -I/opt/pgi/13.6.0/linux86-64/13.6/include -I/opt/cray/xe-sysroot/4.2.24/usr/include
-----------------------------------------

Using C linker: cc
Using Fortran linker: ftn
Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -L/mnt/a/u/sciteam/mrosso/LIBS/petsc-3.4.2/interlagos-64idx-pgi-opt/lib -lsuperlu_dist_3.3 -L/opt/acml/5.3.0/pgi64/lib -lacml -lpthread -lparmetis -lmetis -ldl 
-----------------------------------------

#PETSc Option Table entries:
-ksp_atol 1e-9
-ksp_max_it 30
-ksp_monitor_true_residual
-ksp_type richardson
-ksp_view
-log_summary
-mg_coarse_pc_factor_mat_solver_package superlu_dist
-mg_coarse_pc_type lu
-mg_levels_ksp_max_it 3
-mg_levels_ksp_type gmres
-mg_levels_pc_type bjacobi
-options_left
-pc_mg_galerkin
-pc_mg_levels 5
-pc_mg_log
-pc_mg_type full
-pc_type mg
#End of PETSc Option Table entries
There are no unused options.


More information about the petsc-users mailing list