[petsc-users] Function evaluation slowness ?

Timothée Nicolas timothee.nicolas at gmail.com
Tue Aug 25 01:21:24 CDT 2015


Here is the log summary (attached). At the beginning are personal prints,
you can skip. I seem to have a memory crash in the present state after
typically 45 iterations (that's why I used 40 here), the log summary
indicates some creations without destruction of Petsc objects (I will fix
this immediately), that may cause the memory crash, but I don't think it's
the cause of the slow function evaluations.

The log_summary is consistent with 0.7s per function evaluation
(4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately
the same amount of time (is it normal ?). And the other long operation is
VecScatterEnd. I assume it is the time used in process communications ? In
which case I suppose it is normal that it takes a significant amount of
time.

So this ~10 times increase does not look normal right ?

Best

Timothee NICOLAS


2015-08-25 14:56 GMT+09:00 Barry Smith <bsmith at mcs.anl.gov>:

>
> > On Aug 25, 2015, at 12:45 AM, Timothée Nicolas <
> timothee.nicolas at gmail.com> wrote:
> >
> > Hi,
> >
> > I am testing PETSc on the supercomputer where I used to run my explicit
> MHD code. For my tests I use 256 processes on a problem of size 128*128*640
> = 10485760, that is, 40960 grid points per process, and 8 degrees of
> freedom (or physical fields). The explicit code was using Runge-Kutta 4 for
> the time scheme, which means 4 function evaluation per time step (plus one
> operation to put everything together, but let's forget this one).
> >
> > I could thus easily determine that the typical time required for a
> function evaluation was of the order of 50 ms.
> >
> > Now with the implicit Newton-Krylov solver written in PETSc, in the
> present state where for now I have not implemented any Jacobian or
> preconditioner whatsoever (so I run with -snes_mf), I measure a typical
> time between two time steps of between 5 and 20 seconds, and the number of
> function evaluations for each time step obtained with
> SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of
> course)
> >
> > This means a time per function evaluation of about 0.5 to 1 second, that
> is, 10 to 20 times slower.
> >
> > So I have some questions about this.
> >
> > 1. First does SNESGetNumberFunctionEvals take into account the function
> evaluations required to evaluate the Jacobian when -snes_mf is used, as
> well as the operations required by the GMRES (Krylov) method ? If it were
> the case, I would somehow intuitively expect a number larger than 17, which
> could explain the increase in time.
>
> PetscErrorCode  SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs)
> {
>   *nfuncs = snes->nfuncs;
> }
>
> PetscErrorCode  SNESComputeFunction(SNES snes,Vec x,Vec y)
> {
> ...
>   snes->nfuncs++;
> }
>
> PetscErrorCode  MatCreateSNESMF(SNES snes,Mat *J)
> {
> .....
>   if (snes->pc && snes->pcside == PC_LEFT) {
>     ierr = MatMFFDSetFunction(*J,(PetscErrorCode
> (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr);
>   } else {
>     ierr = MatMFFDSetFunction(*J,(PetscErrorCode
> (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr);
>   }
> }
>
>   So, yes I would expect all the function evaluations needed for the
> matrix-free Jacobian matrix vector product to be counted. You can also look
> at the number of GMRES Krylov iterations it took (which should have one
> multiply per iteration) to double check that the numbers make sense.
>
>   What does your -log_summary output look like? One thing that GMRES does
> is it introduces a global reduction with each multiple (hence a barrier
> across all your processes) on some systems this can be deadly.
>
>   Barry
>
>
> >
> > 2. In any case, I thought that all things considered, the function
> evaluation would be the most time consuming part of a Newton-Krylov solver,
> am I completely wrong about that ? Is the 10-20 factor legit ?
> >
> > I realize of course that preconditioning should make all this smoother,
> in particular allowing larger time steps, but here I am just concerned
> about the sheer Function evaluation time.
> >
> > Best regards
> >
> > Timothee NICOLAS
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150825/e0d8ea11/attachment-0001.html>
-------------- next part --------------
 
 This is an implicit MHD code based on the MIPS code 
 
 Setting all options 
 
 Start: Reading HINT2 equilibrium file 
 
 DATA: lr,lz,lphi=         128         128         640
 lsymmetry=           1
 r_min,r_max=   2.70000000000000        4.60000000000000     
 z_min,z_max= -0.950000000000000       0.950000000000000     
 phi_min,phi_max=  0.000000000000000E+000   6.28318530717959     
 dr,dz,dphi=  1.496062992125984E-002  1.496062992125984E-002
  9.817477042468103E-003
 pmax=  1.135504809500237E-002
 bmax=   2.98086676166910     
 
 End: Reading HINT2 equilibrium file 
 
 Creating nonlinear solver, getting geometrical info, and setting vectors 
 
 Allocating arrays for which it is required 
 Masks definition 
 Major radius and vacuum definition 
 Initializing PETSc Vecs with equilibrium values 
 Set the initial force local vectors (used to enforce the equilibrium) 
 Add a random perturbation to the velocity 
 Entering the main MHD Loop 
 
 Iteration number =    1 
 Time (tau_A) = 1.0000E-02 
  0 SNES Function norm 5.382589763410e-05 
  1 SNES Function norm 4.917642384947e-11 
  2 SNES Function norm 4.187109891461e-16 
 Kinetic Energy =  9.0028E-16 
 Magnetic Energy =  1.3544E-16 
 Total CPU time since PetscInitialize: 7.7581E+00 
 CPU time used for SNESSolve: 5.0903E-01 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =    2 
 Time (tau_A) = 2.0000E-02 
  0 SNES Function norm 8.317717709972e-16 
  1 SNES Function norm 1.373112118906e-16 
 Kinetic Energy =  8.9656E-16 
 Magnetic Energy =  5.4177E-16 
 Total CPU time since PetscInitialize: 5.0395E+01 
 CPU time used for SNESSolve: 3.0864E+01 
 Number of linear iterations :    6 
 Number of function evaluations :    9 
 Too few linear iterations; Time step increased to dt =  1.3333E-02 
 
 
 Iteration number =    3 
 Time (tau_A) = 3.3333E-02 
  0 SNES Function norm 1.732338962433e-05 
  1 SNES Function norm 3.798031074240e-11 
  2 SNES Function norm 5.941618837381e-16 
 Kinetic Energy =  8.8707E-16 
 Magnetic Energy =  1.4783E-15 
 Total CPU time since PetscInitialize: 6.2962E+01 
 CPU time used for SNESSolve: 8.4275E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    4 
 Time (tau_A) = 4.6667E-02 
  0 SNES Function norm 8.650290226831e-07 
  1 SNES Function norm 3.159589761232e-12 
  2 SNES Function norm 5.559310443305e-16 
 Kinetic Energy =  8.7490E-16 
 Magnetic Energy =  2.8512E-15 
 Total CPU time since PetscInitialize: 7.5298E+01 
 CPU time used for SNESSolve: 7.2959E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    5 
 Time (tau_A) = 6.0000E-02 
  0 SNES Function norm 1.814549199007e-06 
  1 SNES Function norm 6.550542684026e-12 
  2 SNES Function norm 5.561417252951e-16 
 Kinetic Energy =  8.5968E-16 
 Magnetic Energy =  4.6147E-15 
 Total CPU time since PetscInitialize: 8.6579E+01 
 CPU time used for SNESSolve: 8.3290E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    6 
 Time (tau_A) = 7.3333E-02 
  0 SNES Function norm 5.143158573116e-07 
  1 SNES Function norm 1.854573359134e-12 
  2 SNES Function norm 5.555619916593e-16 
 Kinetic Energy =  8.4209E-16 
 Magnetic Energy =  6.7217E-15 
 Total CPU time since PetscInitialize: 9.8827E+01 
 CPU time used for SNESSolve: 9.6296E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    7 
 Time (tau_A) = 8.6667E-02 
  0 SNES Function norm 1.241769129665e-07 
  1 SNES Function norm 2.229921948896e-13 
  2 SNES Function norm 5.547572310422e-16 
 Kinetic Energy =  8.2295E-16 
 Magnetic Energy =  9.1167E-15 
 Total CPU time since PetscInitialize: 1.1428E+02 
 CPU time used for SNESSolve: 1.1251E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    8 
 Time (tau_A) = 1.0000E-01 
  0 SNES Function norm 4.127596671841e-08 
  1 SNES Function norm 9.722821486344e-14 
  2 SNES Function norm 5.547721956164e-16 
 Kinetic Energy =  8.0310E-16 
 Magnetic Energy =  1.1740E-14 
 Total CPU time since PetscInitialize: 1.2603E+02 
 CPU time used for SNESSolve: 3.1856E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =    9 
 Time (tau_A) = 1.1333E-01 
  0 SNES Function norm 6.526056598290e-08 
  1 SNES Function norm 1.551059106556e-13 
  2 SNES Function norm 5.554664894406e-16 
 Kinetic Energy =  7.8340E-16 
 Magnetic Energy =  1.4531E-14 
 Total CPU time since PetscInitialize: 1.3916E+02 
 CPU time used for SNESSolve: 1.0360E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   10 
 Time (tau_A) = 1.2667E-01 
  0 SNES Function norm 7.216736548862e-08 
  1 SNES Function norm 1.709747035544e-13 
  2 SNES Function norm 5.551116565705e-16 
 Kinetic Energy =  7.6463E-16 
 Magnetic Energy =  1.7432E-14 
 Total CPU time since PetscInitialize: 1.5486E+02 
 CPU time used for SNESSolve: 1.4236E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   11 
 Time (tau_A) = 1.4000E-01 
  0 SNES Function norm 7.957568936931e-08 
  1 SNES Function norm 7.164583362638e-13 
  2 SNES Function norm 5.547681147827e-16 
 Kinetic Energy =  7.4749E-16 
 Magnetic Energy =  2.0389E-14 
 Total CPU time since PetscInitialize: 1.6355E+02 
 CPU time used for SNESSolve: 5.0159E+00 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   12 
 Time (tau_A) = 1.5333E-01 
  0 SNES Function norm 8.580881227555e-08 
  1 SNES Function norm 7.177441407594e-13 
  2 SNES Function norm 5.549122251796e-16 
 Kinetic Energy =  7.3253E-16 
 Magnetic Energy =  2.3356E-14 
 Total CPU time since PetscInitialize: 1.7703E+02 
 CPU time used for SNESSolve: 8.7550E+00 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   13 
 Time (tau_A) = 1.6667E-01 
  0 SNES Function norm 9.036148523559e-08 
  1 SNES Function norm 7.596864452710e-13 
  2 SNES Function norm 5.552150394126e-16 
 Kinetic Energy =  7.2017E-16 
 Magnetic Energy =  2.6296E-14 
 Total CPU time since PetscInitialize: 1.9497E+02 
 CPU time used for SNESSolve: 1.0766E+01 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   14 
 Time (tau_A) = 1.8000E-01 
  0 SNES Function norm 9.318522525551e-08 
  1 SNES Function norm 8.310283593293e-13 
  2 SNES Function norm 5.549000838284e-16 
 Kinetic Energy =  7.1065E-16 
 Magnetic Energy =  2.9180E-14 
 Total CPU time since PetscInitialize: 2.0960E+02 
 CPU time used for SNESSolve: 8.9322E+00 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   15 
 Time (tau_A) = 1.9333E-01 
  0 SNES Function norm 9.426477938676e-08 
  1 SNES Function norm 9.230910004645e-13 
  2 SNES Function norm 5.545358678033e-16 
 Kinetic Energy =  7.0402E-16 
 Magnetic Energy =  3.1992E-14 
 Total CPU time since PetscInitialize: 2.2222E+02 
 CPU time used for SNESSolve: 3.2721E+00 
 Number of linear iterations :   11 
 Number of function evaluations :   16 
 
 
 Iteration number =   16 
 Time (tau_A) = 2.0667E-01 
  0 SNES Function norm 9.362326845768e-08 
  1 SNES Function norm 1.296102793360e-13 
  2 SNES Function norm 5.547628931478e-16 
 Kinetic Energy =  7.0019E-16 
 Magnetic Energy =  3.4725E-14 
 Total CPU time since PetscInitialize: 2.4133E+02 
 CPU time used for SNESSolve: 4.5367E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   17 
 Time (tau_A) = 2.2000E-01 
  0 SNES Function norm 9.133250113813e-08 
  1 SNES Function norm 1.149831225960e-13 
  2 SNES Function norm 5.548205596917e-16 
 Kinetic Energy =  6.9892E-16 
 Magnetic Energy =  3.7381E-14 
 Total CPU time since PetscInitialize: 2.5612E+02 
 CPU time used for SNESSolve: 7.0432E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   18 
 Time (tau_A) = 2.3333E-01 
  0 SNES Function norm 8.751516705490e-08 
  1 SNES Function norm 1.044612716823e-13 
  2 SNES Function norm 5.546955425427e-16 
 Kinetic Energy =  6.9986E-16 
 Magnetic Energy =  3.9970E-14 
 Total CPU time since PetscInitialize: 2.7746E+02 
 CPU time used for SNESSolve: 1.7779E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   19 
 Time (tau_A) = 2.4667E-01 
  0 SNES Function norm 8.234829805103e-08 
  1 SNES Function norm 9.821327679226e-14 
  2 SNES Function norm 5.548194234735e-16 
 Kinetic Energy =  7.0258E-16 
 Magnetic Energy =  4.2510E-14 
 Total CPU time since PetscInitialize: 2.8796E+02 
 CPU time used for SNESSolve: 7.1363E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   20 
 Time (tau_A) = 2.6000E-01 
  0 SNES Function norm 7.607020340067e-08 
  1 SNES Function norm 9.607454720992e-14 
  2 SNES Function norm 5.546700693492e-16 
 Kinetic Energy =  7.0658E-16 
 Magnetic Energy =  4.5021E-14 
 Total CPU time since PetscInitialize: 3.0141E+02 
 CPU time used for SNESSolve: 9.5897E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   21 
 Time (tau_A) = 2.7333E-01 
  0 SNES Function norm 6.899369214087e-08 
  1 SNES Function norm 9.762527001937e-14 
  2 SNES Function norm 5.546534365057e-16 
 Kinetic Energy =  7.1138E-16 
 Magnetic Energy =  4.7526E-14 
 Total CPU time since PetscInitialize: 3.1839E+02 
 CPU time used for SNESSolve: 1.2120E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   22 
 Time (tau_A) = 2.8667E-01 
  0 SNES Function norm 6.153013990112e-08 
  1 SNES Function norm 1.022809264903e-13 
  2 SNES Function norm 5.559191241468e-16 
 Kinetic Energy =  7.1650E-16 
 Magnetic Energy =  5.0049E-14 
 Total CPU time since PetscInitialize: 3.3729E+02 
 CPU time used for SNESSolve: 1.4010E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   23 
 Time (tau_A) = 3.0000E-01 
  0 SNES Function norm 5.422883432610e-08 
  1 SNES Function norm 1.093821278111e-13 
  2 SNES Function norm 5.545153520497e-16 
 Kinetic Energy =  7.2152E-16 
 Magnetic Energy =  5.2614E-14 
 Total CPU time since PetscInitialize: 3.4506E+02 
 CPU time used for SNESSolve: 3.1828E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   24 
 Time (tau_A) = 3.1333E-01 
  0 SNES Function norm 4.782401973928e-08 
  1 SNES Function norm 1.182852644926e-13 
  2 SNES Function norm 5.545048290555e-16 
 Kinetic Energy =  7.2610E-16 
 Magnetic Energy =  5.5240E-14 
 Total CPU time since PetscInitialize: 3.5751E+02 
 CPU time used for SNESSolve: 3.4274E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   25 
 Time (tau_A) = 3.2667E-01 
  0 SNES Function norm 4.322955973001e-08 
  1 SNES Function norm 1.283072300260e-13 
  2 SNES Function norm 5.545681547284e-16 
 Kinetic Energy =  7.2996E-16 
 Magnetic Energy =  5.7944E-14 
 Total CPU time since PetscInitialize: 3.7469E+02 
 CPU time used for SNESSolve: 2.5263E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   26 
 Time (tau_A) = 3.4000E-01 
  0 SNES Function norm 4.132314115916e-08 
  1 SNES Function norm 1.386969354039e-13 
  2 SNES Function norm 5.540520147662e-16 
 Kinetic Energy =  7.3297E-16 
 Magnetic Energy =  6.0737E-14 
 Total CPU time since PetscInitialize: 3.8264E+02 
 CPU time used for SNESSolve: 3.1395E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   27 
 Time (tau_A) = 3.5333E-01 
  0 SNES Function norm 4.245684761661e-08 
  1 SNES Function norm 1.485504132442e-13 
  2 SNES Function norm 5.545487475751e-16 
 Kinetic Energy =  7.3503E-16 
 Magnetic Energy =  6.3627E-14 
 Total CPU time since PetscInitialize: 3.9247E+02 
 CPU time used for SNESSolve: 5.9592E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   28 
 Time (tau_A) = 3.6667E-01 
  0 SNES Function norm 4.616289783185e-08 
  1 SNES Function norm 1.567803392197e-13 
  2 SNES Function norm 5.548329671559e-16 
 Kinetic Energy =  7.3618E-16 
 Magnetic Energy =  6.6616E-14 
 Total CPU time since PetscInitialize: 4.0173E+02 
 CPU time used for SNESSolve: 5.8294E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   29 
 Time (tau_A) = 3.8000E-01 
  0 SNES Function norm 5.149150954830e-08 
  1 SNES Function norm 1.622328092002e-13 
  2 SNES Function norm 5.545725405186e-16 
 Kinetic Energy =  7.3651E-16 
 Magnetic Energy =  6.9706E-14 
 Total CPU time since PetscInitialize: 4.1289E+02 
 CPU time used for SNESSolve: 7.4448E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   30 
 Time (tau_A) = 3.9333E-01 
  0 SNES Function norm 5.751421272236e-08 
  1 SNES Function norm 1.639703075993e-13 
  2 SNES Function norm 5.551145331563e-16 
 Kinetic Energy =  7.3617E-16 
 Magnetic Energy =  7.2890E-14 
 Total CPU time since PetscInitialize: 4.2473E+02 
 CPU time used for SNESSolve: 9.8094E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   31 
 Time (tau_A) = 4.0667E-01 
  0 SNES Function norm 6.353011943021e-08 
  1 SNES Function norm 1.616610594001e-13 
  2 SNES Function norm 5.542606202109e-16 
 Kinetic Energy =  7.3536E-16 
 Magnetic Energy =  7.6164E-14 
 Total CPU time since PetscInitialize: 4.3767E+02 
 CPU time used for SNESSolve: 9.9831E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   32 
 Time (tau_A) = 4.2000E-01 
  0 SNES Function norm 6.905745342631e-08 
  1 SNES Function norm 1.557783742592e-13 
  2 SNES Function norm 5.544338434073e-16 
 Kinetic Energy =  7.3427E-16 
 Magnetic Energy =  7.9520E-14 
 Total CPU time since PetscInitialize: 4.6580E+02 
 CPU time used for SNESSolve: 2.7991E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   33 
 Time (tau_A) = 4.3333E-01 
  0 SNES Function norm 7.377639093458e-08 
  1 SNES Function norm 1.474466471932e-13 
  2 SNES Function norm 5.543376070989e-16 
 Kinetic Energy =  7.3312E-16 
 Magnetic Energy =  8.2950E-14 
 Total CPU time since PetscInitialize: 4.8419E+02 
 CPU time used for SNESSolve: 1.4762E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   34 
 Time (tau_A) = 4.4667E-01 
  0 SNES Function norm 7.748319943328e-08 
  1 SNES Function norm 1.381088301393e-13 
  2 SNES Function norm 5.550610629314e-16 
 Kinetic Energy =  7.3209E-16 
 Magnetic Energy =  8.6449E-14 
 Total CPU time since PetscInitialize: 4.9802E+02 
 CPU time used for SNESSolve: 1.3800E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   35 
 Time (tau_A) = 4.6000E-01 
  0 SNES Function norm 8.006178457271e-08 
  1 SNES Function norm 1.290725563320e-13 
  2 SNES Function norm 5.542186770005e-16 
 Kinetic Energy =  7.3134E-16 
 Magnetic Energy =  9.0010E-14 
 Total CPU time since PetscInitialize: 5.2157E+02 
 CPU time used for SNESSolve: 1.5918E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   36 
 Time (tau_A) = 4.7333E-01 
  0 SNES Function norm 8.146728382443e-08 
  1 SNES Function norm 1.213819113579e-13 
  2 SNES Function norm 5.543903059087e-16 
 Kinetic Energy =  7.3097E-16 
 Magnetic Energy =  9.3631E-14 
 Total CPU time since PetscInitialize: 5.3299E+02 
 CPU time used for SNESSolve: 7.5341E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   37 
 Time (tau_A) = 4.8667E-01 
  0 SNES Function norm 8.171674223552e-08 
  1 SNES Function norm 1.157134150395e-13 
  2 SNES Function norm 5.542672888432e-16 
 Kinetic Energy =  7.3106E-16 
 Magnetic Energy =  9.7311E-14 
 Total CPU time since PetscInitialize: 5.5695E+02 
 CPU time used for SNESSolve: 1.8279E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   38 
 Time (tau_A) = 5.0000E-01 
  0 SNES Function norm 8.088378896911e-08 
  1 SNES Function norm 1.124615100310e-13 
  2 SNES Function norm 5.544771436497e-16 
 Kinetic Energy =  7.3164E-16 
 Magnetic Energy =  1.0105E-13 
 Total CPU time since PetscInitialize: 5.7506E+02 
 CPU time used for SNESSolve: 4.0146E+00 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   39 
 Time (tau_A) = 5.1333E-01 
  0 SNES Function norm 7.909554806664e-08 
  1 SNES Function norm 1.117368923673e-13 
  2 SNES Function norm 5.552869590808e-16 
 Kinetic Energy =  7.3268E-16 
 Magnetic Energy =  1.0485E-13 
 Total CPU time since PetscInitialize: 5.9949E+02 
 CPU time used for SNESSolve: 1.7531E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Iteration number =   40 
 Time (tau_A) = 5.2667E-01 
  0 SNES Function norm 7.653064797874e-08 
  1 SNES Function norm 1.134332671074e-13 
  2 SNES Function norm 5.551034395178e-16 
 Kinetic Energy =  7.3414E-16 
 Magnetic Energy =  1.0872E-13 
 Total CPU time since PetscInitialize: 6.2212E+02 
 CPU time used for SNESSolve: 1.4068E+01 
 Number of linear iterations :   12 
 Number of function evaluations :   17 
 
 
 Exiting the main MHD Loop 
 
 Deallocating remaining arrays 
 
 Destroying remaining Petsc elements 
 
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./a.out.local on a arch-linux2-cxx-opt named helios1497 with 256 processors, by tnicolas Tue Aug 25 14:55:05 2015
Using Petsc Release Version 3.6.0, Jun, 09, 2015 

                         Max       Max/Min        Avg      Total 
Time (sec):           6.518e+02      1.00002   6.518e+02
Objects:              1.666e+03      1.00000   1.666e+03
Flops:                4.582e+09      1.00000   4.582e+09  1.173e+12
Flops/sec:            7.029e+06      1.00002   7.029e+06  1.799e+09
MPI Messages:         8.081e+03      1.49454   6.746e+03  1.727e+06
MPI Message Lengths:  7.891e+08      3.82180   3.905e+04  6.744e+10
MPI Reductions:       3.368e+03      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 6.5182e+02 100.0%  1.1729e+12 100.0%  1.727e+06 100.0%  3.905e+04      100.0%  3.367e+03 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

SNESSolve             40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 2.7e+03 46 93 99 95 80  46 93 99 95 80  2187
SNESFunctionEval     666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 1.3e+03 45 13 99 95 40  45 13 99 95 40   299
SNESJacobianEval      79 1.0 2.8245e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
SNESLineSearch        79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 6.3e+02  1 11 23 23 19   1 11 23 23 19 33068
VecView                2 1.0 3.2757e+01 6.9 0.00e+00 0.0 1.6e+04 1.9e+05 9.0e+00  3  0  1  5  0   3  0  1  5  0     0
VecDot                79 1.0 2.1557e-01 5.1 4.53e+07 1.0 0.0e+00 0.0e+00 7.9e+01  0  1  0  0  2   0  1  0  0  2 53797
VecMDot              468 1.0 2.1886e+00 2.4 9.31e+08 1.0 0.0e+00 0.0e+00 4.7e+02  0 20  0  0 14   0 20  0  0 14 108862
VecNorm              781 1.0 6.5641e-01 1.3 4.48e+08 1.0 0.0e+00 0.0e+00 7.8e+02  0 10  0  0 23   0 10  0  0 23 174663
VecScale            1202 1.0 5.7695e-01 2.1 3.45e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  8  0  0  0   0  8  0  0  0 152921
VecCopy             1092 1.0 1.3964e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               827 1.0 1.2017e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             1661 1.0 2.1384e+00 1.3 9.52e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0 21  0  0  0   0 21  0  0  0 114027
VecWAXPY            1365 1.0 2.9492e+00 1.1 5.48e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0 12  0  0  0   0 12  0  0  0 47586
VecMAXPY             547 1.0 1.7110e+00 1.0 1.20e+09 1.0 0.0e+00 0.0e+00 0.0e+00  0 26  0  0  0   0 26  0  0  0 179407
VecAssemblyBegin       5 1.0 2.7040e-0213.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+01  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd         5 1.0 8.6308e-0511.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult      80 1.0 1.4662e-01 1.2 2.29e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 40050
VecScatterBegin     1335 1.0 2.4852e+00 1.2 0.00e+00 0.0 1.7e+06 3.8e+04 0.0e+00  0  0 99 97  0   0  0 99 97  0     0
VecScatterEnd       1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31  0  0  0  0  31  0  0  0  0     0
VecReduceArith       158 1.0 7.9838e-02 1.1 9.06e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 290521
VecReduceComm         79 1.0 1.2613e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.9e+01  0  0  0  0  2   0  0  0  0  2     0
VecNormalize         547 1.0 4.6664e-01 1.2 4.25e+08 1.0 0.0e+00 0.0e+00 4.7e+02  0  9  0  0 14   0  9  0  0 14 233267
MatMult MF           547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25962
MatMult              547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03  2 28 81 78 34   2 28 81 78 34 25960
MatAssemblyBegin      79 1.0 1.8835e-0519.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd        79 1.0 2.2821e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog       468 1.0 3.4735e+00 1.6 1.86e+09 1.0 0.0e+00 0.0e+00 4.7e+02  0 41  0  0 14   0 41  0  0 14 137185
KSPSetUp              79 1.0 7.9496e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
KSPSolve              79 1.0 1.4808e+01 1.0 3.69e+09 1.0 1.2e+06 3.8e+04 1.9e+03  2 81 69 67 58   2 81 69 67 58 63857
PCSetUp               79 1.0 2.1935e-05 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply              547 1.0 6.8284e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

                SNES     1              1         1332     0
      SNESLineSearch     1              1          864     0
              DMSNES     2              1          664     0
              Vector  1630            290    654183152     0
      Vector Scatter     3              1         1600     0
             MatMFFD     1              1          784     0
              Matrix     1              1         2304     0
    Distributed Mesh     3              2         9456     0
Star Forest Bipartite Graph     6              4         3312     0
     Discrete System     3              2         1696     0
           Index Set     6              6       187248     0
   IS L to G Mapping     2              1          948     0
       Krylov Solver     1              1        18360     0
     DMKSP interface     1              0            0     0
      Preconditioner     1              1          880     0
              Viewer     3              2         1536     0
         PetscRandom     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 0
Average time for MPI_Barrier(): 9.39369e-06
Average time for zero size MPI_Send(): 0.000582926
#PETSc Option Table entries:
-dt 0.01
-log_summary
-nts 40
-skip 500
-snes_mf
-snes_monitor
-time_limit 35950
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --prefix=/csc/softs/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real --with-clanguage=cxx --with-debugging=0 --with-x=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --with-fortran --known-mpi-shared-libraries=1 --with-scalar-type=real --with-precision=double --CFLAGS="-g -O3 -mavx -mkl" --CXXFLAGS="-g -O3 -mavx -mkl" --FFLAGS="-g -O3 -mavx -mkl"
-----------------------------------------
Libraries compiled on Mon Jun 22 11:05:39 2015 on helios85 
Machine characteristics: Linux-2.6.32-504.16.2.el6.Bull.74.x86_64-x86_64-with-redhat-6.4-Santiago
Using PETSc directory: /csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0
Using PETSc arch: arch-linux2-cxx-opt
-----------------------------------------

Using C compiler: mpicxx -g -O3 -mavx -mkl   -fPIC   ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -g -O3 -mavx -mkl -fPIC   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/include -I/opt/mpi/bullxmpi/1.2.8.2/include
-----------------------------------------

Using C linker: mpicxx
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/lib -L/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/lib -lpetsc -lhwloc -lxml2 -lssl -lcrypto -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -lmpi_f90 -lmpi_f77 -lm -lifport -lifcore -lm -lmpi_cxx -ldl -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -lmpi -lnuma -lrt -lnsl -lutil -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -limf -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -ldl 
-----------------------------------------


More information about the petsc-users mailing list