[petsc-users] Investigate parallel code to improve parallelism

TAY wee-beng zonexo at gmail.com
Tue Mar 1 02:03:15 CST 2016


On 29/2/2016 11:21 AM, Barry Smith wrote:
>> On Feb 28, 2016, at 8:26 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>
>>
>> On 29/2/2016 9:41 AM, Barry Smith wrote:
>>>> On Feb 28, 2016, at 7:08 PM, TAY Wee Beng <zonexo at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I've attached the files for x cells running y procs. hypre is called natively I'm not sure if PETSc catches it.
>>>    So you are directly creating hypre matrices and calling the hypre solver in another piece of your code?
>> Yes because I'm using the simple structure (struct) layout for Cartesian grids. It's about twice as fast compared to BoomerAMG
>     Understood
>
>> . I can't create PETSc matrix and use the hypre struct layout, right?
>>>     In the PETSc part of the code if you compare the 2x_y to the x_y you see that doubling the problem size resulted in 2.2 as much time for the KSPSolve. Most of this large increase is due to the increased time in the scatter which went up to 150/54.  = 2.7777777777777777  but the amount of data transferred only increased by 1e5/6.4e4 = 1.5625  Normally I would not expect to see this behavior and would not expect such a large increase in the communication time.
>>>
>>> Barry
>>>
>>>
>>>
>> So ideally it should be 2 instead of 2.2, is that so?
>    Ideally
>
>> May I know where are you looking at? Because I can't find the nos.
>    The column labeled Avg len tells the average length of messages which increases from 6.4e4 to 1e5 while the time max increase by 2.77 (I took the sum of the VecScatterBegin and VecScatter End rows.
>
>> So where do you think the error comes from?
>    It is not really an error it is just that it is taking more time then one would hope it would take.
>> Or how can I troubleshoot further?
>
>     If you run the same problem several times how much different are the numerical timings for each run?
Hi,

I have re-done x_y and 2x_y again. I have attached the files with _2 for 
the 2nd run. They're exactly the same.

Should I try running on another cluster?

I also tried running the same problem with more cells and more time 
steps (to reduce start up effects) on another cluster. But I forgot to 
run it with -log_summary. Anyway, the results show:

1. Using 1.5 million cells with 48 procs and 3M with 96p took 65min and 
69min. Using the weak scaling formula I attached earlier, it gives about 
88% efficiency

2. Using 3 million cells with 48 procs and 6M with 96p took 114min and 
121min. Using the weak scaling formula I attached earlier, it gives 
about 88% efficiency

3. Using 3.75 million cells with 48 procs and 7.5M with 96p took 134min 
and 143min. Using the weak scaling formula I attached earlier, it gives 
about 87% efficiency

4. Using 4.5 million cells with 48 procs and 9M with 96p took 160min and 
176min (extrapolated). Using the weak scaling formula I attached 
earlier, it gives about 80% efficiency

So it seems that I should run with 3.75 million cells with 48 procs and 
scale along this ratio. Beyond that, my efficiency decreases. Is that 
so? Maybe I should also run with -log_summary to get better estimate...

Thanks.
>
>
>> Thanks
>>>> Thanks
>>>>
>>>> On 29/2/2016 1:11 AM, Barry Smith wrote:
>>>>>    As I said before, send the -log_summary output for the two processor sizes and we'll look at where it is spending its time and how it could possibly be improved.
>>>>>
>>>>>    Barry
>>>>>
>>>>>> On Feb 28, 2016, at 10:29 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 27/2/2016 12:53 AM, Barry Smith wrote:
>>>>>>>> On Feb 26, 2016, at 10:27 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 26/2/2016 11:32 PM, Barry Smith wrote:
>>>>>>>>>> On Feb 26, 2016, at 9:28 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have got a 3D code. When I ran with 48 procs and 11 million cells, it runs for 83 min. When I ran with 96 procs and 22 million cells, it ran for 99 min.
>>>>>>>>>     This is actually pretty good!
>>>>>>>> But if I'm not wrong, if I increase the no. of cells, the parallelism will keep on decreasing. I hope it scales up to maybe 300 - 400 procs.
>>>>>> Hi,
>>>>>>
>>>>>> I think I may have mentioned this before, that is, I need to submit a proposal to request for computing nodes. In the proposal, I'm supposed to run some simulations to estimate the time it takes to run my code. Then an excel file will use my input to estimate the efficiency when I run my code with more cells. They use 2 mtds to estimate:
>>>>>>
>>>>>> 1. strong scaling, whereby I run 2 cases - 1st with n cells and x procs, then with n cells and 2x procs. From there, they can estimate my expected efficiency when I have y procs. The formula is attached in the pdf.
>>>>>>
>>>>>> 2. weak scaling, whereby I run 2 cases - 1st with n cells and x procs, then with 2n cells and 2x procs. From there, they can estimate my expected efficiency when I have y procs. The formula is attached in the pdf.
>>>>>>
>>>>>> So if I use 48 and 96 procs and get maybe 80% efficiency, by the time I hit 800 procs, I get 32% efficiency for strong scaling. They expect at least 50% efficiency for my code. To reach that, I need to achieve 89% efficiency when I use 48 and 96 procs.
>>>>>>
>>>>>> So now my qn is how accurate is this type of calculation, especially wrt to PETSc?
>>>>>>
>>>>>> Similarly, for weak scaling, is it accurate?
>>>>>>
>>>>>> Can I argue that this estimation is not suitable for PETSc or hypre?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>>>>> So it's not that parallel. I want to find out which part of the code I need to improve. Also if PETsc and hypre is working well in parallel. What's the best way to do it?
>>>>>>>>>    Run both with -log_summary and send the output for each case. This will show where the time is being spent and which parts are scaling less well.
>>>>>>>>>
>>>>>>>>>     Barry
>>>>>>>> That's only for the PETSc part, right? So for other parts of the code, including hypre part, I will not be able to find out. If so, what can I use to check these parts?
>>>>>>>     You will still be able to see what percentage of the time is spent in hypre and if it increases with the problem size and how much. So the information will still be useful.
>>>>>>>
>>>>>>>    Barry
>>>>>>>
>>>>>>>>>> I thought of doing profiling but if the code is optimized, I wonder if it still works well.
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>> <temp.pdf>
>>>> -- 
>>>> Thank you
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>> <2x_2y.txt><2x_y.txt><4x_2y.txt><x_y.txt>

-------------- next part --------------
  0.000000000000000E+000  0.600000000000000        17.5000000000000     
   120.000000000000       0.000000000000000E+000  0.250000000000000     
   1.00000000000000       0.400000000000000                0     -400000
 AB,AA,BB   -2.51050002424745        2.47300002246629     
   2.51050002424745        2.43950002087513     
 size_x,size_y,size_z           79         137         141
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           0           1          35
           1          24           1       66360
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           1          36          69
           1          24       66361      130824
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           2          70         103
           1          24      130825      195288
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           3         104         137
           1          24      195289      259752
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           4           1          35
          25          48      259753      326112
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           5          36          69
          25          48      326113      390576
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           6          70         103
          25          48      390577      455040
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           7         104         137
          25          48      455041      519504
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           8           1          35
          49          72      519505      585864
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           9          36          69
          49          72      585865      650328
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          10          70         103
          49          72      650329      714792
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          11         104         137
          49          72      714793      779256
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          12           1          35
          73          95      779257      842851
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          13          36          69
          73          95      842852      904629
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          14          70         103
          73          95      904630      966407
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          15         104         137
          73          95      966408     1028185
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          16           1          35
          96         118     1028186     1091780
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          17          36          69
          96         118     1091781     1153558
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          18          70         103
          96         118     1153559     1215336
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          19         104         137
          96         118     1215337     1277114
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          20           1          35
         119         141     1277115     1340709
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          21          36          69
         119         141     1340710     1402487
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          22          70         103
         119         141     1402488     1464265
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          23         104         137
         119         141     1464266     1526043
 body_cg_ini  0.850000999999998       9.999999998273846E-007
   6.95771875020604     
        3104  surfaces with wrong vertex ordering
 Warning - length difference between element and cell
 max_element_length,min_element_length,min_delta
  7.847540176996057E-002  3.349995610000001E-002  4.700000000000000E-002
 maximum ngh_surfaces and ngh_vertics are           47          22
 minimum ngh_surfaces and ngh_vertics are           22           9
 min IIB_cell_no           0
 max IIB_cell_no         112
 final initial IIB_cell_no        5600
 min I_cell_no           0
 max I_cell_no          96
 final initial I_cell_no        4800
 size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
        5600        4800        5600        4800
 IIB_I_cell_no_uvw_total1        1221        1206        1212         775
         761         751
    1      0.01904762      0.28410536      0.31610359      1.14440147 -0.14430869E+03 -0.13111542E+02  0.15251948E+07
    2      0.01348578      0.34638018      0.42392119      1.23447223 -0.16528393E+03 -0.10238827E+02  0.15250907E+07
    3      0.01252674      0.38305826      0.49569053      1.27891383 -0.16912542E+03 -0.95950253E+01  0.15250695E+07
    4      0.01199639      0.41337279      0.54168038      1.29584768 -0.17048065E+03 -0.94814301E+01  0.15250602E+07
    5      0.01165251      0.43544137      0.57347276      1.30255981 -0.17129184E+03 -0.95170304E+01  0.15250538E+07
  300      0.00236362      3.56353622      5.06727508      4.03923148 -0.78697893E+03  0.15046453E+05  0.15263125E+07
  600      0.00253142      2.94537779      5.74258126      4.71794271 -0.38271069E+04 -0.49150195E+04  0.15289768E+07
  900      0.00220341      3.10439489      6.70144317      4.01105348 -0.71943943E+04  0.13728311E+05  0.15320532E+07
 1200      0.00245748      3.53496741      7.33163591      4.01935315 -0.85017750E+04 -0.77550358E+04  0.15350351E+07
 1500      0.00244299      3.71751725      5.93463559      4.12005108 -0.95364451E+04  0.81223334E+04  0.15373061E+07
 1800      0.00237474      3.49908653      5.20866314      4.69712853 -0.10382365E+05 -0.18966840E+04  0.15385160E+07
 escape_time reached, so abort
 cd_cl_cs_mom_implicit1
  -1.03894256791350       -1.53179673343374       6.737940408853320E-002
  0.357464909626058      -0.103698436387821       -2.42688484514611     
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./a.out on a petsc-3.6.3_static_rel named n12-09 with 24 processors, by wtay Sat Feb 27 16:09:41 2016
Using Petsc Release Version 3.6.3, Dec, 03, 2015 

                         Max       Max/Min        Avg      Total 
Time (sec):           2.922e+03      1.00001   2.922e+03
Objects:              2.008e+04      1.00000   2.008e+04
Flops:                1.651e+11      1.08049   1.582e+11  3.797e+12
Flops/sec:            5.652e+07      1.08049   5.414e+07  1.299e+09
MPI Messages:         8.293e+04      1.89333   6.588e+04  1.581e+06
MPI Message Lengths:  4.109e+09      2.03497   4.964e+04  7.849e+10
MPI Reductions:       4.427e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 2.9219e+03 100.0%  3.7965e+12 100.0%  1.581e+06 100.0%  4.964e+04      100.0%  4.427e+04 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot              3998 1.0 4.4655e+01 5.1 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03  1  1  0  0  9   1  1  0  0  9   820
VecDotNorm2         1999 1.0 4.0603e+01 7.6 1.59e+09 1.1 0.0e+00 0.0e+00 2.0e+03  1  1  0  0  5   1  1  0  0  5   902
VecNorm             3998 1.0 3.0557e+01 6.2 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03  1  1  0  0  9   1  1  0  0  9  1198
VecCopy             3998 1.0 4.4206e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet             12002 1.0 9.3725e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPBYCZ          3998 1.0 9.1178e+00 1.5 3.18e+09 1.1 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0  8030
VecWAXPY            3998 1.0 9.3186e+00 1.5 1.59e+09 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  3928
VecAssemblyBegin    3998 1.0 1.5680e+01 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04  0  0  0  0 27   0  0  0  0 27     0
VecAssemblyEnd      3998 1.0 1.1443e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin    16002 1.0 9.0984e+00 1.4 0.00e+00 0.0 1.2e+06 6.4e+04 0.0e+00  0  0 77100  0   0  0 77100  0     0
VecScatterEnd      16002 1.0 4.4821e+01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatMult             3998 1.0 1.4268e+02 1.3 6.05e+10 1.1 3.0e+05 1.1e+05 0.0e+00  4 37 19 43  0   4 37 19 43  0  9753
MatSolve            5997 1.0 2.0469e+02 1.4 8.84e+10 1.1 0.0e+00 0.0e+00 0.0e+00  6 53  0  0  0   6 53  0  0  0  9921
MatLUFactorNum       104 1.0 2.2332e+01 1.1 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0  6922
MatILUFactorSym        1 1.0 1.0867e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale               1 1.0 3.8305e-02 1.9 7.67e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4603
MatAssemblyBegin     105 1.0 2.0776e+00 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd       105 1.0 2.4702e+00 1.1 0.00e+00 0.0 1.5e+02 2.8e+04 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 4.0531e-06 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 7.1249e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp             105 1.0 9.8758e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
KSPSolve            1999 1.0 4.1857e+02 1.0 1.65e+11 1.1 3.0e+05 1.1e+05 1.0e+04 14100 19 43 23  14100 19 43 23  9070
PCSetUp              208 1.0 2.2440e+01 1.1 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0  6888
PCSetUpOnBlocks     1999 1.0 2.7087e-01 1.1 6.44e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  5487
PCApply             5997 1.0 2.3123e+02 1.3 9.50e+10 1.1 0.0e+00 0.0e+00 0.0e+00  6 58  0  0  0   6 58  0  0  0  9444
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector  4032           4032     31782464     0
      Vector Scatter  2010             15      3738624     0
              Matrix     4              4    190398024     0
    Distributed Mesh  2003              8        39680     0
Star Forest Bipartite Graph  4006             16        13696     0
     Discrete System  2003              8         6784     0
           Index Set  4013           4013     14715400     0
   IS L to G Mapping  2003              8      2137148     0
       Krylov Solver     2              2         2296     0
      Preconditioner     2              2         1896     0
              Viewer     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 8.15392e-06
Average time for zero size MPI_Send(): 1.12454e-05
#PETSc Option Table entries:
-log_summary
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Thu Jan  7 04:05:35 2016 on hpc12 
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.3
Using PETSc arch: petsc-3.6.3_static_rel
-----------------------------------------

Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc  -wd1572 -O3  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90  -O3   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------

Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl 
-----------------------------------------

-------------- next part --------------
  0.000000000000000E+000  0.600000000000000        17.5000000000000     
   120.000000000000       0.000000000000000E+000  0.250000000000000     
   1.00000000000000       0.400000000000000                0     -400000
 AB,AA,BB   -2.78150003711926        2.76500003633555     
   2.78150003711926        2.70650003355695     
 size_x,size_y,size_z          100         172         171
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           0           1          29
           1          43           1      124700
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           1          30          58
           1          43      124701      249400
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           2          59          87
           1          43      249401      374100
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           3          88         116
           1          43      374101      498800
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           4         117         144
           1          43      498801      619200
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           5         145         172
           1          43      619201      739600
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           6           1          29
          44          86      739601      864300
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           7          30          58
          44          86      864301      989000
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           8          59          87
          44          86      989001     1113700
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           9          88         116
          44          86     1113701     1238400
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          10         117         144
          44          86     1238401     1358800
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          11         145         172
          44          86     1358801     1479200
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          12           1          29
          87         129     1479201     1603900
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          13          30          58
          87         129     1603901     1728600
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          14          59          87
          87         129     1728601     1853300
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          15          88         116
          87         129     1853301     1978000
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          16         117         144
          87         129     1978001     2098400
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          17         145         172
          87         129     2098401     2218800
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          18           1          29
         130         171     2218801     2340600
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          19          30          58
         130         171     2340601     2462400
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          20          59          87
         130         171     2462401     2584200
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          21          88         116
         130         171     2584201     2706000
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          22         117         144
         130         171     2706001     2823600
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          23         145         172
         130         171     2823601     2941200
 body_cg_ini  0.850000999999998       9.999999998273846E-007
   6.95771875020604     
        3104  surfaces with wrong vertex ordering
 Warning - length difference between element and cell
 max_element_length,min_element_length,min_delta
  7.847540176996057E-002  3.349995610000001E-002  3.500000000000000E-002
 maximum ngh_surfaces and ngh_vertics are           28          12
 minimum ngh_surfaces and ngh_vertics are           14           5
 min IIB_cell_no           0
 max IIB_cell_no         229
 final initial IIB_cell_no       11450
 min I_cell_no           0
 max I_cell_no         200
 final initial I_cell_no       10000
 size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
       11450       10000       11450       10000
 IIB_I_cell_no_uvw_total1        2230        2227        2166        1930
        1926        1847
    1      0.01411765      0.30104754      0.32529731      1.15440698 -0.30539502E+03 -0.29715696E+02  0.29394159E+07
    2      0.00973086      0.41244573      0.45086899      1.22116550 -0.34890134E+03 -0.25062690E+02  0.29392110E+07
    3      0.00918177      0.45383616      0.51179402      1.27757073 -0.35811483E+03 -0.25027396E+02  0.29391677E+07
    4      0.00885764      0.47398774      0.55169119      1.31019526 -0.36250500E+03 -0.25910050E+02  0.29391470E+07
    5      0.00872241      0.48832538      0.57967282      1.32679047 -0.36545763E+03 -0.26947216E+02  0.29391325E+07
  300      0.00163886      4.27898628      6.83028522      3.60837060 -0.19609891E+04  0.43984454E+05  0.29435194E+07
  600      0.00160193      3.91014241      4.97460210      5.10461274 -0.61092521E+03  0.18910563E+05  0.29467790E+07
  900      0.00150521      3.27352854      5.85427996      4.49166453 -0.89281765E+04 -0.12171584E+05  0.29507471E+07
 1200      0.00165280      3.05922213      7.37243530      5.16434634 -0.10954640E+05  0.22049957E+05  0.29575213E+07
 1500      0.00153718      3.54908044      5.42918256      4.84940953 -0.16430153E+05  0.24407130E+05  0.29608940E+07
 1800      0.00155455      3.30956962      8.35799538      4.50638757 -0.20003619E+05 -0.20349497E+05  0.29676102E+07
 escape_time reached, so abort
 cd_cl_cs_mom_implicit1
  -1.29348921431473       -2.44525665200003      -0.238725356553914     
  0.644444280391413      -3.056662699041206E-002  -2.91791118488116     
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./a.out on a petsc-3.6.3_static_rel named n12-06 with 24 processors, by wtay Mon Feb 29 21:45:09 2016
Using Petsc Release Version 3.6.3, Dec, 03, 2015 

                         Max       Max/Min        Avg      Total 
Time (sec):           5.933e+03      1.00000   5.933e+03
Objects:              2.008e+04      1.00000   2.008e+04
Flops:                3.129e+11      1.06806   3.066e+11  7.360e+12
Flops/sec:            5.273e+07      1.06807   5.169e+07  1.241e+09
MPI Messages:         8.298e+04      1.89703   6.585e+04  1.580e+06
MPI Message Lengths:  6.456e+09      2.05684   7.780e+04  1.229e+11
MPI Reductions:       4.427e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 5.9326e+03 100.0%  7.3595e+12 100.0%  1.580e+06 100.0%  7.780e+04      100.0%  4.427e+04 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot              3998 1.0 1.0612e+02 2.0 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03  1  1  0  0  9   1  1  0  0  9   665
VecDotNorm2         1999 1.0 9.4306e+01 2.1 2.99e+09 1.1 0.0e+00 0.0e+00 2.0e+03  1  1  0  0  5   1  1  0  0  5   748
VecNorm             3998 1.0 8.7330e+01 2.0 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03  1  1  0  0  9   1  1  0  0  9   808
VecCopy             3998 1.0 7.4317e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet             12002 1.0 1.1626e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPBYCZ          3998 1.0 1.7543e+01 1.4 5.98e+09 1.1 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0  8044
VecWAXPY            3998 1.0 1.6637e+01 1.4 2.99e+09 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  4241
VecAssemblyBegin    3998 1.0 3.0367e+01 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04  0  0  0  0 27   0  0  0  0 27     0
VecAssemblyEnd      3998 1.0 1.5386e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin    16002 1.0 1.7833e+01 1.4 0.00e+00 0.0 1.2e+06 1.0e+05 0.0e+00  0  0 77100  0   0  0 77100  0     0
VecScatterEnd      16002 1.0 1.2689e+02 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
MatMult             3998 1.0 3.1700e+02 1.3 1.15e+11 1.1 3.0e+05 1.7e+05 0.0e+00  5 37 19 43  0   5 37 19 43  0  8482
MatSolve            5997 1.0 3.6841e+02 1.3 1.67e+11 1.1 0.0e+00 0.0e+00 0.0e+00  6 54  0  0  0   6 54  0  0  0 10707
MatLUFactorNum       104 1.0 4.3137e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0  7016
MatILUFactorSym        1 1.0 3.5212e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale               1 1.0 9.1592e-02 3.0 1.45e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3720
MatAssemblyBegin     105 1.0 5.1547e+00 4.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd       105 1.0 4.7898e+00 1.1 0.00e+00 0.0 1.5e+02 4.3e+04 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 4.0531e-06 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 2.0590e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp             105 1.0 4.5063e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
KSPSolve            1999 1.0 9.1527e+02 1.0 3.13e+11 1.1 3.0e+05 1.7e+05 1.0e+04 15100 19 43 23  15100 19 43 23  8040
PCSetUp              208 1.0 4.3499e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0  6958
PCSetUpOnBlocks     1999 1.0 8.2526e-01 1.3 1.25e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3526
PCApply             5997 1.0 4.1644e+02 1.3 1.80e+11 1.1 0.0e+00 0.0e+00 0.0e+00  6 58  0  0  0   6 58  0  0  0 10192
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector  4032           4032     53827712     0
      Vector Scatter  2010             15      7012720     0
              Matrix     4              4    359683260     0
    Distributed Mesh  2003              8        39680     0
Star Forest Bipartite Graph  4006             16        13696     0
     Discrete System  2003              8         6784     0
           Index Set  4013           4013     25819112     0
   IS L to G Mapping  2003              8      3919440     0
       Krylov Solver     2              2         2296     0
      Preconditioner     2              2         1896     0
              Viewer     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Average time for MPI_Barrier(): 1.03951e-05
Average time for zero size MPI_Send(): 1.83781e-06
#PETSc Option Table entries:
-log_summary
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Thu Jan  7 04:05:35 2016 on hpc12 
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.3
Using PETSc arch: petsc-3.6.3_static_rel
-----------------------------------------

Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc  -wd1572 -O3  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90  -O3   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------

Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl 
-----------------------------------------

-------------- next part --------------
  0.000000000000000E+000  0.600000000000000        17.5000000000000     
   120.000000000000       0.000000000000000E+000  0.250000000000000     
   1.00000000000000       0.400000000000000                0     -400000
 AB,AA,BB   -2.51050002424745        2.47300002246629     
   2.51050002424745        2.43950002087513     
 size_x,size_y,size_z           79         137         141
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           0           1          35
           1          24           1       66360
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           1          36          69
           1          24       66361      130824
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           2          70         103
           1          24      130825      195288
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           3         104         137
           1          24      195289      259752
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           4           1          35
          25          48      259753      326112
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           5          36          69
          25          48      326113      390576
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           6          70         103
          25          48      390577      455040
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           7         104         137
          25          48      455041      519504
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           8           1          35
          49          72      519505      585864
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           9          36          69
          49          72      585865      650328
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          10          70         103
          49          72      650329      714792
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          11         104         137
          49          72      714793      779256
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          12           1          35
          73          95      779257      842851
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          13          36          69
          73          95      842852      904629
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          14          70         103
          73          95      904630      966407
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          15         104         137
          73          95      966408     1028185
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          16           1          35
          96         118     1028186     1091780
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          17          36          69
          96         118     1091781     1153558
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          18          70         103
          96         118     1153559     1215336
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          19         104         137
          96         118     1215337     1277114
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          20           1          35
         119         141     1277115     1340709
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          21          36          69
         119         141     1340710     1402487
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          22          70         103
         119         141     1402488     1464265
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          23         104         137
         119         141     1464266     1526043
 body_cg_ini  0.850000999999998       9.999999998273846E-007
   6.95771875020604     
        3104  surfaces with wrong vertex ordering
 Warning - length difference between element and cell
 max_element_length,min_element_length,min_delta
  7.847540176996057E-002  3.349995610000001E-002  4.700000000000000E-002
 maximum ngh_surfaces and ngh_vertics are           47          22
 minimum ngh_surfaces and ngh_vertics are           22           9
 min IIB_cell_no           0
 max IIB_cell_no         112
 final initial IIB_cell_no        5600
 min I_cell_no           0
 max I_cell_no          96
 final initial I_cell_no        4800
 size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
        5600        4800        5600        4800
 IIB_I_cell_no_uvw_total1        1221        1206        1212         775
         761         751
    1      0.01904762      0.28410536      0.31610359      1.14440147 -0.14430869E+03 -0.13111542E+02  0.15251948E+07
    2      0.01348578      0.34638018      0.42392119      1.23447223 -0.16528393E+03 -0.10238827E+02  0.15250907E+07
    3      0.01252674      0.38305826      0.49569053      1.27891383 -0.16912542E+03 -0.95950253E+01  0.15250695E+07
    4      0.01199639      0.41337279      0.54168038      1.29584768 -0.17048065E+03 -0.94814301E+01  0.15250602E+07
    5      0.01165251      0.43544137      0.57347276      1.30255981 -0.17129184E+03 -0.95170304E+01  0.15250538E+07
  300      0.00236362      3.56353622      5.06727508      4.03923148 -0.78697893E+03  0.15046453E+05  0.15263125E+07
  600      0.00253142      2.94537779      5.74258126      4.71794271 -0.38271069E+04 -0.49150195E+04  0.15289768E+07
  900      0.00220341      3.10439489      6.70144317      4.01105348 -0.71943943E+04  0.13728311E+05  0.15320532E+07
 1200      0.00245748      3.53496741      7.33163591      4.01935315 -0.85017750E+04 -0.77550358E+04  0.15350351E+07
 1500      0.00244299      3.71751725      5.93463559      4.12005108 -0.95364451E+04  0.81223334E+04  0.15373061E+07
 1800      0.00237474      3.49908653      5.20866314      4.69712853 -0.10382365E+05 -0.18966840E+04  0.15385160E+07
 escape_time reached, so abort
 cd_cl_cs_mom_implicit1
  -1.03894256791350       -1.53179673343374       6.737940408853320E-002
  0.357464909626058      -0.103698436387821       -2.42688484514611     
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./a.out on a petsc-3.6.3_static_rel named n12-06 with 24 processors, by wtay Mon Feb 29 20:55:15 2016
Using Petsc Release Version 3.6.3, Dec, 03, 2015 

                         Max       Max/Min        Avg      Total 
Time (sec):           2.938e+03      1.00001   2.938e+03
Objects:              2.008e+04      1.00000   2.008e+04
Flops:                1.651e+11      1.08049   1.582e+11  3.797e+12
Flops/sec:            5.620e+07      1.08049   5.384e+07  1.292e+09
MPI Messages:         8.293e+04      1.89333   6.588e+04  1.581e+06
MPI Message Lengths:  4.109e+09      2.03497   4.964e+04  7.849e+10
MPI Reductions:       4.427e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 2.9382e+03 100.0%  3.7965e+12 100.0%  1.581e+06 100.0%  4.964e+04      100.0%  4.427e+04 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot              3998 1.0 3.7060e+01 4.0 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03  1  1  0  0  9   1  1  0  0  9   988
VecDotNorm2         1999 1.0 3.3165e+01 5.1 1.59e+09 1.1 0.0e+00 0.0e+00 2.0e+03  1  1  0  0  5   1  1  0  0  5  1104
VecNorm             3998 1.0 3.0081e+01 5.7 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03  1  1  0  0  9   1  1  0  0  9  1217
VecCopy             3998 1.0 4.2268e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet             12002 1.0 9.0293e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPBYCZ          3998 1.0 8.8463e+00 1.5 3.18e+09 1.1 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0  8276
VecWAXPY            3998 1.0 9.0856e+00 1.6 1.59e+09 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  4029
VecAssemblyBegin    3998 1.0 1.2290e+01 6.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04  0  0  0  0 27   0  0  0  0 27     0
VecAssemblyEnd      3998 1.0 1.1405e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin    16002 1.0 9.0506e+00 1.4 0.00e+00 0.0 1.2e+06 6.4e+04 0.0e+00  0  0 77100  0   0  0 77100  0     0
VecScatterEnd      16002 1.0 4.8845e+01 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatMult             3998 1.0 1.3324e+02 1.1 6.05e+10 1.1 3.0e+05 1.1e+05 0.0e+00  4 37 19 43  0   4 37 19 43  0 10444
MatSolve            5997 1.0 1.9260e+02 1.4 8.84e+10 1.1 0.0e+00 0.0e+00 0.0e+00  6 53  0  0  0   6 53  0  0  0 10543
MatLUFactorNum       104 1.0 2.3135e+01 1.2 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0  6681
MatILUFactorSym        1 1.0 1.4099e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale               1 1.0 4.5088e-02 2.6 7.67e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3911
MatAssemblyBegin     105 1.0 2.4788e+0011.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd       105 1.0 2.4778e+00 1.1 0.00e+00 0.0 1.5e+02 2.8e+04 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 4.0531e-06 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 7.9679e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp             105 1.0 9.6669e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
KSPSolve            1999 1.0 3.9941e+02 1.0 1.65e+11 1.1 3.0e+05 1.1e+05 1.0e+04 14100 19 43 23  14100 19 43 23  9505
PCSetUp              208 1.0 2.3286e+01 1.2 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0  6638
PCSetUpOnBlocks     1999 1.0 3.7027e-01 1.3 6.44e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4014
PCApply             5997 1.0 2.1975e+02 1.4 9.50e+10 1.1 0.0e+00 0.0e+00 0.0e+00  6 58  0  0  0   6 58  0  0  0  9937
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector  4032           4032     31782464     0
      Vector Scatter  2010             15      3738624     0
              Matrix     4              4    190398024     0
    Distributed Mesh  2003              8        39680     0
Star Forest Bipartite Graph  4006             16        13696     0
     Discrete System  2003              8         6784     0
           Index Set  4013           4013     14715400     0
   IS L to G Mapping  2003              8      2137148     0
       Krylov Solver     2              2         2296     0
      Preconditioner     2              2         1896     0
              Viewer     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 7.20024e-06
Average time for zero size MPI_Send(): 2.08616e-06
#PETSc Option Table entries:
-log_summary
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Thu Jan  7 04:05:35 2016 on hpc12 
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.3
Using PETSc arch: petsc-3.6.3_static_rel
-----------------------------------------

Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc  -wd1572 -O3  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90  -O3   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------

Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl 
-----------------------------------------

-------------- next part --------------
  0.000000000000000E+000  0.600000000000000        17.5000000000000     
   120.000000000000       0.000000000000000E+000  0.250000000000000     
   1.00000000000000       0.400000000000000                0     -400000
 AB,AA,BB   -2.78150003711926        2.76500003633555     
   2.78150003711926        2.70650003355695     
 size_x,size_y,size_z          100         172         171
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           0           1          29
           1          43           1      124700
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           1          30          58
           1          43      124701      249400
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           2          59          87
           1          43      249401      374100
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           3          88         116
           1          43      374101      498800
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           4         117         144
           1          43      498801      619200
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           5         145         172
           1          43      619201      739600
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           6           1          29
          44          86      739601      864300
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           7          30          58
          44          86      864301      989000
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           8          59          87
          44          86      989001     1113700
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end           9          88         116
          44          86     1113701     1238400
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          10         117         144
          44          86     1238401     1358800
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          11         145         172
          44          86     1358801     1479200
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          12           1          29
          87         129     1479201     1603900
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          13          30          58
          87         129     1603901     1728600
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          14          59          87
          87         129     1728601     1853300
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          15          88         116
          87         129     1853301     1978000
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          16         117         144
          87         129     1978001     2098400
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          17         145         172
          87         129     2098401     2218800
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          18           1          29
         130         171     2218801     2340600
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          19          30          58
         130         171     2340601     2462400
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          20          59          87
         130         171     2462401     2584200
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          21          88         116
         130         171     2584201     2706000
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          22         117         144
         130         171     2706001     2823600
 myid,jsta,jend,ksta,kend,ijk_sta,ijk_end          23         145         172
         130         171     2823601     2941200
 body_cg_ini  0.850000999999998       9.999999998273846E-007
   6.95771875020604     
        3104  surfaces with wrong vertex ordering
 Warning - length difference between element and cell
 max_element_length,min_element_length,min_delta
  7.847540176996057E-002  3.349995610000001E-002  3.500000000000000E-002
 maximum ngh_surfaces and ngh_vertics are           28          12
 minimum ngh_surfaces and ngh_vertics are           14           5
 min IIB_cell_no           0
 max IIB_cell_no         229
 final initial IIB_cell_no       11450
 min I_cell_no           0
 max I_cell_no         200
 final initial I_cell_no       10000
 size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
       11450       10000       11450       10000
 IIB_I_cell_no_uvw_total1        2230        2227        2166        1930
        1926        1847
    1      0.01411765      0.30104754      0.32529731      1.15440698 -0.30539502E+03 -0.29715696E+02  0.29394159E+07
    2      0.00973086      0.41244573      0.45086899      1.22116550 -0.34890134E+03 -0.25062690E+02  0.29392110E+07
    3      0.00918177      0.45383616      0.51179402      1.27757073 -0.35811483E+03 -0.25027396E+02  0.29391677E+07
    4      0.00885764      0.47398774      0.55169119      1.31019526 -0.36250500E+03 -0.25910050E+02  0.29391470E+07
    5      0.00872241      0.48832538      0.57967282      1.32679047 -0.36545763E+03 -0.26947216E+02  0.29391325E+07
  300      0.00163886      4.27898628      6.83028522      3.60837060 -0.19609891E+04  0.43984454E+05  0.29435194E+07
  600      0.00160193      3.91014241      4.97460210      5.10461274 -0.61092521E+03  0.18910563E+05  0.29467790E+07
  900      0.00150521      3.27352854      5.85427996      4.49166453 -0.89281765E+04 -0.12171584E+05  0.29507471E+07
 1200      0.00165280      3.05922213      7.37243530      5.16434634 -0.10954640E+05  0.22049957E+05  0.29575213E+07
 1500      0.00153718      3.54908044      5.42918256      4.84940953 -0.16430153E+05  0.24407130E+05  0.29608940E+07
 1800      0.00155455      3.30956962      8.35799538      4.50638757 -0.20003619E+05 -0.20349497E+05  0.29676102E+07
 escape_time reached, so abort
 cd_cl_cs_mom_implicit1
  -1.29348921431473       -2.44525665200003      -0.238725356553914     
  0.644444280391413      -3.056662699041206E-002  -2.91791118488116     
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./a.out on a petsc-3.6.3_static_rel named n12-09 with 24 processors, by wtay Sat Feb 27 16:58:01 2016
Using Petsc Release Version 3.6.3, Dec, 03, 2015 

                         Max       Max/Min        Avg      Total 
Time (sec):           5.791e+03      1.00001   5.791e+03
Objects:              2.008e+04      1.00000   2.008e+04
Flops:                3.129e+11      1.06806   3.066e+11  7.360e+12
Flops/sec:            5.402e+07      1.06807   5.295e+07  1.271e+09
MPI Messages:         8.298e+04      1.89703   6.585e+04  1.580e+06
MPI Message Lengths:  6.456e+09      2.05684   7.780e+04  1.229e+11
MPI Reductions:       4.427e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 5.7911e+03 100.0%  7.3595e+12 100.0%  1.580e+06 100.0%  7.780e+04      100.0%  4.427e+04 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot              3998 1.0 1.1437e+02 2.3 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03  1  1  0  0  9   1  1  0  0  9   617
VecDotNorm2         1999 1.0 1.0442e+02 2.6 2.99e+09 1.1 0.0e+00 0.0e+00 2.0e+03  1  1  0  0  5   1  1  0  0  5   676
VecNorm             3998 1.0 8.5426e+01 2.2 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03  1  1  0  0  9   1  1  0  0  9   826
VecCopy             3998 1.0 7.3321e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet             12002 1.0 1.2399e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPBYCZ          3998 1.0 1.8118e+01 1.4 5.98e+09 1.1 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0  7788
VecWAXPY            3998 1.0 1.6979e+01 1.3 2.99e+09 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  4155
VecAssemblyBegin    3998 1.0 4.1001e+01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04  0  0  0  0 27   0  0  0  0 27     0
VecAssemblyEnd      3998 1.0 1.4657e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin    16002 1.0 1.9519e+01 1.5 0.00e+00 0.0 1.2e+06 1.0e+05 0.0e+00  0  0 77100  0   0  0 77100  0     0
VecScatterEnd      16002 1.0 1.3223e+02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
MatMult             3998 1.0 3.0904e+02 1.3 1.15e+11 1.1 3.0e+05 1.7e+05 0.0e+00  5 37 19 43  0   5 37 19 43  0  8700
MatSolve            5997 1.0 3.9285e+02 1.4 1.67e+11 1.1 0.0e+00 0.0e+00 0.0e+00  6 54  0  0  0   6 54  0  0  0 10040
MatLUFactorNum       104 1.0 4.2097e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0  7190
MatILUFactorSym        1 1.0 2.9875e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale               1 1.0 1.3492e-01 3.3 1.45e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2525
MatAssemblyBegin     105 1.0 5.9000e+00 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd       105 1.0 4.7665e+00 1.1 0.00e+00 0.0 1.5e+02 4.3e+04 8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 3.6001e-0518.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.6249e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp             105 1.0 2.7945e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
KSPSolve            1999 1.0 9.1973e+02 1.0 3.13e+11 1.1 3.0e+05 1.7e+05 1.0e+04 16100 19 43 23  16100 19 43 23  8001
PCSetUp              208 1.0 4.2401e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0  7138
PCSetUpOnBlocks     1999 1.0 7.2389e-01 1.2 1.25e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4020
PCApply             5997 1.0 4.4054e+02 1.3 1.80e+11 1.1 0.0e+00 0.0e+00 0.0e+00  6 58  0  0  0   6 58  0  0  0  9634
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector  4032           4032     53827712     0
      Vector Scatter  2010             15      7012720     0
              Matrix     4              4    359683260     0
    Distributed Mesh  2003              8        39680     0
Star Forest Bipartite Graph  4006             16        13696     0
     Discrete System  2003              8         6784     0
           Index Set  4013           4013     25819112     0
   IS L to G Mapping  2003              8      3919440     0
       Krylov Solver     2              2         2296     0
      Preconditioner     2              2         1896     0
              Viewer     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 7.20024e-06
Average time for zero size MPI_Send(): 1.83781e-06
#PETSc Option Table entries:
-log_summary
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Thu Jan  7 04:05:35 2016 on hpc12 
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.3
Using PETSc arch: petsc-3.6.3_static_rel
-----------------------------------------

Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc  -wd1572 -O3  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90  -O3   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------

Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl 
-----------------------------------------



More information about the petsc-users mailing list