[petsc-users] Investigate parallel code to improve parallelism
TAY wee-beng
zonexo at gmail.com
Tue Mar 1 02:03:15 CST 2016
On 29/2/2016 11:21 AM, Barry Smith wrote:
>> On Feb 28, 2016, at 8:26 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>
>>
>> On 29/2/2016 9:41 AM, Barry Smith wrote:
>>>> On Feb 28, 2016, at 7:08 PM, TAY Wee Beng <zonexo at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I've attached the files for x cells running y procs. hypre is called natively I'm not sure if PETSc catches it.
>>> So you are directly creating hypre matrices and calling the hypre solver in another piece of your code?
>> Yes because I'm using the simple structure (struct) layout for Cartesian grids. It's about twice as fast compared to BoomerAMG
> Understood
>
>> . I can't create PETSc matrix and use the hypre struct layout, right?
>>> In the PETSc part of the code if you compare the 2x_y to the x_y you see that doubling the problem size resulted in 2.2 as much time for the KSPSolve. Most of this large increase is due to the increased time in the scatter which went up to 150/54. = 2.7777777777777777 but the amount of data transferred only increased by 1e5/6.4e4 = 1.5625 Normally I would not expect to see this behavior and would not expect such a large increase in the communication time.
>>>
>>> Barry
>>>
>>>
>>>
>> So ideally it should be 2 instead of 2.2, is that so?
> Ideally
>
>> May I know where are you looking at? Because I can't find the nos.
> The column labeled Avg len tells the average length of messages which increases from 6.4e4 to 1e5 while the time max increase by 2.77 (I took the sum of the VecScatterBegin and VecScatter End rows.
>
>> So where do you think the error comes from?
> It is not really an error it is just that it is taking more time then one would hope it would take.
>> Or how can I troubleshoot further?
>
> If you run the same problem several times how much different are the numerical timings for each run?
Hi,
I have re-done x_y and 2x_y again. I have attached the files with _2 for
the 2nd run. They're exactly the same.
Should I try running on another cluster?
I also tried running the same problem with more cells and more time
steps (to reduce start up effects) on another cluster. But I forgot to
run it with -log_summary. Anyway, the results show:
1. Using 1.5 million cells with 48 procs and 3M with 96p took 65min and
69min. Using the weak scaling formula I attached earlier, it gives about
88% efficiency
2. Using 3 million cells with 48 procs and 6M with 96p took 114min and
121min. Using the weak scaling formula I attached earlier, it gives
about 88% efficiency
3. Using 3.75 million cells with 48 procs and 7.5M with 96p took 134min
and 143min. Using the weak scaling formula I attached earlier, it gives
about 87% efficiency
4. Using 4.5 million cells with 48 procs and 9M with 96p took 160min and
176min (extrapolated). Using the weak scaling formula I attached
earlier, it gives about 80% efficiency
So it seems that I should run with 3.75 million cells with 48 procs and
scale along this ratio. Beyond that, my efficiency decreases. Is that
so? Maybe I should also run with -log_summary to get better estimate...
Thanks.
>
>
>> Thanks
>>>> Thanks
>>>>
>>>> On 29/2/2016 1:11 AM, Barry Smith wrote:
>>>>> As I said before, send the -log_summary output for the two processor sizes and we'll look at where it is spending its time and how it could possibly be improved.
>>>>>
>>>>> Barry
>>>>>
>>>>>> On Feb 28, 2016, at 10:29 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 27/2/2016 12:53 AM, Barry Smith wrote:
>>>>>>>> On Feb 26, 2016, at 10:27 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 26/2/2016 11:32 PM, Barry Smith wrote:
>>>>>>>>>> On Feb 26, 2016, at 9:28 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have got a 3D code. When I ran with 48 procs and 11 million cells, it runs for 83 min. When I ran with 96 procs and 22 million cells, it ran for 99 min.
>>>>>>>>> This is actually pretty good!
>>>>>>>> But if I'm not wrong, if I increase the no. of cells, the parallelism will keep on decreasing. I hope it scales up to maybe 300 - 400 procs.
>>>>>> Hi,
>>>>>>
>>>>>> I think I may have mentioned this before, that is, I need to submit a proposal to request for computing nodes. In the proposal, I'm supposed to run some simulations to estimate the time it takes to run my code. Then an excel file will use my input to estimate the efficiency when I run my code with more cells. They use 2 mtds to estimate:
>>>>>>
>>>>>> 1. strong scaling, whereby I run 2 cases - 1st with n cells and x procs, then with n cells and 2x procs. From there, they can estimate my expected efficiency when I have y procs. The formula is attached in the pdf.
>>>>>>
>>>>>> 2. weak scaling, whereby I run 2 cases - 1st with n cells and x procs, then with 2n cells and 2x procs. From there, they can estimate my expected efficiency when I have y procs. The formula is attached in the pdf.
>>>>>>
>>>>>> So if I use 48 and 96 procs and get maybe 80% efficiency, by the time I hit 800 procs, I get 32% efficiency for strong scaling. They expect at least 50% efficiency for my code. To reach that, I need to achieve 89% efficiency when I use 48 and 96 procs.
>>>>>>
>>>>>> So now my qn is how accurate is this type of calculation, especially wrt to PETSc?
>>>>>>
>>>>>> Similarly, for weak scaling, is it accurate?
>>>>>>
>>>>>> Can I argue that this estimation is not suitable for PETSc or hypre?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>>>>> So it's not that parallel. I want to find out which part of the code I need to improve. Also if PETsc and hypre is working well in parallel. What's the best way to do it?
>>>>>>>>> Run both with -log_summary and send the output for each case. This will show where the time is being spent and which parts are scaling less well.
>>>>>>>>>
>>>>>>>>> Barry
>>>>>>>> That's only for the PETSc part, right? So for other parts of the code, including hypre part, I will not be able to find out. If so, what can I use to check these parts?
>>>>>>> You will still be able to see what percentage of the time is spent in hypre and if it increases with the problem size and how much. So the information will still be useful.
>>>>>>>
>>>>>>> Barry
>>>>>>>
>>>>>>>>>> I thought of doing profiling but if the code is optimized, I wonder if it still works well.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Thank you.
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> TAY wee-beng
>>>>>>>>>>
>>>>>> <temp.pdf>
>>>> --
>>>> Thank you
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>> <2x_2y.txt><2x_y.txt><4x_2y.txt><x_y.txt>
-------------- next part --------------
0.000000000000000E+000 0.600000000000000 17.5000000000000
120.000000000000 0.000000000000000E+000 0.250000000000000
1.00000000000000 0.400000000000000 0 -400000
AB,AA,BB -2.51050002424745 2.47300002246629
2.51050002424745 2.43950002087513
size_x,size_y,size_z 79 137 141
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 0 1 35
1 24 1 66360
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 1 36 69
1 24 66361 130824
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 2 70 103
1 24 130825 195288
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 3 104 137
1 24 195289 259752
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 4 1 35
25 48 259753 326112
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 5 36 69
25 48 326113 390576
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 6 70 103
25 48 390577 455040
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 7 104 137
25 48 455041 519504
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 8 1 35
49 72 519505 585864
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 9 36 69
49 72 585865 650328
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 10 70 103
49 72 650329 714792
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 11 104 137
49 72 714793 779256
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 12 1 35
73 95 779257 842851
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 13 36 69
73 95 842852 904629
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 14 70 103
73 95 904630 966407
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 15 104 137
73 95 966408 1028185
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 16 1 35
96 118 1028186 1091780
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 17 36 69
96 118 1091781 1153558
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 18 70 103
96 118 1153559 1215336
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 19 104 137
96 118 1215337 1277114
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 20 1 35
119 141 1277115 1340709
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 21 36 69
119 141 1340710 1402487
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 22 70 103
119 141 1402488 1464265
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 23 104 137
119 141 1464266 1526043
body_cg_ini 0.850000999999998 9.999999998273846E-007
6.95771875020604
3104 surfaces with wrong vertex ordering
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
7.847540176996057E-002 3.349995610000001E-002 4.700000000000000E-002
maximum ngh_surfaces and ngh_vertics are 47 22
minimum ngh_surfaces and ngh_vertics are 22 9
min IIB_cell_no 0
max IIB_cell_no 112
final initial IIB_cell_no 5600
min I_cell_no 0
max I_cell_no 96
final initial I_cell_no 4800
size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
5600 4800 5600 4800
IIB_I_cell_no_uvw_total1 1221 1206 1212 775
761 751
1 0.01904762 0.28410536 0.31610359 1.14440147 -0.14430869E+03 -0.13111542E+02 0.15251948E+07
2 0.01348578 0.34638018 0.42392119 1.23447223 -0.16528393E+03 -0.10238827E+02 0.15250907E+07
3 0.01252674 0.38305826 0.49569053 1.27891383 -0.16912542E+03 -0.95950253E+01 0.15250695E+07
4 0.01199639 0.41337279 0.54168038 1.29584768 -0.17048065E+03 -0.94814301E+01 0.15250602E+07
5 0.01165251 0.43544137 0.57347276 1.30255981 -0.17129184E+03 -0.95170304E+01 0.15250538E+07
300 0.00236362 3.56353622 5.06727508 4.03923148 -0.78697893E+03 0.15046453E+05 0.15263125E+07
600 0.00253142 2.94537779 5.74258126 4.71794271 -0.38271069E+04 -0.49150195E+04 0.15289768E+07
900 0.00220341 3.10439489 6.70144317 4.01105348 -0.71943943E+04 0.13728311E+05 0.15320532E+07
1200 0.00245748 3.53496741 7.33163591 4.01935315 -0.85017750E+04 -0.77550358E+04 0.15350351E+07
1500 0.00244299 3.71751725 5.93463559 4.12005108 -0.95364451E+04 0.81223334E+04 0.15373061E+07
1800 0.00237474 3.49908653 5.20866314 4.69712853 -0.10382365E+05 -0.18966840E+04 0.15385160E+07
escape_time reached, so abort
cd_cl_cs_mom_implicit1
-1.03894256791350 -1.53179673343374 6.737940408853320E-002
0.357464909626058 -0.103698436387821 -2.42688484514611
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./a.out on a petsc-3.6.3_static_rel named n12-09 with 24 processors, by wtay Sat Feb 27 16:09:41 2016
Using Petsc Release Version 3.6.3, Dec, 03, 2015
Max Max/Min Avg Total
Time (sec): 2.922e+03 1.00001 2.922e+03
Objects: 2.008e+04 1.00000 2.008e+04
Flops: 1.651e+11 1.08049 1.582e+11 3.797e+12
Flops/sec: 5.652e+07 1.08049 5.414e+07 1.299e+09
MPI Messages: 8.293e+04 1.89333 6.588e+04 1.581e+06
MPI Message Lengths: 4.109e+09 2.03497 4.964e+04 7.849e+10
MPI Reductions: 4.427e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.9219e+03 100.0% 3.7965e+12 100.0% 1.581e+06 100.0% 4.964e+04 100.0% 4.427e+04 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecDot 3998 1.0 4.4655e+01 5.1 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 820
VecDotNorm2 1999 1.0 4.0603e+01 7.6 1.59e+09 1.1 0.0e+00 0.0e+00 2.0e+03 1 1 0 0 5 1 1 0 0 5 902
VecNorm 3998 1.0 3.0557e+01 6.2 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 1198
VecCopy 3998 1.0 4.4206e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 12002 1.0 9.3725e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPBYCZ 3998 1.0 9.1178e+00 1.5 3.18e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 8030
VecWAXPY 3998 1.0 9.3186e+00 1.5 1.59e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 3928
VecAssemblyBegin 3998 1.0 1.5680e+01 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04 0 0 0 0 27 0 0 0 0 27 0
VecAssemblyEnd 3998 1.0 1.1443e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 16002 1.0 9.0984e+00 1.4 0.00e+00 0.0 1.2e+06 6.4e+04 0.0e+00 0 0 77100 0 0 0 77100 0 0
VecScatterEnd 16002 1.0 4.4821e+01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatMult 3998 1.0 1.4268e+02 1.3 6.05e+10 1.1 3.0e+05 1.1e+05 0.0e+00 4 37 19 43 0 4 37 19 43 0 9753
MatSolve 5997 1.0 2.0469e+02 1.4 8.84e+10 1.1 0.0e+00 0.0e+00 0.0e+00 6 53 0 0 0 6 53 0 0 0 9921
MatLUFactorNum 104 1.0 2.2332e+01 1.1 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6922
MatILUFactorSym 1 1.0 1.0867e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatScale 1 1.0 3.8305e-02 1.9 7.67e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4603
MatAssemblyBegin 105 1.0 2.0776e+00 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 105 1.0 2.4702e+00 1.1 0.00e+00 0.0 1.5e+02 2.8e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 1.0 4.0531e-06 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 7.1249e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 105 1.0 9.8758e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1999 1.0 4.1857e+02 1.0 1.65e+11 1.1 3.0e+05 1.1e+05 1.0e+04 14100 19 43 23 14100 19 43 23 9070
PCSetUp 208 1.0 2.2440e+01 1.1 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6888
PCSetUpOnBlocks 1999 1.0 2.7087e-01 1.1 6.44e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5487
PCApply 5997 1.0 2.3123e+02 1.3 9.50e+10 1.1 0.0e+00 0.0e+00 0.0e+00 6 58 0 0 0 6 58 0 0 0 9444
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 4032 4032 31782464 0
Vector Scatter 2010 15 3738624 0
Matrix 4 4 190398024 0
Distributed Mesh 2003 8 39680 0
Star Forest Bipartite Graph 4006 16 13696 0
Discrete System 2003 8 6784 0
Index Set 4013 4013 14715400 0
IS L to G Mapping 2003 8 2137148 0
Krylov Solver 2 2 2296 0
Preconditioner 2 2 1896 0
Viewer 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 8.15392e-06
Average time for zero size MPI_Send(): 1.12454e-05
#PETSc Option Table entries:
-log_summary
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Thu Jan 7 04:05:35 2016 on hpc12
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.3
Using PETSc arch: petsc-3.6.3_static_rel
-----------------------------------------
Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -wd1572 -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------
Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl
-----------------------------------------
-------------- next part --------------
0.000000000000000E+000 0.600000000000000 17.5000000000000
120.000000000000 0.000000000000000E+000 0.250000000000000
1.00000000000000 0.400000000000000 0 -400000
AB,AA,BB -2.78150003711926 2.76500003633555
2.78150003711926 2.70650003355695
size_x,size_y,size_z 100 172 171
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 0 1 29
1 43 1 124700
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 1 30 58
1 43 124701 249400
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 2 59 87
1 43 249401 374100
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 3 88 116
1 43 374101 498800
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 4 117 144
1 43 498801 619200
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 5 145 172
1 43 619201 739600
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 6 1 29
44 86 739601 864300
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 7 30 58
44 86 864301 989000
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 8 59 87
44 86 989001 1113700
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 9 88 116
44 86 1113701 1238400
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 10 117 144
44 86 1238401 1358800
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 11 145 172
44 86 1358801 1479200
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 12 1 29
87 129 1479201 1603900
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 13 30 58
87 129 1603901 1728600
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 14 59 87
87 129 1728601 1853300
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 15 88 116
87 129 1853301 1978000
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 16 117 144
87 129 1978001 2098400
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 17 145 172
87 129 2098401 2218800
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 18 1 29
130 171 2218801 2340600
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 19 30 58
130 171 2340601 2462400
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 20 59 87
130 171 2462401 2584200
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 21 88 116
130 171 2584201 2706000
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 22 117 144
130 171 2706001 2823600
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 23 145 172
130 171 2823601 2941200
body_cg_ini 0.850000999999998 9.999999998273846E-007
6.95771875020604
3104 surfaces with wrong vertex ordering
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
7.847540176996057E-002 3.349995610000001E-002 3.500000000000000E-002
maximum ngh_surfaces and ngh_vertics are 28 12
minimum ngh_surfaces and ngh_vertics are 14 5
min IIB_cell_no 0
max IIB_cell_no 229
final initial IIB_cell_no 11450
min I_cell_no 0
max I_cell_no 200
final initial I_cell_no 10000
size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
11450 10000 11450 10000
IIB_I_cell_no_uvw_total1 2230 2227 2166 1930
1926 1847
1 0.01411765 0.30104754 0.32529731 1.15440698 -0.30539502E+03 -0.29715696E+02 0.29394159E+07
2 0.00973086 0.41244573 0.45086899 1.22116550 -0.34890134E+03 -0.25062690E+02 0.29392110E+07
3 0.00918177 0.45383616 0.51179402 1.27757073 -0.35811483E+03 -0.25027396E+02 0.29391677E+07
4 0.00885764 0.47398774 0.55169119 1.31019526 -0.36250500E+03 -0.25910050E+02 0.29391470E+07
5 0.00872241 0.48832538 0.57967282 1.32679047 -0.36545763E+03 -0.26947216E+02 0.29391325E+07
300 0.00163886 4.27898628 6.83028522 3.60837060 -0.19609891E+04 0.43984454E+05 0.29435194E+07
600 0.00160193 3.91014241 4.97460210 5.10461274 -0.61092521E+03 0.18910563E+05 0.29467790E+07
900 0.00150521 3.27352854 5.85427996 4.49166453 -0.89281765E+04 -0.12171584E+05 0.29507471E+07
1200 0.00165280 3.05922213 7.37243530 5.16434634 -0.10954640E+05 0.22049957E+05 0.29575213E+07
1500 0.00153718 3.54908044 5.42918256 4.84940953 -0.16430153E+05 0.24407130E+05 0.29608940E+07
1800 0.00155455 3.30956962 8.35799538 4.50638757 -0.20003619E+05 -0.20349497E+05 0.29676102E+07
escape_time reached, so abort
cd_cl_cs_mom_implicit1
-1.29348921431473 -2.44525665200003 -0.238725356553914
0.644444280391413 -3.056662699041206E-002 -2.91791118488116
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./a.out on a petsc-3.6.3_static_rel named n12-06 with 24 processors, by wtay Mon Feb 29 21:45:09 2016
Using Petsc Release Version 3.6.3, Dec, 03, 2015
Max Max/Min Avg Total
Time (sec): 5.933e+03 1.00000 5.933e+03
Objects: 2.008e+04 1.00000 2.008e+04
Flops: 3.129e+11 1.06806 3.066e+11 7.360e+12
Flops/sec: 5.273e+07 1.06807 5.169e+07 1.241e+09
MPI Messages: 8.298e+04 1.89703 6.585e+04 1.580e+06
MPI Message Lengths: 6.456e+09 2.05684 7.780e+04 1.229e+11
MPI Reductions: 4.427e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 5.9326e+03 100.0% 7.3595e+12 100.0% 1.580e+06 100.0% 7.780e+04 100.0% 4.427e+04 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecDot 3998 1.0 1.0612e+02 2.0 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 665
VecDotNorm2 1999 1.0 9.4306e+01 2.1 2.99e+09 1.1 0.0e+00 0.0e+00 2.0e+03 1 1 0 0 5 1 1 0 0 5 748
VecNorm 3998 1.0 8.7330e+01 2.0 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 808
VecCopy 3998 1.0 7.4317e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 12002 1.0 1.1626e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPBYCZ 3998 1.0 1.7543e+01 1.4 5.98e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 8044
VecWAXPY 3998 1.0 1.6637e+01 1.4 2.99e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 4241
VecAssemblyBegin 3998 1.0 3.0367e+01 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04 0 0 0 0 27 0 0 0 0 27 0
VecAssemblyEnd 3998 1.0 1.5386e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 16002 1.0 1.7833e+01 1.4 0.00e+00 0.0 1.2e+06 1.0e+05 0.0e+00 0 0 77100 0 0 0 77100 0 0
VecScatterEnd 16002 1.0 1.2689e+02 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
MatMult 3998 1.0 3.1700e+02 1.3 1.15e+11 1.1 3.0e+05 1.7e+05 0.0e+00 5 37 19 43 0 5 37 19 43 0 8482
MatSolve 5997 1.0 3.6841e+02 1.3 1.67e+11 1.1 0.0e+00 0.0e+00 0.0e+00 6 54 0 0 0 6 54 0 0 0 10707
MatLUFactorNum 104 1.0 4.3137e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 7016
MatILUFactorSym 1 1.0 3.5212e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatScale 1 1.0 9.1592e-02 3.0 1.45e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3720
MatAssemblyBegin 105 1.0 5.1547e+00 4.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 105 1.0 4.7898e+00 1.1 0.00e+00 0.0 1.5e+02 4.3e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 1.0 4.0531e-06 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.0590e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 105 1.0 4.5063e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1999 1.0 9.1527e+02 1.0 3.13e+11 1.1 3.0e+05 1.7e+05 1.0e+04 15100 19 43 23 15100 19 43 23 8040
PCSetUp 208 1.0 4.3499e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6958
PCSetUpOnBlocks 1999 1.0 8.2526e-01 1.3 1.25e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3526
PCApply 5997 1.0 4.1644e+02 1.3 1.80e+11 1.1 0.0e+00 0.0e+00 0.0e+00 6 58 0 0 0 6 58 0 0 0 10192
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 4032 4032 53827712 0
Vector Scatter 2010 15 7012720 0
Matrix 4 4 359683260 0
Distributed Mesh 2003 8 39680 0
Star Forest Bipartite Graph 4006 16 13696 0
Discrete System 2003 8 6784 0
Index Set 4013 4013 25819112 0
IS L to G Mapping 2003 8 3919440 0
Krylov Solver 2 2 2296 0
Preconditioner 2 2 1896 0
Viewer 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Average time for MPI_Barrier(): 1.03951e-05
Average time for zero size MPI_Send(): 1.83781e-06
#PETSc Option Table entries:
-log_summary
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Thu Jan 7 04:05:35 2016 on hpc12
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.3
Using PETSc arch: petsc-3.6.3_static_rel
-----------------------------------------
Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -wd1572 -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------
Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl
-----------------------------------------
-------------- next part --------------
0.000000000000000E+000 0.600000000000000 17.5000000000000
120.000000000000 0.000000000000000E+000 0.250000000000000
1.00000000000000 0.400000000000000 0 -400000
AB,AA,BB -2.51050002424745 2.47300002246629
2.51050002424745 2.43950002087513
size_x,size_y,size_z 79 137 141
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 0 1 35
1 24 1 66360
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 1 36 69
1 24 66361 130824
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 2 70 103
1 24 130825 195288
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 3 104 137
1 24 195289 259752
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 4 1 35
25 48 259753 326112
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 5 36 69
25 48 326113 390576
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 6 70 103
25 48 390577 455040
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 7 104 137
25 48 455041 519504
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 8 1 35
49 72 519505 585864
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 9 36 69
49 72 585865 650328
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 10 70 103
49 72 650329 714792
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 11 104 137
49 72 714793 779256
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 12 1 35
73 95 779257 842851
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 13 36 69
73 95 842852 904629
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 14 70 103
73 95 904630 966407
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 15 104 137
73 95 966408 1028185
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 16 1 35
96 118 1028186 1091780
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 17 36 69
96 118 1091781 1153558
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 18 70 103
96 118 1153559 1215336
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 19 104 137
96 118 1215337 1277114
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 20 1 35
119 141 1277115 1340709
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 21 36 69
119 141 1340710 1402487
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 22 70 103
119 141 1402488 1464265
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 23 104 137
119 141 1464266 1526043
body_cg_ini 0.850000999999998 9.999999998273846E-007
6.95771875020604
3104 surfaces with wrong vertex ordering
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
7.847540176996057E-002 3.349995610000001E-002 4.700000000000000E-002
maximum ngh_surfaces and ngh_vertics are 47 22
minimum ngh_surfaces and ngh_vertics are 22 9
min IIB_cell_no 0
max IIB_cell_no 112
final initial IIB_cell_no 5600
min I_cell_no 0
max I_cell_no 96
final initial I_cell_no 4800
size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
5600 4800 5600 4800
IIB_I_cell_no_uvw_total1 1221 1206 1212 775
761 751
1 0.01904762 0.28410536 0.31610359 1.14440147 -0.14430869E+03 -0.13111542E+02 0.15251948E+07
2 0.01348578 0.34638018 0.42392119 1.23447223 -0.16528393E+03 -0.10238827E+02 0.15250907E+07
3 0.01252674 0.38305826 0.49569053 1.27891383 -0.16912542E+03 -0.95950253E+01 0.15250695E+07
4 0.01199639 0.41337279 0.54168038 1.29584768 -0.17048065E+03 -0.94814301E+01 0.15250602E+07
5 0.01165251 0.43544137 0.57347276 1.30255981 -0.17129184E+03 -0.95170304E+01 0.15250538E+07
300 0.00236362 3.56353622 5.06727508 4.03923148 -0.78697893E+03 0.15046453E+05 0.15263125E+07
600 0.00253142 2.94537779 5.74258126 4.71794271 -0.38271069E+04 -0.49150195E+04 0.15289768E+07
900 0.00220341 3.10439489 6.70144317 4.01105348 -0.71943943E+04 0.13728311E+05 0.15320532E+07
1200 0.00245748 3.53496741 7.33163591 4.01935315 -0.85017750E+04 -0.77550358E+04 0.15350351E+07
1500 0.00244299 3.71751725 5.93463559 4.12005108 -0.95364451E+04 0.81223334E+04 0.15373061E+07
1800 0.00237474 3.49908653 5.20866314 4.69712853 -0.10382365E+05 -0.18966840E+04 0.15385160E+07
escape_time reached, so abort
cd_cl_cs_mom_implicit1
-1.03894256791350 -1.53179673343374 6.737940408853320E-002
0.357464909626058 -0.103698436387821 -2.42688484514611
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./a.out on a petsc-3.6.3_static_rel named n12-06 with 24 processors, by wtay Mon Feb 29 20:55:15 2016
Using Petsc Release Version 3.6.3, Dec, 03, 2015
Max Max/Min Avg Total
Time (sec): 2.938e+03 1.00001 2.938e+03
Objects: 2.008e+04 1.00000 2.008e+04
Flops: 1.651e+11 1.08049 1.582e+11 3.797e+12
Flops/sec: 5.620e+07 1.08049 5.384e+07 1.292e+09
MPI Messages: 8.293e+04 1.89333 6.588e+04 1.581e+06
MPI Message Lengths: 4.109e+09 2.03497 4.964e+04 7.849e+10
MPI Reductions: 4.427e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.9382e+03 100.0% 3.7965e+12 100.0% 1.581e+06 100.0% 4.964e+04 100.0% 4.427e+04 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecDot 3998 1.0 3.7060e+01 4.0 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 988
VecDotNorm2 1999 1.0 3.3165e+01 5.1 1.59e+09 1.1 0.0e+00 0.0e+00 2.0e+03 1 1 0 0 5 1 1 0 0 5 1104
VecNorm 3998 1.0 3.0081e+01 5.7 1.59e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 1217
VecCopy 3998 1.0 4.2268e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 12002 1.0 9.0293e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPBYCZ 3998 1.0 8.8463e+00 1.5 3.18e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 8276
VecWAXPY 3998 1.0 9.0856e+00 1.6 1.59e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 4029
VecAssemblyBegin 3998 1.0 1.2290e+01 6.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04 0 0 0 0 27 0 0 0 0 27 0
VecAssemblyEnd 3998 1.0 1.1405e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 16002 1.0 9.0506e+00 1.4 0.00e+00 0.0 1.2e+06 6.4e+04 0.0e+00 0 0 77100 0 0 0 77100 0 0
VecScatterEnd 16002 1.0 4.8845e+01 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatMult 3998 1.0 1.3324e+02 1.1 6.05e+10 1.1 3.0e+05 1.1e+05 0.0e+00 4 37 19 43 0 4 37 19 43 0 10444
MatSolve 5997 1.0 1.9260e+02 1.4 8.84e+10 1.1 0.0e+00 0.0e+00 0.0e+00 6 53 0 0 0 6 53 0 0 0 10543
MatLUFactorNum 104 1.0 2.3135e+01 1.2 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6681
MatILUFactorSym 1 1.0 1.4099e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatScale 1 1.0 4.5088e-02 2.6 7.67e+06 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3911
MatAssemblyBegin 105 1.0 2.4788e+0011.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 105 1.0 2.4778e+00 1.1 0.00e+00 0.0 1.5e+02 2.8e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 1.0 4.0531e-06 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 7.9679e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 105 1.0 9.6669e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1999 1.0 3.9941e+02 1.0 1.65e+11 1.1 3.0e+05 1.1e+05 1.0e+04 14100 19 43 23 14100 19 43 23 9505
PCSetUp 208 1.0 2.3286e+01 1.2 6.70e+09 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 6638
PCSetUpOnBlocks 1999 1.0 3.7027e-01 1.3 6.44e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4014
PCApply 5997 1.0 2.1975e+02 1.4 9.50e+10 1.1 0.0e+00 0.0e+00 0.0e+00 6 58 0 0 0 6 58 0 0 0 9937
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 4032 4032 31782464 0
Vector Scatter 2010 15 3738624 0
Matrix 4 4 190398024 0
Distributed Mesh 2003 8 39680 0
Star Forest Bipartite Graph 4006 16 13696 0
Discrete System 2003 8 6784 0
Index Set 4013 4013 14715400 0
IS L to G Mapping 2003 8 2137148 0
Krylov Solver 2 2 2296 0
Preconditioner 2 2 1896 0
Viewer 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 7.20024e-06
Average time for zero size MPI_Send(): 2.08616e-06
#PETSc Option Table entries:
-log_summary
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Thu Jan 7 04:05:35 2016 on hpc12
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.3
Using PETSc arch: petsc-3.6.3_static_rel
-----------------------------------------
Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -wd1572 -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------
Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl
-----------------------------------------
-------------- next part --------------
0.000000000000000E+000 0.600000000000000 17.5000000000000
120.000000000000 0.000000000000000E+000 0.250000000000000
1.00000000000000 0.400000000000000 0 -400000
AB,AA,BB -2.78150003711926 2.76500003633555
2.78150003711926 2.70650003355695
size_x,size_y,size_z 100 172 171
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 0 1 29
1 43 1 124700
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 1 30 58
1 43 124701 249400
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 2 59 87
1 43 249401 374100
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 3 88 116
1 43 374101 498800
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 4 117 144
1 43 498801 619200
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 5 145 172
1 43 619201 739600
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 6 1 29
44 86 739601 864300
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 7 30 58
44 86 864301 989000
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 8 59 87
44 86 989001 1113700
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 9 88 116
44 86 1113701 1238400
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 10 117 144
44 86 1238401 1358800
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 11 145 172
44 86 1358801 1479200
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 12 1 29
87 129 1479201 1603900
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 13 30 58
87 129 1603901 1728600
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 14 59 87
87 129 1728601 1853300
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 15 88 116
87 129 1853301 1978000
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 16 117 144
87 129 1978001 2098400
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 17 145 172
87 129 2098401 2218800
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 18 1 29
130 171 2218801 2340600
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 19 30 58
130 171 2340601 2462400
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 20 59 87
130 171 2462401 2584200
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 21 88 116
130 171 2584201 2706000
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 22 117 144
130 171 2706001 2823600
myid,jsta,jend,ksta,kend,ijk_sta,ijk_end 23 145 172
130 171 2823601 2941200
body_cg_ini 0.850000999999998 9.999999998273846E-007
6.95771875020604
3104 surfaces with wrong vertex ordering
Warning - length difference between element and cell
max_element_length,min_element_length,min_delta
7.847540176996057E-002 3.349995610000001E-002 3.500000000000000E-002
maximum ngh_surfaces and ngh_vertics are 28 12
minimum ngh_surfaces and ngh_vertics are 14 5
min IIB_cell_no 0
max IIB_cell_no 229
final initial IIB_cell_no 11450
min I_cell_no 0
max I_cell_no 200
final initial I_cell_no 10000
size(IIB_cell_u),size(I_cell_u),size(IIB_equal_cell_u),size(I_equal_cell_u)
11450 10000 11450 10000
IIB_I_cell_no_uvw_total1 2230 2227 2166 1930
1926 1847
1 0.01411765 0.30104754 0.32529731 1.15440698 -0.30539502E+03 -0.29715696E+02 0.29394159E+07
2 0.00973086 0.41244573 0.45086899 1.22116550 -0.34890134E+03 -0.25062690E+02 0.29392110E+07
3 0.00918177 0.45383616 0.51179402 1.27757073 -0.35811483E+03 -0.25027396E+02 0.29391677E+07
4 0.00885764 0.47398774 0.55169119 1.31019526 -0.36250500E+03 -0.25910050E+02 0.29391470E+07
5 0.00872241 0.48832538 0.57967282 1.32679047 -0.36545763E+03 -0.26947216E+02 0.29391325E+07
300 0.00163886 4.27898628 6.83028522 3.60837060 -0.19609891E+04 0.43984454E+05 0.29435194E+07
600 0.00160193 3.91014241 4.97460210 5.10461274 -0.61092521E+03 0.18910563E+05 0.29467790E+07
900 0.00150521 3.27352854 5.85427996 4.49166453 -0.89281765E+04 -0.12171584E+05 0.29507471E+07
1200 0.00165280 3.05922213 7.37243530 5.16434634 -0.10954640E+05 0.22049957E+05 0.29575213E+07
1500 0.00153718 3.54908044 5.42918256 4.84940953 -0.16430153E+05 0.24407130E+05 0.29608940E+07
1800 0.00155455 3.30956962 8.35799538 4.50638757 -0.20003619E+05 -0.20349497E+05 0.29676102E+07
escape_time reached, so abort
cd_cl_cs_mom_implicit1
-1.29348921431473 -2.44525665200003 -0.238725356553914
0.644444280391413 -3.056662699041206E-002 -2.91791118488116
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./a.out on a petsc-3.6.3_static_rel named n12-09 with 24 processors, by wtay Sat Feb 27 16:58:01 2016
Using Petsc Release Version 3.6.3, Dec, 03, 2015
Max Max/Min Avg Total
Time (sec): 5.791e+03 1.00001 5.791e+03
Objects: 2.008e+04 1.00000 2.008e+04
Flops: 3.129e+11 1.06806 3.066e+11 7.360e+12
Flops/sec: 5.402e+07 1.06807 5.295e+07 1.271e+09
MPI Messages: 8.298e+04 1.89703 6.585e+04 1.580e+06
MPI Message Lengths: 6.456e+09 2.05684 7.780e+04 1.229e+11
MPI Reductions: 4.427e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 5.7911e+03 100.0% 7.3595e+12 100.0% 1.580e+06 100.0% 7.780e+04 100.0% 4.427e+04 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecDot 3998 1.0 1.1437e+02 2.3 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 617
VecDotNorm2 1999 1.0 1.0442e+02 2.6 2.99e+09 1.1 0.0e+00 0.0e+00 2.0e+03 1 1 0 0 5 1 1 0 0 5 676
VecNorm 3998 1.0 8.5426e+01 2.2 2.99e+09 1.1 0.0e+00 0.0e+00 4.0e+03 1 1 0 0 9 1 1 0 0 9 826
VecCopy 3998 1.0 7.3321e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 12002 1.0 1.2399e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPBYCZ 3998 1.0 1.8118e+01 1.4 5.98e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 7788
VecWAXPY 3998 1.0 1.6979e+01 1.3 2.99e+09 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 4155
VecAssemblyBegin 3998 1.0 4.1001e+01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+04 0 0 0 0 27 0 0 0 0 27 0
VecAssemblyEnd 3998 1.0 1.4657e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 16002 1.0 1.9519e+01 1.5 0.00e+00 0.0 1.2e+06 1.0e+05 0.0e+00 0 0 77100 0 0 0 77100 0 0
VecScatterEnd 16002 1.0 1.3223e+02 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
MatMult 3998 1.0 3.0904e+02 1.3 1.15e+11 1.1 3.0e+05 1.7e+05 0.0e+00 5 37 19 43 0 5 37 19 43 0 8700
MatSolve 5997 1.0 3.9285e+02 1.4 1.67e+11 1.1 0.0e+00 0.0e+00 0.0e+00 6 54 0 0 0 6 54 0 0 0 10040
MatLUFactorNum 104 1.0 4.2097e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 7190
MatILUFactorSym 1 1.0 2.9875e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatScale 1 1.0 1.3492e-01 3.3 1.45e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2525
MatAssemblyBegin 105 1.0 5.9000e+00 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+02 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 105 1.0 4.7665e+00 1.1 0.00e+00 0.0 1.5e+02 4.3e+04 8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 1.0 3.6001e-0518.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 1.6249e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 105 1.0 2.7945e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1999 1.0 9.1973e+02 1.0 3.13e+11 1.1 3.0e+05 1.7e+05 1.0e+04 16100 19 43 23 16100 19 43 23 8001
PCSetUp 208 1.0 4.2401e+01 1.2 1.30e+10 1.1 0.0e+00 0.0e+00 0.0e+00 1 4 0 0 0 1 4 0 0 0 7138
PCSetUpOnBlocks 1999 1.0 7.2389e-01 1.2 1.25e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4020
PCApply 5997 1.0 4.4054e+02 1.3 1.80e+11 1.1 0.0e+00 0.0e+00 0.0e+00 6 58 0 0 0 6 58 0 0 0 9634
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 4032 4032 53827712 0
Vector Scatter 2010 15 7012720 0
Matrix 4 4 359683260 0
Distributed Mesh 2003 8 39680 0
Star Forest Bipartite Graph 4006 16 13696 0
Discrete System 2003 8 6784 0
Index Set 4013 4013 25819112 0
IS L to G Mapping 2003 8 3919440 0
Krylov Solver 2 2 2296 0
Preconditioner 2 2 1896 0
Viewer 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 7.20024e-06
Average time for zero size MPI_Send(): 1.83781e-06
#PETSc Option Table entries:
-log_summary
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --with-blas-lapack-dir=/opt/ud/intel_xe_2013sp1/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.6.3_static_rel --known-mpi-shared=0 --with-shared-libraries=0 --with-fortran-interfaces=1
-----------------------------------------
Libraries compiled on Thu Jan 7 04:05:35 2016 on hpc12
Machine characteristics: Linux-3.10.0-123.20.1.el7.x86_64-x86_64-with-centos-7.1.1503-Core
Using PETSc directory: /home/wtay/Codes/petsc-3.6.3
Using PETSc arch: petsc-3.6.3_static_rel
-----------------------------------------
Using C compiler: /opt/ud/openmpi-1.8.8/bin/mpicc -wd1572 -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/ud/openmpi-1.8.8/bin/mpif90 -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/include -I/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/include -I/home/wtay/Lib/petsc-3.6.3_static_rel/include -I/opt/ud/openmpi-1.8.8/include
-----------------------------------------
Using C linker: /opt/ud/openmpi-1.8.8/bin/mpicc
Using Fortran linker: /opt/ud/openmpi-1.8.8/bin/mpif90
Using libraries: -Wl,-rpath,/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -L/home/wtay/Codes/petsc-3.6.3/petsc-3.6.3_static_rel/lib -lpetsc -Wl,-rpath,/home/wtay/Lib/petsc-3.6.3_static_rel/lib -L/home/wtay/Lib/petsc-3.6.3_static_rel/lib -lHYPRE -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -lmpi_cxx -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -Wl,-rpath,/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -L/opt/ud/intel_xe_2013sp1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -lX11 -lhwloc -lssl -lcrypto -lmpi_usempi -lmpi_mpifh -lifport -lifcore -lm -lmpi_cxx -ldl -L/opt/ud/openmpi-1.8.8/lib -lmpi -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -Wl,-rpath,/opt/ud/openmpi-1.8.8/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -L/opt/ud/openmpi-1.8.8/lib -L/opt/ud/intel_xe_2013sp1/composer_xe_2013_sp1.2.144/compiler/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.3 -ldl
-----------------------------------------
More information about the petsc-users
mailing list