0 SNES Function norm 1.030411923746e+00 
    0 KSP Residual norm 2.680798806895e+02 
    1 KSP Residual norm 5.116282449173e-02 
    2 KSP Residual norm 3.930093324656e-04 
    3 KSP Residual norm 3.107389602319e-06 
    4 KSP Residual norm 3.953598065105e-08 
  1 SNES Function norm 3.234968282944e-05 
    0 KSP Residual norm 2.088846781943e+01 
    1 KSP Residual norm 4.567926799785e-06 
    2 KSP Residual norm 3.671104671206e-09 
  2 SNES Function norm 2.483905947937e-07 
    0 KSP Residual norm 1.711385367248e-01 
    1 KSP Residual norm 2.599374047625e-08 
    2 KSP Residual norm 3.187383894150e-11 
  3 SNES Function norm 2.340679954558e-11 
SNES Object: 16 MPI processes
  type: newtonls
  maximum iterations=50, maximum function evaluations=10000
  tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
  total number of linear solver iterations=8
  total number of function evaluations=4
  norm schedule ALWAYS
  SNESLineSearch Object:   16 MPI processes
    type: bt
      interpolation: cubic
      alpha=1.000000e-04
    maxstep=1.000000e+08, minlambda=1.000000e-12
    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08
    maximum iterations=40
  KSP Object:   16 MPI processes
    type: gmres
      GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
      GMRES: happy breakdown tolerance 1e-30
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-09, absolute=1e-50, divergence=10000
    left preconditioning
    using PRECONDITIONED norm type for convergence test
  PC Object:   16 MPI processes
    type: mg
      MG: type is MULTIPLICATIVE, levels=2 cycles=v
        Cycles per PCApply=1
        Not using Galerkin computed coarse grid matrices
    Coarse grid solver -- level -------------------------------
      KSP Object:      (mg_coarse_)       16 MPI processes
        type: preonly
        maximum iterations=1, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_coarse_)       16 MPI processes
        type: redundant
          Redundant preconditioner: First (color=0) of 16 PCs follows
        KSP Object:        (mg_coarse_redundant_)         1 MPI processes
          type: preonly
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_redundant_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
            matrix ordering: nd
            factor fill ratio given 5, needed 13.7197
              Factored matrix follows:
                Mat Object:                 1 MPI processes
                  type: seqaij
                  rows=1640961, cols=1640961
                  package used to perform factorization: petsc
                  total: nonzeros=1.12497e+08, allocated nonzeros=1.12497e+08
                  total number of mallocs used during MatSetValues calls =0
                    not using I-node routines
          linear system matrix = precond matrix:
          Mat Object:           1 MPI processes
            type: seqaij
            rows=1640961, cols=1640961
            total: nonzeros=8.19968e+06, allocated nonzeros=8.19968e+06
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        linear system matrix = precond matrix:
        Mat Object:         16 MPI processes
          type: mpiaij
          rows=1640961, cols=1640961
          total: nonzeros=8.19968e+06, allocated nonzeros=8.19968e+06
          total number of mallocs used during MatSetValues calls =0
    Down solver (pre-smoother) on level 1 -------------------------------
      KSP Object:      (mg_levels_1_)       16 MPI processes
        type: richardson
          Richardson: damping factor=1
        maximum iterations=2
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using nonzero initial guess
        using NONE norm type for convergence test
      PC Object:      (mg_levels_1_)       16 MPI processes
        type: sor
          SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1
        linear system matrix = precond matrix:
        Mat Object:         16 MPI processes
          type: mpiaij
          rows=6558721, cols=6558721
          total: nonzeros=3.27834e+07, allocated nonzeros=3.27834e+07
          total number of mallocs used during MatSetValues calls =0
    Up solver (post-smoother) same as down solver (pre-smoother)
    linear system matrix = precond matrix:
    Mat Object:     16 MPI processes
      type: mpiaij
      rows=6558721, cols=6558721
      total: nonzeros=3.27834e+07, allocated nonzeros=3.27834e+07
      total number of mallocs used during MatSetValues calls =0
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./ex5 on a arch-linux2-c-opt named helios91 with 16 processors, by tnicolas Thu Oct 15 13:45:27 2015
Using Petsc Release Version 3.6.0, Jun, 09, 2015 

                         Max       Max/Min        Avg      Total 
Time (sec):           2.315e+02      1.00031   2.315e+02
Objects:              1.690e+02      1.00000   1.690e+02
Flops:                1.430e+11      1.00001   1.430e+11  2.288e+12
Flops/sec:            6.179e+08      1.00031   6.179e+08  9.886e+09
MPI Messages:         5.930e+02      1.54830   4.898e+02  7.836e+03
MPI Message Lengths:  1.585e+08      1.00470   3.231e+05  2.532e+09
MPI Reductions:       2.730e+02      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 2.3146e+02 100.0%  2.2881e+12 100.0%  7.836e+03 100.0%  3.231e+05      100.0%  2.720e+02  99.6% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

SNESSolve              1 1.0 2.3014e+02 1.0 1.43e+11 1.0 7.5e+03 3.4e+05 2.3e+02 99100 96100 83  99100 96100 83  9942
SNESFunctionEval       4 1.0 5.8257e-02 1.3 1.81e+07 1.0 1.9e+02 5.1e+03 0.0e+00  0  0  2  0  0   0  0  2  0  0  4954
SNESJacobianEval       6 1.0 5.7896e-01 1.0 0.00e+00 0.0 2.9e+02 3.8e+03 1.2e+01  0  0  4  0  4   0  0  4  0  4     0
SNESLineSearch         3 1.0 1.4320e-01 1.1 3.82e+07 1.0 2.9e+02 5.1e+03 1.2e+01  0  0  4  0  4   0  0  4  0  4  4259
VecDot                 3 1.0 2.9311e-02 2.0 2.47e+06 1.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  1   0  0  0  0  1  1343
VecMDot                8 1.0 1.5342e-01 1.7 1.31e+07 1.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  3   0  0  0  0  3  1368
VecNorm               18 1.0 1.2217e-01 1.7 1.48e+07 1.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  7   0  0  0  0  7  1933
VecScale              44 1.0 8.9462e-03 1.2 4.59e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  8178
VecCopy                9 1.0 1.8710e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                88 1.0 1.8393e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                3 1.0 5.5678e-03 1.3 2.47e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  7068
VecAYPX               11 1.0 2.3399e-02 2.4 4.52e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  3083
VecWAXPY               3 1.0 1.1509e-02 1.3 1.23e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1710
VecMAXPY              11 1.0 3.7277e-02 1.3 1.97e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  8445
VecPointwiseMult       3 1.0 9.1434e-04 1.4 3.09e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  5384
VecScatterBegin      114 1.0 1.1982e-01 1.5 0.00e+00 0.0 6.6e+03 3.3e+05 0.0e+00  0  0 84 86  0   0  0 84 86  0     0
VecScatterEnd        114 1.0 2.5024e+00 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
VecReduceArith         6 1.0 5.5261e-03 1.3 4.93e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 14242
VecReduceComm          3 1.0 6.8271e-0317.7 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  1   0  0  0  0  1     0
VecNormalize          11 1.0 9.2880e-02 2.0 1.36e+07 1.0 0.0e+00 0.0e+00 1.1e+01  0  0  0  0  4   0  0  0  0  4  2330
MatMult               22 1.0 3.0854e-01 1.5 8.13e+07 1.0 1.1e+03 5.1e+03 0.0e+00  0  0 13  0  0   0  0 13  0  0  4208
MatMultAdd            11 1.0 1.8070e+0054.0 2.03e+07 1.0 3.6e+02 1.9e+03 0.0e+00  0  0  5  0  0   0  0  5  0  0   180
MatMultTranspose      15 1.0 1.6727e-01 2.8 2.77e+07 1.0 5.0e+02 1.9e+03 0.0e+00  0  0  6  0  0   0  0  6  0  0  2646
MatSolve              11 1.0 5.3905e+00 1.5 2.46e+09 1.0 0.0e+00 0.0e+00 0.0e+00  2  2  0  0  0   2  2  0  0  0  7292
MatSOR                22 1.0 3.9514e+0113.0 2.50e+08 1.0 1.6e+03 5.1e+03 4.4e+01  9  0 20  0 16   9  0 20  0 16   101
MatLUFactorSym         1 1.0 3.3988e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatLUFactorNum         3 1.0 2.1303e+02 1.2 1.40e+11 1.0 0.0e+00 0.0e+00 0.0e+00 85 98  0  0  0  85 98  0  0  0 10522
MatCopy                2 1.0 1.3462e-01 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatConvert             1 1.0 1.8156e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatResidual           11 1.0 2.1085e-01 2.3 4.52e+07 1.0 5.3e+02 5.1e+03 0.0e+00  0  0  7  0  0   0  0  7  0  0  3421
MatAssemblyBegin      10 1.0 2.3481e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  7   0  0  0  0  7     0
MatAssemblyEnd        10 1.0 1.8402e-01 1.2 0.00e+00 0.0 2.6e+02 8.4e+02 2.4e+01  0  0  3  0  9   0  0  3  0  9     0
MatGetRowIJ            1 1.0 1.1500e-01 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrice       3 1.0 6.3075e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  3   0  0  0  0  3     0
MatGetOrdering         1 1.0 1.9039e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatView                5 1.7 2.0039e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatRedundantMat        3 1.0 9.4506e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  3   0  0  0  0  3     0
KSPGMRESOrthog         8 1.0 1.7433e-01 1.6 2.63e+07 1.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  3   0  0  0  0  3  2408
KSPSetUp              12 1.0 1.9831e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  4   0  0  0  0  4     0
KSPSolve               3 1.0 2.2951e+02 1.0 1.43e+11 1.0 7.1e+03 3.6e+05 2.0e+02 99100 90100 75  99100 90100 75  9967
PCSetUp                3 1.0 2.1978e+02 1.2 1.40e+11 1.0 1.2e+03 2.9e+05 1.1e+02 88 98 15 14 41  88 98 15 14 41 10199
PCApply               11 1.0 4.4227e+01 4.8 2.79e+09 1.0 5.5e+03 4.0e+05 4.4e+01 11  2 70 86 16  11  2 70 86 16  1010
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

                SNES     1              1         1332     0
      SNESLineSearch     1              1          864     0
              DMSNES     3              3         2072     0
              Vector    80             80    224470328     0
      Vector Scatter    10             10     11530944     0
              Matrix    13             13   1680657104     0
    Distributed Mesh     5              5        24416     0
Star Forest Bipartite Graph    10             10         8448     0
     Discrete System     5              5         4240     0
           Index Set    25             25     30882964     0
   IS L to G Mapping     4              4      4129280     0
       Krylov Solver     4              4        22016     0
     DMKSP interface     2              2         1296     0
      Preconditioner     4              4         3944     0
              Viewer     2              1          760     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 2.57492e-06
Average time for zero size MPI_Send(): 6.31809e-06
#PETSc Option Table entries:
-da_grid_x 21
-da_grid_y 21
-da_refine 7
-ksp_monitor
-ksp_rtol 1e-9
-log_summary
-mg_levels_ksp_type richardson
-pc_mg_levels 2
-pc_type mg
-snes_monitor
-snes_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --prefix=/csc/softs/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real --with-debugging=0 --with-x=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --with-fortran --known-mpi-shared-libraries=1 --with-scalar-type=real --with-precision=double --CFLAGS="-g -O3 -mavx -mkl" --CXXFLAGS="-g -O3 -mavx -mkl" --FFLAGS="-g -O3 -mavx -mkl"
-----------------------------------------
Libraries compiled on Mon Sep 28 20:22:47 2015 on helios85 
Machine characteristics: Linux-2.6.32-573.1.1.el6.Bull.80.x86_64-x86_64-with-redhat-6.4-Santiago
Using PETSc directory: /csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: mpicc -g -O3 -mavx -mkl -fPIC  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -g -O3 -mavx -mkl -fPIC   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-c-opt/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-c-opt/include -I/opt/mpi/bullxmpi/1.2.8.2/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-c-opt/lib -L/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-c-opt/lib -lpetsc -lhwloc -lxml2 -lssl -lcrypto -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -lmpi_f90 -lmpi_f77 -lm -lifport -lifcore -lm -lmpi_cxx -ldl -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -lmpi -lnuma -lrt -lnsl -lutil -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -limf -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -ldl 
-----------------------------------------