[MOAB-dev] DMMoabLoadFromFile() - parallel performance issue

WARNER, JAMES E. (LARC-D309) james.e.warner at nasa.gov
Tue Dec 15 07:27:00 CST 2015


Good morning,

Thanks for the quick response.

We have seen this issue both on a linux workstation without a parallel
filesystem and a supercomputer with a Lustre parallel filesystem. Here is
a link to the machine statistics of the latter:
http://www.nas.nasa.gov/hecc/resources/pleiades.html, this is where we
attempted the scalability test that failed on 500 procs.

By the way, we are building HDF5/MOAB via the PETSc install. Here are the
library versions / configuration details for the workstation build:

------------------------------------------------------------
compiler - comp-intel/2015.0.09
mpi        - mpich2-1.4

PETSc:
CFLAGS=-O2 CXXFLAGS=-O2 --with-debugging=0 --with-shared-libraries=0
--download-metis=1 --download-parmetis=1 --download-hdf5=1
--download-moab=1 --download-mumps=1 --download-scalapack
--with-blas-lapack-dir=/opt/intel15/mkl/lib/intel64/

HDF5:
./configure 
--prefix=/users0/jewarne1/src/petsc/build/petsc-3.5.4/intel-mpich
--libdir=/users0/jewarne1/src/petsc/build/petsc-3.5.4/intel-mpich/lib
CC="mpicc" CFLAGS="-O2 " --enable-parallel --enable-fortran FC="mpif90"
F9X="mpif90" F90="mpif90" --disable-shared

MOAB:
./configure 
--prefix=/users0/jewarne1/src/petsc/build/petsc-3.5.4/intel-mpich
CC="mpicc" CFLAGS="-O2 " CXX="mpicxx" CXXFLAGS="-O2    " F90="mpif90"
F90FLAGS=" -O3  " F77="mpif90" FFLAGS=" -O3  " FC="mpif90" FCLAGS=" -O3  "
--disable-shared --with-mpi
--with-hdf5=/users0/jewarne1/src/petsc/build/petsc-3.5.4/intel-mpich
--without-netcdf
------------------------------------------------------------



Here are the same details for the cluster build:


------------------------------------------------------------

Compiler - comp-intel/2015.3.187
mpi - mpi-sgi/mpt.2.12r26

petsc:
CFLAGS="-O2 -xAVX" CXXFLAGS="-O2 -xAVX" --with-debugging=0
--with-shared-libraries=0 --download-metis=1 --download-parmetis=1
--download-hdf5=1 --download-moab=1 --download-mumps=1
--download-scalapack
--with-blas-lapack-dir=/nasa/intel/Compiler/2015.3.187/composer_xe_2015.3.1
87/mkl/lib/intel64/

moab:
--prefix=/home6/jhochhal/src/petsc-3.5.4/intel-mpt CC="mpicc"
CFLAGS="-O2 -xAVX " CXX="mpicxx" CXXFLAGS="-O2 -xAVX    " F90="mpif90"
F90FLAGS=" -O3  " F77="mpif90" FFLAGS=" -O3  " FC="mpif90" FCLAGS=" -O3
" --disable-shared --with-mpi
--with-hdf5=/home6/jhochhal/src/petsc-3.5.4/intel-mpt --without-netcdf

hdf5:
--prefix=/home6/jhochhal/src/petsc-3.5.4/intel-mpt
--libdir=/home6/jhochhal/src/petsc-3.5.4/intel-mpt/lib CC="mpicc"
CFLAGS="-O2 -xAVX " --enable-parallel --enable-fortran FC="mpif90"
F9X="mpif90" F90="mpif90" --disable-shared
------------------------------------------------------------



Obviously we will not see scaling for the serial moab/hdf5 reader on the
machine without parallel I/O, but we expected to see CPU times on the
order of nprocs * time_serial. Instead were seeing a slowdown of about 35X
from NP=1 to NP=2. 

Let me know if you have any comments about the above information or any
results from your performance on the attached test case. Thanks!

Best,
Jim






On 12/14/15, 7:24 PM, "Vijay S. Mahadevan" <vijay.m at gmail.com> wrote:

>James,
>
>I haven't locally tested your test case yet but those numbers look
>suspicious. We have seen bad slowdown of the HDF5 I/O until 4000 procs
>for loading mesh files that are over O(100)Gb. Even there, the plateau
>in timing still happens around couple of minutes and not 45 minutes.
>So the simple test should certainly not see bad performance
>degradation.
>
>DMMoabLoadFromFile internally just calls moab->load_file, which is
>primarily a load factory that invokes ReadHDF5. So the bulk of the
>behavior can be reduced to ReadHDF5 and HDF5 itself. So here are some
>questions to better understand what you are doing here.
>
>1) What version of HDF5 are you using and how is it configured ?
>Debug/optimized ?
>2) Is MOAB configured with optimized mode ?
>3) What compiler and MPI version are you using on your machine so that
>we can better understand if its a compiler flag issue.
>4) What are your machine characteristics ? Cluster or large scale
>machine ? GPFS or Lustre or some other base system ?
>5) Can you work with a MOAB branch ? We have a PR that is currently
>being reviewed, which should give you fine grained profiling data
>during the read. Take a look at [1].
>
>Let us know some of these answers. Meanwhile, we will also try to use
>your test case to check the I/O performance and see if your results
>are replicable to some degree.
>
>Vijay
>
>[1] 
>https://bitbucket.org/fathomteam/moab/pull-requests/170/genlargemesh-corre
>ctions/diff
>
>On Mon, Dec 14, 2015 at 6:25 PM, WARNER, JAMES E. (LARC-D309)
><james.e.warner at nasa.gov> wrote:
>> Hi Vijay & Iulian,
>>
>> Hope you are doing well! I have a question regarding some strange
>>behavior
>> we¹re seeing with the DMMoabLoadFromFile() functionŠ
>>
>> After doing some recent profiling of our MOAB-based finite element
>>code, we
>> noticed that we are spending a disproportionate amount of CPU time
>>within
>> the DMMoabLoadFromFile() function, which gets slower / remains constant
>>as
>> we increase the number of processors. We also recently attempted a
>> scalability test with ~30M FEM nodes  on 500 processors which hung in
>> DMMoabLoadFromFile() for about 45 minutes before we killed the job. We
>>then
>> re-ran the test on one processor and it made it through successfully in
>> several seconds.
>>
>> To reproduce the problem we¹re seeing, we wrote a test case (attached
>>here)
>> that simply loads a smaller mesh with approximately 16K nodes and
>>prints the
>> run time. When I run the code on an increasing number of processors, I
>>get
>> something like:
>>
>> NP=1: Time to read file: 0.0416839 [sec.]
>> NP=2: Time to read file: 1.42497 [sec.]
>> NP=4: Time to read file: 1.13678 [sec.]
>> NP=8: Time to read file: 1.0475 [sec.]
>> Š
>>
>> If it is relevant/helpful ­ we are using the mbpart tool to partition
>>the
>> mesh.  Do you have any ideas why we are not seeing scalability here? Any
>> thoughts/tips would be appreciated! Let me know if you would like any
>>more
>> information.
>>
>> Thanks,
>> Jim
>>
>>
>>



More information about the moab-dev mailing list