[MOAB-dev] DMMoabLoadFromFile() - parallel performance issue
Grindeanu, Iulian R.
iulian at mcs.anl.gov
Tue Dec 15 19:06:51 CST 2015
Hi Jim,
Thanks for your message
The relatively bad scalability results are due to the way the mesh was generated / ordered in the file.
when reading in parallel, mesh is cherry-picked from the file, each task tries to read from the file
only the elements it needs.
The ids of the elements in each partition are very disjoint, there are big gaps in "file id space"
So each task has to read many sub-sequences from the file.
In serial, there is one sequence, no cherry-picking
It you partition your file using -R option, you will get better results, as the elements are reordered in the file
relative to their partition (so elements in partition 0 will be "close together" in the file too, not only in physical space)
something like this
mbpart -z RCB 16 -R scaling_study.h5m reor.h5m
this is what I got with your initial file scaling_study.h5m
on 1 processor(s)
ReadHDF5: 0.179603
0.182674 PARALLEL TOTAL
on 2 processor(s)
ReadHDF5: 0.363472
0.483488 PARALLEL TOTAL
on 4 processor(s)
ReadHDF5: 0.310096
0.390116 PARALLEL TOTAL
on 8 processor(s)
ReadHDF5: 0.302726
0.366929 PARALLEL TOTAL
reordered file: reor.h5m
on 1 processor(s)
ReadHDF5: 0.0257461
0.0286331 PARALLEL TOTAL
on 2 processor(s)
ReadHDF5: 0.025044
0.138264 PARALLEL TOTAL
on 4 processor(s)
ReadHDF5: 0.0203938
0.099154 PARALLEL TOTAL
on 8 processor(s)
ReadHDF5: 0.0345509
0.103593 PARALLEL TOTAL
Parallel Total includes the time it needs to "resolve shared entities" between tasks.
ReadHDF5 does the actual IO read from the disk.
________________________________________
From: moab-dev-bounces at mcs.anl.gov [moab-dev-bounces at mcs.anl.gov] on behalf of WARNER, JAMES E. (LARC-D309) [james.e.warner at nasa.gov]
Sent: Tuesday, December 15, 2015 7:27 AM
To: Vijay S. Mahadevan
Cc: moab-dev at mcs.anl.gov
Subject: Re: [MOAB-dev] DMMoabLoadFromFile() - parallel performance issue
Good morning,
Thanks for the quick response.
We have seen this issue both on a linux workstation without a parallel
filesystem and a supercomputer with a Lustre parallel filesystem. Here is
a link to the machine statistics of the latter:
http://www.nas.nasa.gov/hecc/resources/pleiades.html, this is where we
attempted the scalability test that failed on 500 procs.
By the way, we are building HDF5/MOAB via the PETSc install. Here are the
library versions / configuration details for the workstation build:
------------------------------------------------------------
compiler - comp-intel/2015.0.09
mpi - mpich2-1.4
PETSc:
CFLAGS=-O2 CXXFLAGS=-O2 --with-debugging=0 --with-shared-libraries=0
--download-metis=1 --download-parmetis=1 --download-hdf5=1
--download-moab=1 --download-mumps=1 --download-scalapack
--with-blas-lapack-dir=/opt/intel15/mkl/lib/intel64/
HDF5:
./configure
--prefix=/users0/jewarne1/src/petsc/build/petsc-3.5.4/intel-mpich
--libdir=/users0/jewarne1/src/petsc/build/petsc-3.5.4/intel-mpich/lib
CC="mpicc" CFLAGS="-O2 " --enable-parallel --enable-fortran FC="mpif90"
F9X="mpif90" F90="mpif90" --disable-shared
MOAB:
./configure
--prefix=/users0/jewarne1/src/petsc/build/petsc-3.5.4/intel-mpich
CC="mpicc" CFLAGS="-O2 " CXX="mpicxx" CXXFLAGS="-O2 " F90="mpif90"
F90FLAGS=" -O3 " F77="mpif90" FFLAGS=" -O3 " FC="mpif90" FCLAGS=" -O3 "
--disable-shared --with-mpi
--with-hdf5=/users0/jewarne1/src/petsc/build/petsc-3.5.4/intel-mpich
--without-netcdf
------------------------------------------------------------
Here are the same details for the cluster build:
------------------------------------------------------------
Compiler - comp-intel/2015.3.187
mpi - mpi-sgi/mpt.2.12r26
petsc:
CFLAGS="-O2 -xAVX" CXXFLAGS="-O2 -xAVX" --with-debugging=0
--with-shared-libraries=0 --download-metis=1 --download-parmetis=1
--download-hdf5=1 --download-moab=1 --download-mumps=1
--download-scalapack
--with-blas-lapack-dir=/nasa/intel/Compiler/2015.3.187/composer_xe_2015.3.1
87/mkl/lib/intel64/
moab:
--prefix=/home6/jhochhal/src/petsc-3.5.4/intel-mpt CC="mpicc"
CFLAGS="-O2 -xAVX " CXX="mpicxx" CXXFLAGS="-O2 -xAVX " F90="mpif90"
F90FLAGS=" -O3 " F77="mpif90" FFLAGS=" -O3 " FC="mpif90" FCLAGS=" -O3
" --disable-shared --with-mpi
--with-hdf5=/home6/jhochhal/src/petsc-3.5.4/intel-mpt --without-netcdf
hdf5:
--prefix=/home6/jhochhal/src/petsc-3.5.4/intel-mpt
--libdir=/home6/jhochhal/src/petsc-3.5.4/intel-mpt/lib CC="mpicc"
CFLAGS="-O2 -xAVX " --enable-parallel --enable-fortran FC="mpif90"
F9X="mpif90" F90="mpif90" --disable-shared
------------------------------------------------------------
Obviously we will not see scaling for the serial moab/hdf5 reader on the
machine without parallel I/O, but we expected to see CPU times on the
order of nprocs * time_serial. Instead were seeing a slowdown of about 35X
from NP=1 to NP=2.
Let me know if you have any comments about the above information or any
results from your performance on the attached test case. Thanks!
Best,
Jim
On 12/14/15, 7:24 PM, "Vijay S. Mahadevan" <vijay.m at gmail.com> wrote:
>James,
>
>I haven't locally tested your test case yet but those numbers look
>suspicious. We have seen bad slowdown of the HDF5 I/O until 4000 procs
>for loading mesh files that are over O(100)Gb. Even there, the plateau
>in timing still happens around couple of minutes and not 45 minutes.
>So the simple test should certainly not see bad performance
>degradation.
>
>DMMoabLoadFromFile internally just calls moab->load_file, which is
>primarily a load factory that invokes ReadHDF5. So the bulk of the
>behavior can be reduced to ReadHDF5 and HDF5 itself. So here are some
>questions to better understand what you are doing here.
>
>1) What version of HDF5 are you using and how is it configured ?
>Debug/optimized ?
>2) Is MOAB configured with optimized mode ?
>3) What compiler and MPI version are you using on your machine so that
>we can better understand if its a compiler flag issue.
>4) What are your machine characteristics ? Cluster or large scale
>machine ? GPFS or Lustre or some other base system ?
>5) Can you work with a MOAB branch ? We have a PR that is currently
>being reviewed, which should give you fine grained profiling data
>during the read. Take a look at [1].
>
>Let us know some of these answers. Meanwhile, we will also try to use
>your test case to check the I/O performance and see if your results
>are replicable to some degree.
>
>Vijay
>
>[1]
>https://bitbucket.org/fathomteam/moab/pull-requests/170/genlargemesh-corre
>ctions/diff
>
>On Mon, Dec 14, 2015 at 6:25 PM, WARNER, JAMES E. (LARC-D309)
><james.e.warner at nasa.gov> wrote:
>> Hi Vijay & Iulian,
>>
>> Hope you are doing well! I have a question regarding some strange
>>behavior
>> we¹re seeing with the DMMoabLoadFromFile() functionŠ
>>
>> After doing some recent profiling of our MOAB-based finite element
>>code, we
>> noticed that we are spending a disproportionate amount of CPU time
>>within
>> the DMMoabLoadFromFile() function, which gets slower / remains constant
>>as
>> we increase the number of processors. We also recently attempted a
>> scalability test with ~30M FEM nodes on 500 processors which hung in
>> DMMoabLoadFromFile() for about 45 minutes before we killed the job. We
>>then
>> re-ran the test on one processor and it made it through successfully in
>> several seconds.
>>
>> To reproduce the problem we¹re seeing, we wrote a test case (attached
>>here)
>> that simply loads a smaller mesh with approximately 16K nodes and
>>prints the
>> run time. When I run the code on an increasing number of processors, I
>>get
>> something like:
>>
>> NP=1: Time to read file: 0.0416839 [sec.]
>> NP=2: Time to read file: 1.42497 [sec.]
>> NP=4: Time to read file: 1.13678 [sec.]
>> NP=8: Time to read file: 1.0475 [sec.]
>> Š
>>
>> If it is relevant/helpful we are using the mbpart tool to partition
>>the
>> mesh. Do you have any ideas why we are not seeing scalability here? Any
>> thoughts/tips would be appreciated! Let me know if you would like any
>>more
>> information.
>>
>> Thanks,
>> Jim
>>
>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: globalID.png
Type: image/png
Size: 112406 bytes
Desc: globalID.png
URL: <http://lists.mcs.anl.gov/pipermail/moab-dev/attachments/20151216/39cba5e4/attachment-0001.png>
More information about the moab-dev
mailing list