[MOAB-dev] adding 'read and broadcast' to HDF5 reader

Mark Miller miller86 at llnl.gov
Fri Oct 19 15:30:46 CDT 2012


Not sure how much this helps but newest versions of HDF5 library support
reading a file into memory (one I/O operation) and then proc 0 can
broadcast that buffer (single broadcast) and other procs can 'open' that
buffer of bytes as an HDF5 file. So, in theory, with minimal changes to
MOAB, its possible to 'spoof' MOAB into thinking each processor did the
read anyways. One problem; I think this feature works for whole files
only. So, if the tables MOAB needs to read this way are self contained
in a single file, it could work. Otherwise, its not much help...

This is the 'file image' feature of HDF5.

Mark

On Fri, 2012-10-19 at 15:16 -0500, Iulian Grindeanu wrote:
> Hello Rob,
> I think that change has to happen in src/parallel/ReadParallel.cpp
> I am not sure yet though, Tim would confirm that
> 
> Iulian
> 
> 
> ______________________________________________________________________
>         Tim knows all this but for the rest of the list, here's the
>         short story:
>         
>         MOAB's HDF5 reader and writer have a problem on BlueGene where
>         it will
>         collectively read in initial conditions or write output, and
>         run out
>         of memory.  This out-of-memory condition comes from MOAB doing
>         all the
>         right things -- using HDF5, using collective I/O -- but the
>         MPI-IO
>         library on Intrepid goes and consumes too much memory.
>         
>         I've got one approach to deal with the MPI-IO memory issue for
>         writes.
>         This approach would sort of work for the reads, but what is
>         really
>         needed is for rank 0 to do the read and broadcast the result
>         to
>         everyone.  
>         
>         So, I'm looking for a little help understanding MOAB's read
>         side of
>         the code.  Conceptually, all processes read the table of
>         entities. 
>         
>         A fairly small 'mbconvert' job will run out of memory: 
>         
>         512 nodes, 2048 processors:
>         
>         ======
>         NODES=512
>         CORES=$(($NODES * 4))
>         cd /intrepid-fs0/users/robl/scratch/moab-test
>         
>         cqsub -t 15 -m vn -p SSSPP -e
>         MPIRUN_LABEL=1:BG_COREDUMPONEXIT=1 \
>                 -n $NODES -c
>          $CORES /home/robl/src/moab-svn/build/tools/mbconvert\
>                 -O CPUTIME -O PARALLEL_GHOSTS=3.0.1 -O
>         PARALLEL=READ_PART \
>                 -O PARALLEL_RESOLVE_SHARED_ENTS -O PARTITION -t \
>                 -o CPUTIME -o
>         PARALLEL=WRITE_PART /intrepid-fs0/users/tautges/persistent/meshes/2bricks/nogeom/64bricks_8mtet_ng_rib_${CORES}.h5m \
>         
>          /intrepid-fs0/users/robl/scratch/moab/8mtet_ng-${CORES}-out.h5m
>         ======
>         
>         I'm kind of stumbling around  ReadHDF5::load_file and
>         ReadHDF5::load_file_partial trying to find a spot where a
>         collection
>         of tags are read into memory.  I'd like to, instead of having
>         all
>         processors do the read, have just one processor read and then
>         send the
>         tag data to the other processors.
>         
>         First, do I remember the basic MOAB concept correctly: that
>         early on
>         every process reads the exact same tables out of the (in this
>         case
>         HDF5) file?  
>         
>         If I want rank 0 to do all the work and send data to other
>         ranks,
>         where's the best place to slip that in?  It's been a while
>         since I did
>         anything non-trivial in C++, so some of these data structures
>         are kind
>         of greek to me.
>         
>         thanks
>         ==rob
>         
>         -- 
>         Rob Latham
>         Mathematics and Computer Science Division
>         Argonne National Lab, IL USA
> 
> 
-- 
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86 at llnl.gov      urgent: miller86 at pager.llnl.gov
T:8-6 (925)-423-5901    M/W/Th:7-12,2-7 (530)-753-8511



More information about the moab-dev mailing list