[petsc-dev] DMplex reader / viewers

Wed Jan 22 13:00:37 CST 2014

On Wed, Jan 22, 2014 at 12:59 PM, Tim Tautges (ANL) <tautges at mcs.anl.gov>wrote:

> That's funny, I was thinking the same about DMPlex. :)
>

Maybe you can think that on the moab-dev list.

   Matt

> - tim
>
>
> On 01/22/2014 12:45 PM, Matthew Knepley wrote:
>
>> Tim,
>>
>>    I do not consider MOAB a real alternative here.
>>
>>       Matt
>>
>> On Wed, Jan 22, 2014 at 12:18 PM, Tim Tautges (ANL) <tautges at mcs.anl.gov<mailto:
>> tautges at mcs.anl.gov>> wrote:
>>
>>
>>
>>     On 01/21/2014 05:58 PM, Gorman, Gerard J wrote:
>>
>>
>>
>>                 I am not sure if Exodus has a good solution here. As far
>> as I understand, exodus is inherently
>>                 sequential, even
>>                 when implemented with HDF5 instead of netcdf. I would
>> also worry about third party support for exodus
>>                 files using
>>                 HDF5 as their storage format. Exodus has an parallel
>> extension called nemesis, but I can?t figure out
>>                 how how
>>                 their concept of ghost point and cells works. The
>> documentation on this point is really unclear.
>>
>>
>>
>>         I have to admit I was kind of hoping that ExodusII folks would
>> have come on a bit more on the parallel IO front (I’m
>>         assuming those guys also run large simulations…). That said, I
>> see this as a two stage process: first integrate with
>>         DMPlex as that should give the high level abstraction for
>> read/write to file; secondly extend the family of
>>         readers/writers. At least this way there will be some agility and
>> interoperability between different formats, and it
>>         will not be too disruptive to the application codes when a
>> different formats adopted.
>>
>>
>>     My impression is that the ExodusII people are working within the
>> context of code frameworks more than disk file
>>     formats to do this, e.g. in Trilinos and Sierra.  I don't think the
>> ExoII file format by itself is very conducive to
>>     representing parallel, which is why Nemesis writes an annotation
>> (though, I haven't followed ExoII developments
>>     closely since they went open source several years back).
>>
>>
>>
>>
>>                 b. Do ?poor man? parallel I/O where each CPU does its own
>> I/O, and possibly create interface matching
>>                 files ? la
>>                 nemesis or SILO. Maybe, we can save enough information on
>> the parallel layout in order to easily write an
>>                 un-partitionner as a post-processor.
>>
>>
>>         I am pretty sure that if we are writing everything in slabs to a
>> HDF5 container we do not have to worry too much
>>         about the parallel layout although some clear optimisations are
>> possible. In the  worst case it is a three stage
>>         process of where we perform a parallel read of the connectivity,
>> scatter/gather for continuous numbering, parallel
>>         repartitioning and subsequent parallel read of remaining data.
>> Importantly, it is at least scalable.
>>
>>
>>     We've seen fragmentation with unstructured meshes being a problem
>> too, and you won't escape that even with
>>     renumbering (though reading then migrating would address that, at the
>> cost of some additional communication and
>>     possibly reading to figure out where things need to go).
>>
>>
>>
>>
>>             Depending on the degree of direct interaction/automation in
>> those interactions between the mesh and Petsc, there
>>             are other options as well.  One that we're developing, based
>> on the MOAB library, can read/write (in serial)
>>             ExodusII, and also supports parallel read/write using its own
>> HDF5-based format.  Parallel I/O robustness
>>             has been
>>             iffy above ~16k procs and 32M-64M hex/tet elements, but for
>> smaller problems it should work.  We're in the
>>             process
>>             of developing direct support for going between a mesh defined
>> with fields (called tags in MOAB) and petsc
>>             vectors.
>>             MOAB has pretty solid support for things like computing
>> sharing and ghosting between procs and
>>             exchanging/reducing
>>             field values on those entities.  Viz is supported either by
>> compiling a VTK/Paraview plugin that pulls the
>>             mesh/fields through MOAB or by translating to VTK (also
>> supported directly from MOAB); Visit also has a
>>             plugin you
>>             can enable.  See http://trac.mcs.anl.gov/__
>> projects/ITAPS/wiki/MOAB
>>
>>             <http://trac.mcs.anl.gov/projects/ITAPS/wiki/MOAB> for
>> details of MOAB; the petsc integration stuff
>>             is on a bitbucket branch of petsc owned by Vijay Mahadevan.
>>
>>
>>         Another reason this could be of great interest is that MOAB
>> supports (according to the docs) geometric topology
>>         which
>>         could be used when adapting the mesh on a curved surface for
>> example - another item on my which list.
>>
>>
>>     Yes, MOAB's file format can handle definitions of groupings (and
>> relations between the groups) necessary to
>>     represent geometric topology.  What you use for the shape evaluation
>> of those geometric surfaces is another question
>>     though.  If you're wanting to do that using a reconstructed
>> continuous patch, MOAB has some code to do that too,
>>     though it's not as good as the latest stuff in that area (from Jiao
>> at Stony Brook).
>>
>>
>>
>>     Is it
>>
>>         integrated to PETSc via the plex or does this essentially replace
>> the functionality of the plex?
>>
>>
>>     It's not via plex, but I'm pretty sure all the mesh-related
>> functionality available through plex is available
>>     through different API functions in MOAB.
>>
>>
>>     Why does it break
>>
>>         down for more than 16k procs?
>>
>>
>>     It's a combination of things:
>>     - maximizing generality means we're using more than just 2 or 3
>> tables, because in addition to nodes and elements,
>>     we need groupings, whose membership determines whether a group is
>> resident on a given processor, etc, and that
>>     strongly affects scaling
>>     - that same generality causes us to hit an MPI/IO bug on IBM (though
>> we haven't checked on Q yet to see if that's
>>     been addressed, it might have been); we've worked with ANL I/O guys
>> off and on on this, and hope to get back to that
>>     on Q soon
>>     - we do single file parallel I/O, without any 2-phase (communicate
>> down to I/O nodes then do I/O), and that hits
>>     HDF5 pretty hard; we're working with hdfgroup to explore that
>>
>>     We haven't done any benchmarking on a Lustre system yet, but I expect
>> that to do worse than IBM, because of the many
>>     tables thing (my impression is that Lustre doesn't handle frequent
>> metadata reads well)
>>
>>
>>     is it just a case that Lustre gets hammered? What magic sauce is used
>> by high order FEM
>>
>>         codes such as nek500 that can run on ~1m cores?
>>
>>
>>     Those codes go for a much more restricted I/O data case, which allows
>> them to specialize and do their own
>>     implementation of parallel I/O.  So, Nek5000 has its own
>> implementation of poor man's parallel, they repeat vertices
>>     in the file that are logically the same (shared between hexes), and
>> they don't really do subsets.  I think that's
>>     great to do if you have to, but I'm still hoping for more support in
>> that direction from general libraries.
>>
>>
>>
>>             libmesh also maintains its own DMlibmesh, but I'm not sure
>> how solid their support for large mesh / parallel
>>             I/O is
>>             (but they've been working on it recently I know).
>>
>>
>>
>>         Are there any other formats that we should be considering? It’s a
>> few years since I tried playing about with CGNS -
>>         at the time its parallel IO was non-existent and I have not seen
>> it being pushed since. XDMF looks interesting as it
>>         is essentially some xml metadata and a HDF5 bucket. Is anyone
>> championing this?
>>
>>
>>     Don't know about XDMF.  I know there's been a bunch of work on SILO
>> and its parallel performance fairly recently (3
>>     or 4 yrs ago) and it's used heavily inside LLNL.
>>
>>     - tim
>>
>>         Cheers Gerard
>>
>>
>>     --
>>     ==============================__==============================__====
>>
>>     "You will keep in perfect peace him whose mind is
>>        steadfast, because he trusts in you."               Isaiah 26:3
>>
>>                   Tim Tautges            Argonne National Laboratory
>>               (tautges at mcs.anl.gov <mailto:tautges at mcs.anl.gov>)
>>  (telecommuting from UW-Madison)
>>       phone (gvoice): (608) 354-1459 <tel:%28608%29%20354-1459>
>>  1500 Engineering Dr.
>>                  fax: (608) 263-4499 <tel:%28608%29%20263-4499>
>>  Madison, WI 53706
>>
>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any
>> results to which their experiments lead.
>> -- Norbert Wiener
>>
>
> --
> ================================================================
> "You will keep in perfect peace him whose mind is
>   steadfast, because he trusts in you."               Isaiah 26:3
>
>              Tim Tautges            Argonne National Laboratory
>          (tautges at mcs.anl.gov)      (telecommuting from UW-Madison)
>  phone (gvoice): (608) 354-1459      1500 Engineering Dr.
>             fax: (608) 263-4499      Madison, WI 53706
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20140122/53f2314e/attachment.html>