[petsc-dev] DMplex reader / viewers

Matthew Knepley knepley at gmail.com
Wed Jan 22 12:45:14 CST 2014


Tim,

  I do not consider MOAB a real alternative here.

     Matt

On Wed, Jan 22, 2014 at 12:18 PM, Tim Tautges (ANL) <tautges at mcs.anl.gov>wrote:

>
>
> On 01/21/2014 05:58 PM, Gorman, Gerard J wrote:
>
>>
>>
>>>> I am not sure if Exodus has a good solution here. As far as I
>>>> understand, exodus is inherently sequential, even
>>>> when implemented with HDF5 instead of netcdf. I would also worry about
>>>> third party support for exodus files using
>>>> HDF5 as their storage format. Exodus has an parallel extension called
>>>> nemesis, but I can?t figure out how how
>>>> their concept of ghost point and cells works. The documentation on this
>>>> point is really unclear.
>>>>
>>>>
>>
>> I have to admit I was kind of hoping that ExodusII folks would have come
>> on a bit more on the parallel IO front (I’m
>> assuming those guys also run large simulations…). That said, I see this
>> as a two stage process: first integrate with
>> DMPlex as that should give the high level abstraction for read/write to
>> file; secondly extend the family of
>> readers/writers. At least this way there will be some agility and
>> interoperability between different formats, and it
>> will not be too disruptive to the application codes when a different
>> formats adopted.
>>
>>
> My impression is that the ExodusII people are working within the context
> of code frameworks more than disk file formats to do this, e.g. in Trilinos
> and Sierra.  I don't think the ExoII file format by itself is very
> conducive to representing parallel, which is why Nemesis writes an
> annotation (though, I haven't followed ExoII developments closely since
> they went open source several years back).
>
>
>
>>
>>  b. Do ?poor man? parallel I/O where each CPU does its own I/O, and
>>>> possibly create interface matching files ? la
>>>> nemesis or SILO. Maybe, we can save enough information on the parallel
>>>> layout in order to easily write an
>>>> un-partitionner as a post-processor.
>>>>
>>>
>> I am pretty sure that if we are writing everything in slabs to a HDF5
>> container we do not have to worry too much
>> about the parallel layout although some clear optimisations are possible.
>> In the  worst case it is a three stage
>> process of where we perform a parallel read of the connectivity,
>> scatter/gather for continuous numbering, parallel
>> repartitioning and subsequent parallel read of remaining data.
>> Importantly, it is at least scalable.
>>
>>
> We've seen fragmentation with unstructured meshes being a problem too, and
> you won't escape that even with renumbering (though reading then migrating
> would address that, at the cost of some additional communication and
> possibly reading to figure out where things need to go).
>
>
>
>>>>
>>> Depending on the degree of direct interaction/automation in those
>>> interactions between the mesh and Petsc, there
>>> are other options as well.  One that we're developing, based on the MOAB
>>> library, can read/write (in serial)
>>> ExodusII, and also supports parallel read/write using its own HDF5-based
>>> format.  Parallel I/O robustness has been
>>> iffy above ~16k procs and 32M-64M hex/tet elements, but for smaller
>>> problems it should work.  We're in the process
>>> of developing direct support for going between a mesh defined with
>>> fields (called tags in MOAB) and petsc vectors.
>>> MOAB has pretty solid support for things like computing sharing and
>>> ghosting between procs and exchanging/reducing
>>> field values on those entities.  Viz is supported either by compiling a
>>> VTK/Paraview plugin that pulls the
>>> mesh/fields through MOAB or by translating to VTK (also supported
>>> directly from MOAB); Visit also has a plugin you
>>> can enable.  See http://trac.mcs.anl.gov/projects/ITAPS/wiki/MOAB for
>>> details of MOAB; the petsc integration stuff
>>> is on a bitbucket branch of petsc owned by Vijay Mahadevan.
>>>
>>
>> Another reason this could be of great interest is that MOAB supports
>> (according to the docs) geometric topology which
>> could be used when adapting the mesh on a curved surface for example -
>> another item on my which list.
>>
>
> Yes, MOAB's file format can handle definitions of groupings (and relations
> between the groups) necessary to represent geometric topology.  What you
> use for the shape evaluation of those geometric surfaces is another
> question though.  If you're wanting to do that using a reconstructed
> continuous patch, MOAB has some code to do that too, though it's not as
> good as the latest stuff in that area (from Jiao at Stony Brook).
>
>
>
> Is it
>
>> integrated to PETSc via the plex or does this essentially replace the
>> functionality of the plex?
>>
>
> It's not via plex, but I'm pretty sure all the mesh-related functionality
> available through plex is available through different API functions in MOAB.
>
>
> Why does it break
>
>> down for more than 16k procs?
>>
>
> It's a combination of things:
> - maximizing generality means we're using more than just 2 or 3 tables,
> because in addition to nodes and elements, we need groupings, whose
> membership determines whether a group is resident on a given processor,
> etc, and that strongly affects scaling
> - that same generality causes us to hit an MPI/IO bug on IBM (though we
> haven't checked on Q yet to see if that's been addressed, it might have
> been); we've worked with ANL I/O guys off and on on this, and hope to get
> back to that on Q soon
> - we do single file parallel I/O, without any 2-phase (communicate down to
> I/O nodes then do I/O), and that hits HDF5 pretty hard; we're working with
> hdfgroup to explore that
>
> We haven't done any benchmarking on a Lustre system yet, but I expect that
> to do worse than IBM, because of the many tables thing (my impression is
> that Lustre doesn't handle frequent metadata reads well)
>
>
> is it just a case that Lustre gets hammered? What magic sauce is used by
> high order FEM
>
>> codes such as nek500 that can run on ~1m cores?
>>
>>
> Those codes go for a much more restricted I/O data case, which allows them
> to specialize and do their own implementation of parallel I/O.  So, Nek5000
> has its own implementation of poor man's parallel, they repeat vertices in
> the file that are logically the same (shared between hexes), and they don't
> really do subsets.  I think that's great to do if you have to, but I'm
> still hoping for more support in that direction from general libraries.
>
>
>
>>> libmesh also maintains its own DMlibmesh, but I'm not sure how solid
>>> their support for large mesh / parallel I/O is
>>> (but they've been working on it recently I know).
>>>
>>>
>>
>> Are there any other formats that we should be considering? It’s a few
>> years since I tried playing about with CGNS -
>> at the time its parallel IO was non-existent and I have not seen it being
>> pushed since. XDMF looks interesting as it
>> is essentially some xml metadata and a HDF5 bucket. Is anyone championing
>> this?
>>
>>
> Don't know about XDMF.  I know there's been a bunch of work on SILO and
> its parallel performance fairly recently (3 or 4 yrs ago) and it's used
> heavily inside LLNL.
>
> - tim
>
>  Cheers Gerard
>>
>>
> --
> ================================================================
> "You will keep in perfect peace him whose mind is
>   steadfast, because he trusts in you."               Isaiah 26:3
>
>              Tim Tautges            Argonne National Laboratory
>          (tautges at mcs.anl.gov)      (telecommuting from UW-Madison)
>  phone (gvoice): (608) 354-1459      1500 Engineering Dr.
>             fax: (608) 263-4499      Madison, WI 53706
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20140122/ab7b7474/attachment.html>


More information about the petsc-dev mailing list