<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Jan 22, 2014 at 12:59 PM, Tim Tautges (ANL) <span dir="ltr"><<a href="mailto:tautges@mcs.anl.gov" target="_blank">tautges@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">That's funny, I was thinking the same about DMPlex. :)<br></blockquote><div><br></div><div>Maybe you can think that on the moab-dev list.</div>
<div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
- tim<div class="im"><br>
<br>
On 01/22/2014 12:45 PM, Matthew Knepley wrote:<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
Tim,<br>
<br>
I do not consider MOAB a real alternative here.<br>
<br>
Matt<br>
<br></div><div><div class="h5">
On Wed, Jan 22, 2014 at 12:18 PM, Tim Tautges (ANL) <<a href="mailto:tautges@mcs.anl.gov" target="_blank">tautges@mcs.anl.gov</a> <mailto:<a href="mailto:tautges@mcs.anl.gov" target="_blank">tautges@mcs.anl.gov</a>>> wrote:<br>
<br>
<br>
<br>
On 01/21/2014 05:58 PM, Gorman, Gerard J wrote:<br>
<br>
<br>
<br>
I am not sure if Exodus has a good solution here. As far as I understand, exodus is inherently<br>
sequential, even<br>
when implemented with HDF5 instead of netcdf. I would also worry about third party support for exodus<br>
files using<br>
HDF5 as their storage format. Exodus has an parallel extension called nemesis, but I can?t figure out<br>
how how<br>
their concept of ghost point and cells works. The documentation on this point is really unclear.<br>
<br>
<br>
<br>
I have to admit I was kind of hoping that ExodusII folks would have come on a bit more on the parallel IO front (I’m<br>
assuming those guys also run large simulations…). That said, I see this as a two stage process: first integrate with<br>
DMPlex as that should give the high level abstraction for read/write to file; secondly extend the family of<br>
readers/writers. At least this way there will be some agility and interoperability between different formats, and it<br>
will not be too disruptive to the application codes when a different formats adopted.<br>
<br>
<br>
My impression is that the ExodusII people are working within the context of code frameworks more than disk file<br>
formats to do this, e.g. in Trilinos and Sierra. I don't think the ExoII file format by itself is very conducive to<br>
representing parallel, which is why Nemesis writes an annotation (though, I haven't followed ExoII developments<br>
closely since they went open source several years back).<br>
<br>
<br>
<br>
<br>
b. Do ?poor man? parallel I/O where each CPU does its own I/O, and possibly create interface matching<br>
files ? la<br>
nemesis or SILO. Maybe, we can save enough information on the parallel layout in order to easily write an<br>
un-partitionner as a post-processor.<br>
<br>
<br>
I am pretty sure that if we are writing everything in slabs to a HDF5 container we do not have to worry too much<br>
about the parallel layout although some clear optimisations are possible. In the worst case it is a three stage<br>
process of where we perform a parallel read of the connectivity, scatter/gather for continuous numbering, parallel<br>
repartitioning and subsequent parallel read of remaining data. Importantly, it is at least scalable.<br>
<br>
<br>
We've seen fragmentation with unstructured meshes being a problem too, and you won't escape that even with<br>
renumbering (though reading then migrating would address that, at the cost of some additional communication and<br>
possibly reading to figure out where things need to go).<br>
<br>
<br>
<br>
<br>
Depending on the degree of direct interaction/automation in those interactions between the mesh and Petsc, there<br>
are other options as well. One that we're developing, based on the MOAB library, can read/write (in serial)<br>
ExodusII, and also supports parallel read/write using its own HDF5-based format. Parallel I/O robustness<br>
has been<br>
iffy above ~16k procs and 32M-64M hex/tet elements, but for smaller problems it should work. We're in the<br>
process<br>
of developing direct support for going between a mesh defined with fields (called tags in MOAB) and petsc<br>
vectors.<br>
MOAB has pretty solid support for things like computing sharing and ghosting between procs and<br>
exchanging/reducing<br>
field values on those entities. Viz is supported either by compiling a VTK/Paraview plugin that pulls the<br>
mesh/fields through MOAB or by translating to VTK (also supported directly from MOAB); Visit also has a<br>
plugin you<br></div></div>
can enable. See <a href="http://trac.mcs.anl.gov/__projects/ITAPS/wiki/MOAB" target="_blank">http://trac.mcs.anl.gov/__<u></u>projects/ITAPS/wiki/MOAB</a><div><div class="h5"><br>
<<a href="http://trac.mcs.anl.gov/projects/ITAPS/wiki/MOAB" target="_blank">http://trac.mcs.anl.gov/<u></u>projects/ITAPS/wiki/MOAB</a>> for details of MOAB; the petsc integration stuff<br>
is on a bitbucket branch of petsc owned by Vijay Mahadevan.<br>
<br>
<br>
Another reason this could be of great interest is that MOAB supports (according to the docs) geometric topology<br>
which<br>
could be used when adapting the mesh on a curved surface for example - another item on my which list.<br>
<br>
<br>
Yes, MOAB's file format can handle definitions of groupings (and relations between the groups) necessary to<br>
represent geometric topology. What you use for the shape evaluation of those geometric surfaces is another question<br>
though. If you're wanting to do that using a reconstructed continuous patch, MOAB has some code to do that too,<br>
though it's not as good as the latest stuff in that area (from Jiao at Stony Brook).<br>
<br>
<br>
<br>
Is it<br>
<br>
integrated to PETSc via the plex or does this essentially replace the functionality of the plex?<br>
<br>
<br>
It's not via plex, but I'm pretty sure all the mesh-related functionality available through plex is available<br>
through different API functions in MOAB.<br>
<br>
<br>
Why does it break<br>
<br>
down for more than 16k procs?<br>
<br>
<br>
It's a combination of things:<br>
- maximizing generality means we're using more than just 2 or 3 tables, because in addition to nodes and elements,<br>
we need groupings, whose membership determines whether a group is resident on a given processor, etc, and that<br>
strongly affects scaling<br>
- that same generality causes us to hit an MPI/IO bug on IBM (though we haven't checked on Q yet to see if that's<br>
been addressed, it might have been); we've worked with ANL I/O guys off and on on this, and hope to get back to that<br>
on Q soon<br>
- we do single file parallel I/O, without any 2-phase (communicate down to I/O nodes then do I/O), and that hits<br>
HDF5 pretty hard; we're working with hdfgroup to explore that<br>
<br>
We haven't done any benchmarking on a Lustre system yet, but I expect that to do worse than IBM, because of the many<br>
tables thing (my impression is that Lustre doesn't handle frequent metadata reads well)<br>
<br>
<br>
is it just a case that Lustre gets hammered? What magic sauce is used by high order FEM<br>
<br>
codes such as nek500 that can run on ~1m cores?<br>
<br>
<br>
Those codes go for a much more restricted I/O data case, which allows them to specialize and do their own<br>
implementation of parallel I/O. So, Nek5000 has its own implementation of poor man's parallel, they repeat vertices<br>
in the file that are logically the same (shared between hexes), and they don't really do subsets. I think that's<br>
great to do if you have to, but I'm still hoping for more support in that direction from general libraries.<br>
<br>
<br>
<br>
libmesh also maintains its own DMlibmesh, but I'm not sure how solid their support for large mesh / parallel<br>
I/O is<br>
(but they've been working on it recently I know).<br>
<br>
<br>
<br>
Are there any other formats that we should be considering? It’s a few years since I tried playing about with CGNS -<br>
at the time its parallel IO was non-existent and I have not seen it being pushed since. XDMF looks interesting as it<br>
is essentially some xml metadata and a HDF5 bucket. Is anyone championing this?<br>
<br>
<br>
Don't know about XDMF. I know there's been a bunch of work on SILO and its parallel performance fairly recently (3<br>
or 4 yrs ago) and it's used heavily inside LLNL.<br>
<br>
- tim<br>
<br>
Cheers Gerard<br>
<br>
<br>
--<br></div></div>
==============================<u></u>__============================<u></u>==__====<div class="im"><br>
"You will keep in perfect peace him whose mind is<br>
steadfast, because he trusts in you." Isaiah 26:3<br>
<br>
Tim Tautges Argonne National Laboratory<br></div>
(<a href="mailto:tautges@mcs.anl.gov" target="_blank">tautges@mcs.anl.gov</a> <mailto:<a href="mailto:tautges@mcs.anl.gov" target="_blank">tautges@mcs.anl.gov</a>>) (telecommuting from UW-Madison)<br>
phone (gvoice): <a href="tel:%28608%29%20354-1459" value="+16083541459" target="_blank">(608) 354-1459</a> <tel:%28608%29%20354-1459> 1500 Engineering Dr.<br>
fax: <a href="tel:%28608%29%20263-4499" value="+16082634499" target="_blank">(608) 263-4499</a> <tel:%28608%29%20263-4499> Madison, WI 53706<div class="im"><br>
<br>
<br>
<br>
<br>
--<br>
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any<br>
results to which their experiments lead.<br>
-- Norbert Wiener<br>
</div></blockquote><div class="HOEnZb"><div class="h5">
<br>
-- <br>
==============================<u></u>==============================<u></u>====<br>
"You will keep in perfect peace him whose mind is<br>
steadfast, because he trusts in you." Isaiah 26:3<br>
<br>
Tim Tautges Argonne National Laboratory<br>
(<a href="mailto:tautges@mcs.anl.gov" target="_blank">tautges@mcs.anl.gov</a>) (telecommuting from UW-Madison)<br>
phone (gvoice): <a href="tel:%28608%29%20354-1459" value="+16083541459" target="_blank">(608) 354-1459</a> 1500 Engineering Dr.<br>
fax: <a href="tel:%28608%29%20263-4499" value="+16082634499" target="_blank">(608) 263-4499</a> Madison, WI 53706<br>
<br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener
</div></div>