[petsc-dev] DMplex reader / viewers

Tue Jan 21 11:56:35 CST 2014

On Jan 21, 2014, at 10:23 AM, Gorman, Gerard J <g.gorman at imperial.ac.uk> wrote:

> Hi Blaise
> 
> This is pretty much music to my ears. Interoperability is a major headache, most file formats are rubbish (e.g. cannot support for boundaries, limited meta data etc etc) most lack reasonable parallel IO solutions (i.e. something other than 1 file per process)…and so on. At the moment a few groups here at IC (this mostly means Michael Lange ;) are integrating DMPlex with a range of FEM codes to try to rationalise this. There are certainly some features missing (we’ve bumped into most of that you’ve listed below) but all indications are that DMPlex has the right design to be expanded to cover these use cases. 
> 
> On 21 Jan 2014, at 15:30, Blaise A Bourdin <bourdin at lsu.edu> wrote:
> 
>> Hi,
>> 
>> It looks like DMplex is steadily gaining maturity but I/O is lagging behind. As far as I understand, right now, PETSc can _read_ a mesh in exodus format, and  write binary VTS format, but many issues remain, IMHO:
>>  - The exodus reader relies on a hard-coded nodeset named “marker”. Generating such a nodeset is not trivial
>>    (at least not for complex meshes generated with Cubit / Trelis).
>>  - Reading from or writing to exodus files is not supported.
> 
> So we *really* want this as well for the purpose of dumping results and checkpointing. Trying to shoehorn high order and DG data into VTK is complete crap. I love VTK because its got loads of functionality and it is easy to throw stuff together, but we are getting ourselves into the position that we are just layering hack after hack to deal with the fact that we cannot write all required data into a vtk file. These days VTK/paraview has its own ExodusII reader so we have a route to nice seamless integration.
> 
> 
>>  - The VTS viewer only allows to read and write _all_ fields in a DM. This may be overkill if one only 
>>    wants to read boundary values, for instance.
> 
> or only writing out prognostic fields for example. 
> 
>>  - The VTS viewer loses all informations on exodus nodesets and cell sets. These may have some significance
>>    and may be required to exploit the output of a computations.
> 
> Right - this includes boundary labels, and it is just a fluff to have to write this out into VTK. You would have to write a separate vtu or something resulting in more messiness (and I already have enough problems on LUSTER from having too many files).
> 
> 
>>  - VTS seems to have a concept of “blocks”. My understanding is that the parallel VTS viewer uses blocks to
>>    save subdomains, and that continuity of piecewise linear fields across subdomain boundaries is lost. 
>>    It is not entirely clear to me if with this layout, it would be possible to reopen a file with a 
>>    different processor count.
> 
> I think you just do not want to go there… For a start your vtk file would not be a valid checkpoint to consider restarting on a different number of processes. And secondly, it would just be a mess to program.
> 
>> 
>> I can dedicate some resources to improving DMplex I/O. Perhaps we can start a discussion by listing the desired features such readers / writers should have. I will pitch in by listing what matters to me:
> 
> Keep talking…we have also an FTE working on this currently but this is a long wish list and a lot of effort is required if this is to be done within a reasonable time frame. It would be great if more people were working on this.
I have a postdoc here who will devote some of his time to this task.

> 
>>  - A well documented and adopted file format that most post-processors / visualization tools can use
> 
> ExodusII appears to be the current favoured option.
Sadly yes… But SILO may be a close second and has a more modern interface. 

> 
>>  - Ability to read / write individual fields
>>  - Preserve _all_ information from the exodus file (node / side / cell sets), do not lose continuity of fields
>>    across subdomain boundaries.
>>  - Ability to reopen file on a different cpu count
> 
> So this is where we need to have parallel IO support. Quoting from petcs’s exodus.py
> ””"
> # ExodusII does not call HDF5 directly, but it does call nc_def_var_deflate(), which is only
> # part of libnetcdf when built using --enable-netcdf-4.  Currently --download-netcdf (netcdf.py)
> # sets --enable-netcdf-4 only when HDF5 is enabled.
> “”"
> So, there may be some rebuilding required to ensure that all the dependencies are built properly but it’s clearly there.

I am not sure if Exodus has a good solution here. As far as I understand, exodus is inherently sequential, even when implemented with HDF5 instead of netcdf. I would also worry about third party support for exodus files using HDF5 as their storage format.
Exodus has an parallel extension called nemesis, but I can’t figure out how how their concept of ghost point and cells works. The documentation on this point is really unclear.

Considering the dismal state of parallel FE formats and libraries, it seems to me that one needs to chose between two options:

 a. scatter back to a single I/O node and use sequential I/O using the ordering of the original (exodus) mesh. This allows reading and writing on an arbitrary number of processors, but has potential memory footprint and performance issues. How large a mesh can we reasonably expect to be able to handle this way? 
 b. Do “poor man” parallel I/O where each CPU does its own I/O, and possibly create interface matching files à la nemesis or SILO. Maybe, we can save enough information on the parallel layout in order to easily write an un-partitionner as a post-processor.

Unless one can come up with a better solution than a or b, I’d like to start by writing a very simple ASCII viewer demonstrating the communication involved, then modify it to use exodus, SILO or HDF5 format.

Blaise

-- 
Department of Mathematics and Center for Computation & Technology
Louisiana State University, Baton Rouge, LA 70803, USA
Tel. +1 (225) 578 1612, Fax  +1 (225) 578 4276 http://www.math.lsu.edu/~bourdin