Proposal for Requiring Save/Restore of Sets and Tags

Thu Sep 30 00:27:24 CDT 2010

Resending (orig. sent May 21, 2009) as per Seegyoung's inquiry regarding implementation's
responsibilities of information to store persistently as well as recommendations on how
Authors:
-------
Mark Miller <miller86 at llnl.gov>
Jason Kraftcheck <kraftche at cae.wisc.edu>

Proposal:
---------

iMesh implementations shall be required to support saving and loading of set
and tag data via the persistent storage methods, iMesh_save() and
iMesh_load(), defined in the iMesh interface.  Implementations are free to
choose how they meet this requirement.

Rationale:
----------

Sets and tags are part of the iMesh interface and data model.  An
implementation of the iMesh API is a component that maintains the
necessary data structures to implement that data model.  Given an API
that is fundamentally an interface for interacting with this data
structure and that provides functions to load and store it, an
implementation of that API is incomplete if it cannot restore the
complete data model that it is intended to maintain.

Further, the inability to save and restore the entire data model is
substantial practical barrier to interoperability and interchangeability
of implementations.  Interoperability of persistently stored data between
implementations is not achievable without imposing this requirement.

Practical considerations:
-------------------------

While a common file format for all implementations may be impractical, the
ability to store and subsequently retrieve a complete iMesh instance to/from
persistent storage is a minimum requirement.  An implementation may
maintain additional data beyond the iMesh data model for which a common
file format may not be sufficient.  Further, the data model does not
entirely constrain the data representation.  Implementations may implement
the complete data model while using incompatible internal representations
of the data (e.g. defining elements by vertices versus sub-elements of
one less topological dimension.)

In choosing a file format, there are several complicating factors to
consider:
  o Tags may be placed on entity sets as well as entities.
  o Entity sets may contain entity sets as well as entities.
  o The entity set containment graph may contain cycles.
  o The entity set parent/child graph may contain cycles.
  o Preserving valid references in tag data of type iBase_ENTITY_HANDLE.
  o Scalability for large numbers of entity sets (> 10^6).
  o Time/Storage performance issues associated with these factors.
  o Opportunities for cross-exploitation of existing formats/tools.
  o Partial load/store.
A key issue is that storage of iMesh data requires BOTH the ability to store
meshes and mesh-related data as well as potentially very large graphs
representing the entity sets, their tags and their relationships.

Relevant existing technology:
-----------------------------

Existing technologies for persistent storage of the iMesh data model fall
into two basic categories; those designed to represent meshes and, in
particular, unstructured meshes and those designed for general purpose I/O.
Examples of the former include VTK and ExodusII.   Examples of the latter
include HDF5, NetCDF and XML.  Many other examples exist in either category
as well.

We are not aware of any existing technology that addresses ALL of the
practical considerations sufficiently.  This leaves two options; extending
a mesh-based technology to support the whole of the iMesh data model or
defining an iMesh-specific solution on top of a general-purpose technology.

MOAB's implementation for saving/loading an iMesh instance to HDF5 is an
example of this latter approach.  It is fully developed and addresses the
most significant of the practical considerations.  Nonetheless, we recognize
that MOAB is only one, mature example of such an approach.  Implementors are
free to use MOAB's HDF5 solution if they choose or develop some other
implementation of their own choosing.