[MOAB-dev] r1297 - in MOAB/trunk: . parallel test/h5file

Wed Oct 3 15:28:43 CDT 2007

Author: tautges
Date: 2007-10-03 15:28:42 -0500 (Wed, 03 Oct 2007)
New Revision: 1297

Added:
   MOAB/trunk/MBHandleUtils.cpp
   MOAB/trunk/MBHandleUtils.hpp
   MOAB/trunk/mbparallelcomm_test.cpp
   MOAB/trunk/parallel/
   MOAB/trunk/parallel/MBParallelComm.cpp
   MOAB/trunk/parallel/MBParallelComm.hpp
   MOAB/trunk/parallel/MBProcConfig.cpp
   MOAB/trunk/parallel/MBProcConfig.hpp
   MOAB/trunk/parallel/Makefile.am
   MOAB/trunk/parallel/ReadParallel.cpp
   MOAB/trunk/parallel/ReadParallel.hpp
   MOAB/trunk/parallel/WriteHDF5Parallel.cpp
   MOAB/trunk/parallel/WriteHDF5Parallel.hpp
   MOAB/trunk/parallel/crystal.c
   MOAB/trunk/parallel/crystal.h
   MOAB/trunk/parallel/errmem.c
   MOAB/trunk/parallel/errmem.h
   MOAB/trunk/parallel/fcrystal.c
   MOAB/trunk/parallel/gs.c
   MOAB/trunk/parallel/gs.h
   MOAB/trunk/parallel/minmax.h
   MOAB/trunk/parallel/sort.c
   MOAB/trunk/parallel/sort.h
   MOAB/trunk/parallel/sort_imp.c
   MOAB/trunk/parallel/transfer.c
   MOAB/trunk/parallel/transfer.h
   MOAB/trunk/parallel/tuple_list.c
   MOAB/trunk/parallel/tuple_list.h
   MOAB/trunk/parallel/types.h
Removed:
   MOAB/trunk/MBParallelComm.cpp
   MOAB/trunk/MBParallelComm.hpp
   MOAB/trunk/MBProcConfig.cpp
   MOAB/trunk/MBProcConfig.hpp
   MOAB/trunk/ReadParallel.cpp
   MOAB/trunk/ReadParallel.hpp
   MOAB/trunk/WriteHDF5Parallel.cpp
   MOAB/trunk/WriteHDF5Parallel.hpp
Modified:
   MOAB/trunk/EntitySequenceManager.cpp
   MOAB/trunk/EntitySequenceManager.hpp
   MOAB/trunk/MBCore.cpp
   MOAB/trunk/MBCore.hpp
   MOAB/trunk/MBInterface.hpp
   MOAB/trunk/MBParallelConventions.h
   MOAB/trunk/MBRange.cpp
   MOAB/trunk/MBRange.hpp
   MOAB/trunk/MBReadUtil.cpp
   MOAB/trunk/MBReadUtil.hpp
   MOAB/trunk/MBReadUtilIface.hpp
   MOAB/trunk/MBReaderIface.hpp
   MOAB/trunk/MBSkinner.cpp
   MOAB/trunk/MBSkinner.hpp
   MOAB/trunk/MBTest.cpp
   MOAB/trunk/Makefile.am
   MOAB/trunk/ReadNCDF.cpp
   MOAB/trunk/configure.in
   MOAB/trunk/internals_test.cpp
   MOAB/trunk/test/h5file/Makefile.am
   MOAB/trunk/test/h5file/parallel.cpp
Log:
Adding code to resolve shared entities between processors.  Works by
matching global ids of vertices on the skin of each processor's mesh.
Stores sharing on vertex tags, for now (eventually, will set up
interface sets too).

The parallel shared entities functionality relies extensively on code
from the Nek CFD code from ANL, used with permission here.  In
particular, this code contains a "crystal router", described in Fox's
parallel programming book, and gather/scatter functionality (with
operations) based on it.  This functionality will probably be useful
in other parallel stuff in MOAB.

MBCore, MBInterface: 
- adding functions to create several vertices at
once; implementation is still based on MBReadUtilIface, but apps
should have an easier time of creating many vertices at once.
- adding create_vertices variant which takes a processor id to
MBInterface.
- removing the MBProcConfig from the public interface, replacing it
with individual functions for rank and size.  MBProcConfig's been
moved down to the new parallel subdirectory.
- added MBHandleUtils for doing various things with handles that used
to be done in MBProcConfig, like getting processor rank from handle.

WriteHDF5Parallel, others: moved many of the parallel classes down to
"parallel" subdirectory, to keep from having to know about parallel
constructs in the mainline code.

MBSkinner: adding variant of find_skin for getting skin of prescribed
dimension, and returning in a single range instead of a forward and
reverse range.

MBHandleUtils: new class for doing handle manipulation

Makefile.am, configure.in: adding parallel subdirectory 

ReadNCDF.cpp: correcting a few error messages I noticed

MBParallelConventions.h: new conventions file for conventional tag
names

mbparallelcomm_test.cpp: test for parallel communication stuff

MBReadUtilIface.hpp, MBReadUtil.cpp: added new
gather_related_entities function, which was originally developed for
ReadParallel but would be useful to other readers.

MBReadUtil.cpp, MBTest.cpp, EntitySequenceManager: updated to use new
rank function, handle utils, etc.

MBRange: re-named subset function to be more descriptive
(subset_by_type)

MBCore: put ReadParallel inside #ifdef PARALLEL so it doesn't get
compiled into serial code.  Also changed to use new rank, handle
utils, etc.




Modified: MOAB/trunk/EntitySequenceManager.cpp
===================================================================

--- MOAB/trunk/EntitySequenceManager.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/EntitySequenceManager.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -30,12 +30,13 @@
 #include "ScdElementSeq.hpp"
 #include "ScdVertexSeq.hpp"
 #include "MeshSetSequence.hpp"
+#include "MBHandleUtils.hpp"
 #include <assert.h>
 #include <algorithm>
 
 
-EntitySequenceManager::EntitySequenceManager( const MBProcConfig& proc_info )
-  : procInfo( proc_info )
+EntitySequenceManager::EntitySequenceManager( const MBHandleUtils &handle_utils)
+  : handleUtils(handle_utils)
 {
   memset(mLastAccessed, 0, MBMAXTYPE*sizeof(void*));
 }
@@ -95,7 +96,7 @@
   }
   
     // get a start handle
-  MBErrorCode rval = get_start_handle(hint_start_id, procInfo.rank(), type, num_ent, start_handle);
+  MBErrorCode rval = get_start_handle(hint_start_id, handleUtils.proc_rank(), type, num_ent, start_handle);
   if (MB_SUCCESS != rval) return rval;
   
   if (MBVERTEX == type)
@@ -171,9 +172,9 @@
 
   // Create handles from input parameters
   int dum = 0;
-  start_hint_handle = CREATE_HANDLE(type, procInfo.id(hint_start, proc), dum);
+  start_hint_handle = CREATE_HANDLE(type, handleUtils.create_id(hint_start, proc), dum);
   MBEntityHandle end_hint_handle = start_hint_handle + num_ent - 1;
-  MBEntityHandle last_handle = CREATE_HANDLE( type, procInfo.last_id(proc), dum );
+  MBEntityHandle last_handle = CREATE_HANDLE( type, handleUtils.last_id(proc), dum );
   
   // Check if the handle type can accomodate the requested number of handles
   if (end_hint_handle > last_handle)
@@ -207,8 +208,8 @@
   int dum = 0;
 
     // get first and largest possible handle for specified proc and type
-  MBEntityHandle last_handle = CREATE_HANDLE( type, procInfo.last_id(proc), dum );
-  handle_out = CREATE_HANDLE( type, procInfo.first_id(proc), dum );
+  MBEntityHandle last_handle = CREATE_HANDLE( type, handleUtils.last_id(proc), dum );
+  handle_out = CREATE_HANDLE( type, handleUtils.first_id(proc), dum );
 
     // check that handle space is large enough to accomodate requested
     // number of entities
@@ -229,7 +230,7 @@
     // If previous sequence is for previous processor, then there
     // are currently no handles allocated for this type and proc.
     // Return the first handle.
-  if (procInfo.rank(iter->second->get_end_handle()) < (unsigned)proc)
+  if (handleUtils.rank_from_handle(iter->second->get_end_handle()) < (unsigned)proc)
     return MB_SUCCESS;
 
     // Otherwise try the handle after those currently allocated
@@ -247,8 +248,8 @@
     MBEntityHandle last_start = iter->second->get_start_handle();
       // If this is the first sequence for this processor
     if (iter == mSequenceMap[type].begin() ||
-        procInfo.rank((--iter)->second->get_end_handle()) < (unsigned)proc) {
-      MBEntityHandle first_handle = CREATE_HANDLE( type, procInfo.first_id(proc), dum );
+        handleUtils.rank_from_handle((--iter)->second->get_end_handle()) < (unsigned)proc) {
+      MBEntityHandle first_handle = CREATE_HANDLE( type, handleUtils.first_id(proc), dum );
       if (first_handle - last_start > (MBEntityHandle)num_ent) {
         handle_out = last_start - num_ent;
         return MB_SUCCESS;
@@ -291,7 +292,7 @@
   // see if there is an existing sequence that can take this new vertex
   SeqMap& seq_map = mPartlyFullSequenceMap[MBVERTEX];
   for (SeqMap::iterator i = seq_map.begin(); i != seq_map.end(); ++i) {
-    if (procInfo.rank(i->second->get_start_handle()) == processor_id) {
+    if (handleUtils.rank_from_handle(i->second->get_start_handle()) == processor_id) {
       seq = static_cast<VertexEntitySequence*>(i->second);
       break;
     }
@@ -326,7 +327,7 @@
   for (SeqMap::iterator i = seq_map.begin(); i != seq_map.end(); ++i) {
     ElementEntitySequence* tseq = reinterpret_cast<ElementEntitySequence*>(i->second);
     if (tseq->nodes_per_element() == connlen && 
-        procInfo.rank( tseq->get_start_handle() ) == processor_id) {
+        handleUtils.rank_from_handle( tseq->get_start_handle() ) == processor_id) {
       seq = tseq;
       break;
     }
@@ -364,7 +365,7 @@
   // see if there is an existing sequence that can take this new vertex
   SeqMap& seq_map = mPartlyFullSequenceMap[MBENTITYSET];
   for (SeqMap::iterator i = seq_map.begin(); i != seq_map.end(); ++i) {
-    if (procInfo.rank(i->second->get_start_handle()) == proc_id) {
+    if (handleUtils.rank_from_handle(i->second->get_start_handle()) == proc_id) {
       seq = static_cast<MeshSetSequence*>(i->second);
       break;
     }
@@ -585,7 +586,7 @@
 
 int main()
 {
-  EntitySequenceManager manager( MBProcConfig(0,1) );
+  EntitySequenceManager manager( MBHandleUtils(0,1) );
   MBEntitySequence* seq;
   
   // create some sequences

Modified: MOAB/trunk/EntitySequenceManager.hpp
===================================================================
--- MOAB/trunk/EntitySequenceManager.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/EntitySequenceManager.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -42,7 +42,7 @@
 #endif
 
 #include "MBForward.hpp"
-#include "MBProcConfig.hpp"
+#include "MBHandleUtils.hpp"
 #include <map>
 
 class MBEntitySequence;
@@ -54,7 +54,7 @@
 public:
 
   //! constructor
-  EntitySequenceManager( const MBProcConfig& proc_info );
+  EntitySequenceManager( const MBHandleUtils &handle_utils);
 
   //! destructor
   ~EntitySequenceManager();
@@ -171,7 +171,7 @@
                                  MBEntityID num_ent,
                                  MBEntityHandle& start_handle );
   
-  const MBProcConfig procInfo;
+  const MBHandleUtils handleUtils;
 };
 
 #endif

Modified: MOAB/trunk/MBCore.cpp
===================================================================
--- MOAB/trunk/MBCore.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBCore.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -39,7 +39,13 @@
 #include "MBReaderWriterSet.hpp"
 #include "MBReaderIface.hpp"
 #include "MBWriterIface.hpp"
+#include "MBHandleUtils.hpp"
+
+#ifdef MPI
+#include "mpi.h"
+#include "MBParallelComm.hpp"
 #include "ReadParallel.hpp"
+#endif
 
 #ifdef HDF5_FILE
 #  include "WriteHDF5.hpp"
@@ -51,14 +57,10 @@
 #  include "WriteVtk.hpp"
    typedef WriteVtk DefaultWriter;
 #endif
-#ifdef USE_MPI
-#include "mpi.h"
-#endif
 #include "MBTagConventions.hpp"
 #include "ExoIIUtil.hpp"
 #include "EntitySequence.hpp"
 #include "FileOptions.hpp"
-#include "MBParallelComm.hpp"
 #ifdef LINUX
 # include <dlfcn.h>
 # include <dirent.h>
@@ -99,7 +101,7 @@
 
 //! Constructor
 MBCore::MBCore( int rank, int num_procs ) 
-  : procInfo( rank, num_procs )
+    : handleUtils(rank, num_procs)
 {
 #ifdef XPCOM_MB
   NS_INIT_ISUPPORTS();
@@ -140,7 +142,7 @@
   if (!tagServer)
     return MB_MEMORY_ALLOCATION_FAILED;
   
-  sequenceManager = new EntitySequenceManager( procInfo );
+  sequenceManager = new EntitySequenceManager( handleUtils );
   if (!sequenceManager)
     return MB_MEMORY_ALLOCATION_FAILED;
 
@@ -359,8 +361,14 @@
   std::string parallel_opt;
   rval = opts.get_option( "PARALLEL", parallel_opt);
   if (MB_SUCCESS == rval && !parallel_opt.empty()) {
+#ifdef MPI    
     return ReadParallel(this).load_file(file_name, file_set, opts,
                                         block_id_list, num_blocks);
+#else
+    mError->set_last_error( "PARALLEL option not valid, this instance"
+                            " compiled for serial execution.\n" );
+    return MB_NOT_IMPLEMENTED;
+#endif
   }
 
     // otherwise try using the file extension to select a reader
@@ -482,7 +490,7 @@
   
   if (sequenceManager)
     delete sequenceManager;
-  sequenceManager = new EntitySequenceManager( procInfo );
+  sequenceManager = new EntitySequenceManager( handleUtils );
 
   return result;
 }
@@ -1654,7 +1662,8 @@
                                    const int num_nodes, 
                                    MBEntityHandle &handle)
 {
-  return create_element( type, procInfo.rank(), connectivity, num_nodes, handle );
+  return create_element( type, handleUtils.proc_rank(), 
+                         connectivity, num_nodes, handle );
 }
 
 MBErrorCode MBCore::create_element( const MBEntityType type,
@@ -1667,7 +1676,7 @@
   if(num_nodes < MBCN::VerticesPerEntity(type))
     return MB_FAILURE;
   
-  if (processor_id >= procInfo.size())
+  if (processor_id >= handleUtils.proc_size())
     return MB_INDEX_OUT_OF_RANGE;
   
   MBErrorCode status = sequence_manager()->create_element(type, processor_id, connectivity, num_nodes, handle);
@@ -1680,18 +1689,57 @@
 //! creates a vertex based on coordinates, returns a handle and error code
 MBErrorCode MBCore::create_vertex(const double coords[3], MBEntityHandle &handle )
 {
-  return create_vertex( procInfo.rank(), coords, handle );
+  return create_vertex( handleUtils.proc_rank(), coords, handle );
 }
 
 MBErrorCode MBCore::create_vertex( const unsigned processor_id, const double* coords, MBEntityHandle& handle )
 {
-  if (processor_id >= procInfo.size())
+  if (processor_id >= handleUtils.proc_size())
     return MB_INDEX_OUT_OF_RANGE;
     
     // get an available vertex handle
   return sequence_manager()->create_vertex( processor_id, coords, handle );
 }
 
+MBErrorCode MBCore::create_vertices(const double *coordinates, 
+                                    const int nverts,
+                                    MBRange &entity_handles ) 
+{
+  return create_vertices(handleUtils.proc_rank(), coordinates,
+                         nverts, entity_handles);
+}
+
+MBErrorCode MBCore::create_vertices(const unsigned processor_id,
+                                    const double *coordinates, 
+                                    const int nverts,
+                                    MBRange &entity_handles ) 
+{
+    // Create vertices
+  MBReadUtilIface *read_iface;
+  MBErrorCode result = 
+    this->query_interface("MBReadUtilIface", 
+                          reinterpret_cast<void**>(&read_iface));
+  if (MB_SUCCESS != result) return result;
+  
+  std::vector<double*> arrays;
+  MBEntityHandle start_handle_out = 0;
+  result = read_iface->get_node_arrays( 3, nverts, MB_START_ID, 
+                                        processor_id,
+                                        start_handle_out, arrays);
+  if (MB_SUCCESS != result) return result;
+  for (int i = 0; i < nverts; i++) {
+    arrays[0][i] = coordinates[3*i];
+    arrays[1][i] = coordinates[3*i+1];
+    arrays[2][i] = coordinates[3*i+2];
+  }
+
+  entity_handles.clear();
+  entity_handles.insert(start_handle_out, start_handle_out+nverts-1);
+  
+  return MB_SUCCESS;
+}
+
+
 //! merges two  entities
 MBErrorCode MBCore::merge_entities( MBEntityHandle entity_to_keep, 
                                       MBEntityHandle entity_to_remove,
@@ -2150,7 +2198,7 @@
                                    int ,
                                    int start_proc)
 {
-  if (-1 == start_proc) start_proc = procInfo.rank();
+  if (-1 == start_proc) start_proc = handleUtils.proc_rank();
   return sequence_manager()->create_mesh_set( start_proc, options, ms_handle );
 }
 
@@ -2897,3 +2945,23 @@
                          tag_array,         num_tags,
                          tag_storage,       amortized_tag_storage );
 }
+
+    //! Return the rank of this processor
+const int MBCore::proc_rank() const 
+{
+  return handleUtils.proc_rank();
+}
+
+    //! Return the number of processors
+const int MBCore::proc_size() const 
+{
+  return handleUtils.proc_size();
+}
+
+    //! Return the utility for dealing with entity handles
+const MBHandleUtils &MBCore::handle_utils() const 
+{
+  return handleUtils;
+}
+
+

Modified: MOAB/trunk/MBCore.hpp
===================================================================
--- MOAB/trunk/MBCore.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBCore.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -17,7 +17,7 @@
 #define MB_IMPL_GENERAL_HPP
 
 #include "MBInterface.hpp"
-#include "MBProcConfig.hpp"
+#include "MBHandleUtils.hpp"
 #include <map>
 
 class MBWriteUtil;
@@ -27,6 +27,7 @@
 class TagServer;
 class MBError;
 class MBReaderWriterSet;
+class MBHandleUtils;
 
 #ifdef XPCOM_MB
 
@@ -457,6 +458,28 @@
                                        const double coordinates[3], 
                                        MBEntityHandle &entity_handle );
 
+    //! Create a set of vertices with the specified coordinates
+    /**
+       \param coordinates Array that has 3*n doubles in it.
+       \param nverts Number of vertices to create
+       \param entity_handles MBRange passed back with new vertex handles
+    */
+  virtual MBErrorCode create_vertices(const double *coordinates, 
+                                      const int nverts,
+                                      MBRange &entity_handles );
+
+    //! Create a set of vertices with the specified coordinates and proc id
+    /**
+       \param processor_id Processor id for these vertices
+       \param coordinates Array that has 3*n doubles in it.
+       \param nverts Number of vertices to create
+       \param entity_handles MBRange passed back with new vertex handles
+    */
+  virtual MBErrorCode create_vertices(const unsigned processor_id,
+                                      const double *coordinates, 
+                                      const int nverts,
+                                      MBRange &entity_handles );
+
       //! merges two entities
     virtual MBErrorCode merge_entities(MBEntityHandle entity_to_keep, 
                                         MBEntityHandle entity_to_remove,
@@ -947,9 +970,15 @@
                              unsigned long* amortized_tag_storage = 0 );
                                      
   
-  virtual const MBProcConfig& proc_config() const 
-    { return procInfo; }
+    //! Return the rank of this processor
+  virtual const int proc_rank() const;
 
+    //! Return the number of processors
+  virtual const int proc_size() const;
+
+    //! Return the utility for dealing with entity handles
+  virtual const MBHandleUtils &handle_utils() const;
+
 private:
 
   void estimated_memory_use_internal( const MBRange* ents,
@@ -964,8 +993,6 @@
                             unsigned long* tag_storage,
                             unsigned long* amortized_tag_storage );
 
-  const MBProcConfig procInfo;
-
     //! database init and de-init routines
   MBErrorCode initialize();
   void deinitialize();
@@ -986,6 +1013,9 @@
     //! the overall geometric dimension of this mesh
   int geometricDimension;
 
+    //! utility for dealing with handles, proc's, etc.
+  MBHandleUtils handleUtils;
+
   MBTag materialTag;
   MBTag neumannBCTag;
   MBTag dirichletBCTag;

Added: MOAB/trunk/MBHandleUtils.cpp
===================================================================
--- MOAB/trunk/MBHandleUtils.cpp	                        (rev 0)
+++ MOAB/trunk/MBHandleUtils.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,110 @@
+#include "MBHandleUtils.hpp"
+#include "MBInternals.hpp"
+#include "assert.h"
+
+MBHandleUtils::MBHandleUtils(int proc_rank, int proc_size) : 
+    procRank((int) proc_rank),
+    procSize((int) proc_size),
+    procWidth( ceil_log_2(procSize)),
+    idWidth( MB_ID_WIDTH - procWidth ),
+    idMask( MB_ID_MASK >> procWidth ),
+    procMask( ~(MB_TYPE_MASK|idMask) )
+{
+  assert(0 <= proc_rank && 0 <= proc_size);
+}
+
+MBEntityHandle MBHandleUtils::create_handle( MBEntityType type, 
+                                             MBEntityID sub_id, 
+                                             unsigned proc ) const
+{
+  int err;
+  return CREATE_HANDLE( type, create_id( sub_id, proc ), err );
+}
+
+MBRange MBHandleUtils::subset_by_proc( unsigned proc, 
+                                       const MBRange &range) const
+{
+  int junk;
+  MBRange result;
+  MBRange::iterator insert_pos = result.begin();
+  MBRange::const_pair_iterator iter;
+  MBEntityHandle s, e;
+  
+  for (iter = range.const_pair_begin(); iter != range.const_pair_end(); ++iter)
+  {
+    const MBEntityType beg_type = TYPE_FROM_HANDLE(iter->first),
+                       end_type = TYPE_FROM_HANDLE(iter->second);
+    const unsigned beg_rank = rank_from_handle(iter->first), 
+      end_rank = rank_from_handle(iter->second);
+    
+    if (beg_type != end_type) {
+      if (beg_rank <= proc) {
+        s = beg_rank == proc ? iter->first : 
+            CREATE_HANDLE( beg_type,    create_id(0,proc), junk );
+        e = CREATE_HANDLE( beg_type, last_id(proc), junk );
+        insert_pos = result.insert( insert_pos, s, e );
+      }
+      MBEntityType t = beg_type;
+      for (++t; t != end_type; ++t) {
+        s = CREATE_HANDLE( t,    create_id(0,proc), junk );
+        e = CREATE_HANDLE( t, last_id(proc), junk );
+        insert_pos = result.insert( insert_pos, s, e );
+      }
+      if (end_rank >= proc) {
+        e = end_rank == proc ? iter->second :
+            CREATE_HANDLE( end_type, last_id(proc), junk );
+        s = CREATE_HANDLE( end_type,    create_id(0,proc), junk );
+        insert_pos = result.insert( insert_pos, s, e );
+      }
+    }
+    else if (beg_rank <= proc && end_rank >= proc) {
+      s = (beg_rank == proc) ? iter->first  : 
+        CREATE_HANDLE( beg_type, create_id(0,proc), junk );
+      e = (end_rank == proc) ? iter->second : 
+        CREATE_HANDLE( beg_type, last_id(proc), junk );
+      insert_pos = result.insert( insert_pos, s, e );
+    }
+  }
+  
+  return result;
+}
+
+MBRange::const_iterator MBHandleUtils::lower_bound( MBEntityType type, 
+                                                    unsigned proc, 
+                                                    const MBRange &range) const
+{
+  int err;
+  MBEntityHandle h = CREATE_HANDLE( type, create_id(0,proc), err );
+  return err ? range.end() : 
+    MBRange::lower_bound(range.begin(), range.end(), h);
+}
+
+MBRange::const_iterator MBHandleUtils::upper_bound( MBEntityType type, 
+                                                    unsigned proc, 
+                                                    const MBRange &range) const
+{
+  int err;
+  MBEntityHandle h = CREATE_HANDLE( type, last_id(proc), err );
+  return err ? range.end() : 
+    MBRange::upper_bound(range.begin(), range.end(), h);
+}
+
+std::pair<MBRange::const_iterator,MBRange::const_iterator>
+MBHandleUtils::equal_range( MBEntityType type, unsigned proc, 
+                            const MBRange &range) const
+{
+  std::pair<MBRange::const_iterator, MBRange::const_iterator> iters;
+  int err;
+  MBEntityHandle h;
+
+  h = CREATE_HANDLE( type, create_id(0,proc), err );
+  iters.first = err ? range.end() : 
+    MBRange::lower_bound(range.begin(), range.end(), h);  
+  
+  h = CREATE_HANDLE( type, last_id(proc), err );
+  iters.second = err ? range.end() : 
+    MBRange::upper_bound( iters.first, range.end(), h );
+  
+  return iters;
+}
+

Added: MOAB/trunk/MBHandleUtils.hpp
===================================================================
--- MOAB/trunk/MBHandleUtils.hpp	                        (rev 0)
+++ MOAB/trunk/MBHandleUtils.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,90 @@
+#ifndef MBHANDLEUTILS_HPP
+#define MBHANDLEUTILS_HPP
+
+#include "MBInterface.hpp"
+#include "MBRange.hpp"
+
+class MBHandleUtils 
+{
+public:
+  MBHandleUtils(int proc_rank, int proc_size);
+
+    //! Get processor rank
+  unsigned proc_rank() const {return procRank;}
+  
+    //! Get processor size
+  unsigned proc_size() const {return procSize;}
+      
+    //! Get CPU number from handle
+  unsigned rank_from_handle( MBEntityHandle handle ) const
+    { return (handle & procMask) >> idWidth; }
+      
+    //! Get CPU number from ID
+  unsigned rank_from_id( MBEntityID id ) const
+    { return id >> idWidth; }
+      
+    //! Get maximum entity ID that can be stored in a
+    //! a handle, allowing for the processor number
+  MBEntityID max_id() const
+    { return idMask; }
+      
+    //! Create the ID portion of a handle by combining
+    //! an actual ID and a processor number
+  MBEntityID create_id( MBEntityID sub_id, unsigned proc ) const
+    { return ((MBEntityHandle)proc << idWidth) | (MBEntityHandle)sub_id; }
+      
+    //! Extract non-rank portion of entity ID from handle
+  MBEntityID id_from_handle( MBEntityHandle h ) const
+    { return h & idMask; }
+      
+  MBEntityID first_id( unsigned proc ) const
+    { return create_id( 1, proc ); }
+    
+  MBEntityID last_id( unsigned proc ) const
+    { return create_id( max_id(), proc ); }
+      
+    //! Create an entity handle given type, rank, and id
+  MBEntityHandle create_handle( MBEntityType type, 
+                                MBEntityID sub_id, 
+                                unsigned proc ) const;
+
+    //! return a subset with corresponding proc values in handles
+  MBRange subset_by_proc( unsigned proc, 
+                          const MBRange &range) const;
+  
+    //! return a lower bound for handles with corresponding proc values
+  MBRange::const_iterator lower_bound( MBEntityType type, 
+                                       unsigned proc, 
+                                       const MBRange &range) const;
+  
+    //! return an upper bound for handles with corresponding proc values
+  MBRange::const_iterator upper_bound( MBEntityType type, 
+                                       unsigned proc, 
+                                       const MBRange &range) const;
+  
+    //! return an equal range for handles with corresponding proc values
+  std::pair<MBRange::const_iterator,MBRange::const_iterator>
+  equal_range( MBEntityType type, unsigned proc, 
+               const MBRange &range) const;
+  
+private:
+/** Calculate ceiling of log 2 of a positive integer */
+  static unsigned ceil_log_2( unsigned n );
+
+  unsigned procRank;    //!< ID of this processor
+  unsigned procSize;    //!< Total number of processors
+  unsigned procWidth;   //!< Number of bits in handle for processor ID
+  unsigned idWidth;     //!< Number of bits in handle for entity ID
+  MBEntityHandle idMask;
+  MBEntityHandle procMask;
+  
+};
+
+inline unsigned MBHandleUtils::ceil_log_2( unsigned n )
+{
+  unsigned result;
+  for (result = 0; n > (((MBEntityHandle)1)<<result); ++result);
+  return result;
+}
+
+#endif

Modified: MOAB/trunk/MBInterface.hpp
===================================================================
--- MOAB/trunk/MBInterface.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBInterface.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -60,6 +60,7 @@
                                          0xbd, 0xf6, 0xc3, 0x4e, 0xf7, 0x1f, 0x5a, 0x52 );
 
 
+class MBHandleUtils;
 
 #if defined(XPCOM_MB)
 class NS_NO_VTABLE MBInterface : public nsISupports {
@@ -775,6 +776,39 @@
   virtual MBErrorCode create_vertex(const double coordinates[3], 
                                     MBEntityHandle &entity_handle ) = 0;
 
+      /**\brief Create vertex given CPU ID and coordinates.
+       *
+       * Create a vertex with the specified processor ID
+       *\param processor_id The ID of the CPU on owning the element
+       *\param coordinates The vertex coordinates
+       *\param entity_handle Output handle value.
+       */
+    virtual MBErrorCode create_vertex( const unsigned processor_id,
+                                       const double coordinates[3], 
+                                       MBEntityHandle &entity_handle ) = 0;
+
+    //! Create a set of vertices with the specified coordinates
+    /**
+       \param coordinates Array that has 3*n doubles in it.
+       \param nverts Number of vertices to create
+       \param entity_handles MBRange passed back with new vertex handles
+    */
+  virtual MBErrorCode create_vertices(const double *coordinates, 
+                                      const int nverts,
+                                      MBRange &entity_handles ) = 0;
+
+    //! Create a set of vertices with the specified coordinates and proc id
+    /**
+       \param processor_id Processor id for these vertices
+       \param coordinates Array that has 3*n doubles in it.
+       \param nverts Number of vertices to create
+       \param entity_handles MBRange passed back with new vertex handles
+    */
+  virtual MBErrorCode create_vertices(const unsigned processor_id,
+                                      const double *coordinates, 
+                                      const int nverts,
+                                      MBRange &entity_handles ) = 0;
+
     //! Merge two entities into a single entity
     /** Merge two entities into a single entities, with <em>entity_to_keep</em> receiving
         adjacencies that were on <em>entity_to_remove</em>.
@@ -1486,10 +1520,14 @@
                              unsigned long* amortized_tag_storage = 0 ) = 0;
                                      
   
-    //! Return the MBProcConfig object for this instance, contains the rank and
-    //! size if running in parallel
-  virtual const MBProcConfig& proc_config() const = 0;
+    //! Return the rank of this processor
+  virtual const int proc_rank() const = 0;
 
+    //! Return the number of processors
+  virtual const int proc_size() const = 0;
+
+    //! Return the utility for dealing with entity handles
+  virtual const MBHandleUtils &handle_utils() const = 0;
 };
 
 //! predicate for STL algorithms.  Returns true if the entity handle is

Deleted: MOAB/trunk/MBParallelComm.cpp
===================================================================
--- MOAB/trunk/MBParallelComm.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBParallelComm.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,1071 +0,0 @@
-#include "MBInterface.hpp"
-#include "MBParallelComm.hpp"
-#include "MBWriteUtilIface.hpp"
-#include "MBReadUtilIface.hpp"
-#include "EntitySequenceManager.hpp"
-#include "EntitySequence.hpp"
-#include "TagServer.hpp"
-#include "MBTagConventions.hpp"
-
-#include <assert.h>
-
-#ifdef USE_MPI
-#include "mpi.h"
-#endif
-
-#define INITIAL_BUFF_SIZE 1024
-
-#define PACK_INT(buff, int_val) {int tmp_val = int_val; PACK_INTS(buff, &tmp_val, 1);}
-
-#define PACK_INTS(buff, int_val, num) {memcpy(buff, int_val, num*sizeof(int)); buff += num*sizeof(int);}
-
-#define PACK_DBL(buff, dbl_val, num) {memcpy(buff, dbl_val, num*sizeof(double)); buff += num*sizeof(double);}
-
-#define PACK_EH(buff, eh_val, num) {memcpy(buff, eh_val, num*sizeof(MBEntityHandle)); buff += num*sizeof(MBEntityHandle);}
-
-#define PACK_CHAR_64(buff, char_val) {strcpy((char*)buff, char_val); buff += 64;}
-
-#define PACK_VOID(buff, val, num) {memcpy(buff, val, num); buff += num;}
-
-#define PACK_RANGE(buff, rng) {int num_subs = num_subranges(rng); PACK_INTS(buff, &num_subs, 1); \
-          for (MBRange::const_pair_iterator cit = rng.const_pair_begin(); cit != rng.const_pair_end(); cit++) { \
-            MBEntityHandle eh = (*cit).first; PACK_EH(buff_ptr, &eh, 1); \
-            eh = (*cit).second; PACK_EH(buff_ptr, &eh, 1);}}
-
-#define UNPACK_INT(buff, int_val) {UNPACK_INTS(buff, &int_val, 1);}
-
-#define UNPACK_INTS(buff, int_val, num) {memcpy(int_val, buff, num*sizeof(int)); buff += num*sizeof(int);}
-
-#define UNPACK_DBL(buff, dbl_val, num) {memcpy(dbl_val, buff, num*sizeof(double)); buff += num*sizeof(double);}
-
-#define UNPACK_EH(buff, eh_val, num) {memcpy(eh_val, buff, num*sizeof(MBEntityHandle)); buff += num*sizeof(MBEntityHandle);}
-
-#define UNPACK_CHAR_64(buff, char_val) {strcpy(char_val, (char*)buff); buff += 64;}
-
-#define UNPACK_VOID(buff, val, num) {memcpy(val, buff, num); buff += num;}
-
-#define UNPACK_RANGE(buff, rng) {int num_subs; UNPACK_INTS(buff, &num_subs, 1); MBEntityHandle _eh[2]; \
-          for (int i = 0; i < num_subs; i++) { UNPACK_EH(buff_ptr, _eh, 2); rng.insert(_eh[0], _eh[1]);}}
-
-#define RR if (MB_SUCCESS != result) return result
-
-MBParallelComm::MBParallelComm(MBInterface *impl, TagServer *tag_server, 
-                               EntitySequenceManager *sequence_manager) 
-    : mbImpl(impl), procInfo(impl->proc_config()), tagServer(tag_server), sequenceManager(sequence_manager)
-{
-  myBuffer.reserve(INITIAL_BUFF_SIZE);
-}
-
-MBParallelComm::MBParallelComm(MBInterface *impl, TagServer *tag_server, 
-                               EntitySequenceManager *sequence_manager,
-                               std::vector<unsigned char> &tmp_buff) 
-    : mbImpl(impl), procInfo(impl->proc_config()), tagServer(tag_server), sequenceManager(sequence_manager)
-{
-  myBuffer.swap(tmp_buff);
-}
-
-//! assign a global id space, for largest-dimension or all entities (and
-//! in either case for vertices too)
-MBErrorCode MBParallelComm::assign_global_ids(const int dimension, 
-                                              const int start_id,
-                                              const bool largest_dim_only) 
-{
-  MBRange entities[4];
-  int local_num_elements[4];
-  MBErrorCode result;
-  for (int dim = 0; dim <= dimension; dim++) {
-    if (dim == 0 || !largest_dim_only || dim == dimension) {
-      result = mbImpl->get_entities_by_dimension(0, dim, entities[dim]); RR;
-    }
-
-      // need to filter out non-locally-owned entities!!!
-    MBRange dum_range;
-    for (MBRange::iterator rit = entities[dim].begin(); rit != entities[dim].end(); rit++)
-      if (procInfo.rank(*rit) != procInfo.rank()) dum_range.insert(*rit);
-    entities[dim] = entities[dim].subtract(dum_range);
-    
-    local_num_elements[dim] = entities[dim].size();
-  }
-  
-    // communicate numbers
-  std::vector<int> num_elements(procInfo.size()*4);
-#ifdef USE_MPI
-  if (procInfo.size() > 1) {
-    int retval = MPI_Alltoall(local_num_elements, 4, MPI_INTEGER,
-                              &num_elements[0], procInfo.size()*4, 
-                              MPI_INTEGER, MPI_COMM_WORLD);
-    if (0 != retval) return MB_FAILURE;
-  }
-  else
-#endif
-    for (int dim = 0; dim < 4; dim++) num_elements[dim] = local_num_elements[dim];
-  
-    // my entities start at one greater than total_elems[d]
-  int total_elems[4] = {start_id, start_id, start_id, start_id};
-  
-  for (unsigned int proc = 0; proc < procInfo.rank(); proc++) {
-    for (int dim = 0; dim < 4; dim++) total_elems[dim] += num_elements[4*proc + dim];
-  }
-  
-    //.assign global ids now
-  MBTag gid_tag;
-  int zero = 0;
-  result = mbImpl->tag_create(GLOBAL_ID_TAG_NAME, 1, MB_TAG_DENSE, MB_TYPE_INTEGER, gid_tag,
-                              &zero, true);
-  if (MB_SUCCESS != result && MB_ALREADY_ALLOCATED != result) return result;
-  
-  for (int dim = 0; dim < 4; dim++) {
-    if (entities[dim].empty()) continue;
-    num_elements.reserve(entities[dim].size());
-    int i = 0;
-    for (MBRange::iterator rit = entities[dim].begin(); rit != entities[dim].end(); rit++)
-      num_elements[i++] = total_elems[dim]++;
-    
-    result = mbImpl->tag_set_data(gid_tag, entities[dim], &num_elements[0]); RR;
-  }
-  
-  return MB_SUCCESS;
-}
-  
-MBErrorCode MBParallelComm::communicate_entities(const int from_proc, const int to_proc,
-                                                 MBRange &entities,
-                                                 const bool adjacencies,
-                                                 const bool tags) 
-{
-#ifndef USE_MPI
-  return MB_FAILURE;
-#else
-  
-  MBErrorCode result = MB_SUCCESS;
-  
-    // if I'm the from, do the packing and sending
-  if ((int)procInfo.rank() == from_proc) {
-    allRanges.clear();
-    vertsPerEntity.clear();
-    setRange.clear();
-    setRanges.clear();
-    allTags.clear();
-    setSizes.clear();
-    optionsVec.clear();
-    setPcs.clear();
-
-    MBRange whole_range;
-
-    int buff_size;
-    
-    result = pack_buffer(entities, adjacencies, tags, true, whole_range, buff_size); RR;
-
-      // if the message is large, send a first message to tell how large
-    if (INITIAL_BUFF_SIZE < buff_size) {
-      int tmp_buff_size = -buff_size;
-      MPI_Request send_req;
-      int success = MPI_Isend(&tmp_buff_size, sizeof(int), MPI_UNSIGNED_CHAR, to_proc, 
-                              0, MPI_COMM_WORLD, &send_req);
-      if (!success) return MB_FAILURE;
-    }
-    
-      // allocate space in the buffer
-    myBuffer.reserve(buff_size);
-
-      // pack the actual buffer
-    int actual_buff_size;
-    result = pack_buffer(entities, adjacencies, tags, false, whole_range, actual_buff_size); RR;
-    
-      // send it
-    MPI_Request send_req;
-    int success = MPI_Isend(&myBuffer[0], actual_buff_size, MPI_UNSIGNED_CHAR, to_proc, 
-                            0, MPI_COMM_WORLD, &send_req);
-    if (!success) return MB_FAILURE;
-  }
-  else if ((int)procInfo.rank() == to_proc) {
-    int buff_size;
-    
-      // get how much to allocate
-    MPI_Status status;
-    int success = MPI_Recv(&myBuffer[0], myBuffer.size(), MPI_UNSIGNED_CHAR, from_proc, 
-                           MPI_ANY_TAG, MPI_COMM_WORLD, &status);
-    int num_recd;
-    success = MPI_Get_count(&status, MPI_UNSIGNED_CHAR, &num_recd);
-    
-    if (sizeof(int) == num_recd && 0 > *((int*)&myBuffer[0])) {
-        // this was just the size of the next message; prepare buffer then receive that message
-      buff_size = myBuffer[0];
-      myBuffer.reserve(buff_size);
-    
-      // receive the real message
-      success = MPI_Recv(&myBuffer[0], buff_size, MPI_UNSIGNED_CHAR, from_proc, 
-                         MPI_ANY_TAG, MPI_COMM_WORLD, &status);
-    }
-    
-      // unpack the buffer
-    result = unpack_buffer(entities); RR;
-  }
-  
-  return result;
-
-#endif
-}
-
-MBErrorCode MBParallelComm::broadcast_entities( const int from_proc,
-                                                MBRange &entities,
-                                                const bool adjacencies,
-                                                const bool tags) 
-{
-#ifndef USE_MPI
-  return MB_FAILURE;
-#else
-  
-  MBErrorCode result = MB_SUCCESS;
-  int success;
-  MBRange whole_range;
-  int buff_size;
-  
-  allRanges.clear();
-  vertsPerEntity.clear();
-  setRange.clear();
-  setRanges.clear();
-  allTags.clear();
-  setSizes.clear();
-  optionsVec.clear();
-  setPcs.clear();
-
-  if ((int)procInfo.rank() == from_proc) {
-    result = pack_buffer( entities, adjacencies, tags, true, whole_range, buff_size ); RR;
-  }
-
-  success = MPI_Bcast( &buff_size, 1, MPI_INT, from_proc, MPI_COMM_WORLD );
-  if (MPI_SUCCESS != success)
-    return MB_FAILURE;
-  
-  if (!buff_size) // no data
-    return MB_SUCCESS;
-  
-  myBuffer.reserve( buff_size );
-  
-  if ((int)procInfo.rank() == from_proc) {
-    int actual_buffer_size;
-    result = pack_buffer( entities, adjacencies, tags, false, whole_range, actual_buffer_size ); RR;
-  }
-
-  success = MPI_Bcast( &myBuffer[0], buff_size, MPI_UNSIGNED_CHAR, from_proc, MPI_COMM_WORLD );
-  if (MPI_SUCCESS != success)
-    return MB_FAILURE;
-  
-  if ((int)procInfo.rank() != from_proc) {
-    result = unpack_buffer( entities ); RR;
-  }
-
-  return MB_SUCCESS;
-#endif
-}
-
-MBErrorCode MBParallelComm::pack_buffer(MBRange &entities, 
-                                        const bool adjacencies,
-                                        const bool tags,
-                                        const bool just_count,
-                                        MBRange &whole_range,
-                                        int &buff_size) 
-{
-    // pack the buffer with the entity ranges, adjacencies, and tags sections
-  MBErrorCode result;
-
-  buff_size = 0;
-  MBRange::const_iterator rit;
-  unsigned char *buff_ptr = NULL;
-  if (!just_count) buff_ptr = &myBuffer[0];
-  
-    // entities
-  result = pack_entities(entities, rit, whole_range, buff_ptr, buff_size, just_count); RR;
-  
-    // sets
-  int tmp_size;
-  result = pack_sets(entities, rit, whole_range, buff_ptr, tmp_size, just_count); RR;
-  buff_size += tmp_size;
-  
-    // adjacencies
-  if (adjacencies) {
-    result = pack_adjacencies(entities, rit, whole_range, buff_ptr, tmp_size, just_count); RR;
-    buff_size += tmp_size;
-  }
-    
-    // tags
-  if (tags) {
-    result = pack_tags(entities, rit, whole_range, buff_ptr, tmp_size, just_count); RR;
-    buff_size += tmp_size;
-  }
-
-  return result;
-}
- 
-MBErrorCode MBParallelComm::unpack_buffer(MBRange &entities) 
-{
-  if (myBuffer.capacity() == 0) return MB_FAILURE;
-  
-  unsigned char *buff_ptr = &myBuffer[0];
-  MBErrorCode result = unpack_entities(buff_ptr, entities); RR;
-  result = unpack_sets(buff_ptr, entities); RR;
-  result = unpack_tags(buff_ptr, entities); RR;
-  
-  return MB_SUCCESS;
-}
-
-int MBParallelComm::num_subranges(const MBRange &this_range)
-{
-    // ok, have all the ranges we'll pack; count the subranges
-  int num_sub_ranges = 0;
-  for (MBRange::const_pair_iterator pit = this_range.const_pair_begin(); 
-       pit != this_range.const_pair_end(); pit++)
-    num_sub_ranges++;
-
-  return num_sub_ranges;
-}
-
-MBErrorCode MBParallelComm::pack_entities(MBRange &entities,
-                                          MBRange::const_iterator &start_rit,
-                                          MBRange &whole_range,
-                                          unsigned char *&buff_ptr,
-                                          int &count,
-                                          const bool just_count) 
-{
-  count = 0;
-  unsigned char *orig_buff_ptr = buff_ptr;
-  MBErrorCode result;
-  MBWriteUtilIface *wu = NULL;
-  if (!just_count) {
-    result = mbImpl->query_interface(std::string("MBWriteUtilIface"), reinterpret_cast<void**>(&wu)); RR;
-  }
-  
-    // pack vertices
-  if (just_count) {
-    entTypes.push_back(MBVERTEX);
-    vertsPerEntity.push_back(1);
-    allRanges.push_back(entities.subset(MBVERTEX));
-  }
-  else {
-    PACK_INT(buff_ptr, MBVERTEX);
-    PACK_RANGE(buff_ptr, allRanges[0]);
-    int num_verts = allRanges[0].size();
-    std::vector<double*> coords(3);
-    for (int i = 0; i < 3; i++)
-      coords[i] = reinterpret_cast<double*>(buff_ptr + i * num_verts * sizeof(double));
-
-    assert(NULL != wu);
-    
-    result = wu->get_node_arrays(3, num_verts, allRanges[0], 0, 0, coords); RR;
-
-    buff_ptr += 3 * num_verts * sizeof(double);
-
-    whole_range = allRanges[0];
-  }
-
-    // place an iterator at the first non-vertex entity
-  if (!allRanges[0].empty()) {
-    start_rit = entities.find(*allRanges[0].rbegin());
-    start_rit++;
-  }
-  else {
-    start_rit = entities.begin();
-  }
-  
-  MBRange::const_iterator end_rit = start_rit;
-  if (allRanges[0].size() == entities.size()) return MB_SUCCESS;
-
-  std::vector<MBRange>::iterator allr_it = allRanges.begin();
-  
-    // pack entities
-  if (just_count) {    
-
-      // get all ranges of entities that have different #'s of vertices or different types
-    while (end_rit != entities.end() && TYPE_FROM_HANDLE(*start_rit) != MBENTITYSET) {
-
-        // get the sequence holding this entity
-      MBEntitySequence *seq;
-      ElementEntitySequence *eseq;
-      result = sequenceManager->find(*start_rit, seq); RR;
-      if (NULL == seq) return MB_FAILURE;
-      eseq = dynamic_cast<ElementEntitySequence*>(seq);
-
-        // if type and nodes per element change, start a new range
-      if (eseq->get_type() != *entTypes.rbegin() || (int) eseq->nodes_per_element() != *vertsPerEntity.rbegin()) {
-        entTypes.push_back(eseq->get_type());
-        vertsPerEntity.push_back(eseq->nodes_per_element());
-        allRanges.push_back(MBRange());
-        allr_it++;
-      }
-    
-        // get position in entities list one past end of this sequence
-      end_rit = entities.lower_bound(start_rit, entities.end(), eseq->get_end_handle()+1);
-
-        // put these entities in the last range
-      eseq->get_entities(*allRanges.rbegin());
-      whole_range.merge(*allRanges.rbegin());
-      
-        // now start where we last left off
-      start_rit = end_rit;
-    }
-
-      // update vertex range and count those data, now that we know which entities get communicated
-    result = mbImpl->get_adjacencies(whole_range, 0, false, allRanges[0], MBInterface::UNION); RR;
-    whole_range.merge(allRanges[0]);
-    count += 3 * sizeof(double) * allRanges[0].size();
-    
-      // space for the ranges
-    std::vector<MBRange>::iterator vit = allRanges.begin();
-    std::vector<int>::iterator iit = vertsPerEntity.begin();
-    std::vector<MBEntityType>::iterator eit = entTypes.begin();
-    for (; vit != allRanges.end(); vit++, iit++, eit++) {
-        // subranges of entities
-      count += 2*sizeof(MBEntityHandle)*num_subranges(*vit);
-        // connectivity of subrange
-      if (iit != vertsPerEntity.begin()) {
-        if (*eit != MBPOLYGON && *eit != MBPOLYHEDRON) 
-            // for non-poly's: #verts/ent * #ents * sizeof handle
-          count += *iit * (*vit).size() * sizeof(MBEntityHandle);
-          // for poly's:  length of conn list * handle size + #ents * int size (for offsets)
-        else count += *iit * sizeof(MBEntityHandle) + (*vit).size() * sizeof(int);
-      }
-    }
-      //                                num_verts per subrange    ent type in subrange
-    count += (vertsPerEntity.size() + 1) * (sizeof(int) + sizeof(MBEntityType));
-
-      // extra entity type at end
-    count += sizeof(int);
-  }
-  else {
-      // for each range beyond the first
-    allr_it++;
-    std::vector<int>::iterator nv_it = vertsPerEntity.begin();
-    std::vector<MBEntityType>::iterator et_it = entTypes.begin();
-    nv_it++; et_it++;
-    
-    for (; allr_it != allRanges.end(); allr_it++, nv_it++, et_it++) {
-        // pack the entity type
-      PACK_INT(buff_ptr, *et_it);
-      
-        // pack the range
-      PACK_RANGE(buff_ptr, (*allr_it));
-
-        // pack the nodes per entity
-      PACK_INT(buff_ptr, *nv_it);
-      
-        // pack the connectivity
-      const MBEntityHandle *connect;
-      int num_connect;
-      if (*et_it == MBPOLYGON || *et_it == MBPOLYHEDRON) {
-        std::vector<int> num_connects;
-        for (MBRange::const_iterator rit = allr_it->begin(); rit != allr_it->end(); rit++) {
-          result = mbImpl->get_connectivity(*rit, connect, num_connect); RR;
-          num_connects.push_back(num_connect);
-          PACK_EH(buff_ptr, &connect[0], num_connect);
-        }
-        PACK_INTS(buff_ptr, &num_connects[0], num_connects.size());
-      }
-      else {
-        for (MBRange::const_iterator rit = allr_it->begin(); rit != allr_it->end(); rit++) {
-          result = mbImpl->get_connectivity(*rit, connect, num_connect); RR;
-          assert(num_connect == *nv_it);
-          PACK_EH(buff_ptr, &connect[0], num_connect);
-        }
-      }
-
-      whole_range.merge(*allr_it);
-    }
-
-      // pack MBMAXTYPE to indicate end of ranges
-    PACK_INT(buff_ptr, MBMAXTYPE);
-
-    count = buff_ptr - orig_buff_ptr;
-  }
-  
-  return MB_SUCCESS;
-}
-
-MBErrorCode MBParallelComm::unpack_entities(unsigned char *&buff_ptr,
-                                            MBRange &entities) 
-{
-  MBErrorCode result;
-  bool done = false;
-  MBReadUtilIface *ru = NULL;
-  result = mbImpl->query_interface(std::string("MBReadUtilIface"), reinterpret_cast<void**>(&ru)); RR;
-  
-  while (!done) {
-    MBEntityType this_type;
-    UNPACK_INT(buff_ptr, this_type);
-    assert(this_type >= MBVERTEX && 
-           (this_type == MBMAXTYPE || this_type < MBENTITYSET));
-
-      // MBMAXTYPE signifies end of entities data
-    if (MBMAXTYPE == this_type) break;
-    
-      // get the range
-    MBRange this_range;
-    UNPACK_RANGE(buff_ptr, this_range);
-    
-    if (MBVERTEX == this_type) {
-        // unpack coords
-      int num_verts = this_range.size();
-      std::vector<double*> coords(3*num_verts);
-      for (MBRange::const_pair_iterator pit = this_range.const_pair_begin(); 
-           pit != this_range.const_pair_end(); pit++) {
-          // allocate handles
-        int start_id = procInfo.id((*pit).first);
-        int start_proc = procInfo.rank((*pit).first);
-        MBEntityHandle actual_start;
-        int tmp_num_verts = (*pit).second - (*pit).first + 1;
-        result = ru->get_node_arrays(3, tmp_num_verts, start_id, start_proc, actual_start,
-                                     coords); RR;
-        if (actual_start != (*pit).first)
-          return MB_FAILURE;
-
-        entities.insert((*pit).first, (*pit).second);
-        
-          // unpack the buffer data directly into coords
-        for (int i = 0; i < 3; i++) 
-          memcpy(coords[i], buff_ptr+i*num_verts*sizeof(double), 
-                 tmp_num_verts*sizeof(double));
-
-        buff_ptr += tmp_num_verts * sizeof(double);
-      }
-
-        // increment the buffer ptr beyond the y and z coords
-      buff_ptr += 2 * num_verts * sizeof(double);
-    }
-
-    else {
-      
-      int verts_per_entity;
-      
-        // unpack the nodes per entity
-      UNPACK_INT(buff_ptr, verts_per_entity);
-      
-        // unpack the connectivity
-      for (MBRange::const_pair_iterator pit = this_range.const_pair_begin(); 
-           pit != this_range.const_pair_end(); pit++) {
-          // allocate handles, connect arrays
-        int start_id = procInfo.id((*pit).first);
-        int start_proc = procInfo.rank((*pit).first);
-        MBEntityHandle actual_start;
-        int num_elems = (*pit).second - (*pit).first + 1;
-        MBEntityHandle *connect;
-        int *connect_offsets;
-        if (this_type == MBPOLYGON || this_type == MBPOLYHEDRON)
-          result = ru->get_poly_element_array(num_elems, verts_per_entity, this_type,
-                                              start_id, start_proc, actual_start,
-                                              connect_offsets, connect); RR;
-        else
-          result = ru->get_element_array(num_elems, verts_per_entity, this_type,
-                                         start_id, start_proc, actual_start,
-                                         connect); RR;
-
-          // copy connect arrays
-        if (this_type != MBPOLYGON && this_type != MBPOLYHEDRON) {
-          UNPACK_EH(buff_ptr, connect, num_elems * verts_per_entity);
-        }
-        else {
-          UNPACK_EH(buff_ptr, connect, verts_per_entity);
-          assert(NULL != connect_offsets);
-            // and the offsets
-          UNPACK_INTS(buff_ptr, connect_offsets, num_elems);
-        }
-
-        entities.insert((*pit).first, (*pit).second);
-      }
-      
-    }
-  }
-  
-  return MB_SUCCESS;
-}
-
-MBErrorCode MBParallelComm::pack_sets(MBRange &entities,
-                                      MBRange::const_iterator &start_rit,
-                                      MBRange &whole_range,
-                                      unsigned char *&buff_ptr,
-                                      int &count,
-                                      const bool just_count)
-{
-  
-    // now the sets; assume any sets the application wants to pass are in the entities list
-  count = 0;
-  unsigned char *orig_buff_ptr = buff_ptr;
-  MBErrorCode result;
-
-  if (just_count) {
-    for (; start_rit != entities.end(); start_rit++) {
-      setRange.insert(*start_rit);
-      count += sizeof(MBEntityHandle);
-    
-      unsigned int options;
-      result = mbImpl->get_meshset_options(*start_rit, options); RR;
-      optionsVec.push_back(options);
-      count += sizeof(unsigned int);
-    
-      if (options & MESHSET_SET) {
-          // range-based set; count the subranges
-        setRanges.push_back(MBRange());
-        result = mbImpl->get_entities_by_handle(*start_rit, *setRanges.rbegin()); RR;
-        count += 2 * sizeof(MBEntityHandle) * num_subranges(*setRanges.rbegin()) + sizeof(int);
-      }
-      else if (options & MESHSET_ORDERED) {
-          // just get the number of entities in the set
-        int num_ents;
-        result = mbImpl->get_number_entities_by_handle(*start_rit, num_ents); RR;
-        count += sizeof(int);
-        
-        setSizes.push_back(num_ents);
-        count += sizeof(MBEntityHandle) * num_ents + sizeof(int);
-      }
-      whole_range.insert(*start_rit);
-
-        // get numbers of parents/children
-      int num_par, num_ch;
-      result = mbImpl->num_child_meshsets(*start_rit, &num_ch); RR;
-      result = mbImpl->num_parent_meshsets(*start_rit, &num_par); RR;
-      count += 2*sizeof(int) + (num_par + num_ch) * sizeof(MBEntityHandle);
-    
-    }
-  }
-  else {
-    
-    std::vector<unsigned int>::const_iterator opt_it = optionsVec.begin();
-    std::vector<MBRange>::const_iterator rit = setRanges.begin();
-    std::vector<int>::const_iterator mem_it = setSizes.begin();
-    static std::vector<MBEntityHandle> members;
-
-      // set handle range
-    PACK_RANGE(buff_ptr, setRange);
-
-    for (MBRange::const_iterator set_it = setRange.begin(); set_it != setRange.end(); 
-         set_it++, opt_it++) {
-        // option value
-      PACK_VOID(buff_ptr, &(*opt_it), sizeof(unsigned int));
-      
-      if ((*opt_it) & MESHSET_SET) {
-          // pack entities as a range
-        PACK_RANGE(buff_ptr, (*rit));
-        rit++;
-      }
-      else if ((*opt_it) & MESHSET_ORDERED) {
-          // pack entities as vector, with length
-        PACK_INT(buff_ptr, *mem_it);
-        members.clear();
-        result = mbImpl->get_entities_by_handle(*set_it, members); RR;
-        PACK_EH(buff_ptr, &members[0], *mem_it);
-        mem_it++;
-      }
-      
-        // pack parents
-      members.clear();
-      result = mbImpl->get_parent_meshsets(*set_it, members); RR;
-      PACK_INT(buff_ptr, members.size());
-      if (!members.empty()) {
-        PACK_EH(buff_ptr, &members[0], members.size());
-      }
-      
-        // pack children
-      members.clear();
-      result = mbImpl->get_child_meshsets(*set_it, members); RR;
-      PACK_INT(buff_ptr, members.size());
-      if (!members.empty()) {
-        PACK_EH(buff_ptr, &members[0], members.size());
-      }
-      
-    }
-    
-    count = buff_ptr - orig_buff_ptr;
-  }
-  
-  return MB_SUCCESS;
-}
-
-MBErrorCode MBParallelComm::unpack_sets(unsigned char *&buff_ptr,
-                                        MBRange &entities)
-{
-  
-    // now the sets; assume any sets the application wants to pass are in the entities list
-  MBErrorCode result;
-
-  std::vector<unsigned int>::const_iterator opt_it = optionsVec.begin();
-  std::vector<MBRange>::const_iterator rit = setRanges.begin();
-  std::vector<int>::const_iterator mem_it = setSizes.begin();
-
-  MBRange set_handles;
-  UNPACK_RANGE(buff_ptr, set_handles);
-  std::vector<MBEntityHandle> members;
-  
-  for (MBRange::const_iterator rit = set_handles.begin(); rit != set_handles.end(); rit++) {
-    
-      // option value
-    unsigned int opt;
-    UNPACK_VOID(buff_ptr, &opt, sizeof(unsigned int));
-      
-      // create the set
-    MBEntityHandle set_handle;
-    result = mbImpl->create_meshset(opt, set_handle, procInfo.id(*rit), procInfo.rank(*rit)); RR;
-    if (set_handle != *rit)
-      return MB_FAILURE;
-
-    int num_ents;
-    if (opt & MESHSET_SET) {
-        // unpack entities as a range
-      MBRange set_range;
-      UNPACK_RANGE(buff_ptr, set_range);
-      result = mbImpl->add_entities(*rit, set_range); RR;
-    }
-    else if (opt & MESHSET_ORDERED) {
-        // unpack entities as vector, with length
-      UNPACK_INT(buff_ptr, num_ents);
-      members.reserve(num_ents);
-      UNPACK_EH(buff_ptr, &members[0], num_ents);
-      result = mbImpl->add_entities(*rit, &members[0], num_ents); RR;
-    }
-      
-      // unpack parents/children
-    UNPACK_INT(buff_ptr, num_ents);
-    members.reserve(num_ents);
-    UNPACK_EH(buff_ptr, &members[0], num_ents);
-    for (int i = 0; i < num_ents; i++) {
-      result = mbImpl->add_parent_meshset(*rit, members[i]); RR;
-    }
-    UNPACK_INT(buff_ptr, num_ents);
-    members.reserve(num_ents);
-    UNPACK_EH(buff_ptr, &members[0], num_ents);
-    for (int i = 0; i < num_ents; i++) {
-      result = mbImpl->add_child_meshset(*rit, members[i]); RR;
-    }
-  }
-  
-  return MB_SUCCESS;
-}
-
-MBErrorCode MBParallelComm::pack_adjacencies(MBRange &entities,
-                                             MBRange::const_iterator &start_rit,
-                                             MBRange &whole_range,
-                                             unsigned char *&buff_ptr,
-                                             int &count,
-                                             const bool just_count)
-{
-  return MB_FAILURE;
-}
-
-MBErrorCode MBParallelComm::unpack_adjacencies(unsigned char *&buff_ptr,
-                                               MBRange &entities)
-{
-  return MB_FAILURE;
-}
-
-MBErrorCode MBParallelComm::pack_tags(MBRange &entities,
-                                      MBRange::const_iterator &start_rit,
-                                      MBRange &whole_range,
-                                      unsigned char *&buff_ptr,
-                                      int &count,
-                                      const bool just_count)
-{
-    // tags
-    // get all the tags
-    // for dense tags, compute size assuming all entities have that tag
-    // for sparse tags, get number of entities w/ that tag to compute size
-
-  count = 0;
-  unsigned char *orig_buff_ptr = buff_ptr;
-  MBErrorCode result;
-  int whole_size = whole_range.size();
-
-  if (just_count) {
-
-    std::vector<MBTag> all_tags;
-    result = tagServer->get_tags(all_tags); RR;
-
-    for (std::vector<MBTag>::iterator tag_it = all_tags.begin(); tag_it != all_tags.end(); tag_it++) {
-      const TagInfo *tinfo = tagServer->get_tag_info(*tag_it);
-      int this_count = 0;
-      MBRange tmp_range;
-      if (PROP_FROM_TAG_HANDLE(*tag_it) == MB_TAG_DENSE) {
-        this_count += whole_size * tinfo->get_size();
-      }
-      else {
-        result = tagServer->get_entities(*tag_it, MBMAXTYPE, tmp_range); RR;
-        tmp_range = tmp_range.intersect(whole_range);
-        if (!tmp_range.empty()) this_count = tmp_range.size() * tinfo->get_size();
-      }
-
-      if (0 == this_count) continue;
-
-        // ok, we'll be sending this tag
-
-        // tag handle
-      allTags.push_back(*tag_it);
-      count += sizeof(MBTag);
-      
-        // default value
-      count += sizeof(int);
-      if (NULL != tinfo->default_value()) count += tinfo->get_size();
-      
-        // size, data type
-      count += sizeof(int);
-      
-        // data type
-      count += sizeof(MBDataType);
-
-        // name
-      count += 64;
-
-      if (!tmp_range.empty()) {
-        tagRanges.push_back(tmp_range);
-          // range of tag
-        count += sizeof(int) + 2 * num_subranges(tmp_range) * sizeof(MBEntityHandle);
-      }
-      
-          // tag data values for range or vector
-      count += this_count;
-    }
-
-      // number of tags
-    count += sizeof(int);
-  }
-
-  else {
-    static std::vector<int> tag_data;
-    std::vector<MBRange>::const_iterator tr_it = tagRanges.begin();
-
-    PACK_INT(buff_ptr, allTags.size());
-    
-    for (std::vector<MBTag>::const_iterator tag_it = allTags.begin(); tag_it != allTags.end(); tag_it++) {
-
-      const TagInfo *tinfo = tagServer->get_tag_info(*tag_it);
-
-        // tag handle
-      PACK_EH(buff_ptr, &(*tag_it), 1);
-      
-        // size, data type
-      PACK_INT(buff_ptr, tinfo->get_size());
-      PACK_INT(buff_ptr, tinfo->get_data_type());
-      
-        // default value
-      if (NULL == tinfo->default_value()) {
-        PACK_INT(buff_ptr, 0);
-      }
-      else {
-        PACK_INT(buff_ptr, 1);
-        PACK_VOID(buff_ptr, tinfo->default_value(), tinfo->get_size());
-      }
-      
-        // name
-      PACK_CHAR_64(buff_ptr, tinfo->get_name().c_str());
-      
-      if (PROP_FROM_TAG_HANDLE(*tag_it) == MB_TAG_DENSE) {
-        tag_data.reserve((whole_size+1) * tinfo->get_size() / sizeof(int));
-        result = mbImpl->tag_get_data(*tag_it, whole_range, &tag_data[0]);
-        PACK_VOID(buff_ptr, &tag_data[0], whole_size*tinfo->get_size());
-      }
-      else {
-        tag_data.reserve((tr_it->size()+1) * tinfo->get_size() / sizeof(int));
-        result = mbImpl->tag_get_data(*tag_it, *tr_it, &tag_data[0]); RR;
-        PACK_RANGE(buff_ptr, (*tr_it));
-        PACK_VOID(buff_ptr, &tag_data[0], tr_it->size()*tinfo->get_size());
-        tr_it++;
-      }
-      
-    }
-
-    count = buff_ptr - orig_buff_ptr;
-  }
-  
-  return MB_SUCCESS;
-}
-
-MBErrorCode MBParallelComm::unpack_tags(unsigned char *&buff_ptr,
-                                        MBRange &entities)
-{
-    // tags
-    // get all the tags
-    // for dense tags, compute size assuming all entities have that tag
-    // for sparse tags, get number of entities w/ that tag to compute size
-
-  MBErrorCode result;
-  
-  int num_tags;
-  UNPACK_INT(buff_ptr, num_tags);
-  std::vector<int> tag_data;
-
-  for (int i = 0; i < num_tags; i++) {
-    
-        // tag handle
-    MBTag tag_handle;
-    UNPACK_EH(buff_ptr, &tag_handle, 1);
-
-      // size, data type
-    int tag_size, tag_data_type;
-    UNPACK_INT(buff_ptr, tag_size);
-    UNPACK_INT(buff_ptr, tag_data_type);
-      
-      // default value
-    int has_def_value;
-    UNPACK_INT(buff_ptr, has_def_value);
-    void *def_val_ptr = NULL;
-    if (1 == has_def_value) {
-      def_val_ptr = buff_ptr;
-      buff_ptr += tag_size;
-    }
-    
-      // name
-    char *tag_name = reinterpret_cast<char *>(buff_ptr);
-    buff_ptr += 64;
-
-      // create the tag
-    MBTagType tag_type;
-    result = mbImpl->tag_get_type(tag_handle, tag_type); RR;
-
-    result = mbImpl->tag_create(tag_name, tag_size, tag_type, (MBDataType) tag_data_type, tag_handle,
-                                def_val_ptr);
-    if (MB_ALREADY_ALLOCATED == result) {
-        // already allocated tag, check to make sure it's the same size, type, etc.
-      const TagInfo *tag_info = tagServer->get_tag_info(tag_name);
-      if (tag_size != tag_info->get_size() ||
-          tag_data_type != tag_info->get_data_type() ||
-          (def_val_ptr && !tag_info->default_value() ||
-           !def_val_ptr && tag_info->default_value()))
-        return MB_FAILURE;
-      MBTagType this_type;
-      result = mbImpl->tag_get_type(tag_handle, this_type);
-      if (MB_SUCCESS != result || this_type != tag_type) return MB_FAILURE;
-    }
-    else if (MB_SUCCESS != result) return result;
-    
-      // set the tag data
-    if (PROP_FROM_TAG_HANDLE(tag_handle) == MB_TAG_DENSE) {
-      if (NULL != def_val_ptr && tag_data_type != MB_TYPE_OPAQUE) {
-          // only set the tags whose values aren't the default value; only works
-          // if it's a known type
-        MBRange::iterator start_rit = entities.begin(), end_rit = start_rit;
-        MBRange set_ents;
-        while (end_rit != entities.end()) {
-          while (start_rit != entities.end() &&
-                 ((tag_data_type == MB_TYPE_INTEGER && *((int*)def_val_ptr) == *((int*)buff_ptr)) ||
-                  (tag_data_type == MB_TYPE_DOUBLE && *((double*)def_val_ptr) == *((double*)buff_ptr)) ||
-                  (tag_data_type == MB_TYPE_HANDLE && *((MBEntityHandle*)def_val_ptr) == *((MBEntityHandle*)buff_ptr)))) {
-            start_rit++;
-            buff_ptr += tag_size;
-          }
-          end_rit = start_rit;
-          void *end_ptr = buff_ptr;
-          while (start_rit != entities.end() && end_rit != entities.end() &&
-                 ((tag_data_type == MB_TYPE_INTEGER && *((int*)def_val_ptr) == *((int*)end_ptr)) ||
-                  (tag_data_type == MB_TYPE_DOUBLE && *((double*)def_val_ptr) == *((double*)end_ptr)) ||
-                  (tag_data_type == MB_TYPE_HANDLE && *((MBEntityHandle*)def_val_ptr) == *((MBEntityHandle*)end_ptr)))) {
-            set_ents.insert(*end_rit);
-            end_rit++;
-            buff_ptr += tag_size;
-          }
-          
-          if (!set_ents.empty()) {
-            result = mbImpl->tag_set_data(tag_handle, set_ents, buff_ptr); RR;
-          }
-          if (start_rit != entities.end()) {
-            end_rit++;
-            start_rit = end_rit;
-            buff_ptr += tag_size;
-          }
-        }
-      }
-      else {
-        result = mbImpl->tag_set_data(tag_handle, entities, buff_ptr); RR;
-        buff_ptr += entities.size() * tag_size;
-      }
-    }
-    else {
-      MBRange tag_range;
-      UNPACK_RANGE(buff_ptr, tag_range);
-      result = mbImpl->tag_set_data(tag_handle, tag_range, buff_ptr); RR;
-      buff_ptr += tag_range.size() * tag_size;
-    }
-  }
-  
-  return MB_SUCCESS;
-}
-
-bool MBParallelComm::buffer_size(const unsigned int new_size) 
-{
-  unsigned int old_size = myBuffer.size();
-  myBuffer.reserve(new_size);
-  return (new_size == old_size);
-}
-
-void MBParallelComm::take_buffer(std::vector<unsigned char> &new_buffer) 
-{
-  new_buffer.swap(myBuffer);
-}
-
-#ifdef TEST_PARALLELCOMM
-
-#include <iostream>
-
-#include "MBCore.hpp"
-#include "MBParallelComm.hpp"
-#include "MBRange.hpp"
-
-int main(int argc, char* argv[])
-{
-
-    // Check command line arg
-  if (argc < 2)
-  {
-    std::cout << "Usage: " << argv[0] << " <mesh_file_name>" << std::endl;
-    exit(1);
-  }
-
-  const char* file = argv[1];
-  MBCore *my_impl = new MBCore(0, 2);
-  MBInterface* mbImpl = my_impl;
-
-    // create a communicator class, which will start mpi too
-  MBParallelComm pcomm(mbImpl, my_impl->tag_server(), my_impl->sequence_manager());
-  MBErrorCode result;
-
-    // load the mesh
-  result = mbImpl->load_mesh(file, 0, 0);
-  if (MB_SUCCESS != result) return result;
-
-    // get the mesh
-  MBRange all_mesh, whole_range;
-  result = mbImpl->get_entities_by_dimension(0, 3, all_mesh);
-  if (MB_SUCCESS != result) return result;
-    
-  int buff_size;
-  result = pcomm.pack_buffer(all_mesh, false, true, true, whole_range, buff_size); RR;
-
-    // allocate space in the buffer
-  pcomm.buffer_size(buff_size);
-
-    // pack the actual buffer
-  int actual_buff_size;
-  result = pcomm.pack_buffer(whole_range, false, true, false, all_mesh, actual_buff_size); RR;
-
-    // list the entities that got packed
-  std::cout << "ENTITIES PACKED:" << std::endl;
-  mbImpl->list_entities(all_mesh);
-
-    // get the buffer
-  std::vector<unsigned char> tmp_buffer;
-  pcomm.take_buffer(tmp_buffer);
-    
-    // stop and restart MOAB
-  delete mbImpl;
-  my_impl = new MBCore(1, 2);
-  mbImpl = my_impl;
-    
-    // create a new communicator class, using our old buffer
-  MBParallelComm pcomm2(mbImpl, my_impl->tag_server(), my_impl->sequence_manager(),
-                        tmp_buffer);
-
-    // unpack the results
-  all_mesh.clear();
-  result = pcomm2.unpack_buffer(all_mesh); RR;
-  std::cout << "ENTITIES UNPACKED:" << std::endl;
-  mbImpl->list_entities(all_mesh);
-  
-  std::cout << "Success, processor " << mbImpl->proc_rank() << "." << std::endl;
-  
-  return 1;
-}
-#endif

Deleted: MOAB/trunk/MBParallelComm.hpp
===================================================================
--- MOAB/trunk/MBParallelComm.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBParallelComm.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,173 +0,0 @@
-/**
- * MOAB, a Mesh-Oriented datABase, is a software component for creating,
- * storing and accessing finite element mesh data.
- * 
- * Copyright 2004 Sandia Corporation.  Under the terms of Contract
- * DE-AC04-94AL85000 with Sandia Coroporation, the U.S. Government
- * retains certain rights in this software.
- * 
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- * 
- */
-
-/**
- * \class MBParallelComm
- * \brief Parallel communications in MOAB
- * \author Tim Tautges
- *
- *  This class implements methods to communicate mesh between processors
- *
- */
-
-#ifndef MB_PARALLEL_COMM_HPP
-#define MB_PARALLEL_COMM_HPP
-
-#include "MBForward.hpp"
-#include "MBRange.hpp"
-#include "MBProcConfig.hpp"
-
-class TagServer;
-class EntitySequenceManager;
-
-class MBParallelComm 
-{
-public:
-
-    //! constructor
-  MBParallelComm(MBInterface *impl, TagServer *tag_server, 
-                 EntitySequenceManager *sequence_manager);
-
-    //! constructor taking packed buffer, for testing
-  MBParallelComm(MBInterface *impl, TagServer *tag_server, 
-                 EntitySequenceManager *sequence_manager,
-                 std::vector<unsigned char> &tmp_buff);
-
-    //! assign a global id space, for largest-dimension or all entities (and
-    //! in either case for vertices too)
-  MBErrorCode assign_global_ids(const int dimension,
-                                const int start_id = 1,
-                                const bool largest_dim_only = true);
-
-    //! communicate entities from/to this range
-  MBErrorCode communicate_entities(const int from_proc, const int to_proc,
-                                   MBRange &entities,
-                                   const bool adjacencies = false,
-                                   const bool tags = true);
-  
-  MBErrorCode broadcast_entities( const int from_proc,
-                                  MBRange& entities,
-                                  const bool adjacencies = false,
-                                  const bool tags = true );
-  
-    //! pack a buffer (stored in this class instance) with ALL data for these entities
-  MBErrorCode pack_buffer(MBRange &entities, 
-                          const bool adjacencies,
-                          const bool tags,
-                          const bool just_count,
-                          MBRange &whole_range,
-                          int &buff_size);
-  
-    //! unpack a buffer; assume information is already in myBuffer
-  MBErrorCode unpack_buffer(MBRange &entities);
-
-    //! set the buffer size; return true if size actually changed
-  bool buffer_size(const unsigned int new_size);
-
-    //! take the buffer from this instance; switches with vector passed in
-  void take_buffer(std::vector<unsigned char> &new_buff);
-
-private:
-
-  int num_subranges(const MBRange &this_range);
-  
-  MBErrorCode pack_entities(MBRange &entities,
-                            MBRange::const_iterator &start_rit,
-                            MBRange &whole_range,
-                            unsigned char *&buff_ptr,
-                            int &count,
-                            const bool just_count);
-  
-  MBErrorCode unpack_entities(unsigned char *&buff_ptr,
-                              MBRange &entities);
-  
-  MBErrorCode pack_sets(MBRange &entities,
-                        MBRange::const_iterator &start_rit,
-                        MBRange &whole_range,
-                        unsigned char *&buff_ptr,
-                        int &count,
-                        const bool just_count);
-  
-  MBErrorCode unpack_sets(unsigned char *&buff_ptr,
-                          MBRange &entities);
-  
-  MBErrorCode pack_adjacencies(MBRange &entities,
-                               MBRange::const_iterator &start_rit,
-                               MBRange &whole_range,
-                               unsigned char *&buff_ptr,
-                               int &count,
-                               const bool just_count);
-
-  MBErrorCode unpack_adjacencies(unsigned char *&buff_ptr,
-                                 MBRange &entities);
-  
-  MBErrorCode pack_tags(MBRange &entities,
-                        MBRange::const_iterator &start_rit,
-                        MBRange &whole_range,
-                        unsigned char *&buff_ptr,
-                        int &count,
-                        const bool just_count);
-
-  MBErrorCode unpack_tags(unsigned char *&buff_ptr,
-                          MBRange &entities);
-  
-
-    //! MB interface associated with this writer
-  MBInterface *mbImpl;
-  
-    //! Processor informatino
-  const MBProcConfig procInfo;
-  
-    //! Tag server, so we can get more info about tags
-  TagServer *tagServer;
-  
-    //! Sequence manager, to get more efficient access to entities
-  EntitySequenceManager *sequenceManager;
-  
-    //! data buffer used to communicate
-  std::vector<unsigned char> myBuffer;
-
-    //! types of ranges to be communicated
-  std::vector<MBEntityType> entTypes;
-
-    //! ranges to be communicated
-  std::vector<MBRange> allRanges;
-  
-    //! vertices per entity in ranges
-  std::vector<int> vertsPerEntity;
-
-    //! sets to be communicated
-  MBRange setRange;
-  
-    //! ranges from sets to be communicated
-  std::vector<MBRange> setRanges;
-  
-    //! sizes of vector-based sets to be communicated
-  std::vector<int> setSizes;
-
-    //! tags to be communicated
-  std::vector<MBTag> allTags;
-
-    //! ranges from sparse tags to be communicated
-  std::vector<MBRange> tagRanges;
-
-    //! vector of set options for transferred sets
-  std::vector<unsigned int> optionsVec;
-  
-    //! numbers of parents/children for transferred sets
-  std::vector<int> setPcs;
-};
-
-#endif

Modified: MOAB/trunk/MBParallelConventions.h
===================================================================
--- MOAB/trunk/MBParallelConventions.h	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBParallelConventions.h	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,6 +1,12 @@
 #ifndef MB_PARALLEL_CONVENTIONS_H
 #define MB_PARALLEL_CONVENTIONS_H
 
+/** Tag conventions for naming parallel things.  Note this header
+ * file belongs in the main MOAB directory because even serial
+ * applications (e.g. partitioners) may write tags for use in
+ * parallel applications.
+ */
+
 /** \brief Meshset tag name for interfaces between processors
  *
  * Meshset containing the interface between two processors.
@@ -38,4 +44,18 @@
  */
 #define PARALLEL_PARTITION_TAG_NAME "PARALLEL_PARTITION"
  
+/** \brief Tag storing which other processor a given entity is shared with
+ *
+ * This single-valued tag implies an entity is shared with one other proc
+ */
+#define PARALLEL_SHARED_PROC_TAG_NAME "PARALLEL_SHARED_PROC"
+ 
+/** \brief Tag storing which other processorS a given entity is shared with
+ *
+ * This multiple-valued tag implies an entity is shared with multiple
+ * other processors.  Length of tag is application-dependent, and depends on
+ * what the maximum number of processors is which share an entity
+ */
+#define PARALLEL_SHARED_PROCS_TAG_NAME "PARALLEL_SHARED_PROCS"
+ 
 #endif

Deleted: MOAB/trunk/MBProcConfig.cpp
===================================================================
--- MOAB/trunk/MBProcConfig.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBProcConfig.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,122 +0,0 @@
-/**
- * MOAB, a Mesh-Oriented datABase, is a software component for creating,
- * storing and accessing finite element mesh data.
- * 
- * Copyright 2004 Sandia Corporation.  Under the terms of Contract
- * DE-AC04-94AL85000 with Sandia Coroporation, the U.S. Government
- * retains certain rights in this software.
- * 
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- * 
- */
-
-#include "MBProcConfig.hpp"
-#include "MBInternals.hpp"
-
-
-/** Calculate ceiling of log 2 of a positive integer */
-static unsigned ceil_log_2( unsigned n )
-{
-  unsigned result;
-  for (result = 0; n > (((MBEntityHandle)1)<<result); ++result);
-  return result;
-}
-
-//! Constructor
-MBProcConfig::MBProcConfig( unsigned rank, unsigned size ) 
-  : procRank( rank ),
-    procSize( size ),
-    procWidth( ceil_log_2( size ) ),
-    idWidth( MB_ID_WIDTH - procWidth ),
-    idMask( MB_ID_MASK >> procWidth ),
-    procMask( ~(MB_TYPE_MASK|idMask) )
-{}
-
-MBEntityHandle MBProcConfig::handle( MBEntityType type, 
-                                     MBEntityID sub_id, 
-                                     unsigned proc ) const
-{
-  int err;
-  return CREATE_HANDLE( type, id( sub_id, proc ), err );
-}
-
-MBRange::const_iterator 
-MBProcConfig::lower_bound( MBEntityType type, unsigned proc, const MBRange& range ) const
-{
-  int err;
-  MBEntityHandle h = CREATE_HANDLE( type, id(0,proc), err );
-  return err ? range.end() : MBRange::lower_bound(range.begin(), range.end(), h);
-}
-
-MBRange::const_iterator
-MBProcConfig::upper_bound( MBEntityType type, unsigned proc, const MBRange& range ) const
-{
-  int err;
-  MBEntityHandle h = CREATE_HANDLE( type, last_id(proc), err );
-  return err ? range.end() : MBRange::upper_bound(range.begin(), range.end(), h);
-}
-
-std::pair<MBRange::const_iterator, MBRange::const_iterator>
-MBProcConfig::equal_range( MBEntityType type, unsigned proc, const MBRange& range ) const
-{
-  std::pair<MBRange::const_iterator, MBRange::const_iterator> iters;
-  int err;
-  MBEntityHandle h;
-
-  h = CREATE_HANDLE( type, id(0,proc), err );
-  iters.first = err ? range.end() : MBRange::lower_bound(range.begin(), range.end(), h);  
-  
-  h = CREATE_HANDLE( type, last_id(proc), err );
-  iters.second = err ? range.end() : MBRange::upper_bound( iters.first, range.end(), h );
-  
-  return iters;
-}
-
-MBRange MBProcConfig::subset( unsigned proc, const MBRange& range ) const
-{
-  int junk;
-  MBRange result;
-  MBRange::iterator insert_pos = result.begin();
-  MBRange::const_pair_iterator iter;
-  MBEntityHandle s, e;
-  
-  for (iter = range.const_pair_begin(); iter != range.const_pair_end(); ++iter)
-  {
-    const MBEntityType beg_type = TYPE_FROM_HANDLE(iter->first),
-                       end_type = TYPE_FROM_HANDLE(iter->second);
-    const unsigned beg_rank = rank(iter->first), end_rank = rank(iter->second);
-    
-    if (beg_type != end_type) {
-      if (beg_rank <= proc) {
-        s = beg_rank == proc ? iter->first : 
-            CREATE_HANDLE( beg_type,    id(0,proc), junk );
-        e = CREATE_HANDLE( beg_type, last_id(proc), junk );
-        insert_pos = result.insert( insert_pos, s, e );
-      }
-      MBEntityType t = beg_type;
-      for (++t; t != end_type; ++t) {
-        s = CREATE_HANDLE( t,    id(0,proc), junk );
-        e = CREATE_HANDLE( t, last_id(proc), junk );
-        insert_pos = result.insert( insert_pos, s, e );
-      }
-      if (end_rank >= proc) {
-        e = end_rank == proc ? iter->second :
-            CREATE_HANDLE( end_type, last_id(proc), junk );
-        s = CREATE_HANDLE( end_type,    id(0,proc), junk );
-        insert_pos = result.insert( insert_pos, s, e );
-      }
-    }
-    else if (beg_rank <= proc && end_rank >= proc) {
-      s = (beg_rank == proc) ? iter->first  : CREATE_HANDLE( beg_type,    id(0,proc), junk );
-      e = (end_rank == proc) ? iter->second : CREATE_HANDLE( beg_type, last_id(proc), junk );
-      insert_pos = result.insert( insert_pos, s, e );
-    }
-  }
-  
-  return result;
-}
-
-

Deleted: MOAB/trunk/MBProcConfig.hpp
===================================================================
--- MOAB/trunk/MBProcConfig.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBProcConfig.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,89 +0,0 @@
-/**
- * MOAB, a Mesh-Oriented datABase, is a software component for creating,
- * storing and accessing finite element mesh data.
- * 
- * Copyright 2004 Sandia Corporation.  Under the terms of Contract
- * DE-AC04-94AL85000 with Sandia Coroporation, the U.S. Government
- * retains certain rights in this software.
- * 
- * This library is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public
- * License as published by the Free Software Foundation; either
- * version 2.1 of the License, or (at your option) any later version.
- * 
- */
-
-#ifndef MB_PROC_CONFIG_HPP
-#define MB_PROC_CONFIG_HPP
-
-#include "MBTypes.h"
-#include "MBRange.hpp"
-
-/**\brief Multi-CPU information for parallel MOAB */
-class MBProcConfig {
-  public:
-
-    MBProcConfig( unsigned rank, unsigned size );
-    
-      //! Get the current processor number
-    unsigned rank() const 
-      { return procRank; }
-      
-      //! Get the number of processors
-    unsigned size() const 
-      { return procSize; }
-      
-      //! Get CPU number from handle
-    unsigned rank( MBEntityHandle handle ) const
-      { return (handle & procMask) >> idWidth; }
-      
-      //! Get CPU number from ID
-    unsigned rank_from_id( MBEntityID id ) const
-      { return id >> idWidth; }
-      
-      //! Get maximum entity ID that can be stored in a
-      //! a handle, allowing for the processor number
-    MBEntityID max_id() const
-      { return idMask; }
-      
-      //! Create the ID portion of a handle by combining
-      //! an actual ID and a processor number
-    MBEntityID id( MBEntityID sub_id, unsigned proc ) const
-      { return ((MBEntityHandle)proc << idWidth) | (MBEntityHandle)sub_id; }
-      
-      //! Extract non-rank portion of entity ID from handle
-    MBEntityID id( MBEntityHandle h ) const
-      { return h & idMask; }
-      
-    MBEntityID first_id( unsigned proc ) const
-      { return id( 1, proc ); }
-    
-    MBEntityID last_id( unsigned proc ) const
-      { return id( max_id(), proc ); }
-      
-      //! Create an entity handle given type, rank, and id
-    MBEntityHandle handle( MBEntityType type, 
-                           MBEntityID sub_id, 
-                           unsigned proc ) const;
-                           
-
-    MBRange::const_iterator lower_bound( MBEntityType type, unsigned proc, const MBRange& ) const;
-    MBRange::const_iterator upper_bound( MBEntityType type, unsigned proc, const MBRange& ) const;
-    
-    std::pair<MBRange::const_iterator,MBRange::const_iterator>
-    equal_range( MBEntityType type, unsigned proc, const MBRange& ) const;
-    
-      //! get subset of range by processor type
-    MBRange subset( unsigned proc, const MBRange& ) const;
-    
-  private:
-  
-    unsigned procRank;    //!< ID of this processor
-    unsigned procSize;    //!< Total number of processors
-    unsigned procWidth;   //!< Number of bits in handle for processor ID
-    unsigned idWidth;     //!< Number of bits in handle for entity ID
-    MBEntityHandle idMask;
-    MBEntityHandle procMask;
-};
-
-#endif

Modified: MOAB/trunk/MBRange.cpp
===================================================================
--- MOAB/trunk/MBRange.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBRange.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -658,6 +658,7 @@
   MBEntityHandle handle = CREATE_HANDLE( type, 0, err );
   return err ? end() : lower_bound( begin(), end(), handle );
 }
+
 MBRange::const_iterator MBRange::upper_bound( MBEntityType type ) const
 {
     // if (type+1) overflows, err will be true and we return end().
@@ -665,6 +666,7 @@
   MBEntityHandle handle = CREATE_HANDLE( type + 1, 0, err );
   return err ? end() : lower_bound( begin(), end(), handle );
 }
+
 std::pair<MBRange::const_iterator, MBRange::const_iterator>
 MBRange::equal_range( MBEntityType type ) const
 {
@@ -775,14 +777,13 @@
 }
 
     //! return a subset of this range, by type
-MBRange MBRange::subset(const MBEntityType t) 
+MBRange MBRange::subset_by_type(const MBEntityType t) 
 {
   MBRange result;
   result.merge( lower_bound(t), upper_bound(t) );
   return result;
 }
 
-
 bool operator==( const MBRange& r1, const MBRange& r2 )
 {
   MBRange::const_pair_iterator i1, i2;

Modified: MOAB/trunk/MBRange.hpp
===================================================================
--- MOAB/trunk/MBRange.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBRange.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -314,7 +314,7 @@
   void sanity_check() const;
 
     //! return a subset of this range, by type
-  MBRange subset(MBEntityType t);
+  MBRange subset_by_type(MBEntityType t);
   
   struct PairNode : public std::pair<MBEntityHandle,MBEntityHandle>
   {

Modified: MOAB/trunk/MBReadUtil.cpp
===================================================================
--- MOAB/trunk/MBReadUtil.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBReadUtil.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -28,6 +28,7 @@
 #include "EntitySequenceManager.hpp"
 #include "PolyEntitySequence.hpp"
 
+#define RR if (MB_SUCCESS != result) return result
 
 MBReadUtil::MBReadUtil(MBCore* mdb, MBError* error_handler) 
     : MBReadUtilIface(), mMB(mdb), mError(error_handler)
@@ -35,7 +36,7 @@
 }
 
 unsigned  MBReadUtil::parallel_rank() const
-  { return mMB->proc_config().rank(); }
+  { return mMB->proc_rank(); }
 
 MBErrorCode MBReadUtil::get_node_arrays(
     const int /*num_arrays*/,
@@ -50,14 +51,16 @@
   MBEntitySequence* seq = 0;
 
   MBEntityHandle preferred_start_handle;
-  static int err;
-  preferred_start_handle = CREATE_HANDLE(MBVERTEX, mMB->proc_config().id(preferred_start_id, 
-                                         preferred_start_proc), err);
+  preferred_start_handle = 
+    mMB->handle_utils().create_handle(MBVERTEX, 
+                                      preferred_start_id, 
+                                      preferred_start_proc);
  
   // create an entity sequence for these nodes 
   error = mMB->sequence_manager()->create_entity_sequence(
-      MBVERTEX, num_nodes, 0, preferred_start_handle, preferred_start_proc, actual_start_handle,
-      seq);
+    MBVERTEX, num_nodes, 0, preferred_start_handle, 
+    preferred_start_proc, actual_start_handle,
+    seq);
 
   if(error != MB_SUCCESS)
     return error;
@@ -194,4 +197,59 @@
   return result;
 }
 
+MBErrorCode MBReadUtil::gather_related_ents(MBRange &partition,
+                                            MBRange &related_ents,
+                                            MBRange *all_sets) 
+{
+    // first, related ents includes the partition itself
+  related_ents.merge(partition);
+  
+    // loop over any sets, getting contained ents
+  std::pair<MBRange::const_iterator, MBRange::const_iterator> pair_it =
+    partition.equal_range(MBENTITYSET);
 
+  MBErrorCode result;
+  for (MBRange::const_iterator rit = pair_it.first; 
+       rit != pair_it.second; rit++) {
+    MBErrorCode tmp_result = 
+      mMB->get_entities_by_handle(*rit, related_ents, 
+                                  MBInterface::UNION);
+    if (MB_SUCCESS != tmp_result) result = tmp_result;
+  }
+  RR;
+
+    // gather adjacent ents of lower dimension
+  MBRange tmp_ents;
+  for (int dim = 2; dim >= 0; dim--) {
+    MBEntityType lower_type = MBCN::TypeDimensionMap[dim+1].first,
+      upper_type = MBCN::TypeDimensionMap[3].second;
+    
+    MBRange::const_iterator bit = related_ents.lower_bound(lower_type),
+      eit = related_ents.upper_bound(upper_type);
+    MBRange from_ents;
+    from_ents.merge(bit, eit);
+    tmp_ents.clear();
+    MBErrorCode tmp_result = mMB->get_adjacencies(from_ents, dim, false, 
+                                                  tmp_ents, 
+                                                  MBInterface::UNION);
+    if (MB_SUCCESS != tmp_result) result = tmp_result;
+    else related_ents.merge(tmp_ents);
+  }
+  RR;
+  
+    // get related sets
+  MBRange tmp_ents3;
+  if (!all_sets) all_sets = &tmp_ents3;
+  result = mMB->get_entities_by_type(0, MBENTITYSET, *all_sets);
+  for (MBRange::iterator rit = all_sets->begin(); 
+       rit != all_sets->end(); rit++) {
+    tmp_ents.clear();
+    result = mMB->get_entities_by_handle(*rit, tmp_ents, true); RR;
+    MBRange tmp_ents2 = tmp_ents.intersect(related_ents);
+    
+      // if the intersection is not empty, set is related
+    if (!tmp_ents2.empty()) related_ents.insert(*rit);
+  }
+
+  return MB_SUCCESS;
+}

Modified: MOAB/trunk/MBReadUtil.hpp
===================================================================
--- MOAB/trunk/MBReadUtil.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBReadUtil.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -64,6 +64,19 @@
       MBEntityHandle*& array
       );
 
+    /**
+     *\brief Gather entities related to those in the partition
+     * Gather entities related to those in the input partition.  Related
+     * means down-adjacent to, contained in, etc.
+     * \param partition Entities for which to gather related entities
+     * \param related_ents Related entities
+     * \param all_sets If non-NULL, all sets in mesh instance are returned
+     * in the pointed-to range
+     */
+  MBErrorCode gather_related_ents(MBRange &partition,
+                                  MBRange &related_ents,
+                                  MBRange *all_sets);
+  
   /** Allocate storage for poly (polygon or polyhedron elements) 
    * 
    * Allocate storage for poly (polygon or polyhedron elements) and

Modified: MOAB/trunk/MBReadUtilIface.hpp
===================================================================
--- MOAB/trunk/MBReadUtilIface.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBReadUtilIface.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -21,6 +21,8 @@
 #include <vector>
 #include "MBTypes.h"
 
+class MBRange;
+
 //! Interface implemented in MOAB which provides memory for mesh reading utilities
 class MB_DLL_EXPORT MBReadUtilIface
 {
@@ -75,6 +77,19 @@
     MBEntityHandle*& array
     ) = 0;
 
+    /**
+     *\brief Gather entities related to those in the partition
+     * Gather entities related to those in the input partition.  Related
+     * means down-adjacent to, contained in, etc.
+     * \param partition Entities for which to gather related entities
+     * \param related_ents Related entities
+     * \param all_sets If non-NULL, all sets in mesh instance are returned
+     * in the pointed-to range
+     */
+  virtual MBErrorCode gather_related_ents(MBRange &partition,
+                                          MBRange &related_ents,
+                                          MBRange *all_sets) = 0;
+  
   /** Allocate storage for poly (polygon or polyhedron elements) 
    * 
    * Allocate storage for poly (polygon or polyhedron elements) and

Modified: MOAB/trunk/MBReaderIface.hpp
===================================================================
--- MOAB/trunk/MBReaderIface.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBReaderIface.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -51,6 +51,7 @@
                                    const FileOptions& opts,
                                    const int* material_set_list,
                                    const int material_set_list_len ) = 0;
+
 };
 
 #endif

Modified: MOAB/trunk/MBSkinner.cpp
===================================================================
--- MOAB/trunk/MBSkinner.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBSkinner.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -852,3 +852,21 @@
   return false;
 }
 
+  // get skin entities of prescribed dimension
+MBErrorCode MBSkinner::find_skin(const MBRange &entities,
+                                 int dim,
+                                 MBRange &skin_entities) 
+{
+  if (MBCN::Dimension(TYPE_FROM_HANDLE(*entities.begin())) !=
+      MBCN::Dimension(TYPE_FROM_HANDLE(*entities.rbegin())))
+    return MB_FAILURE;
+  
+  MBRange tmp_skin_for, tmp_skin_rev;
+  MBErrorCode result = find_skin(entities, tmp_skin_for, tmp_skin_rev);
+  if (MB_SUCCESS != result) return result;
+  
+  tmp_skin_for.merge(tmp_skin_rev);
+  result = thisMB->get_adjacencies(tmp_skin_for, dim, true, skin_entities);
+  return result;
+}
+

Modified: MOAB/trunk/MBSkinner.hpp
===================================================================
--- MOAB/trunk/MBSkinner.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBSkinner.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -49,6 +49,11 @@
                         MBRange &forward_lower_entities,
                         MBRange &reverse_lower_entities);
 
+    // get skin entities of prescribed dimension
+  MBErrorCode find_skin(const MBRange &entities,
+                        int dim,
+                        MBRange &skin_entities);
+
   MBErrorCode classify_2d_boundary( const MBRange &boundary,
                                      const MBRange &bar_elements,
                                      MBEntityHandle boundary_edges,

Modified: MOAB/trunk/MBTest.cpp
===================================================================
--- MOAB/trunk/MBTest.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/MBTest.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -40,6 +40,7 @@
 #include "MBCN.hpp"
 #include "MBOrientedBox.hpp"
 #include "MBCartVect.hpp"
+#include "MBHandleUtils.hpp"
 
 #ifndef IS_BUILDING_MB
 #define IS_BUILDING_MB
@@ -4663,10 +4664,10 @@
   return MB_SUCCESS;
 }
 
-MBErrorCode mb_range_seq_intersect_test( MBInterface* ) 
+MBErrorCode mb_range_seq_intersect_test( MBInterface*) 
 {
   MBErrorCode rval;
-  EntitySequenceManager sequences( MBProcConfig( 0, 1 ) );
+  EntitySequenceManager sequences(MBHandleUtils( 0, 1 ));
   MBRangeSeqIntersectIter iter( &sequences );
   MBRange range;
 
@@ -4812,7 +4813,7 @@
    
     // Iterate over two subsets of the quad sequence
     
-  MBRange quads = range.subset( MBQUAD );
+  MBRange quads = range.subset_by_type( MBQUAD );
   MBEntityHandle removed = qs1->get_start_handle() + nq1/2;
   if (quads.erase( removed ) == quads.end())
     return MB_FAILURE;
@@ -5255,49 +5256,49 @@
   MBRange range;
   MBRange::iterator i1, i2;
   std::pair<MBRange::iterator,MBRange::iterator> ip;
-  MBProcConfig procInfo( 1, 5 ); // second of five CPUs
+  MBHandleUtils handle_utils( 1, 5 ); // second of five CPUs
   
     // make an range containing everything
-  range.insert( 1, procInfo.handle( MBENTITYSET, procInfo.max_id(), procInfo.size()-1 ) );
+  range.insert( 1, handle_utils.create_handle( MBENTITYSET, handle_utils.max_id(), handle_utils.proc_size()-1 ) );
   
     // check type/proc subset stuff for each processor for each entity type
-  for (unsigned proc = 0; proc < procInfo.size(); ++proc) {
+  for (unsigned proc = 0; proc < handle_utils.proc_size(); ++proc) {
     MBRange proc_range;
     for (MBEntityType t = MBVERTEX; t < MBMAXTYPE; ++t) {
-      i1 = procInfo.lower_bound( t, proc, range );
+      i1 = handle_utils.lower_bound( t, proc, range );
       ASSERT_NOT_EQUAL( i1, range.end() );
       ASSERT_EQUAL( TYPE_FROM_HANDLE( *i1 ), t );
-      ASSERT_EQUAL( procInfo.rank( *i1 ), proc );
+      ASSERT_EQUAL( handle_utils.rank_from_handle( *i1 ), proc );
       if (!proc && t == MBVERTEX) 
-        ASSERT_EQUAL( procInfo.id( *i1 ), (MBEntityID)1 );
+        ASSERT_EQUAL( handle_utils.id_from_handle( *i1 ), (MBEntityID)1 );
       else
-        ASSERT_EQUAL( procInfo.id( *i1 ), (MBEntityID)0 );
+        ASSERT_EQUAL( handle_utils.id_from_handle( *i1 ), (MBEntityID)0 );
       
-      i2 = procInfo.upper_bound( t, proc, range );
+      i2 = handle_utils.upper_bound( t, proc, range );
         // Because the number of processors is not a power
         // of two, we've inserted handles w/ invalid processor
         // IDs into the range for all types except MBENTITYSET
-        //if (proc + 1 == procInfo.size()) {
+        //if (proc + 1 == handle_utils.proc_size()) {
         //  if (t == MBENTITYSET)
         //    ASSERT_EQUAL( i2, range.end() );
         //  else {
         //    ASSERT_NOT_EQUAL( i2, range.end() );
         //    MBEntityType n = t; ++n;
         //    ASSERT_EQUAL( TYPE_FROM_HANDLE( *i2 ), n );
-        //    ASSERT_EQUAL( procInfo.rank( *i2 ), (unsigned)0 );
-        //    ASSERT_EQUAL( procInfo.id( *i2 ), (MBEntityHandle)0 );
+        //    ASSERT_EQUAL( handle_utils.rank_from_handle( *i2 ), (unsigned)0 );
+        //    ASSERT_EQUAL( handle_utils.id_from_handle( *i2 ), (MBEntityHandle)0 );
         //  }
         //}
-      if (proc + 1 == procInfo.size() && t == MBENTITYSET)
+      if (proc + 1 == handle_utils.proc_size() && t == MBENTITYSET)
         ASSERT_EQUAL( i2, range.end() );
       else {
         ASSERT_NOT_EQUAL( i2, range.end() );
         ASSERT_EQUAL( TYPE_FROM_HANDLE( *i2 ), t );
-        ASSERT_EQUAL( procInfo.rank( *i2 ), proc+1 );
-        ASSERT_EQUAL( procInfo.id( *i2 ), (MBEntityID)0 );
+        ASSERT_EQUAL( handle_utils.rank_from_handle( *i2 ), proc+1 );
+        ASSERT_EQUAL( handle_utils.id_from_handle( *i2 ), (MBEntityID)0 );
       }
       
-      ip = procInfo.equal_range( t, proc, range );
+      ip = handle_utils.equal_range( t, proc, range );
       ASSERT_EQUAL( i1, ip.first );
       ASSERT_EQUAL( i2, ip.second );
     
@@ -5305,166 +5306,166 @@
     }
     
       // get subset by processor ID and check results
-    MBRange proc_range2 = procInfo.subset( proc, range );
+    MBRange proc_range2 = handle_utils.subset_by_proc( proc, range );
     ASSERT_EQUAL( proc_range, proc_range2 );
   }
   
     // make a range containing some entities, but not all handles
   range.clear();
-  range.insert( procInfo.handle( MBVERTEX, 100, 0 ), 
-                procInfo.handle( MBVERTEX, 110, 0 ) );
-  range.insert( procInfo.handle( MBVERTEX, 100, 2 ),
-                procInfo.handle( MBVERTEX, 110, 2 ) );
-  range.insert( procInfo.handle( MBENTITYSET, 5, 0 ),
-                procInfo.handle( MBENTITYSET, 9, 0 ) );
-  range.insert( procInfo.handle( MBENTITYSET, 1, 3 ),
-                procInfo.handle( MBENTITYSET, 1, 3 ) );
+  range.insert( handle_utils.create_handle( MBVERTEX, 100, 0 ), 
+                handle_utils.create_handle( MBVERTEX, 110, 0 ) );
+  range.insert( handle_utils.create_handle( MBVERTEX, 100, 2 ),
+                handle_utils.create_handle( MBVERTEX, 110, 2 ) );
+  range.insert( handle_utils.create_handle( MBENTITYSET, 5, 0 ),
+                handle_utils.create_handle( MBENTITYSET, 9, 0 ) );
+  range.insert( handle_utils.create_handle( MBENTITYSET, 1, 3 ),
+                handle_utils.create_handle( MBENTITYSET, 1, 3 ) );
   
     // test lower_bound
 
-  i1 = procInfo.lower_bound( MBVERTEX, 0, range );
+  i1 = handle_utils.lower_bound( MBVERTEX, 0, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBVERTEX, 100, 0 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBVERTEX, 100, 0 ) );
 
-  i1 = procInfo.lower_bound( MBVERTEX, 1, range );
+  i1 = handle_utils.lower_bound( MBVERTEX, 1, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBVERTEX, 100, 2 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBVERTEX, 100, 2 ) );
 
-  i1 = procInfo.lower_bound( MBVERTEX, 2, range );
+  i1 = handle_utils.lower_bound( MBVERTEX, 2, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBVERTEX, 100, 2 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBVERTEX, 100, 2 ) );
 
-  i1 = procInfo.lower_bound( MBVERTEX, 3, range );
+  i1 = handle_utils.lower_bound( MBVERTEX, 3, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 5, 0 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 5, 0 ) );
 
-  i1 = procInfo.lower_bound( MBEDGE, 0, range );
+  i1 = handle_utils.lower_bound( MBEDGE, 0, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 5, 0 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 5, 0 ) );
 
-  i1 = procInfo.lower_bound( MBTRI, 0, range );
+  i1 = handle_utils.lower_bound( MBTRI, 0, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 5, 0 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 5, 0 ) );
 
-  i1 = procInfo.lower_bound( MBENTITYSET, 0, range );
+  i1 = handle_utils.lower_bound( MBENTITYSET, 0, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 5, 0 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 5, 0 ) );
 
-  i1 = procInfo.lower_bound( MBENTITYSET, 1, range );
+  i1 = handle_utils.lower_bound( MBENTITYSET, 1, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 1, 3 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 1, 3 ) );
 
-  i1 = procInfo.lower_bound( MBENTITYSET, 3, range );
+  i1 = handle_utils.lower_bound( MBENTITYSET, 3, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 1, 3 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 1, 3 ) );
 
-  i1 = procInfo.lower_bound( MBENTITYSET, 4, range );
+  i1 = handle_utils.lower_bound( MBENTITYSET, 4, range );
   ASSERT_EQUAL( i1, range.end() );
 
     // test upper_bound  
-  i1 = procInfo.upper_bound( MBVERTEX, 0, range );
+  i1 = handle_utils.upper_bound( MBVERTEX, 0, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBVERTEX, 100, 2 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBVERTEX, 100, 2 ) );
 
-  i1 = procInfo.upper_bound( MBVERTEX, 1, range );
+  i1 = handle_utils.upper_bound( MBVERTEX, 1, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBVERTEX, 100, 2 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBVERTEX, 100, 2 ) );
 
-  i1 = procInfo.upper_bound( MBVERTEX, 2, range );
+  i1 = handle_utils.upper_bound( MBVERTEX, 2, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 5, 0 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 5, 0 ) );
 
-  i1 = procInfo.upper_bound( MBVERTEX, 3, range );
+  i1 = handle_utils.upper_bound( MBVERTEX, 3, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 5, 0 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 5, 0 ) );
 
-  i1 = procInfo.upper_bound( MBEDGE, 0, range );
+  i1 = handle_utils.upper_bound( MBEDGE, 0, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 5, 0 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 5, 0 ) );
 
-  i1 = procInfo.upper_bound( MBTRI, 0, range );
+  i1 = handle_utils.upper_bound( MBTRI, 0, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 5, 0 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 5, 0 ) );
 
-  i1 = procInfo.upper_bound( MBENTITYSET, 0, range );
+  i1 = handle_utils.upper_bound( MBENTITYSET, 0, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 1, 3 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 1, 3 ) );
 
-  i1 = procInfo.upper_bound( MBENTITYSET, 1, range );
+  i1 = handle_utils.upper_bound( MBENTITYSET, 1, range );
   ASSERT_NOT_EQUAL( i1, range.end() );
-  ASSERT_EQUAL( *i1, procInfo.handle( MBENTITYSET, 1, 3 ) );
+  ASSERT_EQUAL( *i1, handle_utils.create_handle( MBENTITYSET, 1, 3 ) );
 
-  i1 = procInfo.upper_bound( MBENTITYSET, 3, range );
+  i1 = handle_utils.upper_bound( MBENTITYSET, 3, range );
   ASSERT_EQUAL( i1, range.end() );
 
-  i1 = procInfo.upper_bound( MBENTITYSET, 4, range );
+  i1 = handle_utils.upper_bound( MBENTITYSET, 4, range );
   ASSERT_EQUAL( i1, range.end() );
   
     // test equal_range
-  ip = procInfo.equal_range( MBVERTEX, 0, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBVERTEX, 0, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBVERTEX, 0, range ), ip.second );
+  ip = handle_utils.equal_range( MBVERTEX, 0, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBVERTEX, 0, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBVERTEX, 0, range ), ip.second );
 
-  ip = procInfo.equal_range( MBVERTEX, 1, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBVERTEX, 1, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBVERTEX, 1, range ), ip.second );
+  ip = handle_utils.equal_range( MBVERTEX, 1, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBVERTEX, 1, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBVERTEX, 1, range ), ip.second );
 
-  ip = procInfo.equal_range( MBVERTEX, 2, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBVERTEX, 2, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBVERTEX, 2, range ), ip.second );
+  ip = handle_utils.equal_range( MBVERTEX, 2, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBVERTEX, 2, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBVERTEX, 2, range ), ip.second );
 
-  ip = procInfo.equal_range( MBVERTEX, 3, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBVERTEX, 3, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBVERTEX, 3, range ), ip.second );
+  ip = handle_utils.equal_range( MBVERTEX, 3, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBVERTEX, 3, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBVERTEX, 3, range ), ip.second );
 
-  ip = procInfo.equal_range( MBEDGE, 0, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBEDGE, 0, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBEDGE, 0, range ), ip.second );
+  ip = handle_utils.equal_range( MBEDGE, 0, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBEDGE, 0, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBEDGE, 0, range ), ip.second );
 
-  ip = procInfo.equal_range( MBTRI, 0, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBTRI, 0, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBTRI, 0, range ), ip.second );
+  ip = handle_utils.equal_range( MBTRI, 0, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBTRI, 0, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBTRI, 0, range ), ip.second );
 
-  ip = procInfo.equal_range( MBENTITYSET, 0, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBENTITYSET, 0, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBENTITYSET, 0, range ), ip.second );
+  ip = handle_utils.equal_range( MBENTITYSET, 0, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBENTITYSET, 0, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBENTITYSET, 0, range ), ip.second );
 
-  ip = procInfo.equal_range( MBENTITYSET, 1, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBENTITYSET, 1, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBENTITYSET, 1, range ), ip.second );
+  ip = handle_utils.equal_range( MBENTITYSET, 1, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBENTITYSET, 1, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBENTITYSET, 1, range ), ip.second );
 
-  ip = procInfo.equal_range( MBENTITYSET, 3, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBENTITYSET, 3, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBENTITYSET, 3, range ), ip.second );
+  ip = handle_utils.equal_range( MBENTITYSET, 3, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBENTITYSET, 3, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBENTITYSET, 3, range ), ip.second );
 
-  ip = procInfo.equal_range( MBENTITYSET, 4, range );
-  ASSERT_EQUAL( procInfo.lower_bound( MBENTITYSET, 4, range ), ip.first );
-  ASSERT_EQUAL( procInfo.upper_bound( MBENTITYSET, 4, range ), ip.second );
+  ip = handle_utils.equal_range( MBENTITYSET, 4, range );
+  ASSERT_EQUAL( handle_utils.lower_bound( MBENTITYSET, 4, range ), ip.first );
+  ASSERT_EQUAL( handle_utils.upper_bound( MBENTITYSET, 4, range ), ip.second );
   
   MBRange sub, expected;
   
   sub.clear(); expected.clear();
-  sub = procInfo.subset( 0, range );
-  expected.insert( procInfo.handle( MBVERTEX, 100, 0 ), 
-                   procInfo.handle( MBVERTEX, 110, 0 ) );
-  expected.insert( procInfo.handle( MBENTITYSET, 5, 0 ),
-                   procInfo.handle( MBENTITYSET, 9, 0 ) );
+  sub = handle_utils.subset_by_proc( 0, range );
+  expected.insert( handle_utils.create_handle( MBVERTEX, 100, 0 ), 
+                   handle_utils.create_handle( MBVERTEX, 110, 0 ) );
+  expected.insert( handle_utils.create_handle( MBENTITYSET, 5, 0 ),
+                   handle_utils.create_handle( MBENTITYSET, 9, 0 ) );
   ASSERT_EQUAL( sub, expected );
   
   sub.clear();
-  sub = procInfo.subset( 1, range );
+  sub = handle_utils.subset_by_proc( 1, range );
   ASSERT_EQUAL( sub.empty(), true );
   
   sub.clear(); expected.clear();
-  sub = procInfo.subset( 2, range );
-  expected.insert( procInfo.handle( MBVERTEX, 100, 2 ),
-                   procInfo.handle( MBVERTEX, 110, 2 ) );
+  sub = handle_utils.subset_by_proc( 2, range );
+  expected.insert( handle_utils.create_handle( MBVERTEX, 100, 2 ),
+                   handle_utils.create_handle( MBVERTEX, 110, 2 ) );
   ASSERT_EQUAL( sub, expected );
   
   sub.clear(); expected.clear();
-  sub = procInfo.subset( 3, range );
-  expected.insert( procInfo.handle( MBENTITYSET, 1, 3 ),
-                   procInfo.handle( MBENTITYSET, 1, 3 ) );
+  sub = handle_utils.subset_by_proc( 3, range );
+  expected.insert( handle_utils.create_handle( MBENTITYSET, 1, 3 ),
+                   handle_utils.create_handle( MBENTITYSET, 1, 3 ) );
   ASSERT_EQUAL( sub, expected );
  
   return MB_SUCCESS;

Modified: MOAB/trunk/Makefile.am
===================================================================
--- MOAB/trunk/Makefile.am	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/Makefile.am	2007-10-03 20:28:42 UTC (rev 1297)
@@ -6,6 +6,9 @@
 if HDF5_FILE
   SUBDIRS += mhdf
 endif
+if PARALLEL
+  SUBDIRS += parallel
+endif
 SUBDIRS += . test tools doc doxygen
 
 
@@ -23,7 +26,7 @@
         obb_test \
 	vtk_test \
 	adaptive_kd_tree_tests \
-	file_options_test 
+	file_options_test
 #                 merge_test \         # input files no longer exist?
 #                 test_tag_server \    # fails
 
@@ -31,6 +34,10 @@
 		 kd_tree_tool \
 		 kd_tree_time
          
+if PARALLEL 
+  check_PROGRAMS += mbparallelcomm_test
+endif
+
 #noinst_PROGRAMS = scdseq_timing
 
 
@@ -51,20 +58,17 @@
                      WriteNCDF.cpp WriteNCDF.hpp \
                      WriteSLAC.cpp WriteSLAC.hpp 
 endif
-if PARALLEL_HDF5
-  MOAB_PARALLEL_SRCS = WriteHDF5Parallel.cpp
-  MOAB_PARALLEL_HDRS = WriteHDF5.hpp WriteHDF5Parallel.hpp
-else
-  MOAB_PARALLEL_SRCS = WriteHDF5.hpp
-  MOAB_PARALLEL_HDRS =
-endif
+libMOAB_la_LIBADD = 
 if HDF5_FILE
-  libMOAB_la_LIBADD = $(top_builddir)/mhdf/libmhdf.la
+  libMOAB_la_LIBADD += $(top_builddir)/mhdf/libmhdf.la
   INCLUDES += -I$(srcdir)/mhdf/include
-  MOAB_EXTRA_SRCS += ReadHDF5.cpp ReadHDF5.hpp WriteHDF5.cpp $(MOAB_PARALLEL_SRCS)
-  MOAB_EXTRA_HDRS += $(MOAB_PARALLEL_HDRS)
+  MOAB_EXTRA_SRCS += ReadHDF5.cpp ReadHDF5.hpp WriteHDF5.cpp WriteHDF5.hpp
 endif
 
+if PARALLEL
+  libMOAB_la_LIBADD += $(top_builddir)/parallel/libMOABpar.la
+  INCLUDES += -I$(srcdir)/parallel
+endif
 
 # Automake doesn't seem to have a directory defined for
 # platform-dependent data (or include) files. So put 
@@ -115,6 +119,7 @@
   MBCN.cpp \
   MBCNArrays.hpp \
   MBCartVect.cpp \
+  MBHandleUtils.cpp \
   MBMatrix3.cpp \
   MBMatrix3.hpp \
   MBCore.cpp \
@@ -126,8 +131,6 @@
   MBMeshSet.hpp \
   MBOrientedBox.cpp \
   MBOrientedBoxTreeTool.cpp \
-  MBParallelComm.cpp \
-  MBProcConfig.cpp \
   MBRange.cpp \
   MBRangeSeqIntersectIter.cpp \
   MBRangeSeqIntersectIter.hpp \
@@ -145,8 +148,6 @@
   PolyEntitySequence.hpp \
   ReadGmsh.cpp \
   ReadGmsh.hpp \
-  ReadParallel.hpp \
-  ReadParallel.cpp \
   ReadSTL.cpp \
   ReadSTL.hpp \
   ReadVtk.cpp \
@@ -194,12 +195,11 @@
   MBError.hpp \
   MBForward.hpp \
   MBGeomUtil.hpp \
+  MBHandleUtils.cpp \
   MBInterface.hpp \
   MBOrientedBox.hpp \
   MBOrientedBoxTreeTool.hpp \
-  MBParallelComm.hpp \
   MBParallelConventions.h \
-  MBProcConfig.hpp \
   MBRange.hpp \
   MBReadUtilIface.hpp \
   MBReaderIface.hpp \
@@ -298,6 +298,12 @@
 kd_tree_time_LDADD = $(top_builddir)/libMOAB.la
 kd_tree_time_DEPENDENCIES = $(kd_tree_time_LDADD)
 
+if PARALLEL
+mbparallelcomm_test_SOURCES = mbparallelcomm_test.cpp
+mbparallelcomm_test_LDADD = $(top_builddir)/libMOAB.la
+mbparallelcomm_test_DEPENDENCIES = $(mbparallelcomm_test_LDADD)
+endif
+
 file_options_test_SOURCES = FileOptions.cpp 
 file_options_test_CPPFLAGS = -DTEST
 

Modified: MOAB/trunk/ReadNCDF.cpp
===================================================================
--- MOAB/trunk/ReadNCDF.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/ReadNCDF.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -940,12 +940,12 @@
 
   NcVar *temp_var = ncFile->get_var("elem_map");
   if (NULL == temp_var || !temp_var->is_valid()) {
-    readMeshIface->report_error("MBCN:: Problem getting element number map variable.");
+    readMeshIface->report_error("ReadNCDF:: Problem getting element number map variable.");
     return MB_FAILURE;
   }
   NcBool status = temp_var->get(ptr, numberElements_loading);
   if (0 == status) {
-    readMeshIface->report_error("MBCN:: Problem getting element number map data.");
+    readMeshIface->report_error("ReadNCDF:: Problem getting element number map data.");
     delete [] ptr;
     return MB_FAILURE;
   }

Deleted: MOAB/trunk/ReadParallel.cpp
===================================================================
--- MOAB/trunk/ReadParallel.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/ReadParallel.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,236 +0,0 @@
-#include "ReadParallel.hpp"
-#include "MBCore.hpp"
-#include "MBProcConfig.hpp"
-#include "FileOptions.hpp"
-#include "MBError.hpp"
-#include "MBReaderWriterSet.hpp"
-#include "MBParallelComm.hpp"
-#include "MBCN.hpp"
-
-#define RR if (MB_SUCCESS != result) return result
-
-MBErrorCode ReadParallel::load_file(const char *file_name,
-                                    MBEntityHandle& file_set,
-                                    const FileOptions &opts,
-                                    const int* material_set_list,
-                                    const int num_material_sets ) 
-{
-  MBError *merror = ((MBCore*)mbImpl)->get_error_handler();
-
-  MBCore *impl = dynamic_cast<MBCore*>(mbImpl);
-  
-    // Get parallel settings
-  int parallel_mode;
-  const char* parallel_opts[] = { "NONE", "BCAST", "BCAST_DELETE", "SCATTER", 
-                                  "FORMAT", 0 };
-  enum ParallelOpts {POPT_NONE=0, POPT_BCAST, POPT_BCAST_DELETE, POPT_SCATTER,
-                     POPT_FORMAT, POPT_LAST};
-      
-  MBErrorCode rval = opts.match_option( "PARALLEL", parallel_opts, 
-                                        parallel_mode );
-  if (MB_FAILURE == rval) {
-    merror->set_last_error( "Unexpected value for 'PARALLEL' option\n" );
-    return MB_FAILURE;
-  }
-  else if (MB_ENTITY_NOT_FOUND == rval) {
-    parallel_mode = 0;
-  }
-    // Get partition setting
-  std::string partition_tag;
-  rval = opts.get_option("PARTITION", partition_tag);
-  if (MB_ENTITY_NOT_FOUND == rval || partition_tag.empty())
-    partition_tag += "PARTITION";
-
-    // get MPI IO processor rank
-  int reader_rank;
-  rval = opts.get_int_option( "MPI_IO_RANK", reader_rank );
-  if (MB_ENTITY_NOT_FOUND == rval)
-    reader_rank = 0;
-  else if (MB_SUCCESS != rval) {
-    merror->set_last_error( "Unexpected value for 'MPI_IO_RANK' option\n" );
-    return MB_FAILURE;
-  }
-  
-    // now that we've parsed all the parallel options, return
-    // failure for most of them because we haven't implemented 
-    // most of them yet.
-  if (parallel_mode == POPT_FORMAT) {
-    merror->set_last_error( "Access to format-specific parallel read not implemented.\n");
-    return MB_NOT_IMPLEMENTED;
-  }
-
-  if (parallel_mode == POPT_SCATTER) {
-    merror->set_last_error( "Partitioning for PARALLEL=SCATTER not supported yet.\n");
-    return MB_NOT_IMPLEMENTED;
-  }
-  
-  if (parallel_mode != POPT_SCATTER || 
-      reader_rank == (int)(mbImpl->proc_config().rank())) {
-      // Try using the file extension to select a reader
-    const MBReaderWriterSet* set = impl->reader_writer_set();
-    MBReaderIface* reader = set->get_file_extension_reader( file_name );
-    if (reader)
-    { 
-      rval = reader->load_file( file_name, file_set, opts, 
-                                material_set_list, num_material_sets );
-      delete reader;
-    }
-    else
-    {  
-        // Try all the readers
-      MBReaderWriterSet::iterator iter;
-      for (iter = set->begin(); iter != set->end(); ++iter)
-      {
-        MBReaderIface* reader = iter->make_reader( mbImpl );
-        if (NULL != reader)
-        {
-          rval = reader->load_file( file_name, file_set, opts, 
-                                    material_set_list, num_material_sets );
-          delete reader;
-          if (MB_SUCCESS == rval)
-            break;
-        }
-      }
-    }
-  }
-  else {
-    rval = MB_SUCCESS;
-  }
-  
-  if (parallel_mode == POPT_BCAST ||
-      parallel_mode == POPT_BCAST_DELETE) {
-    MBRange entities; 
-    if (MB_SUCCESS == rval && 
-        reader_rank == (int)(mbImpl->proc_config().rank())) {
-      rval = mbImpl->get_entities_by_handle( file_set, entities );
-      if (MB_SUCCESS != rval)
-        entities.clear();
-    }
-    
-    MBParallelComm tool( mbImpl, impl->tag_server(), impl->sequence_manager());
-    MBErrorCode tmp_rval = tool.broadcast_entities( reader_rank, entities );
-    if (MB_SUCCESS != rval && mbImpl->proc_config().size() != 1)
-      tmp_rval = rval;
-    else if (MB_SUCCESS != rval) rval = MB_SUCCESS;
-      
-    if (MB_SUCCESS == rval && 
-        reader_rank != (int)(mbImpl->proc_config().rank())) {
-      rval = mbImpl->create_meshset( MESHSET_SET, file_set );
-      if (MB_SUCCESS == rval) {
-        rval = mbImpl->add_entities( file_set, entities );
-        if (MB_SUCCESS != rval) {
-          mbImpl->delete_entities( &file_set, 1 );
-          file_set = 0;
-        }
-      }
-    }
-
-    if (parallel_mode == POPT_BCAST_DELETE)
-      rval = delete_nonlocal_entities(partition_tag, file_set);
-    
-  }
-  
-  return rval;
-}
-
-MBErrorCode ReadParallel::delete_nonlocal_entities(std::string &partition_name,
-                                                   MBEntityHandle file_set) 
-{
-  MBErrorCode result;
-  MBError *merror = ((MBCore*)mbImpl)->get_error_handler();
-  
-    // get entities in this partition
-  int my_rank = (int)mbImpl->proc_config().rank();
-  if (my_rank == 0 && mbImpl->proc_config().size() == 1) my_rank = 1;
-  int *my_rank_ptr = &my_rank;
-  MBTag partition_tag;
-  
-  result = mbImpl->tag_get_handle(partition_name.c_str(), partition_tag);
-  if (MB_TAG_NOT_FOUND == result) {
-    merror->set_last_error( "Couldn't find partition tag\n");
-    return result;
-  }
-  else if (MB_SUCCESS != result) return result;
-    
-  MBRange partition_sets;
-  result = mbImpl->get_entities_by_type_and_tag(file_set, MBENTITYSET,
-                                                &partition_tag, 
-                                                (const void* const *) &my_rank_ptr, 
-                                                1, partition_sets); RR;
-  if (MB_SUCCESS != result || partition_sets.empty()) return result;
-  
-  MBRange file_ents, partition_ents, exist_ents, all_ents;
-
-    // get ents in the partition
-  for (MBRange::iterator rit = partition_sets.begin(); 
-       rit != partition_sets.end(); rit++) {
-    result = mbImpl->get_entities_by_handle(*rit, partition_ents, 
-                                            MBInterface::UNION); RR;
-  }
-
-    // get pre-existing ents, which are all entities minus file ents
-  result = mbImpl->get_entities_by_handle(0, all_ents); RR;
-  result = mbImpl->get_entities_by_handle(file_set, file_ents); RR;
-  exist_ents = all_ents.subtract(file_ents);
-
-    // merge partition ents into pre-existing entities
-  exist_ents.merge(partition_ents);
-  
-    // gather adjacent ents of lower dimension and add to existing ents
-  MBRange tmp_ents;
-  for (int dim = 2; dim >= 0; dim--) {
-    MBEntityType lower_type = MBCN::TypeDimensionMap[dim+1].first,
-      upper_type = MBCN::TypeDimensionMap[3].second;
-    
-    MBRange::const_iterator bit = exist_ents.lower_bound(lower_type),
-      eit = exist_ents.upper_bound(upper_type);
-    MBRange from_ents;
-    from_ents.merge(bit, eit);
-    tmp_ents.clear();
-    result = mbImpl->get_adjacencies(from_ents, dim, false, tmp_ents, 
-                                     MBInterface::UNION); RR;
-    exist_ents.merge(tmp_ents);
-  }
-  
-    // subtract from all ents to get deletable ents
-  all_ents = all_ents.subtract(exist_ents);
-  
-    // go through the sets to which ones we should keep
-  MBRange all_sets, deletable_sets;
-  result = mbImpl->get_entities_by_type(0, MBENTITYSET, all_sets);
-  for (MBRange::iterator rit = all_sets.begin(); rit != all_sets.end(); rit++) {
-    tmp_ents.clear();
-    result = mbImpl->get_entities_by_handle(*rit, tmp_ents, true); RR;
-    MBRange tmp_ents2 = tmp_ents.intersect(exist_ents);
-    
-      // if the intersection is empty, set is deletable
-    if (tmp_ents2.empty()) deletable_sets.insert(*rit);
-    
-    else if (tmp_ents.size() > tmp_ents2.size()) {
-        // more elements in set or contained sets than we're keeping; delete 
-        // the difference from just this set, to remove entities to be deleted below
-        // it's ok if entity isn't contained, doesn't generate an error
-      tmp_ents = tmp_ents.subtract(tmp_ents2);
-      result = mbImpl->remove_entities(*rit, tmp_ents); RR;
-    }
-  }
-
-    // take the deletable sets out of other sets so we don't end up
-    // with stale set handles
-  for (MBRange::iterator rit = all_sets.begin(); rit != all_sets.end(); rit++) {
-    if (deletable_sets.find(*rit) == deletable_sets.end()) {
-      result = mbImpl->remove_entities(*rit, deletable_sets); RR;
-    }
-  }
-
-    // remove sets from all_ents, since they're dealt with separately
-  all_ents = all_ents.subtract(all_sets);
-  
-    // now delete sets first, then ents
-  result = mbImpl->delete_entities(deletable_sets); RR;
-  result = mbImpl->delete_entities(all_ents); RR;
-  
-  result = ((MBCore*)mbImpl)->check_adjacencies();
-  
-  return result;
-}

Deleted: MOAB/trunk/ReadParallel.hpp
===================================================================
--- MOAB/trunk/ReadParallel.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/ReadParallel.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,38 +0,0 @@
-#ifndef READ_PARALLEL_HPP
-#define READ_PARALLEL_HPP
-
-#include "MBForward.hpp"
-#include "MBReaderIface.hpp"
-
-class MBReadUtilIface;
-
-class ReadParallel : public MBReaderIface
-{
-   
-public:
-
-  static MBReaderIface* factory( MBInterface* );
-
-    //! load a file
-  MBErrorCode load_file(const char *file_name,
-                        MBEntityHandle& file_set,
-                        const FileOptions &opts,
-                        const int* material_set_list,
-                        const int num_material_sets );
-  
-    //! Constructor
-  ReadParallel(MBInterface* impl = NULL) {mbImpl = impl;};
-
-   //! Destructor
-  virtual ~ReadParallel() {}
-
-protected:
-
-private:
-  MBInterface *mbImpl;
-  
-  MBErrorCode delete_nonlocal_entities(std::string &partition_name,
-                                       MBEntityHandle file_set);
-};
-
-#endif

Deleted: MOAB/trunk/WriteHDF5Parallel.cpp
===================================================================
--- MOAB/trunk/WriteHDF5Parallel.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/WriteHDF5Parallel.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,2132 +0,0 @@
-
-#undef DEBUG
-
-#ifdef DEBUG
-#  include <stdio.h>
-#  include <stdarg.h>
-#endif
-
-#ifndef HDF5_FILE
-#  error Attempt to compile WriteHDF5Parallel with HDF5 support disabled
-#endif
-
-#include <stdlib.h>
-#include <string.h>
-
-#include <vector>
-#include <set>
-#include <map>
-#include <utility>
-
-#include <mpi.h>
-
-#include <H5Tpublic.h>
-#include <H5Ppublic.h>
-#include <H5FDmpi.h>
-#include <H5FDmpio.h>
-
-#include "mhdf.h"
-
-#include "MBInterface.hpp"
-#include "MBInternals.hpp"
-#include "MBTagConventions.hpp"
-#include "MBParallelConventions.h"
-#include "MBCN.hpp"
-#include "MBWriteUtilIface.hpp"
-
-#include "WriteHDF5Parallel.hpp"
-
-
-#ifdef DEBUG
-#  define START_SERIAL                     \
-     for (int _x = 0; _x < numProc; ++_x) {\
-       MPI_Barrier( MPI_COMM_WORLD );      \
-       if (_x != myRank) continue     
-#  define END_SERIAL                       \
-     }                                     \
-     MPI_Barrier( MPI_COMM_WORLD )
-#else
-#  define START_SERIAL
-#  define END_SERIAL
-#endif
-
-
-#define DEBUG_OUT_STREAM stdout
-
-#ifndef DEBUG
-static void printdebug( const char*, ... ) {}
-#else
-static void printdebug( const char* fmt, ... )
-{
-  int rank;
-  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
-  fprintf( DEBUG_OUT_STREAM, "[%d] ", rank );
-  va_list args;
-  va_start( args, fmt );
-  vfprintf( DEBUG_OUT_STREAM, fmt, args );
-  va_end( args );
-  fflush( DEBUG_OUT_STREAM );
-}
-#endif
-
-
-#ifdef NDEBUG
-#  define assert(A)
-#else
-#  define assert(A) if (!(A)) do_assert(__FILE__, __LINE__, #A)
-   static void do_assert( const char* file, int line, const char* condstr )
-   {
-     int rank;
-     MPI_Comm_rank( MPI_COMM_WORLD, &rank );
-     fprintf( DEBUG_OUT_STREAM, "[%d] Assert(%s) failed at %s:%d\n", rank, condstr, file, line );
-     fflush( DEBUG_OUT_STREAM );
-     abort();
-   }
-#endif
-
-
-#ifndef DEBUG
-void WriteHDF5Parallel::printrange( MBRange& ) {}
-#else
-void WriteHDF5Parallel::printrange( MBRange& r )
-{
-  int rank;
-  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
-  MBEntityType type = MBMAXTYPE;
-  for (MBRange::const_pair_iterator i = r.pair_begin(); i != r.pair_end(); ++i)
-  {
-    MBEntityHandle a, b;
-    a = (*i).first;
-    b = (*i).second;
-    MBEntityType mytype = iFace->type_from_handle(a);
-    if (mytype != type)
-    {
-      type = mytype;
-      fprintf(DEBUG_OUT_STREAM, "%s[%d]  %s", type == MBMAXTYPE ? "" : "\n", rank, MBCN::EntityTypeName( type ) );
-    }
-    unsigned long id1 = iFace->id_from_handle( a );
-    unsigned long id2 = iFace->id_from_handle( b );
-    if (id1 == id2)
-      fprintf(DEBUG_OUT_STREAM, " %lu", id1 );
-    else
-      fprintf(DEBUG_OUT_STREAM, " %lu-%lu", id1, id2 );
-  }
-  fprintf(DEBUG_OUT_STREAM, "\n");
-  fflush( DEBUG_OUT_STREAM );
-}
-#endif
-
-
-#ifndef DEBUG
-static void print_type_sets( MBInterface* , int , int , MBRange& ) {}
-#else
-static void print_type_sets( MBInterface* iFace, int myRank, int numProc, MBRange& sets )
-{
-  MBTag gid, did, bid, sid, nid, iid;
-  iFace->tag_get_handle( GLOBAL_ID_TAG_NAME, gid ); 
-  iFace->tag_get_handle( GEOM_DIMENSION_TAG_NAME, did );
-  iFace->tag_get_handle( MATERIAL_SET_TAG_NAME, bid );
-  iFace->tag_get_handle( DIRICHLET_SET_TAG_NAME, nid );
-  iFace->tag_get_handle( NEUMANN_SET_TAG_NAME, sid );
-  iFace->tag_get_handle( PARALLEL_INTERFACE_TAG_NAME, iid );
-  MBRange typesets[10];
-  const char* typenames[] = {"Block", "Sideset", "NodeSet", "Vertex", "Curve", "Surface", "Volume", "Body", "Interfaces", "Other"};
-  for (MBRange::iterator riter = sets.begin(); riter != sets.end(); ++riter)
-  {
-    unsigned dim, id, proc[2], oldsize;
-    if (MB_SUCCESS == iFace->tag_get_data(bid, &*riter, 1, &id)) 
-      dim = 0;
-    else if (MB_SUCCESS == iFace->tag_get_data(sid, &*riter, 1, &id))
-      dim = 1;
-    else if (MB_SUCCESS == iFace->tag_get_data(nid, &*riter, 1, &id))
-      dim = 2;
-    else if (MB_SUCCESS == iFace->tag_get_data(did, &*riter, 1, &dim)) {
-      id = 0;
-      iFace->tag_get_data(gid, &*riter, 1, &id);
-      dim += 3;
-    }
-    else if (MB_SUCCESS == iFace->tag_get_data(iid, &*riter, 1, proc)) {
-      assert(proc[0] == (unsigned)myRank || proc[1] == (unsigned)myRank);
-      id = proc[proc[0] == (unsigned)myRank];
-      dim = 8;
-    }
-    else {
-      id = *riter;
-      dim = 9;
-    }
-
-    oldsize = typesets[dim].size();
-    typesets[dim].insert( id );
-    assert( typesets[dim].size() - oldsize == 1 );  
-  }
-  for (int ii = 0; ii < 10; ++ii)
-  {
-    char num[16];
-    std::string line(typenames[ii]);
-    if (typesets[ii].empty())
-      continue;
-    sprintf(num, "(%u):", typesets[ii].size());
-    line += num;
-    for (MBRange::const_pair_iterator piter = typesets[ii].pair_begin();
-         piter != typesets[ii].pair_end(); ++piter)
-    {
-      sprintf(num," %d", (*piter).first);
-      line += num;
-      if ((*piter).first != (*piter).second) {
-        sprintf(num,"-%d", (*piter).second);
-        line += num;
-      }
-    }
-
-    printdebug ("%s\n", line.c_str());
-  }
-  printdebug("Total: %u\n", sets.size());
-}
-#endif
-
-
-void range_remove( MBRange& from, const MBRange& removed )
-{
-  
-/* The following should be more efficient, but isn't due
-   to the inefficient implementation of MBRange::erase(iter,iter)
-  MBRange::const_iterator s, e, n = from.begin();
-  for (MBRange::const_pair_iterator p = removed.pair_begin();
-       p != removed.pair_end(); ++p)
-  {
-    e = s = MBRange::lower_bound(n, from.end(), (*p).first);
-    e = MBRange::lower_bound(s, from.end(), (*p).second);
-    if (e != from.end() && *e == (*p).second)
-      ++e;
-    n = from.erase( s, e );
-  }
-*/
-
-  if (removed.size())
-  {
-    MBRange tmp = from.subtract(removed);
-    from.swap( tmp );
-  }
-}
-
-void WriteHDF5Parallel::MultiProcSetTags::add( const std::string& name )
-  { list.push_back( Data(name) ); }
-
-void WriteHDF5Parallel::MultiProcSetTags::add( const std::string& filter, 
-                                               const std::string& data )
-  { list.push_back( Data(filter,data) ); }
-
-void WriteHDF5Parallel::MultiProcSetTags::add( const std::string& filter, 
-                                               int filterval,
-                                               const std::string& data )
-  { list.push_back( Data(filter,data,filterval) ); }
-
-
-WriteHDF5Parallel::WriteHDF5Parallel( MBInterface* iface )
-  : WriteHDF5(iface)
-{
-  multiProcSetTags.add(  MATERIAL_SET_TAG_NAME );
-  multiProcSetTags.add( DIRICHLET_SET_TAG_NAME );
-  multiProcSetTags.add(   NEUMANN_SET_TAG_NAME );
-  multiProcSetTags.add( GEOM_DIMENSION_TAG_NAME, 0, GLOBAL_ID_TAG_NAME );
-  multiProcSetTags.add( GEOM_DIMENSION_TAG_NAME, 1, GLOBAL_ID_TAG_NAME );
-  multiProcSetTags.add( GEOM_DIMENSION_TAG_NAME, 2, GLOBAL_ID_TAG_NAME );
-  multiProcSetTags.add( GEOM_DIMENSION_TAG_NAME, 3, GLOBAL_ID_TAG_NAME );
-}
-
-WriteHDF5Parallel::WriteHDF5Parallel( MBInterface* iface,
-                                      const std::vector<std::string>& tag_names )
-  : WriteHDF5(iface)
-{
-  for(std::vector<std::string>::const_iterator i = tag_names.begin();
-      i != tag_names.end(); ++i)
-    multiProcSetTags.add( *i );
-}
-
-WriteHDF5Parallel::WriteHDF5Parallel( MBInterface* iface,
-                                      const MultiProcSetTags& set_tags )
-  : WriteHDF5(iface), multiProcSetTags(set_tags)
-{}
-
-// The parent WriteHDF5 class has ExportSet structs that are
-// populated with the entities to be written, grouped by type
-// (and for elements, connectivity length).  This function:
-//  o determines which entities are to be written by a remote processor
-//  o removes those entities from the ExportSet structs in WriteMesh
-//  o puts them in the 'remoteMesh' array of MBRanges in this class
-//  o sets their file Id to '1'
-MBErrorCode WriteHDF5Parallel::gather_interface_meshes()
-{
-  MBRange range;
-  MBErrorCode result;
-  MBTag iface_tag, geom_tag;
-  int i, proc_pair[2];
-  
-  START_SERIAL;
-  printdebug( "Pre-interface mesh:\n");
-  printrange(nodeSet.range);
-  for (std::list<ExportSet>::iterator eiter = exportList.begin();
-           eiter != exportList.end(); ++eiter )
-    printrange(eiter->range);
-  printrange(setSet.range);
-  
-    // Allocate space for remote mesh data
-  remoteMesh.resize( numProc );
-  
-    // Get tag handles
-  result = iFace->tag_get_handle( PARALLEL_INTERFACE_TAG_NAME, iface_tag );
-  if (MB_SUCCESS != result) return result;
-  result = iFace->tag_get_handle( PARALLEL_GEOM_TOPO_TAG_NAME, geom_tag );
-  if (MB_SUCCESS != result) return result;
-  
-  
-    // Get interface mesh sets
-  result = iFace->get_entities_by_type_and_tag( 0,
-                                                MBENTITYSET,
-                                                &iface_tag,
-                                                0,
-                                                1,
-                                                range );
-  if (MB_SUCCESS != result) return result;
-  
-  
-    // Populate lists of interface mesh entities
-  for (MBRange::iterator iiter = range.begin(); iiter != range.end(); ++iiter)
-  {
-    result = iFace->tag_get_data( iface_tag, &*iiter, 1, proc_pair );
-    if (MB_SUCCESS != result) return result;
-    const int remote_proc = proc_pair[0];
-    
-      // Get list of all entities in interface and 
-      // the subset of that list that are meshsets.
-    MBRange entities, sets;
-    result = iFace->get_entities_by_handle( *iiter, entities );
-    if (MB_SUCCESS != result) return result;
-    result = iFace->get_entities_by_type( *iiter, MBENTITYSET, sets );
-    if (MB_SUCCESS != result) return result;
-
-      // Put any non-meshset entities in the list directly.
-    //range_remove( entities, sets ); //not necessary, get_entities_by_handle doesn't return sets
-    remoteMesh[remote_proc].merge( entities );
-    //remoteMesh[remote_proc].insert( *iiter );
-    
-    for (MBRange::iterator siter = sets.begin(); siter != sets.end(); ++siter)
-    {
-        // For current parallel meshing code, root processor owns
-        // all curve and geometric vertex meshes.  
-      int dimension;
-      result = iFace->tag_get_data( geom_tag, &*siter, 1, &dimension );
-      if (result == MB_SUCCESS && dimension < 2)
-        continue;
-        
-        // Put entities in list for appropriate processor.
-      //remoteMesh[remote_proc].insert( *siter );
-      entities.clear();
-      result = iFace->get_entities_by_handle( *siter, entities );
-      if (MB_SUCCESS != result) return result;
-      remoteMesh[remote_proc].merge( entities );
-    }
-  }
-  
-    // For current parallel meshing code, root processor owns
-    // all curve and geometric vertex meshes.  Find them and
-    // allocate them appropriately.
-  MBRange curves_and_verts;
-  MBTag tags[] = { geom_tag, geom_tag };
-  int value_ints[] = { 0, 1 };
-  const void* values[] = {value_ints, value_ints + 1};
-  result = iFace->get_entities_by_type_and_tag( 0, MBENTITYSET,
-                                                tags, values, 2,
-                                                curves_and_verts, 
-                                                MBInterface::UNION );
-                                                assert(MB_SUCCESS == result);
-  MBRange edges, nodes;
-  for (MBRange::iterator riter = curves_and_verts.begin();
-       riter != curves_and_verts.end(); ++riter)
-  {
-    result = iFace->get_entities_by_type( *riter, MBVERTEX, nodes ); assert(MB_SUCCESS == result);
-    result = iFace->get_entities_by_type( *riter, MBEDGE, edges ); assert(MB_SUCCESS == result);
-  }
-  std::list<ExportSet>::iterator eiter = exportList.begin();
-  for ( ; eiter != exportList.end() && eiter->type != MBEDGE; ++eiter );
-  
-  remoteMesh[0].merge( nodes );
-  remoteMesh[0].merge( edges );
-  //remoteMesh[0].merge( curves_and_verts );
-  if (myRank == 0)
-  {
-    nodeSet.range.merge( nodes );
-    //setSet.range.merge(curves_and_verts);
-    eiter->range.merge( edges );
-  } 
-  edges.merge(nodes);
-  //edges.merge(curves_and_verts);
-  for (i = 1; i < numProc; i++)
-  {
-    MBRange diff = edges.intersect( remoteMesh[i] );
-    range_remove(remoteMesh[i], diff);
-  }
-  
-  
-  
-    // For all remote mesh entities, remove them from the
-    // lists of local mesh to be exported and give them a 
-    // junk file Id of 1.  Need to specify a file ID greater
-    // than zero so the code that gathers adjacencies and 
-    // such doesn't think that the entities aren't being
-    // exported.
-  for (i = 0; i < numProc; i++)
-  {
-    if (i == myRank) continue;
-    
-    MBRange& range = remoteMesh[i];
-    
-    range_remove( nodeSet.range, range );
-    //range_remove( setSet.range, range );
-    for (std::list<ExportSet>::iterator eiter = exportList.begin();
-         eiter != exportList.end(); ++eiter )
-      range_remove( eiter->range, range );
-    
-    int id = 1;
-    for (MBRange::iterator riter = remoteMesh[i].begin(); 
-         riter != remoteMesh[i].end() && iFace->type_from_handle(*riter) != MBENTITYSET; 
-         ++riter)
-    {
-      result = iFace->tag_set_data( idTag, &*riter, 1, &id );
-      if (MB_SUCCESS != result) return result;
-    }
-  }
-  
-    // print some debug output summarizing what we've accomplished
-  
-  printdebug("Remote mesh:\n");
-  for (int ii = 0; ii < numProc; ++ii)
-  {
-    printdebug("  proc %d : %d\n", ii, remoteMesh[ii].size());
-    printrange( remoteMesh[ii] );
-  }
-
-  printdebug( "Post-interface mesh:\n");
-  printrange(nodeSet.range);
-  for (std::list<ExportSet>::iterator eiter = exportList.begin();
-           eiter != exportList.end(); ++eiter )
-    printrange(eiter->range);
-  printrange(setSet.range);
-
-  END_SERIAL;
-  
-  return MB_SUCCESS;
-}
-
-
-
-MBErrorCode WriteHDF5Parallel::create_file( const char* filename,
-                                            bool overwrite,
-                                            std::vector<std::string>& qa_records,
-                                            int dimension )
-{
-  MBErrorCode rval;
-  int result;
-  mhdf_Status status;
-    
-  result = MPI_Comm_rank( MPI_COMM_WORLD, &myRank );
-  assert(MPI_SUCCESS == result);
-  result = MPI_Comm_size( MPI_COMM_WORLD, &numProc );
-  assert(MPI_SUCCESS == result);
-  
-  rval = gather_interface_meshes();
-  if (MB_SUCCESS != rval) return rval;
-  
-    /**************** Create actual file and write meta info ***************/
-
-  if (myRank == 0)
-  {
-      // create the file
-    const char* type_names[MBMAXTYPE];
-    memset( type_names, 0, MBMAXTYPE * sizeof(char*) );
-    for (MBEntityType i = MBEDGE; i < MBENTITYSET; ++i)
-      type_names[i] = MBCN::EntityTypeName( i );
-   
-    filePtr = mhdf_createFile( filename, overwrite, type_names, MBMAXTYPE, &status );
-    if (!filePtr)
-    {
-      writeUtil->report_error( "%s\n", mhdf_message( &status ) );
-      return MB_FAILURE;
-    }
-    
-    rval = write_qa( qa_records );
-    if (MB_SUCCESS != rval) return rval;
-  }
-  
-  
-     /**************** Create node coordinate table ***************/
- 
-  rval = create_node_table( dimension );
-  if (MB_SUCCESS != rval) return rval;
-  
-  
-    /**************** Create element tables ***************/
-
-  rval = negotiate_type_list();
-  if (MB_SUCCESS != rval) return rval;
-  rval = create_element_tables();
-  if (MB_SUCCESS != rval) return rval;
-  
-
-    /**************** Comminucate all remote IDs ***********************/
-  
-  rval = communicate_remote_ids( MBVERTEX );
-  for (std::list<ExportSet>::iterator ex_itor = exportList.begin(); 
-       ex_itor != exportList.end(); ++ex_itor)
-  {
-    rval = communicate_remote_ids( ex_itor->type );
-    assert(MB_SUCCESS == rval);
-  }
-  
-  
-    /**************** Create adjacency tables *********************/
-  
-  rval = create_adjacency_tables();
-  if (MB_SUCCESS != rval) return rval;
-  
-    /**************** Create meshset tables *********************/
-  
-  rval = create_meshset_tables();
-  if (MB_SUCCESS != rval) return rval;
-  
-  
-    /* Need to write tags for shared sets this proc is responsible for */
-  
-  MBRange parallel_sets;
-  for (std::list<ParallelSet>::const_iterator psiter = parallelSets.begin();
-       psiter != parallelSets.end(); ++psiter)
-    if (psiter->description)
-      parallel_sets.insert( psiter->handle );
-  
-  setSet.range.merge( parallel_sets );
-  rval = gather_tags();
-  if (MB_SUCCESS != rval)
-    return rval;
-  range_remove( setSet.range, parallel_sets );   
-  
-
-    /**************** Create tag data *********************/
-  
-  std::list<SparseTag>::iterator tag_iter;
-  sort_tags_by_name();
-  const int num_tags = tagList.size();
-  std::vector<int> tag_offsets(num_tags), tag_counts(num_tags);
-  std::vector<int>::iterator tag_off_iter = tag_counts.begin();
-  for (tag_iter = tagList.begin(); tag_iter != tagList.end(); ++tag_iter, ++tag_off_iter)
-    *tag_off_iter = tag_iter->range.size();
-  
-  printdebug("Exchanging tag data for %d tags.\n", num_tags);
-  std::vector<int> proc_tag_offsets(num_tags*numProc);
-  result = MPI_Gather( &tag_counts[0], num_tags, MPI_INT,
-                 &proc_tag_offsets[0], num_tags, MPI_INT,
-                       0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-  tag_iter = tagList.begin();
-  for (int i = 0; i < num_tags; ++i, ++tag_iter)
-  {
-    tag_counts[i] = 0;
-    int next_offset = 0;
-    for (int j = 0; j < numProc; j++)
-    {
-      int count = proc_tag_offsets[i + j*num_tags];
-      proc_tag_offsets[i + j*num_tags] = next_offset;
-      next_offset += count;
-      tag_counts[i] += count;
-    }
-
-    if (0 == myRank)
-    {
-      rval = create_tag( tag_iter->tag_id, next_offset );
-      assert(MB_SUCCESS == rval);
-      printdebug( "Creating table of size %d for tag 0x%lx\n", (int)next_offset, (unsigned long)tag_iter->tag_id);
-    }
-  }
-  
-  result = MPI_Bcast( &tag_counts[0], num_tags, MPI_INT, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-  result = MPI_Scatter( &proc_tag_offsets[0], num_tags, MPI_INT,
-                             &tag_offsets[0], num_tags, MPI_INT,
-                             0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-
-
-  tag_iter = tagList.begin();
-  for (int i = 0; i < num_tags; ++i, ++tag_iter)
-  {
-    tag_iter->offset = tag_offsets[i];
-    tag_iter->write = tag_counts[i] > 0;
-  }
-
-  #ifdef DEBUG
-  START_SERIAL;  
-  printdebug("Tags: %16s %8s %8s %8s\n", "Name", "Count", "Offset", "Handle");
-
-  tag_iter = tagList.begin();
-  for (int i = 0; i < num_tags; ++i, ++tag_iter)
-  {
-    std::string name;
-    iFace->tag_get_name( tag_iter->tag_id, name );
-    printdebug("      %16s %8d %8d %8lx\n", name.c_str(), tag_counts[i], tag_offsets[i], (unsigned long)tag_iter->tag_id );
-  }
-  END_SERIAL;  
-  #endif
-  
-  /************** Close serial file and reopen parallel *****************/
-  
-  if (0 == myRank)
-  {
-    mhdf_closeFile( filePtr, &status );
-  }
-  
-  unsigned long junk;
-  hid_t hdf_opt = H5Pcreate( H5P_FILE_ACCESS );
-  H5Pset_fapl_mpio( hdf_opt, MPI_COMM_WORLD, MPI_INFO_NULL );
-  filePtr = mhdf_openFileWithOpt( filename, 1, &junk, hdf_opt, &status );
-  if (!filePtr)
-  {
-    writeUtil->report_error( "%s\n", mhdf_message( &status ) );
-    return MB_FAILURE;
-  }
-  
-  
-  return MB_SUCCESS;
-}
-
-
-MBErrorCode WriteHDF5Parallel::create_node_table( int dimension )
-{
-  int result;
-  mhdf_Status status;
- 
-    // gather node counts for each processor
-  std::vector<int> node_counts(numProc);
-  int num_nodes = nodeSet.range.size();
-  result = MPI_Gather( &num_nodes, 1, MPI_INT, &node_counts[0], 1, MPI_INT, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // create node data in file
-  long first_id;
-  if (myRank == 0)
-  {
-    int total = 0;
-    for (int i = 0; i < numProc; i++)
-      total += node_counts[i];
-      
-    hid_t handle = mhdf_createNodeCoords( filePtr, dimension, total, &first_id, &status );
-    if (mhdf_isError( &status ))
-    {
-      writeUtil->report_error( "%s\n", mhdf_message( &status ) );
-      return MB_FAILURE;
-    }
-    mhdf_closeData( filePtr, handle, &status );
- }
-    
-    // send id offset to every proc
-  result = MPI_Bcast( &first_id, 1, MPI_LONG, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  nodeSet.first_id = (id_t)first_id;
-   
-      // calculate per-processor offsets
-  if (myRank == 0)
-  {
-    int prev_size = node_counts[0];
-    node_counts[0] = 0;
-    for (int i = 1; i < numProc; ++i)
-    {
-      int mysize = node_counts[i];
-      node_counts[i] = node_counts[i-1] + prev_size;
-      prev_size = mysize;
-    }
-  }
-  
-    // send each proc it's offset in the node table
-  int offset;
-  result = MPI_Scatter( &node_counts[0], 1, MPI_INT, 
-                        &offset, 1, MPI_INT,
-                        0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  nodeSet.offset = offset;
-  
-  writeUtil->assign_ids( nodeSet.range, idTag, (id_t)(nodeSet.first_id + nodeSet.offset) );
-
-  return MB_SUCCESS;
-}
-
-
-
-struct elemtype {
-  int mbtype;
-  int numnode;
-  
-  elemtype( int vals[2] ) : mbtype(vals[0]), numnode(vals[1]) {}
-  elemtype( int t, int n ) : mbtype(t), numnode(n) {}
-  
-  bool operator==( const elemtype& other ) const
-  {
-    return mbtype == other.mbtype &&
-            (mbtype == MBPOLYGON ||
-             mbtype == MBPOLYHEDRON ||
-             mbtype == MBENTITYSET ||
-             numnode == other.numnode);
-  }
-  bool operator<( const elemtype& other ) const
-  {
-    if (mbtype > other.mbtype)
-      return false;
-   
-    return mbtype < other.mbtype ||
-           (mbtype != MBPOLYGON &&
-            mbtype != MBPOLYHEDRON &&
-            mbtype != MBENTITYSET &&
-            numnode < other.numnode);
-  }
-  bool operator!=( const elemtype& other ) const
-    { return !this->operator==(other); }
-};
-
-
-MBErrorCode WriteHDF5Parallel::negotiate_type_list()
-{
-  int result;
-  
-  exportList.sort();
-  
-    // Get number of types each processor has
-  int num_types = 2*exportList.size();
-  std::vector<int> counts(numProc);
-  result = MPI_Gather( &num_types, 1, MPI_INT, &counts[0], 1, MPI_INT, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // Get list of types on this processor
-  std::vector<int> my_types(num_types);
-  std::vector<int>::iterator viter = my_types.begin();
-  for (std::list<ExportSet>::iterator eiter = exportList.begin();
-       eiter != exportList.end(); ++eiter)
-  {
-    *viter = eiter->type;      ++viter;
-    *viter = eiter->num_nodes; ++viter;
-  }
-
-  #ifdef DEBUG
-  START_SERIAL;
-  printdebug( "Local Element Types:\n");
-  viter = my_types.begin();
-  while (viter != my_types.end())
-  {
-    int type = *viter; ++viter;
-    int count = *viter; ++viter;
-    printdebug("  %s : %d\n", MBCN::EntityTypeName((MBEntityType)type), count);
-  }
-  END_SERIAL;
-  #endif
-
-    // Get list of types from each processor
-  std::vector<int> displs(numProc + 1);
-  displs[0] = 0;
-  for (int i = 1; i <= numProc; ++i)
-    displs[i] = displs[i-1] + counts[i-1];
-  int total = displs[numProc];
-  std::vector<int> alltypes(total);
-  result = MPI_Gatherv( &my_types[0], my_types.size(), MPI_INT,
-                        &alltypes[0], &counts[0], &displs[0], MPI_INT,
-                        0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // Merge type lists
-  std::list<elemtype> type_list;
-  std::list<elemtype>::iterator liter;
-  for (int i = 0; i < numProc; ++i)
-  {
-    int* proc_type_list = &alltypes[displs[i]];
-    liter = type_list.begin();
-    for (int j = 0; j < counts[i]; j += 2)
-    {
-      elemtype type( &proc_type_list[j] );
-        // skip until insertion spot
-      for (; liter != type_list.end() && *liter < type; ++liter);
-      
-      if (liter == type_list.end() || *liter != type)
-        liter = type_list.insert( liter, type );
-    }
-  }
-  
-    // Send total number of types to each processor
-  total = type_list.size();
-  result = MPI_Bcast( &total, 1, MPI_INT, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // Send list of types to each processor
-  std::vector<int> intlist(total * 2);
-  viter = intlist.begin();
-  for (liter = type_list.begin(); liter != type_list.end(); ++liter)
-  {
-    *viter = liter->mbtype;  ++viter;
-    *viter = liter->numnode; ++viter;
-  }
-  result = MPI_Bcast( &intlist[0], 2*total, MPI_INT, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-
-  #ifdef DEBUG
-  START_SERIAL;
-  printdebug( "Global Element Types:\n");
-  viter = intlist.begin();
-  while (viter != intlist.end())
-  {
-    int type = *viter; ++viter;
-    int count = *viter; ++viter;
-    printdebug("  %s : %d\n", MBCN::EntityTypeName((MBEntityType)type), count);
-  }
-  END_SERIAL;
-  #endif
-  
-    // Insert missing types into exportList, with an empty
-    // range of entities to export.
-  std::list<ExportSet>::iterator ex_iter = exportList.begin();
-  viter = intlist.begin();
-  for (int i = 0; i < total; ++i)
-  {
-    int mbtype = *viter; ++viter;
-    int numnode = *viter; ++viter;
-    while (ex_iter != exportList.end() && ex_iter->type < mbtype)
-      ++ex_iter;
-    
-    bool equal = ex_iter != exportList.end() && ex_iter->type == mbtype;
-    if (equal && mbtype != MBPOLYGON && mbtype != MBPOLYHEDRON)
-    {
-      while (ex_iter != exportList.end() && ex_iter->num_nodes < numnode)
-        ++ex_iter;
-        
-      equal = ex_iter != exportList.end() && ex_iter->num_nodes == numnode;
-    }
-    
-    if (!equal)
-    {
-      ExportSet insert;
-      insert.type = (MBEntityType)mbtype;
-      insert.num_nodes = numnode;
-      insert.first_id = 0;
-      insert.offset = 0;
-      insert.poly_offset = 0;
-      insert.adj_offset = 0;
-      ex_iter = exportList.insert( ex_iter, insert );
-    }
-  }
-  
-  return MB_SUCCESS;
-}
-
-MBErrorCode WriteHDF5Parallel::create_element_tables()
-{
-  int result;
-  MBErrorCode rval;
-  std::list<ExportSet>::iterator ex_iter;
-  std::vector<long>::iterator viter;
-  
-    // Get number of each element type from each processor
-  const int numtypes = exportList.size();
-  std::vector<long> my_counts(numtypes);
-  std::vector<long> counts(numtypes * numProc + numtypes);
-  viter = my_counts.begin();
-  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
-    { *viter = ex_iter->range.size(); ++viter; }
-  
-  result = MPI_Gather( &my_counts[0], numtypes, MPI_LONG,
-                       &counts[0],    numtypes, MPI_LONG, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // Convert counts to offsets
-  for (int i = 0; i < numtypes; i++) 
-  {
-    long prev = 0;
-    for (int j = 0; j <= numProc; j++)
-    {
-      long tmp = counts[j*numtypes + i];
-      counts[j*numtypes+i] = prev;
-      prev += tmp;
-    }
-  }
-  
-    // Send offsets to each processor
-  result = MPI_Scatter( &counts[0],    numtypes, MPI_LONG,
-                        &my_counts[0], numtypes, MPI_LONG,
-                        0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // Update store offsets in ExportSets
-  viter = my_counts.begin();
-  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
-    ex_iter->offset = (id_t)*(viter++);
-  
-    // If polygons or polyhedra, send calculate offsets for each
-  std::vector<int> perproc(numProc+1);
-  ExportSet *poly[] = {0,0};
-  int polycount[2];
-  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
-  {
-    if (ex_iter->type == MBPOLYGON)
-    {
-      assert(!poly[0]);
-      poly[0] = &*ex_iter;
-    }
-    else if(ex_iter->type == MBPOLYHEDRON)
-    {
-      assert(!poly[1]);
-      poly[1] = &*ex_iter;
-    }
-  }
-  for (int i = 0; i < 2; i++)
-  {
-    ExportSet* ppoly = poly[i];
-    if (!ppoly)
-      continue;
-  
-    int count;
-    rval = writeUtil->get_poly_array_size( ppoly->range.begin(),
-                                           ppoly->range.end(),
-                                           count );
-    assert(MB_SUCCESS == rval);
-    result = MPI_Gather( &count, 1, MPI_INT, &perproc[0], 1, MPI_INT, 0, MPI_COMM_WORLD );
-    assert(MPI_SUCCESS == result);
-    
-    int prev = 0;
-    for (int j = 1; j <= numProc; j++)
-    {
-      int tmp = perproc[j];
-      perproc[j] = prev;
-      prev += tmp;
-    }
-                                           
-    polycount[i] = perproc[numProc];
-    result = MPI_Scatter( &perproc[0], 1, MPI_INT, &count, 1, MPI_INT, 0, MPI_COMM_WORLD );
-    assert(MPI_SUCCESS == result);
-    ppoly->poly_offset = count;
-  }
-  
-    // Create element tables
-  std::vector<long> start_ids(numtypes);
-  if (myRank == 0)
-  {
-    viter = start_ids.begin();
-    long* citer = &counts[numtypes * numProc];
-    for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
-    {
-      switch(ex_iter->type) {
-      case MBPOLYGON:
-        rval = create_poly_tables( MBPOLYGON,
-                                   *citer,
-                                   polycount[0],
-                                   *viter );
-        break;
-      case MBPOLYHEDRON:
-        rval = create_poly_tables( MBPOLYHEDRON,
-                                   *citer,
-                                   polycount[1],
-                                   *viter );
-        break;
-      default:
-        rval = create_elem_tables( ex_iter->type,
-                                   ex_iter->num_nodes,
-                                   *citer,
-                                   *viter );
-      }
-      assert(MB_SUCCESS == rval);
-      ++citer;
-      ++viter;
-    }
-  }
-  
-    // send start IDs to each processor
-  result = MPI_Bcast( &start_ids[0], numtypes, MPI_LONG, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // Assign IDs to local elements
-  viter = start_ids.begin();
-  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
-  {
-    ex_iter->first_id = *(viter++);
-    id_t myfirst = (id_t)(ex_iter->first_id + ex_iter->offset);
-    rval = writeUtil->assign_ids( ex_iter->range, idTag, myfirst );
-    assert(MB_SUCCESS == rval);
-  }
-  
-  return MB_SUCCESS;
-}
-  
-MBErrorCode WriteHDF5Parallel::create_adjacency_tables()
-{
-  MBErrorCode rval;
-  mhdf_Status status;
-  int i, j, result;
-#ifdef WRITE_NODE_ADJACENCIES  
-  const int numtypes = exportList.size()+1;
-#else
-  const int numtypes = exportList.size();
-#endif
-  std::vector<long>::iterator viter;
-  std::list<ExportSet>::iterator ex_iter;
-  std::vector<long> local(numtypes), all(numProc * numtypes + numtypes);
-  
-    // Get adjacency counts for local processor
-  viter = local.begin();
-  id_t num_adj;
-#ifdef WRITE_NODE_ADJACENCIES  
-  rval = count_adjacencies( nodeSet.range, num_adj );
-  assert (MB_SUCCESS == rval);
-  *viter = num_adj; ++viter;
-#endif
-
-  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
-  {
-    rval = count_adjacencies( ex_iter->range, num_adj );
-    assert (MB_SUCCESS == rval);
-    *viter = num_adj; ++viter;
-  }
-  
-    // Send local adjacency counts to root processor
-  result = MPI_Gather( &local[0], numtypes, MPI_LONG,
-                       &all[0],   numtypes, MPI_LONG, 
-                       0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // Convert counts to offsets
-  for (i = 0; i < numtypes; i++) 
-  {
-    long prev = 0;
-    for (j = 0; j <= numProc; j++)
-    {
-      long tmp = all[j*numtypes + i];
-      all[j*numtypes+i] = prev;
-      prev += tmp;
-    }
-  }
-  
-    // For each element type for which there is no adjacency data,
-    // send -1 to all processors as the offset
-  for (i = 0; i < numtypes; ++i)
-    if (all[numtypes*numProc+i] == 0)
-      for (j = 0; j < numProc; ++j)
-        all[j*numtypes+i] = -1;
-  
-    // Send offsets back to each processor
-  result = MPI_Scatter( &all[0],   numtypes, MPI_LONG,
-                        &local[0], numtypes, MPI_LONG,
-                        0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // Record the adjacency offset in each ExportSet
-  viter = local.begin();
-#ifdef WRITE_NODE_ADJACENCIES  
-  nodeSet.adj_offset = *viter; ++viter;
-#endif
-  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
-    { ex_iter->adj_offset = *viter; ++viter; }
-  
-    // Create data tables in file
-  if (myRank == 0)
-  {
-    viter = all.begin() + (numtypes * numProc);
-#ifdef WRITE_NODE_ADJACENCIES  
-    if (*viter) {
-      hid_t handle = mhdf_createAdjacency( filePtr, 
-                                           mhdf_node_type_handle(),
-                                           *viter,
-                                           &status );
-      if (mhdf_isError( &status ))
-      {
-        writeUtil->report_error( "%s\n", mhdf_message( &status ) );
-        return MB_FAILURE;
-      }
-      mhdf_closeData( filePtr, handle, &status );
-    }
-    ++viter;
-#endif
-    for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter, ++viter)
-    {
-      if (!*viter) 
-        continue;
-      
-      hid_t handle = mhdf_createAdjacency( filePtr,
-                                           ex_iter->name(),
-                                           *viter,
-                                           &status );
-      if (mhdf_isError( &status ))
-      {
-        writeUtil->report_error( "%s\n", mhdf_message( &status ) );
-        return MB_FAILURE;
-      }
-      mhdf_closeData( filePtr, handle, &status );
-    }
-  }
-
-  return MB_SUCCESS;
-}
-
-/*
-MBErrorCode WriteHDF5Parallel::get_interface_set_data( RemoteSetData& data,
-                                                       long& offset )
-{
-  const char* PROC_ID_TAG = "HDF5Writer_Rank";
-  MBTag iface_tag, proc_tag;
-  MBErrorCode rval;
-  
-  rval = iFace->tag_get_handle( PARALLEL_INTERFACE_TAG_NAME, iface_tag );
-  if (MB_SUCCESS != rval) return rval;
-  
-  rval = iFace->tag_get_handle( PROC_ID_TAG, proc_tag );
-  if (MB_SUCCESS == rval) 
-    iFace->tag_delete( proc_tag );
-  rval = iFace->tag_create( PROC_ID_TAG, sizeof(int), MB_TAG_DENSE, MB_TYPE_INTEGER, proc_tag, 0 );
-  if (MB_SUCCESS != rval) return rval;
-    
-  MBRange interface_sets, sets;
-  rval = iFace->get_entities_by_type_and_tag( 0, MBENTITYSET, &iface_tag, 0, 1, interface_sets );
-  if (MB_SUCCESS != rval) return rval;
-  
-  std::vector<int> list;
-  for (MBRange::iterator i = interface_sets.begin(); i != interface_sets.end(); ++i)
-  {
-    int proc_ids[2];
-    rval = iFace->tag_get_data( iface_tag, &*i, 1, proc_ids );
-    if (MB_SUCCESS != rval) return rval;
-    
-    sets.clear();
-    rval = iFace->get_entities_by_type( *i, MBENTITYSET, sets );
-    if (MB_SUCCESS != rval) return rval;
-  
-    list.clear();
-    list.resize( sets.size(), proc_ids[0] );
-    rval = iFace->tag_set_data( proc_tag, sets, &list[0] );
-    if (MB_SUCCESS != rval) return rval;
-  }
-  
-  return get_remote_set_data( PROC_ID_TAG, PARALLEL_GLOBAL_ID_TAG_NAME, data, offset );
-}
-*/
-  
-
-struct RemoteSetData {
-  MBTag data_tag, filter_tag;
-  int filter_value;
-  MBRange range;
-  std::vector<int> counts, displs, all_values, local_values;
-};
-
-MBErrorCode WriteHDF5Parallel::get_remote_set_data( 
-                        const WriteHDF5Parallel::MultiProcSetTags::Data& tags,
-                        RemoteSetData& data, long& offset )
-{
-  MBErrorCode rval;
-  int i, result;
-  MBRange::iterator riter;
-    
-  rval = iFace->tag_get_handle( tags.filterTag.c_str(), data.filter_tag );
-  if (rval != MB_SUCCESS) return rval;
-  if (tags.useFilterValue) 
-  {
-    i = 0;
-    iFace->tag_get_size( data.filter_tag, i );
-    if (i != sizeof(int)) {
-      fprintf(stderr, "Cannot use non-int tag data for filtering remote sets.\n" );
-      assert(0);
-      return MB_FAILURE;
-    }  
-    data.filter_value = tags.filterValue;
-  }
-  else
-  {
-    data.filter_value = 0;
-  }
-  
-  rval = iFace->tag_get_handle( tags.dataTag.c_str(), data.data_tag );
-  if (rval != MB_SUCCESS) return rval;
-  i = 0;
-  iFace->tag_get_size( data.data_tag, i );
-  if (i != sizeof(int)) {
-    fprintf(stderr, "Cannot use non-int tag data for matching remote sets.\n" );
-    assert(0);
-    return MB_FAILURE;
-  }  
-    
-
-  printdebug("Negotiating multi-proc meshsets for tag: \"%s\"\n", tags.filterTag.c_str());
-
-    // Get sets with tag, or leave range empty if the tag
-    // isn't defined on this processor.
-  if (rval != MB_TAG_NOT_FOUND)
-  {
-    MBTag handles[] = { data.filter_tag, data.data_tag };
-    const void* values[] = { tags.useFilterValue ? &tags.filterValue : 0, 0 };
-    rval = iFace->get_entities_by_type_and_tag( 0, 
-                                                MBENTITYSET, 
-                                                handles,
-                                                values,
-                                                2,
-                                                data.range );
-    if (rval != MB_SUCCESS) return rval;
-    data.range = data.range.intersect( setSet.range );
-    range_remove( setSet.range, data.range );
-  }
-  
-  printdebug("Found %d meshsets with \"%s\" tag.\n", data.range.size(), tags.filterTag.c_str() );
-
-    // Exchange number of sets with tag between all processors
-  data.counts.resize(numProc);
-  int count = data.range.size();
-  result = MPI_Allgather( &count,          1, MPI_INT, 
-                          &data.counts[0], 1, MPI_INT,
-                          MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-
-    // Exchange tag values for sets between all processors
-  data.displs.resize(numProc+1);
-  data.displs[0] = 0;
-  for (i = 1; i <= numProc; i++)
-    data.displs[i] = data.displs[i-1] + data.counts[i-1];
-  int total = data.displs[numProc];
-  data.all_values.resize(total);
-  data.local_values.resize(count);
-  rval = iFace->tag_get_data( data.data_tag, data.range, &data.local_values[0] );
-  assert( MB_SUCCESS == rval );
-  result = MPI_Allgatherv( &data.local_values[0], count, MPI_INT,
-                           &data.all_values[0], &data.counts[0], &data.displs[0], MPI_INT,
-                           MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-
-
-    // Remove from the list any sets that are unique to one processor
-  std::vector<int> sorted( data.all_values );
-  std::sort( sorted.begin(), sorted.end() );
-  int r = 0, w = 0;
-  for (i = 0; i < numProc; ++i)
-  {
-    const int start = w;
-    for (int j = 0; j < data.counts[i]; ++j)
-    {
-      std::vector<int>::iterator p 
-        = std::lower_bound( sorted.begin(), sorted.end(), data.all_values[r] );
-      ++p;
-      if (p != sorted.end() && *p == data.all_values[r])
-      {
-        data.all_values[w] = data.all_values[r];
-        ++w;
-      }
-      ++r;
-    }
-    data.counts[i] = w - start;
-  }
-  total = w;
-  data.all_values.resize( total );
-  r = w = 0;
-  for (i = 0; i < count; ++i)
-  {
-    std::vector<int>::iterator p 
-      = std::lower_bound( sorted.begin(), sorted.end(), data.local_values[r] );
-    ++p;
-    if (p != sorted.end() && *p == data.local_values[r])
-    {
-      data.local_values[w] = data.local_values[r];
-      ++w;
-    }
-    else
-    {
-      riter = data.range.begin();
-      riter += w;
-      setSet.range.insert( *riter );
-      data.range.erase( riter );
-    }
-    ++r;
-  }
-  count = data.range.size();
-  assert( count == data.counts[myRank] );
-  assert( count == w );
-  data.local_values.resize( count );
-  sorted.clear(); // release storage
-    // recalculate displacements
-  data.displs[0] = 0;
-  for (i = 1; i <= numProc; i++)
-    data.displs[i] = data.displs[i-1] + data.counts[i-1];
-  
-    // Find sets that span multple processors and update appropriately.
-    // The first processor (sorted by MPI rank) that contains a given set
-    // will be responsible for writing the set description.  All multi-
-    // processor sets will be written at the beginning of the set tables.
-    // Processors will write set contents/children for a given set in
-    // the order of their MPI rank.
-    //
-    // Identify which meshsets will be managed by this processor and
-    // the corresponding offset in the set description table. 
-  std::map<int,int> val_id_map;
-  int cpu = 0;
-  for (i = 0; i < total; ++i)
-  {
-    if (data.displs[cpu+1] == i)
-      ++cpu;
-
-    int id = 0;
-    std::map<int,int>::iterator p = val_id_map.find( data.all_values[i] );
-    if (p == val_id_map.end())
-    {
-      id = (int)++offset;
-      val_id_map[data.all_values[i]] = id;
-      //const unsigned int values_offset = (unsigned)i - (unsigned)data.displs[myRank];
-      //if (values_offset < (unsigned)count)
-      //{
-      //  riter = data.range.begin();
-      //  riter += values_offset;
-      //  myParallelSets.insert( *riter );
-      //}
-    }
-    std::vector<int>::iterator loc 
-      = std::find( data.local_values.begin(), data.local_values.end(), data.all_values[i] );
-    if (loc != data.local_values.end()) 
-    {
-      riter = data.range.begin();
-      riter += loc - data.local_values.begin();
-      cpuParallelSets[cpu].insert( *riter );
-    }
-  }
-  riter = data.range.begin();
-  for (i = 0; i < count; ++i, ++riter)
-  {
-    std::map<int,int>::iterator p = val_id_map.find( data.local_values[i] );
-    assert( p != val_id_map.end() );
-    int id = p->second;
-    rval = iFace->tag_set_data( idTag, &*riter, 1, &id );
-    assert(MB_SUCCESS == rval);
-  }
-  
-  return MB_SUCCESS;
-}
-
-
-MBErrorCode WriteHDF5Parallel::create_meshset_tables()
-{
-  MBErrorCode rval;
-  int result, i;
-  long total_offset = 0;
-  MBRange::const_iterator riter;
-
-  START_SERIAL;
-  print_type_sets( iFace, myRank, numProc, setSet.range );
-  END_SERIAL;
-
-    // Gather data about multi-processor meshsets - removes sets from setSet.range
-  cpuParallelSets.resize( numProc );
-  std::vector<RemoteSetData> remote_set_data( multiProcSetTags.list.size() );
-  for (i = 0; i< (int)multiProcSetTags.list.size(); i++)
-  {
-    rval = get_remote_set_data( multiProcSetTags.list[i],
-                                remote_set_data[i],
-                                total_offset ); assert(MB_SUCCESS == rval);
-  }
-  //rval = get_interface_set_data( remote_set_data[i], total_offset );
-  if (MB_SUCCESS != rval) return rval;
-
-  START_SERIAL;
-  printdebug("myLocalSets\n");
-  print_type_sets( iFace, myRank, numProc, setSet.range );
-  END_SERIAL;
-
-    // Gather counts of non-shared sets from each proc
-    // to determine total table size.
-  std::vector<long> set_offsets(numProc + 1);
-  long local_count = setSet.range.size();
-  result = MPI_Gather( &local_count,    1, MPI_LONG,
-                       &set_offsets[0], 1, MPI_LONG,
-                       0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  for (i = 0; i <= numProc; i++)
-  {
-    long tmp = set_offsets[i];
-    set_offsets[i] = total_offset;
-    total_offset += tmp;
-  }
-  
-    // Send each proc its offsets in the set description table.
-  long sets_offset;
-  result = MPI_Scatter( &set_offsets[0], 1, MPI_LONG,
-                        &sets_offset,    1, MPI_LONG, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  setSet.offset = (id_t)(sets_offset);
-
-    // Create the set description table
-  long total_count_and_start_id[2] = { set_offsets[numProc], 0 };
-  if (myRank == 0 && total_count_and_start_id[0] > 0)
-  {
-    rval = create_set_meta( (id_t)total_count_and_start_id[0], total_count_and_start_id[1] );
-    assert (MB_SUCCESS == rval);
-  }
-  
-    // Send totals to all procs.
-  result = MPI_Bcast( total_count_and_start_id, 2, MPI_LONG, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  setSet.first_id = total_count_and_start_id[1];
-  writeSets = total_count_and_start_id[0] > 0;
-
-  START_SERIAL;  
-  printdebug("Non-shared sets: %ld local, %ld global, offset = %ld, first_id = %ld\n",
-    local_count, total_count_and_start_id[0], sets_offset, total_count_and_start_id[1] );
-  printdebug("my Parallel Sets:\n");
-  print_type_sets(iFace, myRank, numProc, cpuParallelSets[myRank] );
-  END_SERIAL;
-  
-    // Not writing any sets??
-  if (!writeSets)
-    return MB_SUCCESS;
-  
-    // Assign set IDs
-  writeUtil->assign_ids( setSet.range, idTag, (id_t)(setSet.first_id + setSet.offset) );
-  for (i = 0; i < (int)remote_set_data.size(); ++i)
-    fix_remote_set_ids( remote_set_data[i], setSet.first_id );
-  
-    // Communicate sizes for remote sets
-  long data_offsets[3] = { 0, 0, 0 };
-  for (i = 0; i < (int)remote_set_data.size(); ++i)
-  {
-    rval = negotiate_remote_set_contents( remote_set_data[i], data_offsets ); 
-    assert(MB_SUCCESS == rval);
-  }
-  remote_set_data.clear();
-  
-    // Exchange IDs for remote/adjacent sets not shared between procs
-  //rval = communicate_remote_ids( MBENTITYSET ); assert(MB_SUCCESS == rval);
-  
-    // Communicate counts for local sets
-  long data_counts[3];
-  rval = count_set_size( setSet.range, rangeSets, data_counts[0], data_counts[1], data_counts[2] );
-  if (MB_SUCCESS != rval) return rval;
-  std::vector<long> set_counts(3*numProc);
-  result = MPI_Gather( data_counts,    3, MPI_LONG,
-                       &set_counts[0], 3, MPI_LONG,
-                       0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  for (i = 0; i < 3*numProc; ++i)
-  {
-    long tmp = set_counts[i];
-    set_counts[i] = data_offsets[i%3];
-    data_offsets[i%3] += tmp;
-  }
-  long all_counts[] = {data_offsets[0], data_offsets[1], data_offsets[2]};
-  result = MPI_Scatter( &set_counts[0], 3, MPI_LONG,
-                        data_offsets,   3, MPI_LONG,
-                        0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  setContentsOffset = data_offsets[0];
-  setChildrenOffset = data_offsets[1];
-  setParentsOffset = data_offsets[2];
-  
-    // Create set contents and set children tables
-  if (myRank == 0)
-  {
-    rval = create_set_tables( all_counts[0], all_counts[1], all_counts[2] );
-    if (MB_SUCCESS != rval) return rval;
-  }
-  
-    // Send totals to all processors
-  result = MPI_Bcast( all_counts, 3, MPI_LONG, 0, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  writeSetContents = all_counts[0] > 0;
-  writeSetChildren = all_counts[1] > 0;
-  writeSetParents  = all_counts[2] > 0;
-
-  START_SERIAL;  
-  printdebug("Non-shared set contents: %ld local, %ld global, offset = %ld\n",
-    data_counts[0], all_counts[0], data_offsets[0] );
-  printdebug("Non-shared set children: %ld local, %ld global, offset = %ld\n",
-    data_counts[1], all_counts[1], data_offsets[1] );
-  printdebug("Non-shared set parents: %ld local, %ld global, offset = %ld\n",
-    data_counts[2], all_counts[2], data_offsets[2] );
-  END_SERIAL;
-  
-  return MB_SUCCESS;
-}
-
-void WriteHDF5Parallel::remove_remote_entities( MBEntityHandle relative,
-                                                MBRange& range )
-{
-  MBRange result;
-  result.merge( range.intersect( nodeSet.range ) );
-  result.merge( range.intersect( setSet.range ) );  
-  for (std::list<ExportSet>::iterator eiter = exportList.begin();
-           eiter != exportList.end(); ++eiter )
-  {
-    result.merge( range.intersect( eiter->range ) );
-  }
-  //result.merge( range.intersect( myParallelSets ) );
-  MBRange sets;
-  int junk;
-  sets.merge( MBRange::lower_bound( range.begin(), range.end(), CREATE_HANDLE(MBENTITYSET, 0, junk )), range.end() );
-  remove_remote_sets( relative, sets );
-  result.merge( sets );
-  range.swap(result);
-}
-
-void WriteHDF5Parallel::remove_remote_sets( MBEntityHandle relative, 
-                                            MBRange& range )
-{
-  MBRange result( range.intersect( setSet.range ) );
-  //result.merge( range.intersect( myParallelSets ) );
-  MBRange remaining( range.subtract( result ) );
-  
-  for(MBRange::iterator i = remaining.begin(); i != remaining.end(); ++i)
-  {
-      // Look for the first CPU which knows about both sets.
-    int cpu;
-    for (cpu = 0; cpu < numProc; ++cpu)
-      if (cpuParallelSets[cpu].find(relative) != cpuParallelSets[cpu].end() &&
-          cpuParallelSets[cpu].find(*i) != cpuParallelSets[cpu].end())
-        break;
-      // If we didn't find one, it may indicate a bug.  However,
-      // it could also indicate that it is a link to some set that
-      // exists on this processor but is not being written, because
-      // the caller requested that some subset of the mesh be written.
-    //assert(cpu < numProc);
-      // If I'm the first set that knows about both, I'll handle it.
-    if (cpu == myRank)
-      result.insert( *i );
-  }
-  
-  range.swap( result );
-}
-  
-  
-
-void WriteHDF5Parallel::remove_remote_entities( MBEntityHandle relative,
-                                                std::vector<MBEntityHandle>& vect )
-{
-  MBRange intrsct;
-  for (std::vector<MBEntityHandle>::const_iterator iter = vect.begin();
-       iter != vect.end(); ++iter)
-    intrsct.insert(*iter);
-  remove_remote_entities( relative, intrsct );
-  
-  unsigned int read, write;
-  for (read = write = 0; read < vect.size(); ++read)
-  {
-    if (intrsct.find(vect[read]) != intrsct.end())
-    {
-      if (read != write)
-        vect[write] = vect[read];
-      ++write;
-    }
-  }
-  if (write != vect.size())
-    vect.resize(write);
-}
-
-  
-
-void WriteHDF5Parallel::remove_remote_sets( MBEntityHandle relative,
-                                            std::vector<MBEntityHandle>& vect )
-{
-  MBRange intrsct;
-  for (std::vector<MBEntityHandle>::const_iterator iter = vect.begin();
-       iter != vect.end(); ++iter)
-    intrsct.insert(*iter);
-  remove_remote_sets( relative, intrsct );
-  
-  unsigned int read, write;
-  for (read = write = 0; read < vect.size(); ++read)
-  {
-    if (intrsct.find(vect[read]) != intrsct.end())
-    {
-      if (read != write)
-        vect[write] = vect[read];
-      ++write;
-    }
-  }
-  if (write != vect.size())
-    vect.resize(write);
-}
-
-// Given a RemoteSetData object describing the set information for a 
-// single tag (or tag pair), populate the list of parallel sets
-// (this->parallelSets) with the per-entityset data.
-MBErrorCode WriteHDF5Parallel::negotiate_remote_set_contents( RemoteSetData& data,
-                                                              long* offsets /* long[3] */ )
-{
-  unsigned i;
-  MBErrorCode rval;
-  MBRange::const_iterator riter;
-  int result;
-  const unsigned count = data.range.size();
-  const unsigned total = data.all_values.size();
-  std::vector<int>::iterator viter, viter2;
-
-    // Calculate counts for each meshset
-  std::vector<long> local_sizes(3*count);
-  std::vector<long>::iterator sizes_iter = local_sizes.begin();
-  MBRange tmp_range;
-  std::vector<MBEntityHandle> child_list;
-  for (riter = data.range.begin(); riter != data.range.end(); ++riter)
-  {
-      // Count contents
-    *sizes_iter = 0;
-    tmp_range.clear();
-    rval = iFace->get_entities_by_handle( *riter, tmp_range );
-    remove_remote_entities( *riter, tmp_range );
-    assert (MB_SUCCESS == rval);
-    for (MBRange::iterator iter = tmp_range.begin(); iter != tmp_range.end(); ++iter)
-    {
-      int id = 0;
-      rval = iFace->tag_get_data( idTag, &*iter, 1, &id );
-      if (rval != MB_TAG_NOT_FOUND && rval != MB_SUCCESS)
-        { assert(0); return MB_FAILURE; }
-      if (id > 0)
-        ++*sizes_iter;
-    }
-    ++sizes_iter;
-    
-      // Count children
-    *sizes_iter = 0;
-    child_list.clear();
-    rval = iFace->get_child_meshsets( *riter, child_list );
-    remove_remote_sets( *riter, child_list );
-    assert (MB_SUCCESS == rval);
-    for (std::vector<MBEntityHandle>::iterator iter = child_list.begin();
-         iter != child_list.end(); ++iter)
-    {
-      int id = 0;
-      rval = iFace->tag_get_data( idTag, &*iter, 1, &id );
-      if (rval != MB_TAG_NOT_FOUND && rval != MB_SUCCESS)
-        { assert(0); return MB_FAILURE; }
-      if (id > 0)
-        ++*sizes_iter;
-    }
-    ++sizes_iter;
-    
-      // Count parents
-    *sizes_iter = 0;
-    child_list.clear();
-    rval = iFace->get_parent_meshsets( *riter, child_list );
-    remove_remote_sets( *riter, child_list );
-    assert (MB_SUCCESS == rval);
-    for (std::vector<MBEntityHandle>::iterator iter = child_list.begin();
-         iter != child_list.end(); ++iter)
-    {
-      int id = 0;
-      rval = iFace->tag_get_data( idTag, &*iter, 1, &id );
-      if (rval != MB_TAG_NOT_FOUND && rval != MB_SUCCESS)
-        { assert(0); return MB_FAILURE; }
-      if (id > 0)
-        ++*sizes_iter;
-    }
-    ++sizes_iter;
-  }
-  
-    // Exchange sizes for sets between all processors.
-  std::vector<long> all_sizes(3*total);
-  std::vector<int> counts(numProc), displs(numProc);
-  for (i = 0; i < (unsigned)numProc; i++)
-    counts[i] = 3 * data.counts[i];
-  displs[0] = 0;
-  for (i = 1; i < (unsigned)numProc; i++)
-    displs[i] = displs[i-1] + counts[i-1];
-  result = MPI_Allgatherv( &local_sizes[0], 3*count, MPI_LONG,
-                           &all_sizes[0], &counts[0], &displs[0], MPI_LONG,
-                           MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-
-  
-    // Update information in-place in the array from the Allgatherv.
-    
-    // Change the corresponding sizes for the first instance of a tag
-    // value such that it ends up being the total size of the set.
-    // Change the size to -1 for the later instances of a tag value.
-    //
-    // For the sets that this processor has, update the offsets at
-    // which the set data is to be written.  Store the offset of the data
-    // on this processor for the set *relative* to the start of the
-    // data of *the set*.
-  std::vector<long> local_offsets(3*count);
-  std::map<int,int> tagsort;  // Map of {tag value, index of first set w/ value}
-  for (i = 0; i < total; ++i)
-  {
-    const std::map<int,int>::iterator p = tagsort.find( data.all_values[i] );
-    const unsigned r = (unsigned)(i - data.displs[myRank]);  // offset in "local" array
-    
-      // If this is the first instance of this tag value, 
-      // then the processor with this instance is responsible
-      // for writing the tag description
-    if ( p == tagsort.end() )  
-    {
-      tagsort[data.all_values[i]] = i;
-        // If within the range for this processor, save offsets
-      if (r < (unsigned)count) 
-      {
-        local_offsets[3*r] = local_offsets[3*r+1] = local_offsets[3*r+2] = 0;
-      }
-    }
-      // Otherwise update the total size in the table
-      // for the processor that is responsible for writing
-      // the data and mark the data for the current processor
-      // with a -1.
-    else 
-    {
-        // If within the range for this processor, save offsets
-      int j = p->second;
-      if (r < (unsigned)count) 
-      {
-          // the offset for this processor, from the start of the data
-          // for this group of sets, is the current total count for the
-          // group of sets.
-        local_offsets[3*r  ] = all_sizes[3*j  ];  // contents
-        local_offsets[3*r+1] = all_sizes[3*j+1];  // children
-        local_offsets[3*r+2] = all_sizes[3*j+2];  // parents
-      }
-      
-        // update the total count for the set in the first position in
-        // all_sizes at which the set occurs (the one corresponding to
-        // the processor that owns the set.)
-      all_sizes[3*j  ] += all_sizes[3*i  ]; // contents
-      all_sizes[3*j+1] += all_sizes[3*i+1]; // children
-      all_sizes[3*j+2] += all_sizes[3*i+2]; // parents
-        // set the size to -1 in the positions corresponding to non-owning processor
-      all_sizes[3*i  ] = all_sizes[3*i+1] = all_sizes[3*i+2] = -1;
-    }
-  }  
-    
-  
-    // Store the total size of each set (rather than the
-    // number of entities local to this processor) in the
-    // local_sizes array for each meshset.  Only need this
-    // for the sets this processor is writing the description
-    // for, but it's easier to get it for all of them.
-  sizes_iter = local_sizes.begin();
-  viter = data.local_values.begin();
-  for (riter = data.range.begin(); riter != data.range.end(); ++riter, ++viter)
-  {
-    const std::map<int,int>::iterator p = tagsort.find( *viter ); 
-    assert( p != tagsort.end() );
-    int j = 3 * p->second;
-    *sizes_iter = all_sizes[j  ]; ++sizes_iter;  // contents
-    *sizes_iter = all_sizes[j+1]; ++sizes_iter;  // children
-    *sizes_iter = all_sizes[j+2]; ++sizes_iter;  // parents
-  }
-  
-    // Now calculate the offset of the data for each (entire, parallel) set in
-    // the set contents, children and parents tables.  offsets is long[3], and
-    // is both input and output of this function.  We increment offsets by the
-    // total count (over all processors) for each set such that it contains
-    // the next open row in the table.  This will be passed back into this
-    // function for the next tag (or tag pair) such that ultimately it will
-    // contain the beginning of the non-shared set data in each of the three tables.
-    // all_sizes is re-used to store the global offset in each table for each 
-    // set with the tag.
-  for (i = 0; i < all_sizes.size(); ++i)
-  {
-    if (all_sizes[i] >= 0) // value is -1 (from above) if not this processor
-    {
-      int j = i % 3;              // contents, children or parents list ?
-      long tmp = offsets[j];      // save current, running offset
-      offsets[j] += all_sizes[i]; // next set's offset is current plus the size of this set
-      all_sizes[i] = tmp;         // size of this set is running offset.
-    }
-  }
-  
-    // Local offsets for this processor are stored as values relative to the
-    // start of each set's data.  Convert them to offsets relative to the
-    // start of all the set data.  Add the offset *from* the start of the set
-    // data (local_offsets) to the offset *of* the start of the set data 
-    // (stored in all_sizes in the previous loop) 
-  std::vector<long>::iterator offset_iter = local_offsets.begin();
-  viter = data.local_values.begin();
-  for (riter = data.range.begin(); riter != data.range.end(); ++riter, ++viter)
-  {
-    const std::map<int,int>::iterator p = tagsort.find( *viter );
-    assert( p != tagsort.end() );
-    int j = 3 * p->second;
-    *offset_iter += all_sizes[j  ]; ++offset_iter; // contents
-    *offset_iter += all_sizes[j+1]; ++offset_iter; // children
-    *offset_iter += all_sizes[j+2]; ++offset_iter; // parents
-  }
-
-#ifdef DEBUG  
-START_SERIAL; if (counts[myRank]) {
-std::string name1, name2;
-iFace->tag_get_name( data.data_tag, name1 );
-iFace->tag_get_name( data.filter_tag, name2 );
-printdebug("Remote set data\n" );
-printdebug("    %13s %13s owner local_offsets total_counts\n", name1.c_str(), name2.c_str());
-for (unsigned d = 0; d < (unsigned)counts[myRank]; ++d) {
-switch(d%3) {
-  case 0: // data/contents
-printdebug("   %13d %13d %5s %13d %12d\n", data.all_values[(d+displs[myRank])/3], 
- data.filter_value, 
- all_sizes[d+displs[myRank]] < 0 ? "no" : "yes", 
- local_offsets[d], local_sizes[d] );
-  break;
-  case 1: // children
-printdebug("                          (children) %13d %12d\n", local_offsets[d], local_sizes[d] );
-  break;
-  case 2: // parents
-printdebug("                           (parents) %13d %12d\n", local_offsets[d], local_sizes[d] );
-  break;
-} 
-}
-} 
-END_SERIAL;
-#endif
-  
-    // Store each parallel meshset in the list
-  sizes_iter = local_sizes.begin();
-  offset_iter = local_offsets.begin();
-  std::vector<long>::iterator all_iter = all_sizes.begin() + displs[myRank];
-  for (riter = data.range.begin(); riter != data.range.end(); ++riter)
-  {
-    ParallelSet info;
-    info.handle = *riter;
-    info.contentsOffset = *offset_iter; ++offset_iter;
-    info.childrenOffset = *offset_iter; ++offset_iter;
-    info.parentsOffset = *offset_iter; ++offset_iter;
-    info.contentsCount = *sizes_iter; ++sizes_iter;
-    info.childrenCount = *sizes_iter; ++sizes_iter;
-    info.parentsCount = *sizes_iter; ++sizes_iter;
-    info.description = *all_iter >= 0; all_iter += 3;
-    parallelSets.push_back( info );
-  }
-  
-  return MB_SUCCESS;
-}
-
-MBErrorCode WriteHDF5Parallel::fix_remote_set_ids( RemoteSetData& data, long first_id )
-{
-  const id_t id_diff = (id_t)(first_id - 1);
-  id_t file_id;
-  MBErrorCode rval;
-
-  for (MBRange::iterator iter = data.range.begin(); iter != data.range.end(); ++iter)
-  {
-    rval = iFace->tag_get_data( idTag, &*iter, 1, &file_id );
-    assert( MB_SUCCESS == rval );
-    file_id += id_diff;
-    rval = iFace->tag_set_data( idTag, &*iter, 1, &file_id );
-    assert( MB_SUCCESS == rval );
-  }
-  
-  return MB_SUCCESS;
-}   
-
-
-MBErrorCode WriteHDF5Parallel::write_shared_set_descriptions( hid_t table )
-{
-  const id_t start_id = setSet.first_id;
-  MBErrorCode rval;
-  mhdf_Status status;
-  
-  for( std::list<ParallelSet>::iterator iter = parallelSets.begin();
-        iter != parallelSets.end(); ++iter)
-  {
-    if (!iter->description)
-      continue;  // handled by a different processor
-    
-      // Get offset in table at which to write data
-    int file_id;
-    rval = iFace->tag_get_data( idTag, &(iter->handle), 1, &file_id );
-    file_id -= start_id;
-    
-      // Get flag data
-    unsigned int flags;
-    rval = iFace->get_meshset_options( iter->handle, flags );
-    assert( MB_SUCCESS == rval );
-      
-      // Write the data
-    long data[4] = { iter->contentsOffset + iter->contentsCount - 1, 
-                     iter->childrenOffset + iter->childrenCount - 1, 
-                     iter->parentsOffset  + iter->parentsCount  - 1,
-                     flags };
-    mhdf_writeSetMeta( table, file_id, 1, H5T_NATIVE_LONG, data, &status );
-    if (mhdf_isError(&status))
-      printdebug("Meshset %d : %s\n", ID_FROM_HANDLE(iter->handle), mhdf_message(&status));
-    assert( !mhdf_isError( &status ) );
-  }
-
-  return MB_SUCCESS;
-}
-    
-
-MBErrorCode WriteHDF5Parallel::write_shared_set_contents( hid_t table )
-{
-  MBErrorCode rval;
-  mhdf_Status status;
-  std::vector<MBEntityHandle> handle_list;
-  std::vector<id_t> id_list;
-  
-  for( std::list<ParallelSet>::iterator iter = parallelSets.begin();
-        iter != parallelSets.end(); ++iter)
-  {
-    handle_list.clear();
-    rval = iFace->get_entities_by_handle( iter->handle, handle_list );
-    assert( MB_SUCCESS == rval );
-    remove_remote_entities( iter->handle, handle_list );
-    
-    id_list.clear();
-    for (unsigned int i = 0; i < handle_list.size(); ++i)
-    {
-      int id;
-      rval = iFace->tag_get_data( idTag, &handle_list[i], 1, &id );
-      assert( MB_SUCCESS == rval );
-      if (id > 0)
-        id_list.push_back(id);
-    }
-    
-    if (id_list.empty())
-      continue;
-    
-    mhdf_writeSetData( table, 
-                       iter->contentsOffset, 
-                       id_list.size(),
-                       id_type,
-                       &id_list[0],
-                       &status );
-    assert(!mhdf_isError(&status));
-  }
-  
-  return MB_SUCCESS;
-}
-    
-
-MBErrorCode WriteHDF5Parallel::write_shared_set_children( hid_t table )
-{
-  MBErrorCode rval;
-  mhdf_Status status;
-  std::vector<MBEntityHandle> handle_list;
-  std::vector<id_t> id_list;
-  
-  printdebug("Writing %d parallel sets.\n", parallelSets.size());
-  for( std::list<ParallelSet>::iterator iter = parallelSets.begin();
-        iter != parallelSets.end(); ++iter)
-  {
-    handle_list.clear();
-    rval = iFace->get_child_meshsets( iter->handle, handle_list );
-    assert( MB_SUCCESS == rval );
-    remove_remote_sets( iter->handle, handle_list );
-    
-    id_list.clear();
-    for (unsigned int i = 0; i < handle_list.size(); ++i)
-    {
-      int id;
-      rval = iFace->tag_get_data( idTag, &handle_list[i], 1, &id );
-      assert( MB_SUCCESS == rval );
-      if (id > 0)
-        id_list.push_back(id);
-    }
-    
-    if (!id_list.empty())
-    {
-      mhdf_writeSetParentsChildren( table, 
-                                    iter->childrenOffset, 
-                                    id_list.size(),
-                                    id_type,
-                                    &id_list[0],
-                                    &status );
-      assert(!mhdf_isError(&status));
-    }
-  }
-
-  return MB_SUCCESS;
-}
-    
-
-MBErrorCode WriteHDF5Parallel::write_shared_set_parents( hid_t table )
-{
-  MBErrorCode rval;
-  mhdf_Status status;
-  std::vector<MBEntityHandle> handle_list;
-  std::vector<id_t> id_list;
-  
-  printdebug("Writing %d parallel sets.\n", parallelSets.size());
-  for( std::list<ParallelSet>::iterator iter = parallelSets.begin();
-        iter != parallelSets.end(); ++iter)
-  {
-    handle_list.clear();
-    rval = iFace->get_parent_meshsets( iter->handle, handle_list );
-    assert( MB_SUCCESS == rval );
-    remove_remote_sets( iter->handle, handle_list );
-    
-    id_list.clear();
-    for (unsigned int i = 0; i < handle_list.size(); ++i)
-    {
-      int id;
-      rval = iFace->tag_get_data( idTag, &handle_list[i], 1, &id );
-      assert( MB_SUCCESS == rval );
-      if (id > 0)
-        id_list.push_back(id);
-    }
-    
-    if (!id_list.empty())
-    {
-      mhdf_writeSetParentsChildren( table, 
-                                    iter->parentsOffset, 
-                                    id_list.size(),
-                                    id_type,
-                                    &id_list[0],
-                                    &status );
-      assert(!mhdf_isError(&status));
-    }
-  }
-
-  return MB_SUCCESS;
-}
-
-MBErrorCode WriteHDF5Parallel::write_finished()
-{
-  parallelSets.clear();
-  cpuParallelSets.clear();
-  //myParallelSets.clear();
-  return WriteHDF5::write_finished();
-}
-
-
-class TagNameCompare {
-  MBInterface* iFace;
-  std::string name1, name2;
-public:
-  TagNameCompare( MBInterface* iface ) : iFace(iface) {}
-  bool operator() (const WriteHDF5::SparseTag& t1, 
-                   const WriteHDF5::SparseTag& t2);
-};
-bool TagNameCompare::operator() (const WriteHDF5::SparseTag& t1, 
-                                 const WriteHDF5::SparseTag& t2)
-{
-  MBErrorCode rval;
-  rval = iFace->tag_get_name( t1.tag_id, name1 );
-  rval = iFace->tag_get_name( t2.tag_id, name2 );
-  return name1 < name2;
-}  
-
-void WriteHDF5Parallel::sort_tags_by_name( )
-{
-  tagList.sort( TagNameCompare( iFace ) );
-}
-
-
-MBErrorCode WriteHDF5Parallel::communicate_remote_ids( MBEntityType type )
-{
-  int result;
-  MBErrorCode rval;
-
-    // Get the export set for the specified type
-  ExportSet* export_set = 0;
-  if (type == MBVERTEX)
-    export_set = &nodeSet;
-  else if(type == MBENTITYSET)
-    export_set = &setSet;
-  else
-  {
-    for (std::list<ExportSet>::iterator esiter = exportList.begin();
-         esiter != exportList.end(); ++esiter)
-      if (esiter->type == type)
-      {
-        export_set = &*esiter;
-        break;
-      }
-  }
-  assert(export_set != NULL);
-  
-    // Get the ranges in the set
-  std::vector<unsigned long> myranges;
-  MBRange::const_pair_iterator p_iter = export_set->range.const_pair_begin();
-  const MBRange::const_pair_iterator p_end = export_set->range.const_pair_end();
-  for ( ; p_iter != p_end; ++p_iter)
-  {
-    myranges.push_back( (*p_iter).first );
-    myranges.push_back( (*p_iter).second );
-  }
-
-  START_SERIAL;
-  printdebug("%s ranges to communicate:\n", MBCN::EntityTypeName(type));
-  for (unsigned int xx = 0; xx != myranges.size(); xx+=2)
-    printdebug("  %lu - %lu\n", myranges[xx], myranges[xx+1] );
-  END_SERIAL;
-  
-    // Communicate the number of ranges and the start_id for
-    // each processor.
-  std::vector<int> counts(numProc), offsets(numProc), displs(numProc);
-  int mycount = myranges.size();
-  int mystart = export_set->first_id + export_set->offset;
-  result = MPI_Allgather( &mycount, 1, MPI_INT, &counts[0], 1, MPI_INT, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  result = MPI_Allgather( &mystart, 1, MPI_INT, &offsets[0], 1, MPI_INT, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-    // Communicate the ranges 
-  displs[0] = 0;
-  for (int i = 1; i < numProc; ++i)
-    displs[i] = displs[i-1] + counts[i-1];
-  std::vector<unsigned long> allranges( displs[numProc-1] + counts[numProc-1] );
-  result = MPI_Allgatherv( &myranges[0], myranges.size(), MPI_UNSIGNED_LONG,
-                           &allranges[0], &counts[0], &displs[0],
-                           MPI_UNSIGNED_LONG, MPI_COMM_WORLD );
-  assert(MPI_SUCCESS == result);
-  
-  MBTag global_id_tag;
-  rval = iFace->tag_get_handle( PARALLEL_GLOBAL_ID_TAG_NAME, global_id_tag );
-  assert(MB_SUCCESS == rval);
-  
-    // Set file IDs for each communicated entity
-    
-    // For each processor
-  for (int proc = 0; proc < numProc; ++proc)
-  {
-    if (proc == myRank)
-      continue;
-    
-      // Get data for corresponding processor
-    const int offset = offsets[proc];
-    const int count = counts[proc];
-    const unsigned long* const ranges = &allranges[displs[proc]];
-    
-      // For each geometry meshset in the interface
-    MBRange::iterator r_iter = MBRange::lower_bound( remoteMesh[proc].begin(),
-                                                     remoteMesh[proc].end(),
-                                                     CREATE_HANDLE(type,0,result) );
-    MBRange::iterator r_stop = MBRange::lower_bound( r_iter,
-                                                     remoteMesh[proc].end(),
-                                                     CREATE_HANDLE(type+1,0,result) );
-    for ( ; r_iter != r_stop; ++r_iter)
-    {
-      MBEntityHandle entity = *r_iter;
-
-        // Get handle on other processor
-      MBEntityHandle global;
-      rval = iFace->tag_get_data( global_id_tag, &entity, 1, &global );
-      assert(MB_SUCCESS == rval);
-
-        // Find corresponding fileid on other processor.
-        // This could potentially be n**2, but we will assume that
-        // the range list from each processor is short (typically 1).
-      int j, steps = 0;
-      unsigned long low, high;
-      for (j = 0; j < count; j += 2)
-      {
-        low = ranges[j];
-        high = ranges[j+1];
-        if (low <= global && high >= global)
-          break;
-        steps += (high - low) + 1;
-      }
-      if (j >= count) {
-      printdebug("*** handle = %u, type = %d, id = %d, proc = %d\n",
-      (unsigned)global, (int)(iFace->type_from_handle(global)), (int)(iFace->id_from_handle(global)), proc);
-      for (int ii = 0; ii < count; ii+=2) 
-      printdebug("***  %u to %u\n", (unsigned)ranges[ii], (unsigned)ranges[ii+1] );
-      MBRange junk; junk.insert( global );
-      print_type_sets( iFace, myRank, numProc, junk );
-      }
-      assert(j < count);
-      int fileid = offset + steps + (global - low);
-      rval = iFace->tag_set_data( idTag, &entity, 1, &fileid );
-      assert(MB_SUCCESS == rval);
-    } // for(r_iter->range)
-  } // for(each processor)
-  
-  return MB_SUCCESS;
-}

Deleted: MOAB/trunk/WriteHDF5Parallel.hpp
===================================================================
--- MOAB/trunk/WriteHDF5Parallel.hpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/WriteHDF5Parallel.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,250 +0,0 @@
-/** 
- * \class WriteHDF5Parallel
- * \brief Write MOAB HDF5 file in parallel.
- * \author Jason Kraftcheck
- * \data   22 July 2004
- */
-
-#ifndef WRITE_HDF5_PARALLEL_HPP
-#define WRITE_HDF5_PARALLEL_HPP
-
-#include "WriteHDF5.hpp"
-#include <mpi.h>
-
-struct RemoteSetData;
-
-class MB_DLL_EXPORT WriteHDF5Parallel : public WriteHDF5
-{
-  public:
-    
-      /** Consturctor
-       *
-       * This constructor will automatically register the tags for
-       * material set (block), dirichlet set (nodeset), neumann set
-       * (sideset), and geometry grouping sets for use in identifying
-       * sets that are shared across multiple processors.  To explicitly
-       * disable this functionality, call one of the other construtors
-       * with an empty list of tags.
-       */
-    WriteHDF5Parallel( MBInterface* iface );
-     
-    
-      /** Constructor
-       *\param multiproc_set_tags Null-terminated list strings.
-       *
-       * multiproc_set_tags is a null-terminated list of tag names.
-       * Each tag specified must have an native integer (int) data 
-       * type.  The tag data is used to identify meshsets that span
-       * multiple processors such that they are written as a single
-       * meshset in the resulting file.  
-       *
-       * NOTE: This list must be identical on all processors, including
-       *       the order!
-       */
-    WriteHDF5Parallel( MBInterface* iface,
-                       const std::vector<std::string>& multiproc_set_tags );
-    
-    /**\brief Define tags used to identify sets spanning multiple procesors */
-    class MultiProcSetTags {
-      friend class WriteHDF5Parallel;
-      public:
-
-        /**Specify the name of a tag used to identify parallel entity sets.
-         * The tag must have an native integer (int) data type.  The value
-         * of the tag will be used to match sets on different processors.
-         */
-      void add( const std::string& name );
- 
-        /**Specify separate tags for identifying parallel entity sets and
-         * matching them across processors.
-         *\param filter_name The name of a tag used to identify parallel entity sets
-         *\param value_name  The name of a tag having a native integer (int) data
-         *                   type.  The value of this tag is used as an ID to match
-         *                   entity sets on different processors.
-         */
-      void add( const std::string& filter_name, const std::string& value_name );
- 
-        /**Specify separate tags for identifying parallel entity sets and
-         * matching them across processors.
-         *\param filter_name The name of a tag used to identify parallel entity sets.
-         *                   The data type of this tag must be a native integer (int).
-         *\param filter_value The value of the filter_name tag to use to identify
-         *                   parallel entity sets.
-         *\param value_name  The name of a tag having a native integer (int) data
-         *                   type.  The value of this tag is used as an ID to match
-         *                   entity sets on different processors.
-         */
-      void add( const std::string& filter_name, int filter_value, const std::string& value_name );
-      
-      private:
-      class Data;
-      std::vector<Data> list;
-    };
-     
-      /** Constructor
-       *\param multiproc_set_tags Data used to identify sets spanning multiple processors.
-       *                          NOTE:  This must be identical on all processors, including
-       *                          the order in which tags were added to the object!
-       */
-    WriteHDF5Parallel( MBInterface* iface, const MultiProcSetTags& multiproc_set_tags );
-      
-    
-  
-  protected:
-  
-      //! Called by normal (non-parallel) writer.  Sets up
-      //! necessary data for parallel write.
-    virtual MBErrorCode create_file( const char* filename,
-                                     bool overwrite,
-                                     std::vector<std::string>& qa_records,
-                                     int dimension = 3 );
-    
-      //! Figure out which mesh local mesh is duplicated on
-      //! remote processors and which processor will write
-      //! that mesh.
-    MBErrorCode gather_interface_meshes();
-    
-      //! For entities that will be written by another 
-      //! processor, get the file Ids that will be assigned
-      //! to those so they can be referenced by
-      //! entities to be written on this processor.
-    MBErrorCode communicate_remote_ids(MBEntityType type);
-    
-      //! Sort the list of tag information in the parent
-      //! class by name so all procs have them in the same
-      //! order.
-    void sort_tags_by_name();
-    
-      //! Create the node table in the file.
-    MBErrorCode create_node_table( int dimension );
-    
-      //! Communicate with other processors to negotiate 
-      //! the types of elements that will be written
-      //! (the union of the types defined on each proc.)
-    MBErrorCode negotiate_type_list();
-    
-      //! Create tables to hold element connectivity
-    MBErrorCode create_element_tables();
-    
-      //! Create tables to hold element adjacencies.
-    MBErrorCode create_adjacency_tables();
-    
-      //! Identify and set up meshsets that span multiple
-      //! processors.
-      //!\param offsets Output array of three values.
-    MBErrorCode negotiate_shared_meshsets( long* offsets );
-    
-      //! Setup meshsets spanning multiple processors
-    MBErrorCode get_remote_set_data( const MultiProcSetTags::Data& tag,
-                                     RemoteSetData& data,
-                                     long& offset );
-                                     
-      //! Setup interface meshsets spanning multiple processors
-    MBErrorCode get_interface_set_data( RemoteSetData& data, long& offset );
-    
-      //! Determine offsets in contents and children tables for 
-      //! meshsets shared between processors.
-    MBErrorCode negotiate_remote_set_contents( RemoteSetData& data,
-                                               long* offsets );
-    
-      //! Create tables for mesh sets
-    MBErrorCode create_meshset_tables();
-    
-      //! Write tag descriptions and create tables to hold tag data.
-    MBErrorCode create_tag_tables();
-    
-      //! Mark multiple-processor meshsets with correct file Id
-      //! from the set description offset stored in that tag by
-      //! negotiate_shared_meshsets(..).
-    MBErrorCode fix_remote_set_ids( RemoteSetData& data, long first_id );
-      
-      //! Write set descriptions for multi-processor meshsets.
-      //! Virtual function called by non-parallel code after
-      //! the normal (single-processor) meshset descriptions have
-      //! been written.
-    MBErrorCode write_shared_set_descriptions( hid_t table );
-       
-      //! Write set contents for multi-processor meshsets.
-      //! Virtual function called by non-parallel code after
-      //! the normal (single-processor) meshset contents have
-      //! been written.
-    MBErrorCode write_shared_set_contents( hid_t table );
-       
-      //! Write set children for multi-processor meshsets.
-      //! Virtual function called by non-parallel code after
-      //! the normal (single-processor) meshset children have
-      //! been written.
-    MBErrorCode write_shared_set_children( hid_t table );
-       
-      //! Write set children for multi-processor meshsets.
-      //! Virtual function called by non-parallel code after
-      //! the normal (single-processor) meshset children have
-      //! been written.
-    MBErrorCode write_shared_set_parents( hid_t table );
-  
-      //! Virtual function overridden from WriteHDF5.  
-      //! Release memory by clearing member lists.
-    MBErrorCode write_finished();
-    
-      //! Remove any remote mesh entities from the passed range.
-    void remove_remote_entities( MBEntityHandle relative, MBRange& range );
-    void remove_remote_entities( MBEntityHandle relative, std::vector<MBEntityHandle>& vect );
-    void remove_remote_sets( MBEntityHandle relative, MBRange& range );
-    void remove_remote_sets( MBEntityHandle relative, std::vector<MBEntityHandle>& vect );
-    
-  private:
-    
-      //! MPI environment
-    int numProc, myRank;
-                                     
-      //! An array of interface mesh which is to be written by
-      //! remote processors.  Indexed by MPI rank (processor number).
-    std::vector<MBRange> remoteMesh;
-    
-      //! Tag names for identifying multi-processor meshsets
-    MultiProcSetTags multiProcSetTags;
-    
-      //! Struct describing a multi-processor meshset
-    struct ParallelSet {
-      MBEntityHandle handle;// set handle on this processor
-      long contentsOffset;  // offset in table at which to write set contents
-      long childrenOffset;  // offset in table at which to write set children
-      long parentsOffset;   // offset in table at which to write set parents
-      long contentsCount;   // total size of set contents (all processors)
-      long childrenCount;   // total number of set children (all processors)
-      long parentsCount;    // total numoer of set parents (all processors)
-      bool description;     // true if this processor 'ownes' the set
-    };
-    
-      //! List of multi-processor meshsets
-    std::list<ParallelSet> parallelSets;
-    
-      //! Vector indexed by MPI rank, containing the list
-      //! of parallel sets that each processor knows about.
-    std::vector<MBRange> cpuParallelSets;
-    
-      //! List of parallel sets "owned" by this processor
-    //MBRange myParallelSets;
-    
-    void printrange( MBRange& );
-};
-
-
-
-class WriteHDF5Parallel::MultiProcSetTags::Data
-{
-  public:
-  Data( const std::string& name ) 
-   : filterTag(name), dataTag(name), useFilterValue(false) {}
-  Data( const std::string& fname, const std::string& dname )
-   : filterTag(fname), dataTag(dname), useFilterValue(false) {}
-  Data( const std::string& fname, const std::string& dname, int fval )
-   : filterTag(fname), dataTag(dname), filterValue(fval), useFilterValue(true) {}
-   
-  std::string filterTag;
-  std::string dataTag;
-  int filterValue;
-  bool useFilterValue;
-};
-
-#endif

Modified: MOAB/trunk/configure.in
===================================================================
--- MOAB/trunk/configure.in	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/configure.in	2007-10-03 20:28:42 UTC (rev 1297)
@@ -754,6 +754,7 @@
 AC_CONFIG_FILES([Makefile 
                  moab.make 
                  testdir.h
+		 parallel/Makefile
                  mhdf/Makefile
                  test/Makefile
                  test/h5file/Makefile

Modified: MOAB/trunk/internals_test.cpp
===================================================================
--- MOAB/trunk/internals_test.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/internals_test.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -1,9 +1,9 @@
 #include "MBInternals.hpp"
-#include "MBProcConfig.hpp"
+#include "MBHandleUtils.hpp"
 #include <iostream>
 using namespace std;
 
-MBProcConfig procInfo(0,1);
+MBHandleUtils handleUtils(0,1);
 
 bool internal_assert( bool c ) { return !c; }
 
@@ -16,7 +16,7 @@
 bool handle_test( MBEntityType type, MBEntityID id, int proc, bool should_fail )
 {
   int err = 0;
-  MBEntityHandle handle = CREATE_HANDLE( type, procInfo.id(id, proc), err );
+  MBEntityHandle handle = CREATE_HANDLE( type, handleUtils.create_id(id, proc), err );
   if (should_fail) {
     handle_test_assert( err )
     return true;
@@ -26,10 +26,10 @@
   MBEntityType type_from_handle = TYPE_FROM_HANDLE(handle);
   handle_test_assert( type_from_handle == type )
   
-  MBEntityID id_from_handle = procInfo.id(handle);
+  MBEntityID id_from_handle = handleUtils.id_from_handle(handle);
   handle_test_assert( id_from_handle == id )
   
-  int proc_from_handle = procInfo.rank(handle);
+  int proc_from_handle = handleUtils.rank_from_handle(handle);
   handle_test_assert( proc_from_handle == proc )
   
   return true;
@@ -82,10 +82,10 @@
   
   for (int num_cpu = 0; num_cpu < num_cpus; ++num_cpu) {
     
-    procInfo = MBProcConfig( 0, cpus[num_cpu] );
+    handleUtils = MBHandleUtils( 0, cpus[num_cpu] );
     
     // init these after setting num_cpu, because max id depends on num cpu.
-    const MBEntityID ids[] = {0, 1, procInfo.max_id()/2, procInfo.max_id()-1, procInfo.max_id()};
+    const MBEntityID ids[] = {0, 1, handleUtils.max_id()/2, handleUtils.max_id()-1, handleUtils.max_id()};
     const MBTagId tids[] = {0, 1, MB_TAG_PROP_MASK/2, MB_TAG_PROP_MASK-1, MB_TAG_PROP_MASK};
     const int num_ids = sizeof(ids)/sizeof(ids[0]);
     const int num_tids = sizeof(tids)/sizeof(tids[0]);
@@ -114,7 +114,7 @@
   }
   
     // test some stuff that should fail
-  procInfo = MBProcConfig(0, 16);
+  handleUtils = MBHandleUtils(0, 16);
   ++tests;
   if (!handle_test( MBVERTEX, MB_END_ID+1, 0, true)) {
     cout << "Failed to catch ID overflow" << endl;

Added: MOAB/trunk/mbparallelcomm_test.cpp
===================================================================
--- MOAB/trunk/mbparallelcomm_test.cpp	                        (rev 0)
+++ MOAB/trunk/mbparallelcomm_test.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,132 @@
+/** test of MBParallelComm functionality
+ *
+ * To run:
+ *
+ * mpirun -np <#procs> mbparallelcomm_test
+ *
+ */
+
+#include "MBParallelComm.hpp"
+#include "MBParallelConventions.h"
+#include "MBTagConventions.hpp"
+#include "MBCore.hpp"
+#include "mpi.h"
+#include <iostream>
+
+#define ERROR(a, b) {std::cerr << a << std::endl; return b;}
+
+int main(int argc, char **argv) 
+{
+    // need to init MPI first, to tell how many procs and rank
+  int err = MPI_Init(&argc, &argv);
+
+  int nprocs, rank;
+  err = MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
+  err = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
+
+    // create MOAB instance based on that
+  MBInterface *mbImpl = new MBCore(rank, nprocs);
+  if (NULL == mbImpl) return 1;
+  
+  MBErrorCode result;
+
+    // each interior proc has a vector of N+M vertices, sharing
+    // M vertices each with lower- and upper-rank processors, except
+    // procs on the end
+
+    // get N, M from command line
+  int N, M;
+  if (argc < 3) {
+    std::cerr << "No arguments passed; assuming N=10, M=2." << std::endl;
+    N = 10;
+    M = 2;
+  }
+  else {
+    N = atoi(argv[1]);
+    M = atoi(argv[2]);
+  }
+  
+  int nverts = N + M;
+  if (0 == rank) nverts = N;
+
+    // create my vertices and give them the right global ids
+  MBRange my_verts;
+  std::vector<double> coords(3*(nverts));
+  std::fill(coords.begin(), coords.end(), 0.0);
+  result = mbImpl->create_vertices(&coords[0], nverts,
+                                   my_verts);
+  if (MB_SUCCESS != 0)
+    ERROR("Failed to create vertices.", 1);
+  
+  std::vector<int> global_ids(N+M);
+  for (int i = 0; i < nverts; i++)
+    global_ids[i] = rank*N - (nverts-N) + i;
+  
+  int def_val = -1;
+  MBTag gid_tag;
+  result = mbImpl->tag_create(GLOBAL_ID_TAG_NAME, 1, MB_TAG_DENSE,
+                              MB_TYPE_INTEGER, gid_tag,
+                              &def_val, true);
+  if (MB_SUCCESS != result && MB_ALREADY_ALLOCATED != result) 
+    ERROR("Failed to create tag.", 1);
+  
+  result = mbImpl->tag_set_data(gid_tag, my_verts, &global_ids[0]);
+  if (MB_SUCCESS != result) ERROR("Failed to set global_id tag.", 1);
+  
+    // now figure out what's shared
+  MBParallelComm pcomm(mbImpl);
+  result = pcomm.resolve_shared_ents(my_verts, 0);
+  if (MB_SUCCESS != result) ERROR("Couldn't resolve shared entities.", 1);
+  
+    // check shared entities
+  MBTag sharedproc_tag, sharedprocs_tag;
+  result = mbImpl->tag_get_handle(PARALLEL_SHARED_PROC_TAG_NAME, 
+                                  sharedproc_tag);
+  if (MB_SUCCESS != result) ERROR("Shared processor tag not found.", 1);
+
+  result = mbImpl->tag_get_handle(PARALLEL_SHARED_PROCS_TAG_NAME, 
+                                  sharedprocs_tag);
+  if (MB_SUCCESS != result) 
+    ERROR("Shared processor*s* tag not found.", 1);
+  
+    // get the tag values
+#define MAX_SHARING_PROCS 10
+  std::vector<int> shared_proc_tags(MAX_SHARING_PROCS*my_verts.size());
+  result = mbImpl->tag_get_data(sharedproc_tag, my_verts, 
+                                &shared_proc_tags[0]);
+  if (MB_SUCCESS != result) ERROR("Problem getting shared proc tag.", 1);
+
+    // interior procs should have 2*M shared, bdy procs should have M shared
+  int nshared = 0;
+  for (unsigned int nv = 0; nv < my_verts.size(); nv++)
+    if (shared_proc_tags[2*nv] > -1) nshared++;
+  
+  if ((rank == 0 || rank == nprocs-1) && nshared != (unsigned int) M) {
+    std::cerr << "Didn't get correct number of shared vertices on "
+              << "processor " << rank << std::endl;
+    result = MB_FAILURE;
+  }
+  
+  else if ((rank != 0 && rank != nprocs-1) && nshared != (unsigned int) 2*M) 
+  {
+    std::cerr << "Didn't get correct number of shared vertices on "
+              << "processor " << rank << std::endl;
+    result = MB_FAILURE;
+  }
+
+    // now check sharedprocs; shouldn't be any 
+  MBErrorCode result2 = mbImpl->tag_get_data(sharedprocs_tag, my_verts, 
+                                             &shared_proc_tags[0]);
+  if (MB_SUCCESS == result2) {
+    std::cerr << "Shoudn't get shared proc*s* tag, but did on proc "
+              << rank << std::endl;
+    result = MB_FAILURE;
+  }
+
+  err = MPI_Finalize();
+
+  if (MB_SUCCESS == result)
+    std::cerr << "Proc " << rank << ": Success." << std::endl;
+    
+  return (MB_SUCCESS == result ? 0 : 1);
+}

Added: MOAB/trunk/parallel/MBParallelComm.cpp
===================================================================
--- MOAB/trunk/parallel/MBParallelComm.cpp	                        (rev 0)
+++ MOAB/trunk/parallel/MBParallelComm.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,1263 @@
+#include "MBInterface.hpp"
+#include "MBParallelComm.hpp"
+#include "MBWriteUtilIface.hpp"
+#include "MBReadUtilIface.hpp"
+#include "EntitySequenceManager.hpp"
+#include "EntitySequence.hpp"
+#include "TagServer.hpp"
+#include "MBTagConventions.hpp"
+#include "MBSkinner.hpp"
+#include "MBParallelConventions.h"
+#include "MBCore.hpp"
+
+extern "C" 
+{
+#include "gs.h"
+#include "tuple_list.h"
+}
+
+#include <assert.h>
+
+#ifdef USE_MPI
+#include "mpi.h"
+#endif
+
+#define INITIAL_BUFF_SIZE 1024
+
+#define PACK_INT(buff, int_val) {int tmp_val = int_val; PACK_INTS(buff, &tmp_val, 1);}
+
+#define PACK_INTS(buff, int_val, num) {memcpy(buff, int_val, num*sizeof(int)); buff += num*sizeof(int);}
+
+#define PACK_DBL(buff, dbl_val, num) {memcpy(buff, dbl_val, num*sizeof(double)); buff += num*sizeof(double);}
+
+#define PACK_EH(buff, eh_val, num) {memcpy(buff, eh_val, num*sizeof(MBEntityHandle)); buff += num*sizeof(MBEntityHandle);}
+
+#define PACK_CHAR_64(buff, char_val) {strcpy((char*)buff, char_val); buff += 64;}
+
+#define PACK_VOID(buff, val, num) {memcpy(buff, val, num); buff += num;}
+
+#define PACK_RANGE(buff, rng) {int num_subs = num_subranges(rng); PACK_INTS(buff, &num_subs, 1); \
+          for (MBRange::const_pair_iterator cit = rng.const_pair_begin(); cit != rng.const_pair_end(); cit++) { \
+            MBEntityHandle eh = (*cit).first; PACK_EH(buff_ptr, &eh, 1); \
+            eh = (*cit).second; PACK_EH(buff_ptr, &eh, 1);}}
+
+#define UNPACK_INT(buff, int_val) {UNPACK_INTS(buff, &int_val, 1);}
+
+#define UNPACK_INTS(buff, int_val, num) {memcpy(int_val, buff, num*sizeof(int)); buff += num*sizeof(int);}
+
+#define UNPACK_DBL(buff, dbl_val, num) {memcpy(dbl_val, buff, num*sizeof(double)); buff += num*sizeof(double);}
+
+#define UNPACK_EH(buff, eh_val, num) {memcpy(eh_val, buff, num*sizeof(MBEntityHandle)); buff += num*sizeof(MBEntityHandle);}
+
+#define UNPACK_CHAR_64(buff, char_val) {strcpy(char_val, (char*)buff); buff += 64;}
+
+#define UNPACK_VOID(buff, val, num) {memcpy(val, buff, num); buff += num;}
+
+#define UNPACK_RANGE(buff, rng) {int num_subs; UNPACK_INTS(buff, &num_subs, 1); MBEntityHandle _eh[2]; \
+          for (int i = 0; i < num_subs; i++) { UNPACK_EH(buff_ptr, _eh, 2); rng.insert(_eh[0], _eh[1]);}}
+
+#define RR if (MB_SUCCESS != result) return result
+
+MBParallelComm::MBParallelComm(MBInterface *impl, MPI_Comm comm) 
+    : mbImpl(impl), procConfig(comm)
+{
+  myBuffer.reserve(INITIAL_BUFF_SIZE);
+
+  tagServer = dynamic_cast<MBCore*>(mbImpl)->tag_server();
+  sequenceManager = dynamic_cast<MBCore*>(mbImpl)->sequence_manager();
+}
+
+MBParallelComm::MBParallelComm(MBInterface *impl,
+                               std::vector<unsigned char> &tmp_buff, 
+                               MPI_Comm comm) 
+    : mbImpl(impl), procConfig(comm)
+{
+  myBuffer.swap(tmp_buff);
+}
+
+//! assign a global id space, for largest-dimension or all entities (and
+//! in either case for vertices too)
+MBErrorCode MBParallelComm::assign_global_ids(const int dimension, 
+                                              const int start_id,
+                                              const bool largest_dim_only) 
+{
+  MBRange entities[4];
+  int local_num_elements[4];
+  MBErrorCode result;
+  for (int dim = 0; dim <= dimension; dim++) {
+    if (dim == 0 || !largest_dim_only || dim == dimension) {
+      result = mbImpl->get_entities_by_dimension(0, dim, entities[dim]); RR;
+    }
+
+      // need to filter out non-locally-owned entities!!!
+    MBRange dum_range;
+    for (MBRange::iterator rit = entities[dim].begin(); rit != entities[dim].end(); rit++)
+      if (mbImpl->handle_utils().rank_from_handle(*rit) != 
+          (unsigned int) mbImpl->proc_rank()) 
+        dum_range.insert(*rit);
+    entities[dim] = entities[dim].subtract(dum_range);
+    
+    local_num_elements[dim] = entities[dim].size();
+  }
+  
+    // communicate numbers
+  std::vector<int> num_elements(procConfig.proc_size()*4);
+#ifdef USE_MPI
+  if (procConfig.proc_size() > 1) {
+    int retval = MPI_Alltoall(local_num_elements, 4, MPI_INTEGER,
+                              &num_elements[0], procConfig.proc_size()*4, 
+                              MPI_INTEGER, procConfig.proc_comm());
+    if (0 != retval) return MB_FAILURE;
+  }
+  else
+#endif
+    for (int dim = 0; dim < 4; dim++) num_elements[dim] = local_num_elements[dim];
+  
+    // my entities start at one greater than total_elems[d]
+  int total_elems[4] = {start_id, start_id, start_id, start_id};
+  
+  for (unsigned int proc = 0; proc < procConfig.proc_rank(); proc++) {
+    for (int dim = 0; dim < 4; dim++) total_elems[dim] += num_elements[4*proc + dim];
+  }
+  
+    //.assign global ids now
+  MBTag gid_tag;
+  int zero = 0;
+  result = mbImpl->tag_create(GLOBAL_ID_TAG_NAME, sizeof(int), 
+                              MB_TAG_DENSE, MB_TYPE_INTEGER, gid_tag,
+                              &zero, true);
+  if (MB_SUCCESS != result && MB_ALREADY_ALLOCATED != result) return result;
+  
+  for (int dim = 0; dim < 4; dim++) {
+    if (entities[dim].empty()) continue;
+    num_elements.reserve(entities[dim].size());
+    int i = 0;
+    for (MBRange::iterator rit = entities[dim].begin(); rit != entities[dim].end(); rit++)
+      num_elements[i++] = total_elems[dim]++;
+    
+    result = mbImpl->tag_set_data(gid_tag, entities[dim], &num_elements[0]); RR;
+  }
+  
+  return MB_SUCCESS;
+}
+  
+MBErrorCode MBParallelComm::communicate_entities(const int from_proc, const int to_proc,
+                                                 MBRange &entities,
+                                                 const bool adjacencies,
+                                                 const bool tags) 
+{
+#ifndef USE_MPI
+  return MB_FAILURE;
+#else
+  
+  MBErrorCode result = MB_SUCCESS;
+  
+    // if I'm the from, do the packing and sending
+  if ((int)procConfig.proc_rank() == from_proc) {
+    allRanges.clear();
+    vertsPerEntity.clear();
+    setRange.clear();
+    setRanges.clear();
+    allTags.clear();
+    setSizes.clear();
+    optionsVec.clear();
+    setPcs.clear();
+
+    MBRange whole_range;
+
+    int buff_size;
+    
+    result = pack_buffer(entities, adjacencies, tags, true, whole_range, buff_size); RR;
+
+      // if the message is large, send a first message to tell how large
+    if (INITIAL_BUFF_SIZE < buff_size) {
+      int tmp_buff_size = -buff_size;
+      MPI_Request send_req;
+      int success = MPI_Isend(&tmp_buff_size, sizeof(int), MPI_UNSIGNED_CHAR, to_proc, 
+                              0, procConfig.proc_comm(), &send_req);
+      if (!success) return MB_FAILURE;
+    }
+    
+      // allocate space in the buffer
+    myBuffer.reserve(buff_size);
+
+      // pack the actual buffer
+    int actual_buff_size;
+    result = pack_buffer(entities, adjacencies, tags, false, whole_range, actual_buff_size); RR;
+    
+      // send it
+    MPI_Request send_req;
+    int success = MPI_Isend(&myBuffer[0], actual_buff_size, MPI_UNSIGNED_CHAR, to_proc, 
+                            0, procConfig.proc_comm(), &send_req);
+    if (!success) return MB_FAILURE;
+  }
+  else if ((int)procConfig.proc_rank() == to_proc) {
+    int buff_size;
+    
+      // get how much to allocate
+    MPI_Status status;
+    int success = MPI_Recv(&myBuffer[0], myBuffer.size(), MPI_UNSIGNED_CHAR, from_proc, 
+                           MPI_ANY_TAG, procConfig.proc_comm(), &status);
+    int num_recd;
+    success = MPI_Get_count(&status, MPI_UNSIGNED_CHAR, &num_recd);
+    
+    if (sizeof(int) == num_recd && 0 > *((int*)&myBuffer[0])) {
+        // this was just the size of the next message; prepare buffer then receive that message
+      buff_size = myBuffer[0];
+      myBuffer.reserve(buff_size);
+    
+      // receive the real message
+      success = MPI_Recv(&myBuffer[0], buff_size, MPI_UNSIGNED_CHAR, from_proc, 
+                         MPI_ANY_TAG, procConfig.proc_comm(), &status);
+    }
+    
+      // unpack the buffer
+    result = unpack_buffer(entities); RR;
+  }
+  
+  return result;
+
+#endif
+}
+
+MBErrorCode MBParallelComm::broadcast_entities( const int from_proc,
+                                                MBRange &entities,
+                                                const bool adjacencies,
+                                                const bool tags) 
+{
+#ifndef USE_MPI
+  return MB_FAILURE;
+#else
+  
+  MBErrorCode result = MB_SUCCESS;
+  int success;
+  MBRange whole_range;
+  int buff_size;
+  
+  allRanges.clear();
+  vertsPerEntity.clear();
+  setRange.clear();
+  setRanges.clear();
+  allTags.clear();
+  setSizes.clear();
+  optionsVec.clear();
+  setPcs.clear();
+
+  if ((int)procConfig.proc_rank() == from_proc) {
+    result = pack_buffer( entities, adjacencies, tags, true, whole_range, buff_size ); RR;
+  }
+
+  success = MPI_Bcast( &buff_size, 1, MPI_INT, from_proc, procConfig.proc_comm() );
+  if (MPI_SUCCESS != success)
+    return MB_FAILURE;
+  
+  if (!buff_size) // no data
+    return MB_SUCCESS;
+  
+  myBuffer.reserve( buff_size );
+  
+  if ((int)procConfig.proc_rank() == from_proc) {
+    int actual_buffer_size;
+    result = pack_buffer( entities, adjacencies, tags, false, whole_range, actual_buffer_size ); RR;
+  }
+
+  success = MPI_Bcast( &myBuffer[0], buff_size, MPI_UNSIGNED_CHAR, from_proc, procConfig.proc_comm() );
+  if (MPI_SUCCESS != success)
+    return MB_FAILURE;
+  
+  if ((int)procConfig.proc_rank() != from_proc) {
+    result = unpack_buffer( entities ); RR;
+  }
+
+  return MB_SUCCESS;
+#endif
+}
+
+MBErrorCode MBParallelComm::pack_buffer(MBRange &entities, 
+                                        const bool adjacencies,
+                                        const bool tags,
+                                        const bool just_count,
+                                        MBRange &whole_range,
+                                        int &buff_size) 
+{
+    // pack the buffer with the entity ranges, adjacencies, and tags sections
+  MBErrorCode result;
+
+  buff_size = 0;
+  MBRange::const_iterator rit;
+  unsigned char *buff_ptr = NULL;
+  if (!just_count) buff_ptr = &myBuffer[0];
+  
+    // entities
+  result = pack_entities(entities, rit, whole_range, buff_ptr, buff_size, just_count); RR;
+  
+    // sets
+  int tmp_size;
+  result = pack_sets(entities, rit, whole_range, buff_ptr, tmp_size, just_count); RR;
+  buff_size += tmp_size;
+  
+    // adjacencies
+  if (adjacencies) {
+    result = pack_adjacencies(entities, rit, whole_range, buff_ptr, tmp_size, just_count); RR;
+    buff_size += tmp_size;
+  }
+    
+    // tags
+  if (tags) {
+    result = pack_tags(entities, rit, whole_range, buff_ptr, tmp_size, just_count); RR;
+    buff_size += tmp_size;
+  }
+
+  return result;
+}
+ 
+MBErrorCode MBParallelComm::unpack_buffer(MBRange &entities) 
+{
+  if (myBuffer.capacity() == 0) return MB_FAILURE;
+  
+  unsigned char *buff_ptr = &myBuffer[0];
+  MBErrorCode result = unpack_entities(buff_ptr, entities); RR;
+  result = unpack_sets(buff_ptr, entities); RR;
+  result = unpack_tags(buff_ptr, entities); RR;
+  
+  return MB_SUCCESS;
+}
+
+int MBParallelComm::num_subranges(const MBRange &this_range)
+{
+    // ok, have all the ranges we'll pack; count the subranges
+  int num_sub_ranges = 0;
+  for (MBRange::const_pair_iterator pit = this_range.const_pair_begin(); 
+       pit != this_range.const_pair_end(); pit++)
+    num_sub_ranges++;
+
+  return num_sub_ranges;
+}
+
+MBErrorCode MBParallelComm::pack_entities(MBRange &entities,
+                                          MBRange::const_iterator &start_rit,
+                                          MBRange &whole_range,
+                                          unsigned char *&buff_ptr,
+                                          int &count,
+                                          const bool just_count) 
+{
+  count = 0;
+  unsigned char *orig_buff_ptr = buff_ptr;
+  MBErrorCode result;
+  MBWriteUtilIface *wu = NULL;
+  if (!just_count) {
+    result = mbImpl->query_interface(std::string("MBWriteUtilIface"), reinterpret_cast<void**>(&wu)); RR;
+  }
+  
+    // pack vertices
+  if (just_count) {
+    entTypes.push_back(MBVERTEX);
+    vertsPerEntity.push_back(1);
+    allRanges.push_back(entities.subset_by_type(MBVERTEX));
+  }
+  else {
+    PACK_INT(buff_ptr, MBVERTEX);
+    PACK_RANGE(buff_ptr, allRanges[0]);
+    int num_verts = allRanges[0].size();
+    std::vector<double*> coords(3);
+    for (int i = 0; i < 3; i++)
+      coords[i] = reinterpret_cast<double*>(buff_ptr + i * num_verts * sizeof(double));
+
+    assert(NULL != wu);
+    
+    result = wu->get_node_arrays(3, num_verts, allRanges[0], 0, 0, coords); RR;
+
+    buff_ptr += 3 * num_verts * sizeof(double);
+
+    whole_range = allRanges[0];
+  }
+
+    // place an iterator at the first non-vertex entity
+  if (!allRanges[0].empty()) {
+    start_rit = entities.find(*allRanges[0].rbegin());
+    start_rit++;
+  }
+  else {
+    start_rit = entities.begin();
+  }
+  
+  MBRange::const_iterator end_rit = start_rit;
+  if (allRanges[0].size() == entities.size()) return MB_SUCCESS;
+
+  std::vector<MBRange>::iterator allr_it = allRanges.begin();
+  
+    // pack entities
+  if (just_count) {    
+
+      // get all ranges of entities that have different #'s of vertices or different types
+    while (end_rit != entities.end() && TYPE_FROM_HANDLE(*start_rit) != MBENTITYSET) {
+
+        // get the sequence holding this entity
+      MBEntitySequence *seq;
+      ElementEntitySequence *eseq;
+      result = sequenceManager->find(*start_rit, seq); RR;
+      if (NULL == seq) return MB_FAILURE;
+      eseq = dynamic_cast<ElementEntitySequence*>(seq);
+
+        // if type and nodes per element change, start a new range
+      if (eseq->get_type() != *entTypes.rbegin() || (int) eseq->nodes_per_element() != *vertsPerEntity.rbegin()) {
+        entTypes.push_back(eseq->get_type());
+        vertsPerEntity.push_back(eseq->nodes_per_element());
+        allRanges.push_back(MBRange());
+        allr_it++;
+      }
+    
+        // get position in entities list one past end of this sequence
+      end_rit = entities.lower_bound(start_rit, entities.end(), eseq->get_end_handle()+1);
+
+        // put these entities in the last range
+      eseq->get_entities(*allRanges.rbegin());
+      whole_range.merge(*allRanges.rbegin());
+      
+        // now start where we last left off
+      start_rit = end_rit;
+    }
+
+      // update vertex range and count those data, now that we know which entities get communicated
+    result = mbImpl->get_adjacencies(whole_range, 0, false, allRanges[0], MBInterface::UNION); RR;
+    whole_range.merge(allRanges[0]);
+    count += 3 * sizeof(double) * allRanges[0].size();
+    
+      // space for the ranges
+    std::vector<MBRange>::iterator vit = allRanges.begin();
+    std::vector<int>::iterator iit = vertsPerEntity.begin();
+    std::vector<MBEntityType>::iterator eit = entTypes.begin();
+    for (; vit != allRanges.end(); vit++, iit++, eit++) {
+        // subranges of entities
+      count += 2*sizeof(MBEntityHandle)*num_subranges(*vit);
+        // connectivity of subrange
+      if (iit != vertsPerEntity.begin()) {
+        if (*eit != MBPOLYGON && *eit != MBPOLYHEDRON) 
+            // for non-poly's: #verts/ent * #ents * sizeof handle
+          count += *iit * (*vit).size() * sizeof(MBEntityHandle);
+          // for poly's:  length of conn list * handle size + #ents * int size (for offsets)
+        else count += *iit * sizeof(MBEntityHandle) + (*vit).size() * sizeof(int);
+      }
+    }
+      //                                num_verts per subrange    ent type in subrange
+    count += (vertsPerEntity.size() + 1) * (sizeof(int) + sizeof(MBEntityType));
+
+      // extra entity type at end
+    count += sizeof(int);
+  }
+  else {
+      // for each range beyond the first
+    allr_it++;
+    std::vector<int>::iterator nv_it = vertsPerEntity.begin();
+    std::vector<MBEntityType>::iterator et_it = entTypes.begin();
+    nv_it++; et_it++;
+    
+    for (; allr_it != allRanges.end(); allr_it++, nv_it++, et_it++) {
+        // pack the entity type
+      PACK_INT(buff_ptr, *et_it);
+      
+        // pack the range
+      PACK_RANGE(buff_ptr, (*allr_it));
+
+        // pack the nodes per entity
+      PACK_INT(buff_ptr, *nv_it);
+      
+        // pack the connectivity
+      const MBEntityHandle *connect;
+      int num_connect;
+      if (*et_it == MBPOLYGON || *et_it == MBPOLYHEDRON) {
+        std::vector<int> num_connects;
+        for (MBRange::const_iterator rit = allr_it->begin(); rit != allr_it->end(); rit++) {
+          result = mbImpl->get_connectivity(*rit, connect, num_connect); RR;
+          num_connects.push_back(num_connect);
+          PACK_EH(buff_ptr, &connect[0], num_connect);
+        }
+        PACK_INTS(buff_ptr, &num_connects[0], num_connects.size());
+      }
+      else {
+        for (MBRange::const_iterator rit = allr_it->begin(); rit != allr_it->end(); rit++) {
+          result = mbImpl->get_connectivity(*rit, connect, num_connect); RR;
+          assert(num_connect == *nv_it);
+          PACK_EH(buff_ptr, &connect[0], num_connect);
+        }
+      }
+
+      whole_range.merge(*allr_it);
+    }
+
+      // pack MBMAXTYPE to indicate end of ranges
+    PACK_INT(buff_ptr, MBMAXTYPE);
+
+    count = buff_ptr - orig_buff_ptr;
+  }
+  
+  return MB_SUCCESS;
+}
+
+MBErrorCode MBParallelComm::unpack_entities(unsigned char *&buff_ptr,
+                                            MBRange &entities) 
+{
+  MBErrorCode result;
+  bool done = false;
+  MBReadUtilIface *ru = NULL;
+  result = mbImpl->query_interface(std::string("MBReadUtilIface"), reinterpret_cast<void**>(&ru)); RR;
+  
+  while (!done) {
+    MBEntityType this_type;
+    UNPACK_INT(buff_ptr, this_type);
+    assert(this_type >= MBVERTEX && 
+           (this_type == MBMAXTYPE || this_type < MBENTITYSET));
+
+      // MBMAXTYPE signifies end of entities data
+    if (MBMAXTYPE == this_type) break;
+    
+      // get the range
+    MBRange this_range;
+    UNPACK_RANGE(buff_ptr, this_range);
+    
+    if (MBVERTEX == this_type) {
+        // unpack coords
+      int num_verts = this_range.size();
+      std::vector<double*> coords(3*num_verts);
+      for (MBRange::const_pair_iterator pit = this_range.const_pair_begin(); 
+           pit != this_range.const_pair_end(); pit++) {
+          // allocate handles
+        int start_id = mbImpl->handle_utils().id_from_handle((*pit).first);
+        int start_proc = mbImpl->handle_utils().rank_from_handle((*pit).first);
+        MBEntityHandle actual_start;
+        int tmp_num_verts = (*pit).second - (*pit).first + 1;
+        result = ru->get_node_arrays(3, tmp_num_verts, start_id, start_proc, actual_start,
+                                     coords); RR;
+        if (actual_start != (*pit).first)
+          return MB_FAILURE;
+
+        entities.insert((*pit).first, (*pit).second);
+        
+          // unpack the buffer data directly into coords
+        for (int i = 0; i < 3; i++) 
+          memcpy(coords[i], buff_ptr+i*num_verts*sizeof(double), 
+                 tmp_num_verts*sizeof(double));
+
+        buff_ptr += tmp_num_verts * sizeof(double);
+      }
+
+        // increment the buffer ptr beyond the y and z coords
+      buff_ptr += 2 * num_verts * sizeof(double);
+    }
+
+    else {
+      
+      int verts_per_entity;
+      
+        // unpack the nodes per entity
+      UNPACK_INT(buff_ptr, verts_per_entity);
+      
+        // unpack the connectivity
+      for (MBRange::const_pair_iterator pit = this_range.const_pair_begin(); 
+           pit != this_range.const_pair_end(); pit++) {
+          // allocate handles, connect arrays
+        int start_id = mbImpl->handle_utils().id_from_handle((*pit).first);
+        int start_proc = mbImpl->handle_utils().rank_from_handle((*pit).first);
+        MBEntityHandle actual_start;
+        int num_elems = (*pit).second - (*pit).first + 1;
+        MBEntityHandle *connect;
+        int *connect_offsets;
+        if (this_type == MBPOLYGON || this_type == MBPOLYHEDRON)
+          result = ru->get_poly_element_array(num_elems, verts_per_entity, this_type,
+                                              start_id, start_proc, actual_start,
+                                              connect_offsets, connect); RR;
+        else
+          result = ru->get_element_array(num_elems, verts_per_entity, this_type,
+                                         start_id, start_proc, actual_start,
+                                         connect); RR;
+
+          // copy connect arrays
+        if (this_type != MBPOLYGON && this_type != MBPOLYHEDRON) {
+          UNPACK_EH(buff_ptr, connect, num_elems * verts_per_entity);
+        }
+        else {
+          UNPACK_EH(buff_ptr, connect, verts_per_entity);
+          assert(NULL != connect_offsets);
+            // and the offsets
+          UNPACK_INTS(buff_ptr, connect_offsets, num_elems);
+        }
+
+        entities.insert((*pit).first, (*pit).second);
+      }
+      
+    }
+  }
+  
+  return MB_SUCCESS;
+}
+
+MBErrorCode MBParallelComm::pack_sets(MBRange &entities,
+                                      MBRange::const_iterator &start_rit,
+                                      MBRange &whole_range,
+                                      unsigned char *&buff_ptr,
+                                      int &count,
+                                      const bool just_count)
+{
+  
+    // now the sets; assume any sets the application wants to pass are in the entities list
+  count = 0;
+  unsigned char *orig_buff_ptr = buff_ptr;
+  MBErrorCode result;
+
+  if (just_count) {
+    for (; start_rit != entities.end(); start_rit++) {
+      setRange.insert(*start_rit);
+      count += sizeof(MBEntityHandle);
+    
+      unsigned int options;
+      result = mbImpl->get_meshset_options(*start_rit, options); RR;
+      optionsVec.push_back(options);
+      count += sizeof(unsigned int);
+    
+      if (options & MESHSET_SET) {
+          // range-based set; count the subranges
+        setRanges.push_back(MBRange());
+        result = mbImpl->get_entities_by_handle(*start_rit, *setRanges.rbegin()); RR;
+        count += 2 * sizeof(MBEntityHandle) * num_subranges(*setRanges.rbegin()) + sizeof(int);
+      }
+      else if (options & MESHSET_ORDERED) {
+          // just get the number of entities in the set
+        int num_ents;
+        result = mbImpl->get_number_entities_by_handle(*start_rit, num_ents); RR;
+        count += sizeof(int);
+        
+        setSizes.push_back(num_ents);
+        count += sizeof(MBEntityHandle) * num_ents + sizeof(int);
+      }
+      whole_range.insert(*start_rit);
+
+        // get numbers of parents/children
+      int num_par, num_ch;
+      result = mbImpl->num_child_meshsets(*start_rit, &num_ch); RR;
+      result = mbImpl->num_parent_meshsets(*start_rit, &num_par); RR;
+      count += 2*sizeof(int) + (num_par + num_ch) * sizeof(MBEntityHandle);
+    
+    }
+  }
+  else {
+    
+    std::vector<unsigned int>::const_iterator opt_it = optionsVec.begin();
+    std::vector<MBRange>::const_iterator rit = setRanges.begin();
+    std::vector<int>::const_iterator mem_it = setSizes.begin();
+    static std::vector<MBEntityHandle> members;
+
+      // set handle range
+    PACK_RANGE(buff_ptr, setRange);
+
+    for (MBRange::const_iterator set_it = setRange.begin(); set_it != setRange.end(); 
+         set_it++, opt_it++) {
+        // option value
+      PACK_VOID(buff_ptr, &(*opt_it), sizeof(unsigned int));
+      
+      if ((*opt_it) & MESHSET_SET) {
+          // pack entities as a range
+        PACK_RANGE(buff_ptr, (*rit));
+        rit++;
+      }
+      else if ((*opt_it) & MESHSET_ORDERED) {
+          // pack entities as vector, with length
+        PACK_INT(buff_ptr, *mem_it);
+        members.clear();
+        result = mbImpl->get_entities_by_handle(*set_it, members); RR;
+        PACK_EH(buff_ptr, &members[0], *mem_it);
+        mem_it++;
+      }
+      
+        // pack parents
+      members.clear();
+      result = mbImpl->get_parent_meshsets(*set_it, members); RR;
+      PACK_INT(buff_ptr, members.size());
+      if (!members.empty()) {
+        PACK_EH(buff_ptr, &members[0], members.size());
+      }
+      
+        // pack children
+      members.clear();
+      result = mbImpl->get_child_meshsets(*set_it, members); RR;
+      PACK_INT(buff_ptr, members.size());
+      if (!members.empty()) {
+        PACK_EH(buff_ptr, &members[0], members.size());
+      }
+      
+    }
+    
+    count = buff_ptr - orig_buff_ptr;
+  }
+  
+  return MB_SUCCESS;
+}
+
+MBErrorCode MBParallelComm::unpack_sets(unsigned char *&buff_ptr,
+                                        MBRange &entities)
+{
+  
+    // now the sets; assume any sets the application wants to pass are in the entities list
+  MBErrorCode result;
+
+  std::vector<unsigned int>::const_iterator opt_it = optionsVec.begin();
+  std::vector<MBRange>::const_iterator rit = setRanges.begin();
+  std::vector<int>::const_iterator mem_it = setSizes.begin();
+
+  MBRange set_handles;
+  UNPACK_RANGE(buff_ptr, set_handles);
+  std::vector<MBEntityHandle> members;
+  
+  for (MBRange::const_iterator rit = set_handles.begin(); rit != set_handles.end(); rit++) {
+    
+      // option value
+    unsigned int opt;
+    UNPACK_VOID(buff_ptr, &opt, sizeof(unsigned int));
+      
+      // create the set
+    MBEntityHandle set_handle;
+    result = mbImpl->create_meshset(opt, set_handle, 
+                                    mbImpl->handle_utils().id_from_handle(*rit), 
+                                    mbImpl->handle_utils().rank_from_handle(*rit)); RR;
+    if (set_handle != *rit)
+      return MB_FAILURE;
+
+    int num_ents;
+    if (opt & MESHSET_SET) {
+        // unpack entities as a range
+      MBRange set_range;
+      UNPACK_RANGE(buff_ptr, set_range);
+      result = mbImpl->add_entities(*rit, set_range); RR;
+    }
+    else if (opt & MESHSET_ORDERED) {
+        // unpack entities as vector, with length
+      UNPACK_INT(buff_ptr, num_ents);
+      members.reserve(num_ents);
+      UNPACK_EH(buff_ptr, &members[0], num_ents);
+      result = mbImpl->add_entities(*rit, &members[0], num_ents); RR;
+    }
+      
+      // unpack parents/children
+    UNPACK_INT(buff_ptr, num_ents);
+    members.reserve(num_ents);
+    UNPACK_EH(buff_ptr, &members[0], num_ents);
+    for (int i = 0; i < num_ents; i++) {
+      result = mbImpl->add_parent_meshset(*rit, members[i]); RR;
+    }
+    UNPACK_INT(buff_ptr, num_ents);
+    members.reserve(num_ents);
+    UNPACK_EH(buff_ptr, &members[0], num_ents);
+    for (int i = 0; i < num_ents; i++) {
+      result = mbImpl->add_child_meshset(*rit, members[i]); RR;
+    }
+  }
+  
+  return MB_SUCCESS;
+}
+
+MBErrorCode MBParallelComm::pack_adjacencies(MBRange &entities,
+                                             MBRange::const_iterator &start_rit,
+                                             MBRange &whole_range,
+                                             unsigned char *&buff_ptr,
+                                             int &count,
+                                             const bool just_count)
+{
+  return MB_FAILURE;
+}
+
+MBErrorCode MBParallelComm::unpack_adjacencies(unsigned char *&buff_ptr,
+                                               MBRange &entities)
+{
+  return MB_FAILURE;
+}
+
+MBErrorCode MBParallelComm::pack_tags(MBRange &entities,
+                                      MBRange::const_iterator &start_rit,
+                                      MBRange &whole_range,
+                                      unsigned char *&buff_ptr,
+                                      int &count,
+                                      const bool just_count)
+{
+    // tags
+    // get all the tags
+    // for dense tags, compute size assuming all entities have that tag
+    // for sparse tags, get number of entities w/ that tag to compute size
+
+  count = 0;
+  unsigned char *orig_buff_ptr = buff_ptr;
+  MBErrorCode result;
+  int whole_size = whole_range.size();
+
+  if (just_count) {
+
+    std::vector<MBTag> all_tags;
+    result = tagServer->get_tags(all_tags); RR;
+
+    for (std::vector<MBTag>::iterator tag_it = all_tags.begin(); tag_it != all_tags.end(); tag_it++) {
+      const TagInfo *tinfo = tagServer->get_tag_info(*tag_it);
+      int this_count = 0;
+      MBRange tmp_range;
+      if (PROP_FROM_TAG_HANDLE(*tag_it) == MB_TAG_DENSE) {
+        this_count += whole_size * tinfo->get_size();
+      }
+      else {
+        result = tagServer->get_entities(*tag_it, MBMAXTYPE, tmp_range); RR;
+        tmp_range = tmp_range.intersect(whole_range);
+        if (!tmp_range.empty()) this_count = tmp_range.size() * tinfo->get_size();
+      }
+
+      if (0 == this_count) continue;
+
+        // ok, we'll be sending this tag
+
+        // tag handle
+      allTags.push_back(*tag_it);
+      count += sizeof(MBTag);
+      
+        // default value
+      count += sizeof(int);
+      if (NULL != tinfo->default_value()) count += tinfo->get_size();
+      
+        // size, data type
+      count += sizeof(int);
+      
+        // data type
+      count += sizeof(MBDataType);
+
+        // name
+      count += 64;
+
+      if (!tmp_range.empty()) {
+        tagRanges.push_back(tmp_range);
+          // range of tag
+        count += sizeof(int) + 2 * num_subranges(tmp_range) * sizeof(MBEntityHandle);
+      }
+      
+          // tag data values for range or vector
+      count += this_count;
+    }
+
+      // number of tags
+    count += sizeof(int);
+  }
+
+  else {
+    static std::vector<int> tag_data;
+    std::vector<MBRange>::const_iterator tr_it = tagRanges.begin();
+
+    PACK_INT(buff_ptr, allTags.size());
+    
+    for (std::vector<MBTag>::const_iterator tag_it = allTags.begin(); tag_it != allTags.end(); tag_it++) {
+
+      const TagInfo *tinfo = tagServer->get_tag_info(*tag_it);
+
+        // tag handle
+      PACK_EH(buff_ptr, &(*tag_it), 1);
+      
+        // size, data type
+      PACK_INT(buff_ptr, tinfo->get_size());
+      PACK_INT(buff_ptr, tinfo->get_data_type());
+      
+        // default value
+      if (NULL == tinfo->default_value()) {
+        PACK_INT(buff_ptr, 0);
+      }
+      else {
+        PACK_INT(buff_ptr, 1);
+        PACK_VOID(buff_ptr, tinfo->default_value(), tinfo->get_size());
+      }
+      
+        // name
+      PACK_CHAR_64(buff_ptr, tinfo->get_name().c_str());
+      
+      if (PROP_FROM_TAG_HANDLE(*tag_it) == MB_TAG_DENSE) {
+        tag_data.reserve((whole_size+1) * tinfo->get_size() / sizeof(int));
+        result = mbImpl->tag_get_data(*tag_it, whole_range, &tag_data[0]);
+        PACK_VOID(buff_ptr, &tag_data[0], whole_size*tinfo->get_size());
+      }
+      else {
+        tag_data.reserve((tr_it->size()+1) * tinfo->get_size() / sizeof(int));
+        result = mbImpl->tag_get_data(*tag_it, *tr_it, &tag_data[0]); RR;
+        PACK_RANGE(buff_ptr, (*tr_it));
+        PACK_VOID(buff_ptr, &tag_data[0], tr_it->size()*tinfo->get_size());
+        tr_it++;
+      }
+      
+    }
+
+    count = buff_ptr - orig_buff_ptr;
+  }
+  
+  return MB_SUCCESS;
+}
+
+MBErrorCode MBParallelComm::unpack_tags(unsigned char *&buff_ptr,
+                                        MBRange &entities)
+{
+    // tags
+    // get all the tags
+    // for dense tags, compute size assuming all entities have that tag
+    // for sparse tags, get number of entities w/ that tag to compute size
+
+  MBErrorCode result;
+  
+  int num_tags;
+  UNPACK_INT(buff_ptr, num_tags);
+  std::vector<int> tag_data;
+
+  for (int i = 0; i < num_tags; i++) {
+    
+        // tag handle
+    MBTag tag_handle;
+    UNPACK_EH(buff_ptr, &tag_handle, 1);
+
+      // size, data type
+    int tag_size, tag_data_type;
+    UNPACK_INT(buff_ptr, tag_size);
+    UNPACK_INT(buff_ptr, tag_data_type);
+      
+      // default value
+    int has_def_value;
+    UNPACK_INT(buff_ptr, has_def_value);
+    void *def_val_ptr = NULL;
+    if (1 == has_def_value) {
+      def_val_ptr = buff_ptr;
+      buff_ptr += tag_size;
+    }
+    
+      // name
+    char *tag_name = reinterpret_cast<char *>(buff_ptr);
+    buff_ptr += 64;
+
+      // create the tag
+    MBTagType tag_type;
+    result = mbImpl->tag_get_type(tag_handle, tag_type); RR;
+
+    result = mbImpl->tag_create(tag_name, tag_size, tag_type, (MBDataType) tag_data_type, tag_handle,
+                                def_val_ptr);
+    if (MB_ALREADY_ALLOCATED == result) {
+        // already allocated tag, check to make sure it's the same size, type, etc.
+      const TagInfo *tag_info = tagServer->get_tag_info(tag_name);
+      if (tag_size != tag_info->get_size() ||
+          tag_data_type != tag_info->get_data_type() ||
+          (def_val_ptr && !tag_info->default_value() ||
+           !def_val_ptr && tag_info->default_value()))
+        return MB_FAILURE;
+      MBTagType this_type;
+      result = mbImpl->tag_get_type(tag_handle, this_type);
+      if (MB_SUCCESS != result || this_type != tag_type) return MB_FAILURE;
+    }
+    else if (MB_SUCCESS != result) return result;
+    
+      // set the tag data
+    if (PROP_FROM_TAG_HANDLE(tag_handle) == MB_TAG_DENSE) {
+      if (NULL != def_val_ptr && tag_data_type != MB_TYPE_OPAQUE) {
+          // only set the tags whose values aren't the default value; only works
+          // if it's a known type
+        MBRange::iterator start_rit = entities.begin(), end_rit = start_rit;
+        MBRange set_ents;
+        while (end_rit != entities.end()) {
+          while (start_rit != entities.end() &&
+                 ((tag_data_type == MB_TYPE_INTEGER && *((int*)def_val_ptr) == *((int*)buff_ptr)) ||
+                  (tag_data_type == MB_TYPE_DOUBLE && *((double*)def_val_ptr) == *((double*)buff_ptr)) ||
+                  (tag_data_type == MB_TYPE_HANDLE && *((MBEntityHandle*)def_val_ptr) == *((MBEntityHandle*)buff_ptr)))) {
+            start_rit++;
+            buff_ptr += tag_size;
+          }
+          end_rit = start_rit;
+          void *end_ptr = buff_ptr;
+          while (start_rit != entities.end() && end_rit != entities.end() &&
+                 ((tag_data_type == MB_TYPE_INTEGER && *((int*)def_val_ptr) == *((int*)end_ptr)) ||
+                  (tag_data_type == MB_TYPE_DOUBLE && *((double*)def_val_ptr) == *((double*)end_ptr)) ||
+                  (tag_data_type == MB_TYPE_HANDLE && *((MBEntityHandle*)def_val_ptr) == *((MBEntityHandle*)end_ptr)))) {
+            set_ents.insert(*end_rit);
+            end_rit++;
+            buff_ptr += tag_size;
+          }
+          
+          if (!set_ents.empty()) {
+            result = mbImpl->tag_set_data(tag_handle, set_ents, buff_ptr); RR;
+          }
+          if (start_rit != entities.end()) {
+            end_rit++;
+            start_rit = end_rit;
+            buff_ptr += tag_size;
+          }
+        }
+      }
+      else {
+        result = mbImpl->tag_set_data(tag_handle, entities, buff_ptr); RR;
+        buff_ptr += entities.size() * tag_size;
+      }
+    }
+    else {
+      MBRange tag_range;
+      UNPACK_RANGE(buff_ptr, tag_range);
+      result = mbImpl->tag_set_data(tag_handle, tag_range, buff_ptr); RR;
+      buff_ptr += tag_range.size() * tag_size;
+    }
+  }
+  
+  return MB_SUCCESS;
+}
+
+bool MBParallelComm::buffer_size(const unsigned int new_size) 
+{
+  unsigned int old_size = myBuffer.size();
+  myBuffer.reserve(new_size);
+  return (new_size == old_size);
+}
+
+void MBParallelComm::take_buffer(std::vector<unsigned char> &new_buffer) 
+{
+  new_buffer.swap(myBuffer);
+}
+
+MBErrorCode MBParallelComm::resolve_shared_ents(MBRange &proc_ents,
+                                                const int dim) 
+{
+  MBRange::iterator rit;
+  MBSkinner skinner(mbImpl);
+  
+    // get the skin entities by dimension
+  MBRange skin_ents[3];
+  MBErrorCode result;
+  int upper_dim = MBCN::Dimension(TYPE_FROM_HANDLE(*proc_ents.begin()));
+
+  if (upper_dim > 0) {
+      // first get d-1 skin ents
+    result = skinner.find_skin(proc_ents, skin_ents[upper_dim-1],
+                               skin_ents[upper_dim-1]);
+    if (MB_SUCCESS != result) return result;
+      // then get d-2, d-3, etc. entities adjacent to skin ents 
+    for (int this_dim = upper_dim-1; this_dim >= 0; this_dim--) {
+      result = mbImpl->get_adjacencies(skin_ents[upper_dim-1], this_dim,
+                                       true, skin_ents[this_dim]);
+      if (MB_SUCCESS != result) return result;
+    }
+  }
+  else skin_ents[0] = proc_ents;
+  
+    // global id tag
+  MBTag gid_tag; int def_val = -1;
+  result = mbImpl->tag_create(GLOBAL_ID_TAG_NAME, sizeof(int),
+                              MB_TAG_DENSE, MB_TYPE_INTEGER, gid_tag,
+                              &def_val, true);
+  if (MB_FAILURE == result) return result;
+  else if (MB_ALREADY_ALLOCATED != result) {
+      // just created it, so we need global ids
+    result = assign_global_ids(dim);
+    if (MB_SUCCESS != result) return result;
+  }
+
+    // store index in temp tag; reuse gid_data 
+  std::vector<int> gid_data(skin_ents[0].size());
+  int idx = 0;
+  for (MBRange::iterator rit = skin_ents[0].begin(); 
+       rit != skin_ents[0].end(); rit++) 
+    gid_data[idx] = idx, idx++;
+  MBTag idx_tag;
+  result = mbImpl->tag_create("__idx_tag", sizeof(int), MB_TAG_DENSE,
+                              MB_TYPE_INTEGER, idx_tag, &def_val, true);
+  if (MB_SUCCESS != result && MB_ALREADY_ALLOCATED != result) return result;
+  result = mbImpl->tag_set_data(idx_tag, skin_ents[0], &gid_data[0]);
+  if (MB_SUCCESS != result) return result;
+
+    // get gids for skin verts in a vector, to pass to gs
+  result = mbImpl->tag_get_data(gid_tag, skin_ents[0], &gid_data[0]);
+  if (MB_SUCCESS != result) return result;
+
+    // get a crystal router
+  crystal_data *cd = procConfig.crystal_router();
+  
+    // get total number of verts; will overshoot highest global id, but
+    // that's ok
+  int nverts_total, nverts_local;
+  result = mbImpl->get_number_entities_by_dimension(0, 0, nverts_local);
+  if (MB_SUCCESS != result) return result;
+  int failure = MPI_Allreduce(&nverts_local, &nverts_total, 1,
+                              MPI_INTEGER, MPI_SUM, procConfig.proc_comm());
+  if (failure) return MB_FAILURE;
+  
+    // call gather-scatter to get shared ids & procs
+  gs_data *gsd = gs_data_setup(skin_ents[0].size(), (const ulong_*)&gid_data[0], 1, cd);
+  if (NULL == gsd) return MB_FAILURE;
+  
+    // get shared proc tags
+#define MAX_SHARING_PROCS 10  
+  int def_vals[2] = {-10*procConfig.proc_size(), -10*procConfig.proc_size()};
+  MBTag sharedp_tag, sharedps_tag;
+  result = mbImpl->tag_create(PARALLEL_SHARED_PROC_TAG_NAME, 2*sizeof(int), 
+                              MB_TAG_DENSE,
+                              MB_TYPE_INTEGER, sharedp_tag, &def_vals, true);
+  if (MB_SUCCESS != result && MB_ALREADY_ALLOCATED != result) return result;
+  result = mbImpl->tag_create(PARALLEL_SHARED_PROCS_TAG_NAME, 
+                              MAX_SHARING_PROCS*sizeof(int), 
+                              MB_TAG_SPARSE,
+                              MB_TYPE_INTEGER, sharedps_tag, NULL, true);
+  if (MB_SUCCESS != result && MB_ALREADY_ALLOCATED != result) return result;
+
+    // load shared vertices into a tuple, then sort by index
+  tuple_list shared_verts;
+  tuple_list_init_max(&shared_verts, 0, 2, 0, 
+                      skin_ents[0].size()*MAX_SHARING_PROCS);
+  int i = 0, j = 0;
+  for (unsigned int p = 0; p < gsd->nlinfo->np; p++) 
+    for (unsigned int np = 0; np < gsd->nlinfo->nshared[p]; np++) 
+      shared_verts.vl[i++] = gsd->nlinfo->sh_ind[j++],
+        shared_verts.vl[i++] = gsd->nlinfo->target[p],
+        shared_verts.n++;
+  std::vector<int> sort_buffer(skin_ents[0].size()*MAX_SHARING_PROCS);
+  tuple_list_sort(&shared_verts, 0,(buffer*)&sort_buffer[0]);
+
+    // set sharing procs tags on skin vertices
+  int maxp = -10*procConfig.proc_size();
+  int sharing_procs[MAX_SHARING_PROCS] = {maxp};
+  j = 0;
+  while (j < 2*shared_verts.n) {
+      // count & accumulate sharing procs
+    int nump = 0, this_idx = shared_verts.vl[j];
+    while (shared_verts.vl[j] == this_idx)
+      j++, sharing_procs[nump++] = shared_verts.vl[j++];
+
+    sharing_procs[nump++] = procConfig.proc_rank();
+    MBEntityHandle this_ent = skin_ents[0][this_idx];
+    if (2 == nump)
+      result = mbImpl->tag_set_data(sharedp_tag, &this_ent, 1,
+                                    sharing_procs);
+    else
+      result = mbImpl->tag_set_data(sharedps_tag, &this_ent, 1,
+                                    sharing_procs);
+    if (MB_SUCCESS != result) return result;
+
+      // reset sharing proc(s) tags
+    std::fill(sharing_procs, sharing_procs+nump, maxp);
+  }
+  
+    // set sharing procs tags on other skin ents
+  const MBEntityHandle *connect; int num_connect;
+  for (int d = dim-1; d > 0; d--) {
+    for (MBRange::iterator rit = skin_ents[d].begin();
+         rit != skin_ents[d].end(); rit++) {
+        // get connectivity
+      result = mbImpl->get_connectivity(*rit, connect, num_connect);
+      if (MB_SUCCESS != result) return result;
+      MBRange sp_range, vp_range;
+      for (int nc = 0; nc < num_connect; nc++) {
+          // get sharing procs
+        result = mbImpl->tag_get_data(sharedp_tag, &(*rit), 1, sharing_procs);
+        if (MB_SUCCESS != result) return result;
+        if (sharing_procs[0] == maxp) {
+          result = mbImpl->tag_get_data(sharedps_tag, &(*rit), 1, sharing_procs);
+          if (MB_SUCCESS != result) return result;
+        }
+          // build range of sharing procs for this vertex
+        unsigned int p = 0; vp_range.clear();
+        while (sharing_procs[p] != maxp && p < MAX_SHARING_PROCS)
+          vp_range.insert(sharing_procs[p]), p++;
+        assert(p < MAX_SHARING_PROCS);
+          // intersect with range for this skin ent
+        if (0 != nc) sp_range = sp_range.intersect(vp_range);
+        else sp_range = vp_range;
+      }
+        // intersection is the owning proc(s) for this skin ent; should
+        // not be empty
+      assert(!sp_range.empty());
+      MBRange::iterator rit2;
+        // set tag for this ent
+      for (j = 0, rit2 = sp_range.begin(); 
+           rit2 != sp_range.end(); rit2++, j++)
+        sharing_procs[j] = *rit;
+      if (2 >= j)
+        result = mbImpl->tag_set_data(sharedp_tag, &(*rit), 1,
+                                      sharing_procs);
+      else
+        result = mbImpl->tag_set_data(sharedps_tag, &(*rit), 1,
+                                      sharing_procs);
+
+      if (MB_SUCCESS != result) return result;
+      
+        // reset sharing proc(s) tags
+      std::fill(sharing_procs, sharing_procs+j, maxp);
+    }
+  }
+
+    // done
+  return result;
+}
+
+
+  
+  
+#ifdef TEST_PARALLELCOMM
+
+#include <iostream>
+
+#include "MBCore.hpp"
+#include "MBParallelComm.hpp"
+#include "MBRange.hpp"
+
+int main(int argc, char* argv[])
+{
+
+    // Check command line arg
+  if (argc < 2)
+  {
+    std::cout << "Usage: " << argv[0] << " <mesh_file_name>" << std::endl;
+    exit(1);
+  }
+
+  const char* file = argv[1];
+  MBCore *my_impl = new MBCore(0, 2);
+  MBInterface* mbImpl = my_impl;
+
+    // create a communicator class, which will start mpi too
+  MBParallelComm pcomm(mbImpl, my_impl->tag_server(), my_impl->sequence_manager());
+  MBErrorCode result;
+
+    // load the mesh
+  result = mbImpl->load_mesh(file, 0, 0);
+  if (MB_SUCCESS != result) return result;
+
+    // get the mesh
+  MBRange all_mesh, whole_range;
+  result = mbImpl->get_entities_by_dimension(0, 3, all_mesh);
+  if (MB_SUCCESS != result) return result;
+    
+  int buff_size;
+  result = pcomm.pack_buffer(all_mesh, false, true, true, whole_range, buff_size); RR;
+
+    // allocate space in the buffer
+  pcomm.buffer_size(buff_size);
+
+    // pack the actual buffer
+  int actual_buff_size;
+  result = pcomm.pack_buffer(whole_range, false, true, false, all_mesh, actual_buff_size); RR;
+
+    // list the entities that got packed
+  std::cout << "ENTITIES PACKED:" << std::endl;
+  mbImpl->list_entities(all_mesh);
+
+    // get the buffer
+  std::vector<unsigned char> tmp_buffer;
+  pcomm.take_buffer(tmp_buffer);
+    
+    // stop and restart MOAB
+  delete mbImpl;
+  my_impl = new MBCore(1, 2);
+  mbImpl = my_impl;
+    
+    // create a new communicator class, using our old buffer
+  MBParallelComm pcomm2(mbImpl, my_impl->tag_server(), my_impl->sequence_manager(),
+                        tmp_buffer);
+
+    // unpack the results
+  all_mesh.clear();
+  result = pcomm2.unpack_buffer(all_mesh); RR;
+  std::cout << "ENTITIES UNPACKED:" << std::endl;
+  mbImpl->list_entities(all_mesh);
+  
+  std::cout << "Success, processor " << mbImpl->proc_rank() << "." << std::endl;
+  
+  return 1;
+}
+#endif

Added: MOAB/trunk/parallel/MBParallelComm.hpp
===================================================================
--- MOAB/trunk/parallel/MBParallelComm.hpp	                        (rev 0)
+++ MOAB/trunk/parallel/MBParallelComm.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,191 @@
+/**
+ * MOAB, a Mesh-Oriented datABase, is a software component for creating,
+ * storing and accessing finite element mesh data.
+ * 
+ * Copyright 2004 Sandia Corporation.  Under the terms of Contract
+ * DE-AC04-94AL85000 with Sandia Coroporation, the U.S. Government
+ * retains certain rights in this software.
+ * 
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ * 
+ */
+
+/**
+ * \class MBParallelComm
+ * \brief Parallel communications in MOAB
+ * \author Tim Tautges
+ *
+ *  This class implements methods to communicate mesh between processors
+ *
+ */
+
+#ifndef MB_PARALLEL_COMM_HPP
+#define MB_PARALLEL_COMM_HPP
+
+#include "MBForward.hpp"
+#include "MBRange.hpp"
+#include "MBProcConfig.hpp"
+
+class TagServer;
+class EntitySequenceManager;
+
+class MBParallelComm 
+{
+public:
+
+    //! constructor
+  MBParallelComm(MBInterface *impl,
+                 MPI_Comm comm = MPI_COMM_WORLD);
+
+    //! constructor taking packed buffer, for testing
+  MBParallelComm(MBInterface *impl,
+                 std::vector<unsigned char> &tmp_buff,
+                 MPI_Comm comm = MPI_COMM_WORLD);
+
+    //! assign a global id space, for largest-dimension or all entities (and
+    //! in either case for vertices too)
+  MBErrorCode assign_global_ids(const int dimension,
+                                const int start_id = 1,
+                                const bool largest_dim_only = true);
+
+    //! communicate entities from/to this range
+  MBErrorCode communicate_entities(const int from_proc, const int to_proc,
+                                   MBRange &entities,
+                                   const bool adjacencies = false,
+                                   const bool tags = true);
+  
+  MBErrorCode broadcast_entities( const int from_proc,
+                                  MBRange& entities,
+                                  const bool adjacencies = false,
+                                  const bool tags = true );
+
+    /** Resolve shared entities between processors
+     * Resolve shared entities between processors for entities in proc_ents,
+     * by comparing global id tag values on vertices on skin of elements in
+     * proc_ents.  Shared entities are assigned a tag that's either
+     * PARALLEL_SHARED_PROC_TAG_NAME, which is 2 integers in length, or 
+     * PARALLEL_SHARED_PROCS_TAG_NAME, whose length depends on the maximum
+     * number of sharing processors.  Values in these tags denote the ranks
+     * of sharing processors, and the list ends with the value -10*#procs.
+     *
+     * \param proc_ents Entities for which to resolve shared entities
+     * \param dim Dimension of entities in proc_ents
+     */
+  MBErrorCode resolve_shared_ents(MBRange &proc_ents, const int dim);
+  
+    //! pack a buffer (stored in this class instance) with ALL data for these entities
+  MBErrorCode pack_buffer(MBRange &entities, 
+                          const bool adjacencies,
+                          const bool tags,
+                          const bool just_count,
+                          MBRange &whole_range,
+                          int &buff_size);
+  
+    //! unpack a buffer; assume information is already in myBuffer
+  MBErrorCode unpack_buffer(MBRange &entities);
+
+    //! set the buffer size; return true if size actually changed
+  bool buffer_size(const unsigned int new_size);
+
+    //! take the buffer from this instance; switches with vector passed in
+  void take_buffer(std::vector<unsigned char> &new_buff);
+
+    //! Get proc config for this communication object
+  const MBProcConfig &proc_config() const {return procConfig;}
+  
+      
+private:
+
+  int num_subranges(const MBRange &this_range);
+  
+  MBErrorCode pack_entities(MBRange &entities,
+                            MBRange::const_iterator &start_rit,
+                            MBRange &whole_range,
+                            unsigned char *&buff_ptr,
+                            int &count,
+                            const bool just_count);
+  
+  MBErrorCode unpack_entities(unsigned char *&buff_ptr,
+                              MBRange &entities);
+  
+  MBErrorCode pack_sets(MBRange &entities,
+                        MBRange::const_iterator &start_rit,
+                        MBRange &whole_range,
+                        unsigned char *&buff_ptr,
+                        int &count,
+                        const bool just_count);
+  
+  MBErrorCode unpack_sets(unsigned char *&buff_ptr,
+                          MBRange &entities);
+  
+  MBErrorCode pack_adjacencies(MBRange &entities,
+                               MBRange::const_iterator &start_rit,
+                               MBRange &whole_range,
+                               unsigned char *&buff_ptr,
+                               int &count,
+                               const bool just_count);
+
+  MBErrorCode unpack_adjacencies(unsigned char *&buff_ptr,
+                                 MBRange &entities);
+  
+  MBErrorCode pack_tags(MBRange &entities,
+                        MBRange::const_iterator &start_rit,
+                        MBRange &whole_range,
+                        unsigned char *&buff_ptr,
+                        int &count,
+                        const bool just_count);
+
+  MBErrorCode unpack_tags(unsigned char *&buff_ptr,
+                          MBRange &entities);
+  
+
+    //! MB interface associated with this writer
+  MBInterface *mbImpl;
+
+    //! Proc config object, keeps info on parallel stuff
+  MBProcConfig procConfig;
+  
+    //! Tag server, so we can get more info about tags
+  TagServer *tagServer;
+  
+    //! Sequence manager, to get more efficient access to entities
+  EntitySequenceManager *sequenceManager;
+  
+    //! data buffer used to communicate
+  std::vector<unsigned char> myBuffer;
+
+    //! types of ranges to be communicated
+  std::vector<MBEntityType> entTypes;
+
+    //! ranges to be communicated
+  std::vector<MBRange> allRanges;
+  
+    //! vertices per entity in ranges
+  std::vector<int> vertsPerEntity;
+
+    //! sets to be communicated
+  MBRange setRange;
+  
+    //! ranges from sets to be communicated
+  std::vector<MBRange> setRanges;
+  
+    //! sizes of vector-based sets to be communicated
+  std::vector<int> setSizes;
+
+    //! tags to be communicated
+  std::vector<MBTag> allTags;
+
+    //! ranges from sparse tags to be communicated
+  std::vector<MBRange> tagRanges;
+
+    //! vector of set options for transferred sets
+  std::vector<unsigned int> optionsVec;
+  
+    //! numbers of parents/children for transferred sets
+  std::vector<int> setPcs;
+};
+
+#endif

Added: MOAB/trunk/parallel/MBProcConfig.cpp
===================================================================
--- MOAB/trunk/parallel/MBProcConfig.cpp	                        (rev 0)
+++ MOAB/trunk/parallel/MBProcConfig.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,54 @@
+/**
+ * MOAB, a Mesh-Oriented datABase, is a software component for creating,
+ * storing and accessing finite element mesh data.
+ * 
+ * Copyright 2004 Sandia Corporation.  Under the terms of Contract
+ * DE-AC04-94AL85000 with Sandia Coroporation, the U.S. Government
+ * retains certain rights in this software.
+ * 
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ * 
+ */
+
+#include "MBProcConfig.hpp"
+
+//! Constructor
+MBProcConfig::MBProcConfig(MPI_Comm proc_comm) 
+    : procComm(proc_comm),
+      crystalInit(false)
+{
+#ifdef USE_MPI
+  int rank, size;
+  MPI_Comm_rank(procComm, &rank); 
+  procRank = (unsigned int) rank;
+  MPI_Comm_size(procComm, &size); 
+  procSize = (unsigned int) size;
+#else
+  procRank = 0;
+  procSize = 1;
+#endif
+}
+
+crystal_data *MBProcConfig::crystal_router(bool construct_if_missing) 
+{
+  if (!crystalInit && construct_if_missing)
+#ifdef USE_MPI
+    crystal_init(&crystalData, procComm), crystalInit = true;
+#else
+  ;
+#endif
+
+  return &crystalData;
+}
+
+MBProcConfig::~MBProcConfig() 
+{
+#ifdef USE_MPI
+  if (crystalInit) 
+    crystal_free(&crystalData), crystalInit = false;
+#endif
+}
+

Added: MOAB/trunk/parallel/MBProcConfig.hpp
===================================================================
--- MOAB/trunk/parallel/MBProcConfig.hpp	                        (rev 0)
+++ MOAB/trunk/parallel/MBProcConfig.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,81 @@
+/**
+ * MOAB, a Mesh-Oriented datABase, is a software component for creating,
+ * storing and accessing finite element mesh data.
+ * 
+ * Copyright 2004 Sandia Corporation.  Under the terms of Contract
+ * DE-AC04-94AL85000 with Sandia Coroporation, the U.S. Government
+ * retains certain rights in this software.
+ * 
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ * 
+ */
+
+#ifndef MB_PROC_CONFIG_HPP
+#define MB_PROC_CONFIG_HPP
+
+#include "MBTypes.h"
+#include "MBRange.hpp"
+
+class MBInterface;
+
+
+#ifdef USE_MPI
+#include "mpi.h"
+extern "C" 
+{
+#include "types.h"
+#include "errmem.h"
+#include "crystal.h"
+}
+#else
+typedef int MPI_Comm;
+#define MPI_COMM_WORLD 0
+typedef void* crystal_data;
+#endif
+
+/**\brief Multi-CPU information for parallel MOAB */
+class MBProcConfig {
+public:
+
+  MBProcConfig(MPI_Comm proc_comm = MPI_COMM_WORLD);
+  
+  ~MBProcConfig();
+  
+    //! Get the current processor number
+  unsigned proc_rank() const 
+    { return procRank; }
+      
+    //! Get the number of processors
+  unsigned proc_size() const 
+    { return procSize; }
+      
+    //! get a crystal router for this parallel job
+  crystal_data *crystal_router(bool construct_if_missing = true);
+
+    //! get/set the communicator for this proc config
+  MPI_Comm proc_comm() {return procComm;}
+  void proc_comm(MPI_Comm this_comm) {procComm = this_comm;}
+  
+private:
+
+    //! MPI communicator set for this instance
+  MPI_Comm procComm;
+
+    //! rank of this processor
+  unsigned procRank;
+  
+    //! number of processors
+  unsigned procSize;
+
+    //! whether the crystal router's been initialized or not
+  bool crystalInit;
+  
+    //! crystal router for this parallel job
+  crystal_data crystalData;
+  
+};
+
+#endif

Added: MOAB/trunk/parallel/Makefile.am
===================================================================
--- MOAB/trunk/parallel/Makefile.am	                        (rev 0)
+++ MOAB/trunk/parallel/Makefile.am	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,59 @@
+# Don't require GNU-standard files (Changelog, README, etc.)
+AUTOMAKE_OPTIONS = foreign
+
+# Subdirectories to build
+SUBDIRS = 
+
+# Things to build
+noinst_LTLIBRARIES = libMOABpar.la
+         
+# Some variables
+DEFS = $(DEFINES) -DIS_BUILDING_MB
+INCLUDES += -I$(top_builddir) 
+
+# The directory in which to install headers
+libMOABpar_la_includedir = $(includedir)
+
+
+# Conditional sources
+MOAB_PARALLEL_SRCS =
+MOAB_PARALLEL_HDRS =
+if USE_MPI
+  MOAB_PARALLEL_SRCS += \
+     MBParallelComm.cpp \
+     MBParallelConventions.h \
+     MBProcConfig.cpp \
+     ReadParallel.cpp \
+     crystal.c crystal.h errmem.h errmem.c \
+     transfer.c gs.c gs.h tuple_list.c \
+     tuple_list.h types.h \
+     sort.c sort.h
+
+  MOAB_PARALLEL_HDRS += \
+     MBParallelComm.hpp \
+     MBProcConfig.hpp \
+     ReadParallel.hpp 
+
+if PARALLEL_HDF5
+#  libMOABpar_la_LIBADD = $(top_builddir)/mhdf/libmhdf.la
+  INCLUDES += -I$(top_srcdir)/mhdf/include
+  MOAB_PARALLEL_SRCS += WriteHDF5Parallel.cpp 
+  MOAB_PARALLEL_HDRS += WriteHDF5Parallel.hpp
+endif
+
+endif
+
+# The list of source files, and any header files that do not need to be installed
+libMOABpar_la_SOURCES = \
+   $(MOAB_PARALLEL_SRCS)
+
+# The list of header files which are to be installed
+libMOABpar_la_include_HEADERS = \
+  $(MOAB_PARALLEL_HDRS)
+
+# Tests and such
+
+#moab_test_SOURCES = MBTest.cpp
+#moab_test_LDADD = $(top_builddir)/libMOAB.la
+#moab_test_DEPENDENCIES = test/mb_big_test.g test/cell1.gen test/cell2.gen $(top_builddir)/libMOAB.la
+

Added: MOAB/trunk/parallel/ReadParallel.cpp
===================================================================
--- MOAB/trunk/parallel/ReadParallel.cpp	                        (rev 0)
+++ MOAB/trunk/parallel/ReadParallel.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,295 @@
+#include "ReadParallel.hpp"
+#include "MBCore.hpp"
+#include "MBProcConfig.hpp"
+#include "FileOptions.hpp"
+#include "MBError.hpp"
+#include "MBReaderWriterSet.hpp"
+#include "MBReadUtilIface.hpp"
+#include "MBParallelComm.hpp"
+#include "MBCN.hpp"
+
+#define RR if (MB_SUCCESS != result) return result
+
+MBErrorCode ReadParallel::load_file(const char *file_name,
+                                    MBEntityHandle& file_set,
+                                    const FileOptions &opts,
+                                    const int* material_set_list,
+                                    const int num_material_sets ) 
+{
+  MBError *merror = ((MBCore*)mbImpl)->get_error_handler();
+
+  MBCore *impl = dynamic_cast<MBCore*>(mbImpl);
+  
+    // Get parallel settings
+  int parallel_mode;
+  const char* parallel_opts[] = { "NONE", "BCAST", "BCAST_DELETE", "SCATTER", 
+                                  "FORMAT", 0 };
+  enum ParallelOpts {POPT_NONE=0, POPT_BCAST, POPT_BCAST_DELETE, POPT_SCATTER,
+                     POPT_FORMAT, POPT_LAST};
+      
+  MBErrorCode rval = opts.match_option( "PARALLEL", parallel_opts, 
+                                        parallel_mode );
+  if (MB_FAILURE == rval) {
+    merror->set_last_error( "Unexpected value for 'PARALLEL' option\n" );
+    return MB_FAILURE;
+  }
+  else if (MB_ENTITY_NOT_FOUND == rval) {
+    parallel_mode = 0;
+  }
+    // Get partition setting
+  std::string partition_tag_name;
+  rval = opts.get_option("PARTITION", partition_tag_name);
+  if (MB_ENTITY_NOT_FOUND == rval || partition_tag_name.empty())
+    partition_tag_name += "PARTITION";
+
+    // get MPI IO processor rank
+  int reader_rank;
+  rval = opts.get_int_option( "MPI_IO_RANK", reader_rank );
+  if (MB_ENTITY_NOT_FOUND == rval)
+    reader_rank = 0;
+  else if (MB_SUCCESS != rval) {
+    merror->set_last_error( "Unexpected value for 'MPI_IO_RANK' option\n" );
+    return MB_FAILURE;
+  }
+  
+    // now that we've parsed all the parallel options, return
+    // failure for most of them because we haven't implemented 
+    // most of them yet.
+  if (parallel_mode == POPT_FORMAT) {
+    merror->set_last_error( "Access to format-specific parallel read not implemented.\n");
+    return MB_NOT_IMPLEMENTED;
+  }
+
+  if (parallel_mode == POPT_SCATTER) {
+    merror->set_last_error( "Partitioning for PARALLEL=SCATTER not supported yet.\n");
+    return MB_NOT_IMPLEMENTED;
+  }
+
+  if (parallel_mode != POPT_SCATTER || 
+      reader_rank == (int)(mbImpl->proc_rank())) {
+      // Try using the file extension to select a reader
+    const MBReaderWriterSet* set = impl->reader_writer_set();
+    MBReaderIface* reader = set->get_file_extension_reader( file_name );
+    if (reader)
+    { 
+      rval = reader->load_file( file_name, file_set, opts, 
+                                material_set_list, num_material_sets );
+      delete reader;
+    }
+    else
+    {  
+        // Try all the readers
+      MBReaderWriterSet::iterator iter;
+      for (iter = set->begin(); iter != set->end(); ++iter)
+      {
+        MBReaderIface* reader = iter->make_reader( mbImpl );
+        if (NULL != reader)
+        {
+          rval = reader->load_file( file_name, file_set, opts, 
+                                    material_set_list, num_material_sets );
+          delete reader;
+          if (MB_SUCCESS == rval)
+            break;
+        }
+      }
+    }
+  }
+  else {
+    rval = MB_SUCCESS;
+  }
+  
+  if (parallel_mode == POPT_BCAST ||
+      parallel_mode == POPT_BCAST_DELETE) {
+    MBRange entities; 
+    if (MB_SUCCESS == rval && 
+        reader_rank == (int)(mbImpl->proc_rank())) {
+      rval = mbImpl->get_entities_by_handle( file_set, entities );
+      if (MB_SUCCESS != rval)
+        entities.clear();
+    }
+    
+    MBParallelComm tool( mbImpl);
+    MBErrorCode tmp_rval = tool.broadcast_entities( reader_rank, entities );
+    if (MB_SUCCESS != rval && mbImpl->proc_size() != 1)
+      tmp_rval = rval;
+    else if (MB_SUCCESS != rval) rval = MB_SUCCESS;
+      
+    if (MB_SUCCESS == rval && 
+        reader_rank != (int)(mbImpl->proc_rank())) {
+      rval = mbImpl->create_meshset( MESHSET_SET, file_set );
+      if (MB_SUCCESS == rval) {
+        rval = mbImpl->add_entities( file_set, entities );
+        if (MB_SUCCESS != rval) {
+          mbImpl->delete_entities( &file_set, 1 );
+          file_set = 0;
+        }
+      }
+    }
+
+    if (parallel_mode == POPT_BCAST_DELETE)
+      rval = delete_nonlocal_entities(partition_tag_name, file_set);
+    
+  }
+  
+  return rval;
+}
+
+MBErrorCode ReadParallel::delete_nonlocal_entities(std::string &ptag_name,
+                                                   MBEntityHandle file_set) 
+{
+  MBRange partition_sets;
+  MBErrorCode result;
+
+  MBTag ptag;
+  result = mbImpl->tag_get_handle(ptag_name.c_str(), ptag); RR;
+  
+  result = mbImpl->get_entities_by_type_and_tag(file_set, MBENTITYSET,
+                                                &ptag, NULL, 1,
+                                                partition_sets); RR;
+
+  return delete_nonlocal_entities(partition_sets, file_set);
+}
+
+MBErrorCode ReadParallel::delete_nonlocal_entities(MBRange &partition_sets,
+                                                   MBEntityHandle file_set) 
+{
+  MBErrorCode result;
+  MBError *merror = ((MBCore*)mbImpl)->get_error_handler();
+
+    // get partition entities and ents related to/used by those
+    // get ents in the partition
+  std::string iface_name = "MBReadUtilIface";
+  MBReadUtilIface *read_iface;
+  mbImpl->query_interface(iface_name, reinterpret_cast<void**>(&read_iface));
+  MBRange partition_ents, all_sets;
+  result = read_iface->gather_related_ents(partition_sets, partition_ents,
+                                           &all_sets);
+  RR;
+
+    // get pre-existing entities
+  MBRange file_ents;
+  result = mbImpl->get_entities_by_handle(file_set, file_ents); RR;
+
+    // get deletable entities by subtracting partition ents from file ents
+  MBRange deletable_ents = file_ents.subtract(partition_ents);
+
+    // cache deletable vs. keepable sets
+  MBRange deletable_sets = all_sets.intersect(deletable_ents);
+  MBRange keepable_sets = all_sets.subtract(deletable_sets);
+  
+    // remove deletable ents from all keepable sets
+  for (MBRange::iterator rit = keepable_sets.begin();
+       rit != keepable_sets.end(); rit++) {
+    result = mbImpl->remove_entities(*rit, deletable_ents); RR;
+  }
+
+    // delete sets, then ents
+  result = mbImpl->delete_entities(deletable_sets); RR;
+
+  deletable_ents = deletable_ents.subtract(deletable_sets);
+  result = mbImpl->delete_entities(deletable_ents); RR;
+  
+  result = ((MBCore*)mbImpl)->check_adjacencies();
+
+  return result;
+
+/*  
+
+
+// ================================  
+    // get entities in this partition
+  int my_rank = (int)mbImpl->proc_config().rank();
+  if (my_rank == 0 && mbImpl->proc_config().size() == 1) my_rank = 1;
+  int *my_rank_ptr = &my_rank;
+  MBTag partition_tag;
+  
+  result = mbImpl->tag_get_handle(partition_name.c_str(), partition_tag);
+  if (MB_TAG_NOT_FOUND == result) {
+    merror->set_last_error( "Couldn't find partition tag\n");
+    return result;
+  }
+  else if (MB_SUCCESS != result) return result;
+    
+  MBRange partition_sets;
+  result = mbImpl->get_entities_by_type_and_tag(file_set, MBENTITYSET,
+                                                &partition_tag, 
+                                                (const void* const *) &my_rank_ptr, 
+                                                1, partition_sets); RR;
+  if (MB_SUCCESS != result || partition_sets.empty()) return result;
+  
+  MBRange file_ents, partition_ents, exist_ents, all_ents;
+
+  for (MBRange::iterator rit = partition_sets.begin(); 
+       rit != partition_sets.end(); rit++) {
+    result = mbImpl->get_entities_by_handle(*rit, partition_ents, 
+                                            MBInterface::UNION); RR;
+  }
+
+    // get pre-existing ents, which are all entities minus file ents
+  result = mbImpl->get_entities_by_handle(0, all_ents); RR;
+  result = mbImpl->get_entities_by_handle(file_set, file_ents); RR;
+  exist_ents = all_ents.subtract(file_ents);
+
+    // merge partition ents into pre-existing entities
+  exist_ents.merge(partition_ents);
+  
+    // gather adjacent ents of lower dimension and add to existing ents
+  MBRange tmp_ents;
+  for (int dim = 2; dim >= 0; dim--) {
+    MBEntityType lower_type = MBCN::TypeDimensionMap[dim+1].first,
+      upper_type = MBCN::TypeDimensionMap[3].second;
+    
+    MBRange::const_iterator bit = exist_ents.lower_bound(lower_type),
+      eit = exist_ents.upper_bound(upper_type);
+    MBRange from_ents;
+    from_ents.merge(bit, eit);
+    tmp_ents.clear();
+    result = mbImpl->get_adjacencies(from_ents, dim, false, tmp_ents, 
+                                     MBInterface::UNION); RR;
+    exist_ents.merge(tmp_ents);
+  }
+  
+    // subtract from all ents to get deletable ents
+  all_ents = all_ents.subtract(exist_ents);
+  
+    // go through the sets to which ones we should keep
+  MBRange all_sets, deletable_sets;
+  result = mbImpl->get_entities_by_type(0, MBENTITYSET, all_sets);
+  for (MBRange::iterator rit = all_sets.begin(); rit != all_sets.end(); rit++) {
+    tmp_ents.clear();
+    result = mbImpl->get_entities_by_handle(*rit, tmp_ents, true); RR;
+    MBRange tmp_ents2 = tmp_ents.intersect(exist_ents);
+    
+      // if the intersection is empty, set is deletable
+    if (tmp_ents2.empty()) deletable_sets.insert(*rit);
+    
+    else if (tmp_ents.size() > tmp_ents2.size()) {
+        // more elements in set or contained sets than we're keeping; delete 
+        // the difference from just this set, to remove entities to be deleted below
+        // it's ok if entity isn't contained, doesn't generate an error
+      tmp_ents = tmp_ents.subtract(tmp_ents2);
+      result = mbImpl->remove_entities(*rit, tmp_ents); RR;
+    }
+  }
+
+    // take the deletable sets out of other sets so we don't end up
+    // with stale set handles
+  for (MBRange::iterator rit = all_sets.begin(); rit != all_sets.end(); rit++) {
+    if (deletable_sets.find(*rit) == deletable_sets.end()) {
+      result = mbImpl->remove_entities(*rit, deletable_sets); RR;
+    }
+  }
+
+    // remove sets from all_ents, since they're dealt with separately
+  all_ents = all_ents.subtract(all_sets);
+  
+    // now delete sets first, then ents
+  result = mbImpl->delete_entities(deletable_sets); RR;
+  result = mbImpl->delete_entities(all_ents); RR;
+  
+  result = ((MBCore*)mbImpl)->check_adjacencies();
+  
+  return result;
+
+*/
+}

Added: MOAB/trunk/parallel/ReadParallel.hpp
===================================================================
--- MOAB/trunk/parallel/ReadParallel.hpp	                        (rev 0)
+++ MOAB/trunk/parallel/ReadParallel.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,43 @@
+#ifndef READ_PARALLEL_HPP
+#define READ_PARALLEL_HPP
+
+#include "MBForward.hpp"
+#include "MBReaderIface.hpp"
+
+#include <string>
+
+class MBReadUtilIface;
+
+class ReadParallel : public MBReaderIface
+{
+   
+public:
+
+  static MBReaderIface* factory( MBInterface* );
+
+    //! load a file
+  MBErrorCode load_file(const char *file_name,
+                        MBEntityHandle& file_set,
+                        const FileOptions &opts,
+                        const int* material_set_list,
+                        const int num_material_sets );
+  
+    //! Constructor
+  ReadParallel(MBInterface* impl = NULL) {mbImpl = impl;};
+
+   //! Destructor
+  virtual ~ReadParallel() {}
+
+protected:
+
+private:
+  MBInterface *mbImpl;
+  
+  MBErrorCode delete_nonlocal_entities(std::string &ptag_name,
+                                       MBEntityHandle file_set);
+  
+  MBErrorCode delete_nonlocal_entities(MBRange &partition_sets,
+                                       MBEntityHandle file_set);
+};
+
+#endif

Added: MOAB/trunk/parallel/WriteHDF5Parallel.cpp
===================================================================
--- MOAB/trunk/parallel/WriteHDF5Parallel.cpp	                        (rev 0)
+++ MOAB/trunk/parallel/WriteHDF5Parallel.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,2132 @@
+
+#undef DEBUG
+
+#ifdef DEBUG
+#  include <stdio.h>
+#  include <stdarg.h>
+#endif
+
+#ifndef HDF5_FILE
+#  error Attempt to compile WriteHDF5Parallel with HDF5 support disabled
+#endif
+
+#include <stdlib.h>
+#include <string.h>
+
+#include <vector>
+#include <set>
+#include <map>
+#include <utility>
+
+#include <mpi.h>
+
+#include <H5Tpublic.h>
+#include <H5Ppublic.h>
+#include <H5FDmpi.h>
+#include <H5FDmpio.h>
+
+#include "mhdf.h"
+
+#include "MBInterface.hpp"
+#include "MBInternals.hpp"
+#include "MBTagConventions.hpp"
+#include "MBParallelConventions.h"
+#include "MBCN.hpp"
+#include "MBWriteUtilIface.hpp"
+
+#include "WriteHDF5Parallel.hpp"
+
+
+#ifdef DEBUG
+#  define START_SERIAL                     \
+     for (int _x = 0; _x < numProc; ++_x) {\
+       MPI_Barrier( MPI_COMM_WORLD );      \
+       if (_x != myRank) continue     
+#  define END_SERIAL                       \
+     }                                     \
+     MPI_Barrier( MPI_COMM_WORLD )
+#else
+#  define START_SERIAL
+#  define END_SERIAL
+#endif
+
+
+#define DEBUG_OUT_STREAM stdout
+
+#ifndef DEBUG
+static void printdebug( const char*, ... ) {}
+#else
+static void printdebug( const char* fmt, ... )
+{
+  int rank;
+  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
+  fprintf( DEBUG_OUT_STREAM, "[%d] ", rank );
+  va_list args;
+  va_start( args, fmt );
+  vfprintf( DEBUG_OUT_STREAM, fmt, args );
+  va_end( args );
+  fflush( DEBUG_OUT_STREAM );
+}
+#endif
+
+
+#ifdef NDEBUG
+#  define assert(A)
+#else
+#  define assert(A) if (!(A)) do_assert(__FILE__, __LINE__, #A)
+   static void do_assert( const char* file, int line, const char* condstr )
+   {
+     int rank;
+     MPI_Comm_rank( MPI_COMM_WORLD, &rank );
+     fprintf( DEBUG_OUT_STREAM, "[%d] Assert(%s) failed at %s:%d\n", rank, condstr, file, line );
+     fflush( DEBUG_OUT_STREAM );
+     abort();
+   }
+#endif
+
+
+#ifndef DEBUG
+void WriteHDF5Parallel::printrange( MBRange& ) {}
+#else
+void WriteHDF5Parallel::printrange( MBRange& r )
+{
+  int rank;
+  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
+  MBEntityType type = MBMAXTYPE;
+  for (MBRange::const_pair_iterator i = r.pair_begin(); i != r.pair_end(); ++i)
+  {
+    MBEntityHandle a, b;
+    a = (*i).first;
+    b = (*i).second;
+    MBEntityType mytype = iFace->type_from_handle(a);
+    if (mytype != type)
+    {
+      type = mytype;
+      fprintf(DEBUG_OUT_STREAM, "%s[%d]  %s", type == MBMAXTYPE ? "" : "\n", rank, MBCN::EntityTypeName( type ) );
+    }
+    unsigned long id1 = iFace->id_from_handle( a );
+    unsigned long id2 = iFace->id_from_handle( b );
+    if (id1 == id2)
+      fprintf(DEBUG_OUT_STREAM, " %lu", id1 );
+    else
+      fprintf(DEBUG_OUT_STREAM, " %lu-%lu", id1, id2 );
+  }
+  fprintf(DEBUG_OUT_STREAM, "\n");
+  fflush( DEBUG_OUT_STREAM );
+}
+#endif
+
+
+#ifndef DEBUG
+static void print_type_sets( MBInterface* , int , int , MBRange& ) {}
+#else
+static void print_type_sets( MBInterface* iFace, int myRank, int numProc, MBRange& sets )
+{
+  MBTag gid, did, bid, sid, nid, iid;
+  iFace->tag_get_handle( GLOBAL_ID_TAG_NAME, gid ); 
+  iFace->tag_get_handle( GEOM_DIMENSION_TAG_NAME, did );
+  iFace->tag_get_handle( MATERIAL_SET_TAG_NAME, bid );
+  iFace->tag_get_handle( DIRICHLET_SET_TAG_NAME, nid );
+  iFace->tag_get_handle( NEUMANN_SET_TAG_NAME, sid );
+  iFace->tag_get_handle( PARALLEL_INTERFACE_TAG_NAME, iid );
+  MBRange typesets[10];
+  const char* typenames[] = {"Block", "Sideset", "NodeSet", "Vertex", "Curve", "Surface", "Volume", "Body", "Interfaces", "Other"};
+  for (MBRange::iterator riter = sets.begin(); riter != sets.end(); ++riter)
+  {
+    unsigned dim, id, proc[2], oldsize;
+    if (MB_SUCCESS == iFace->tag_get_data(bid, &*riter, 1, &id)) 
+      dim = 0;
+    else if (MB_SUCCESS == iFace->tag_get_data(sid, &*riter, 1, &id))
+      dim = 1;
+    else if (MB_SUCCESS == iFace->tag_get_data(nid, &*riter, 1, &id))
+      dim = 2;
+    else if (MB_SUCCESS == iFace->tag_get_data(did, &*riter, 1, &dim)) {
+      id = 0;
+      iFace->tag_get_data(gid, &*riter, 1, &id);
+      dim += 3;
+    }
+    else if (MB_SUCCESS == iFace->tag_get_data(iid, &*riter, 1, proc)) {
+      assert(proc[0] == (unsigned)myRank || proc[1] == (unsigned)myRank);
+      id = proc[proc[0] == (unsigned)myRank];
+      dim = 8;
+    }
+    else {
+      id = *riter;
+      dim = 9;
+    }
+
+    oldsize = typesets[dim].size();
+    typesets[dim].insert( id );
+    assert( typesets[dim].size() - oldsize == 1 );  
+  }
+  for (int ii = 0; ii < 10; ++ii)
+  {
+    char num[16];
+    std::string line(typenames[ii]);
+    if (typesets[ii].empty())
+      continue;
+    sprintf(num, "(%u):", typesets[ii].size());
+    line += num;
+    for (MBRange::const_pair_iterator piter = typesets[ii].pair_begin();
+         piter != typesets[ii].pair_end(); ++piter)
+    {
+      sprintf(num," %d", (*piter).first);
+      line += num;
+      if ((*piter).first != (*piter).second) {
+        sprintf(num,"-%d", (*piter).second);
+        line += num;
+      }
+    }
+
+    printdebug ("%s\n", line.c_str());
+  }
+  printdebug("Total: %u\n", sets.size());
+}
+#endif
+
+
+void range_remove( MBRange& from, const MBRange& removed )
+{
+  
+/* The following should be more efficient, but isn't due
+   to the inefficient implementation of MBRange::erase(iter,iter)
+  MBRange::const_iterator s, e, n = from.begin();
+  for (MBRange::const_pair_iterator p = removed.pair_begin();
+       p != removed.pair_end(); ++p)
+  {
+    e = s = MBRange::lower_bound(n, from.end(), (*p).first);
+    e = MBRange::lower_bound(s, from.end(), (*p).second);
+    if (e != from.end() && *e == (*p).second)
+      ++e;
+    n = from.erase( s, e );
+  }
+*/
+
+  if (removed.size())
+  {
+    MBRange tmp = from.subtract(removed);
+    from.swap( tmp );
+  }
+}
+
+void WriteHDF5Parallel::MultiProcSetTags::add( const std::string& name )
+  { list.push_back( Data(name) ); }
+
+void WriteHDF5Parallel::MultiProcSetTags::add( const std::string& filter, 
+                                               const std::string& data )
+  { list.push_back( Data(filter,data) ); }
+
+void WriteHDF5Parallel::MultiProcSetTags::add( const std::string& filter, 
+                                               int filterval,
+                                               const std::string& data )
+  { list.push_back( Data(filter,data,filterval) ); }
+
+
+WriteHDF5Parallel::WriteHDF5Parallel( MBInterface* iface )
+  : WriteHDF5(iface)
+{
+  multiProcSetTags.add(  MATERIAL_SET_TAG_NAME );
+  multiProcSetTags.add( DIRICHLET_SET_TAG_NAME );
+  multiProcSetTags.add(   NEUMANN_SET_TAG_NAME );
+  multiProcSetTags.add( GEOM_DIMENSION_TAG_NAME, 0, GLOBAL_ID_TAG_NAME );
+  multiProcSetTags.add( GEOM_DIMENSION_TAG_NAME, 1, GLOBAL_ID_TAG_NAME );
+  multiProcSetTags.add( GEOM_DIMENSION_TAG_NAME, 2, GLOBAL_ID_TAG_NAME );
+  multiProcSetTags.add( GEOM_DIMENSION_TAG_NAME, 3, GLOBAL_ID_TAG_NAME );
+}
+
+WriteHDF5Parallel::WriteHDF5Parallel( MBInterface* iface,
+                                      const std::vector<std::string>& tag_names )
+  : WriteHDF5(iface)
+{
+  for(std::vector<std::string>::const_iterator i = tag_names.begin();
+      i != tag_names.end(); ++i)
+    multiProcSetTags.add( *i );
+}
+
+WriteHDF5Parallel::WriteHDF5Parallel( MBInterface* iface,
+                                      const MultiProcSetTags& set_tags )
+  : WriteHDF5(iface), multiProcSetTags(set_tags)
+{}
+
+// The parent WriteHDF5 class has ExportSet structs that are
+// populated with the entities to be written, grouped by type
+// (and for elements, connectivity length).  This function:
+//  o determines which entities are to be written by a remote processor
+//  o removes those entities from the ExportSet structs in WriteMesh
+//  o puts them in the 'remoteMesh' array of MBRanges in this class
+//  o sets their file Id to '1'
+MBErrorCode WriteHDF5Parallel::gather_interface_meshes()
+{
+  MBRange range;
+  MBErrorCode result;
+  MBTag iface_tag, geom_tag;
+  int i, proc_pair[2];
+  
+  START_SERIAL;
+  printdebug( "Pre-interface mesh:\n");
+  printrange(nodeSet.range);
+  for (std::list<ExportSet>::iterator eiter = exportList.begin();
+           eiter != exportList.end(); ++eiter )
+    printrange(eiter->range);
+  printrange(setSet.range);
+  
+    // Allocate space for remote mesh data
+  remoteMesh.resize( numProc );
+  
+    // Get tag handles
+  result = iFace->tag_get_handle( PARALLEL_INTERFACE_TAG_NAME, iface_tag );
+  if (MB_SUCCESS != result) return result;
+  result = iFace->tag_get_handle( PARALLEL_GEOM_TOPO_TAG_NAME, geom_tag );
+  if (MB_SUCCESS != result) return result;
+  
+  
+    // Get interface mesh sets
+  result = iFace->get_entities_by_type_and_tag( 0,
+                                                MBENTITYSET,
+                                                &iface_tag,
+                                                0,
+                                                1,
+                                                range );
+  if (MB_SUCCESS != result) return result;
+  
+  
+    // Populate lists of interface mesh entities
+  for (MBRange::iterator iiter = range.begin(); iiter != range.end(); ++iiter)
+  {
+    result = iFace->tag_get_data( iface_tag, &*iiter, 1, proc_pair );
+    if (MB_SUCCESS != result) return result;
+    const int remote_proc = proc_pair[0];
+    
+      // Get list of all entities in interface and 
+      // the subset of that list that are meshsets.
+    MBRange entities, sets;
+    result = iFace->get_entities_by_handle( *iiter, entities );
+    if (MB_SUCCESS != result) return result;
+    result = iFace->get_entities_by_type( *iiter, MBENTITYSET, sets );
+    if (MB_SUCCESS != result) return result;
+
+      // Put any non-meshset entities in the list directly.
+    //range_remove( entities, sets ); //not necessary, get_entities_by_handle doesn't return sets
+    remoteMesh[remote_proc].merge( entities );
+    //remoteMesh[remote_proc].insert( *iiter );
+    
+    for (MBRange::iterator siter = sets.begin(); siter != sets.end(); ++siter)
+    {
+        // For current parallel meshing code, root processor owns
+        // all curve and geometric vertex meshes.  
+      int dimension;
+      result = iFace->tag_get_data( geom_tag, &*siter, 1, &dimension );
+      if (result == MB_SUCCESS && dimension < 2)
+        continue;
+        
+        // Put entities in list for appropriate processor.
+      //remoteMesh[remote_proc].insert( *siter );
+      entities.clear();
+      result = iFace->get_entities_by_handle( *siter, entities );
+      if (MB_SUCCESS != result) return result;
+      remoteMesh[remote_proc].merge( entities );
+    }
+  }
+  
+    // For current parallel meshing code, root processor owns
+    // all curve and geometric vertex meshes.  Find them and
+    // allocate them appropriately.
+  MBRange curves_and_verts;
+  MBTag tags[] = { geom_tag, geom_tag };
+  int value_ints[] = { 0, 1 };
+  const void* values[] = {value_ints, value_ints + 1};
+  result = iFace->get_entities_by_type_and_tag( 0, MBENTITYSET,
+                                                tags, values, 2,
+                                                curves_and_verts, 
+                                                MBInterface::UNION );
+                                                assert(MB_SUCCESS == result);
+  MBRange edges, nodes;
+  for (MBRange::iterator riter = curves_and_verts.begin();
+       riter != curves_and_verts.end(); ++riter)
+  {
+    result = iFace->get_entities_by_type( *riter, MBVERTEX, nodes ); assert(MB_SUCCESS == result);
+    result = iFace->get_entities_by_type( *riter, MBEDGE, edges ); assert(MB_SUCCESS == result);
+  }
+  std::list<ExportSet>::iterator eiter = exportList.begin();
+  for ( ; eiter != exportList.end() && eiter->type != MBEDGE; ++eiter );
+  
+  remoteMesh[0].merge( nodes );
+  remoteMesh[0].merge( edges );
+  //remoteMesh[0].merge( curves_and_verts );
+  if (myRank == 0)
+  {
+    nodeSet.range.merge( nodes );
+    //setSet.range.merge(curves_and_verts);
+    eiter->range.merge( edges );
+  } 
+  edges.merge(nodes);
+  //edges.merge(curves_and_verts);
+  for (i = 1; i < numProc; i++)
+  {
+    MBRange diff = edges.intersect( remoteMesh[i] );
+    range_remove(remoteMesh[i], diff);
+  }
+  
+  
+  
+    // For all remote mesh entities, remove them from the
+    // lists of local mesh to be exported and give them a 
+    // junk file Id of 1.  Need to specify a file ID greater
+    // than zero so the code that gathers adjacencies and 
+    // such doesn't think that the entities aren't being
+    // exported.
+  for (i = 0; i < numProc; i++)
+  {
+    if (i == myRank) continue;
+    
+    MBRange& range = remoteMesh[i];
+    
+    range_remove( nodeSet.range, range );
+    //range_remove( setSet.range, range );
+    for (std::list<ExportSet>::iterator eiter = exportList.begin();
+         eiter != exportList.end(); ++eiter )
+      range_remove( eiter->range, range );
+    
+    int id = 1;
+    for (MBRange::iterator riter = remoteMesh[i].begin(); 
+         riter != remoteMesh[i].end() && iFace->type_from_handle(*riter) != MBENTITYSET; 
+         ++riter)
+    {
+      result = iFace->tag_set_data( idTag, &*riter, 1, &id );
+      if (MB_SUCCESS != result) return result;
+    }
+  }
+  
+    // print some debug output summarizing what we've accomplished
+  
+  printdebug("Remote mesh:\n");
+  for (int ii = 0; ii < numProc; ++ii)
+  {
+    printdebug("  proc %d : %d\n", ii, remoteMesh[ii].size());
+    printrange( remoteMesh[ii] );
+  }
+
+  printdebug( "Post-interface mesh:\n");
+  printrange(nodeSet.range);
+  for (std::list<ExportSet>::iterator eiter = exportList.begin();
+           eiter != exportList.end(); ++eiter )
+    printrange(eiter->range);
+  printrange(setSet.range);
+
+  END_SERIAL;
+  
+  return MB_SUCCESS;
+}
+
+
+
+MBErrorCode WriteHDF5Parallel::create_file( const char* filename,
+                                            bool overwrite,
+                                            std::vector<std::string>& qa_records,
+                                            int dimension )
+{
+  MBErrorCode rval;
+  int result;
+  mhdf_Status status;
+    
+  result = MPI_Comm_rank( MPI_COMM_WORLD, &myRank );
+  assert(MPI_SUCCESS == result);
+  result = MPI_Comm_size( MPI_COMM_WORLD, &numProc );
+  assert(MPI_SUCCESS == result);
+  
+  rval = gather_interface_meshes();
+  if (MB_SUCCESS != rval) return rval;
+  
+    /**************** Create actual file and write meta info ***************/
+
+  if (myRank == 0)
+  {
+      // create the file
+    const char* type_names[MBMAXTYPE];
+    memset( type_names, 0, MBMAXTYPE * sizeof(char*) );
+    for (MBEntityType i = MBEDGE; i < MBENTITYSET; ++i)
+      type_names[i] = MBCN::EntityTypeName( i );
+   
+    filePtr = mhdf_createFile( filename, overwrite, type_names, MBMAXTYPE, &status );
+    if (!filePtr)
+    {
+      writeUtil->report_error( "%s\n", mhdf_message( &status ) );
+      return MB_FAILURE;
+    }
+    
+    rval = write_qa( qa_records );
+    if (MB_SUCCESS != rval) return rval;
+  }
+  
+  
+     /**************** Create node coordinate table ***************/
+ 
+  rval = create_node_table( dimension );
+  if (MB_SUCCESS != rval) return rval;
+  
+  
+    /**************** Create element tables ***************/
+
+  rval = negotiate_type_list();
+  if (MB_SUCCESS != rval) return rval;
+  rval = create_element_tables();
+  if (MB_SUCCESS != rval) return rval;
+  
+
+    /**************** Comminucate all remote IDs ***********************/
+  
+  rval = communicate_remote_ids( MBVERTEX );
+  for (std::list<ExportSet>::iterator ex_itor = exportList.begin(); 
+       ex_itor != exportList.end(); ++ex_itor)
+  {
+    rval = communicate_remote_ids( ex_itor->type );
+    assert(MB_SUCCESS == rval);
+  }
+  
+  
+    /**************** Create adjacency tables *********************/
+  
+  rval = create_adjacency_tables();
+  if (MB_SUCCESS != rval) return rval;
+  
+    /**************** Create meshset tables *********************/
+  
+  rval = create_meshset_tables();
+  if (MB_SUCCESS != rval) return rval;
+  
+  
+    /* Need to write tags for shared sets this proc is responsible for */
+  
+  MBRange parallel_sets;
+  for (std::list<ParallelSet>::const_iterator psiter = parallelSets.begin();
+       psiter != parallelSets.end(); ++psiter)
+    if (psiter->description)
+      parallel_sets.insert( psiter->handle );
+  
+  setSet.range.merge( parallel_sets );
+  rval = gather_tags();
+  if (MB_SUCCESS != rval)
+    return rval;
+  range_remove( setSet.range, parallel_sets );   
+  
+
+    /**************** Create tag data *********************/
+  
+  std::list<SparseTag>::iterator tag_iter;
+  sort_tags_by_name();
+  const int num_tags = tagList.size();
+  std::vector<int> tag_offsets(num_tags), tag_counts(num_tags);
+  std::vector<int>::iterator tag_off_iter = tag_counts.begin();
+  for (tag_iter = tagList.begin(); tag_iter != tagList.end(); ++tag_iter, ++tag_off_iter)
+    *tag_off_iter = tag_iter->range.size();
+  
+  printdebug("Exchanging tag data for %d tags.\n", num_tags);
+  std::vector<int> proc_tag_offsets(num_tags*numProc);
+  result = MPI_Gather( &tag_counts[0], num_tags, MPI_INT,
+                 &proc_tag_offsets[0], num_tags, MPI_INT,
+                       0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+  tag_iter = tagList.begin();
+  for (int i = 0; i < num_tags; ++i, ++tag_iter)
+  {
+    tag_counts[i] = 0;
+    int next_offset = 0;
+    for (int j = 0; j < numProc; j++)
+    {
+      int count = proc_tag_offsets[i + j*num_tags];
+      proc_tag_offsets[i + j*num_tags] = next_offset;
+      next_offset += count;
+      tag_counts[i] += count;
+    }
+
+    if (0 == myRank)
+    {
+      rval = create_tag( tag_iter->tag_id, next_offset );
+      assert(MB_SUCCESS == rval);
+      printdebug( "Creating table of size %d for tag 0x%lx\n", (int)next_offset, (unsigned long)tag_iter->tag_id);
+    }
+  }
+  
+  result = MPI_Bcast( &tag_counts[0], num_tags, MPI_INT, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+  result = MPI_Scatter( &proc_tag_offsets[0], num_tags, MPI_INT,
+                             &tag_offsets[0], num_tags, MPI_INT,
+                             0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+
+
+  tag_iter = tagList.begin();
+  for (int i = 0; i < num_tags; ++i, ++tag_iter)
+  {
+    tag_iter->offset = tag_offsets[i];
+    tag_iter->write = tag_counts[i] > 0;
+  }
+
+  #ifdef DEBUG
+  START_SERIAL;  
+  printdebug("Tags: %16s %8s %8s %8s\n", "Name", "Count", "Offset", "Handle");
+
+  tag_iter = tagList.begin();
+  for (int i = 0; i < num_tags; ++i, ++tag_iter)
+  {
+    std::string name;
+    iFace->tag_get_name( tag_iter->tag_id, name );
+    printdebug("      %16s %8d %8d %8lx\n", name.c_str(), tag_counts[i], tag_offsets[i], (unsigned long)tag_iter->tag_id );
+  }
+  END_SERIAL;  
+  #endif
+  
+  /************** Close serial file and reopen parallel *****************/
+  
+  if (0 == myRank)
+  {
+    mhdf_closeFile( filePtr, &status );
+  }
+  
+  unsigned long junk;
+  hid_t hdf_opt = H5Pcreate( H5P_FILE_ACCESS );
+  H5Pset_fapl_mpio( hdf_opt, MPI_COMM_WORLD, MPI_INFO_NULL );
+  filePtr = mhdf_openFileWithOpt( filename, 1, &junk, hdf_opt, &status );
+  if (!filePtr)
+  {
+    writeUtil->report_error( "%s\n", mhdf_message( &status ) );
+    return MB_FAILURE;
+  }
+  
+  
+  return MB_SUCCESS;
+}
+
+
+MBErrorCode WriteHDF5Parallel::create_node_table( int dimension )
+{
+  int result;
+  mhdf_Status status;
+ 
+    // gather node counts for each processor
+  std::vector<int> node_counts(numProc);
+  int num_nodes = nodeSet.range.size();
+  result = MPI_Gather( &num_nodes, 1, MPI_INT, &node_counts[0], 1, MPI_INT, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // create node data in file
+  long first_id;
+  if (myRank == 0)
+  {
+    int total = 0;
+    for (int i = 0; i < numProc; i++)
+      total += node_counts[i];
+      
+    hid_t handle = mhdf_createNodeCoords( filePtr, dimension, total, &first_id, &status );
+    if (mhdf_isError( &status ))
+    {
+      writeUtil->report_error( "%s\n", mhdf_message( &status ) );
+      return MB_FAILURE;
+    }
+    mhdf_closeData( filePtr, handle, &status );
+ }
+    
+    // send id offset to every proc
+  result = MPI_Bcast( &first_id, 1, MPI_LONG, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  nodeSet.first_id = (id_t)first_id;
+   
+      // calculate per-processor offsets
+  if (myRank == 0)
+  {
+    int prev_size = node_counts[0];
+    node_counts[0] = 0;
+    for (int i = 1; i < numProc; ++i)
+    {
+      int mysize = node_counts[i];
+      node_counts[i] = node_counts[i-1] + prev_size;
+      prev_size = mysize;
+    }
+  }
+  
+    // send each proc it's offset in the node table
+  int offset;
+  result = MPI_Scatter( &node_counts[0], 1, MPI_INT, 
+                        &offset, 1, MPI_INT,
+                        0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  nodeSet.offset = offset;
+  
+  writeUtil->assign_ids( nodeSet.range, idTag, (id_t)(nodeSet.first_id + nodeSet.offset) );
+
+  return MB_SUCCESS;
+}
+
+
+
+struct elemtype {
+  int mbtype;
+  int numnode;
+  
+  elemtype( int vals[2] ) : mbtype(vals[0]), numnode(vals[1]) {}
+  elemtype( int t, int n ) : mbtype(t), numnode(n) {}
+  
+  bool operator==( const elemtype& other ) const
+  {
+    return mbtype == other.mbtype &&
+            (mbtype == MBPOLYGON ||
+             mbtype == MBPOLYHEDRON ||
+             mbtype == MBENTITYSET ||
+             numnode == other.numnode);
+  }
+  bool operator<( const elemtype& other ) const
+  {
+    if (mbtype > other.mbtype)
+      return false;
+   
+    return mbtype < other.mbtype ||
+           (mbtype != MBPOLYGON &&
+            mbtype != MBPOLYHEDRON &&
+            mbtype != MBENTITYSET &&
+            numnode < other.numnode);
+  }
+  bool operator!=( const elemtype& other ) const
+    { return !this->operator==(other); }
+};
+
+
+MBErrorCode WriteHDF5Parallel::negotiate_type_list()
+{
+  int result;
+  
+  exportList.sort();
+  
+    // Get number of types each processor has
+  int num_types = 2*exportList.size();
+  std::vector<int> counts(numProc);
+  result = MPI_Gather( &num_types, 1, MPI_INT, &counts[0], 1, MPI_INT, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // Get list of types on this processor
+  std::vector<int> my_types(num_types);
+  std::vector<int>::iterator viter = my_types.begin();
+  for (std::list<ExportSet>::iterator eiter = exportList.begin();
+       eiter != exportList.end(); ++eiter)
+  {
+    *viter = eiter->type;      ++viter;
+    *viter = eiter->num_nodes; ++viter;
+  }
+
+  #ifdef DEBUG
+  START_SERIAL;
+  printdebug( "Local Element Types:\n");
+  viter = my_types.begin();
+  while (viter != my_types.end())
+  {
+    int type = *viter; ++viter;
+    int count = *viter; ++viter;
+    printdebug("  %s : %d\n", MBCN::EntityTypeName((MBEntityType)type), count);
+  }
+  END_SERIAL;
+  #endif
+
+    // Get list of types from each processor
+  std::vector<int> displs(numProc + 1);
+  displs[0] = 0;
+  for (int i = 1; i <= numProc; ++i)
+    displs[i] = displs[i-1] + counts[i-1];
+  int total = displs[numProc];
+  std::vector<int> alltypes(total);
+  result = MPI_Gatherv( &my_types[0], my_types.size(), MPI_INT,
+                        &alltypes[0], &counts[0], &displs[0], MPI_INT,
+                        0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // Merge type lists
+  std::list<elemtype> type_list;
+  std::list<elemtype>::iterator liter;
+  for (int i = 0; i < numProc; ++i)
+  {
+    int* proc_type_list = &alltypes[displs[i]];
+    liter = type_list.begin();
+    for (int j = 0; j < counts[i]; j += 2)
+    {
+      elemtype type( &proc_type_list[j] );
+        // skip until insertion spot
+      for (; liter != type_list.end() && *liter < type; ++liter);
+      
+      if (liter == type_list.end() || *liter != type)
+        liter = type_list.insert( liter, type );
+    }
+  }
+  
+    // Send total number of types to each processor
+  total = type_list.size();
+  result = MPI_Bcast( &total, 1, MPI_INT, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // Send list of types to each processor
+  std::vector<int> intlist(total * 2);
+  viter = intlist.begin();
+  for (liter = type_list.begin(); liter != type_list.end(); ++liter)
+  {
+    *viter = liter->mbtype;  ++viter;
+    *viter = liter->numnode; ++viter;
+  }
+  result = MPI_Bcast( &intlist[0], 2*total, MPI_INT, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+
+  #ifdef DEBUG
+  START_SERIAL;
+  printdebug( "Global Element Types:\n");
+  viter = intlist.begin();
+  while (viter != intlist.end())
+  {
+    int type = *viter; ++viter;
+    int count = *viter; ++viter;
+    printdebug("  %s : %d\n", MBCN::EntityTypeName((MBEntityType)type), count);
+  }
+  END_SERIAL;
+  #endif
+  
+    // Insert missing types into exportList, with an empty
+    // range of entities to export.
+  std::list<ExportSet>::iterator ex_iter = exportList.begin();
+  viter = intlist.begin();
+  for (int i = 0; i < total; ++i)
+  {
+    int mbtype = *viter; ++viter;
+    int numnode = *viter; ++viter;
+    while (ex_iter != exportList.end() && ex_iter->type < mbtype)
+      ++ex_iter;
+    
+    bool equal = ex_iter != exportList.end() && ex_iter->type == mbtype;
+    if (equal && mbtype != MBPOLYGON && mbtype != MBPOLYHEDRON)
+    {
+      while (ex_iter != exportList.end() && ex_iter->num_nodes < numnode)
+        ++ex_iter;
+        
+      equal = ex_iter != exportList.end() && ex_iter->num_nodes == numnode;
+    }
+    
+    if (!equal)
+    {
+      ExportSet insert;
+      insert.type = (MBEntityType)mbtype;
+      insert.num_nodes = numnode;
+      insert.first_id = 0;
+      insert.offset = 0;
+      insert.poly_offset = 0;
+      insert.adj_offset = 0;
+      ex_iter = exportList.insert( ex_iter, insert );
+    }
+  }
+  
+  return MB_SUCCESS;
+}
+
+MBErrorCode WriteHDF5Parallel::create_element_tables()
+{
+  int result;
+  MBErrorCode rval;
+  std::list<ExportSet>::iterator ex_iter;
+  std::vector<long>::iterator viter;
+  
+    // Get number of each element type from each processor
+  const int numtypes = exportList.size();
+  std::vector<long> my_counts(numtypes);
+  std::vector<long> counts(numtypes * numProc + numtypes);
+  viter = my_counts.begin();
+  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
+    { *viter = ex_iter->range.size(); ++viter; }
+  
+  result = MPI_Gather( &my_counts[0], numtypes, MPI_LONG,
+                       &counts[0],    numtypes, MPI_LONG, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // Convert counts to offsets
+  for (int i = 0; i < numtypes; i++) 
+  {
+    long prev = 0;
+    for (int j = 0; j <= numProc; j++)
+    {
+      long tmp = counts[j*numtypes + i];
+      counts[j*numtypes+i] = prev;
+      prev += tmp;
+    }
+  }
+  
+    // Send offsets to each processor
+  result = MPI_Scatter( &counts[0],    numtypes, MPI_LONG,
+                        &my_counts[0], numtypes, MPI_LONG,
+                        0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // Update store offsets in ExportSets
+  viter = my_counts.begin();
+  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
+    ex_iter->offset = (id_t)*(viter++);
+  
+    // If polygons or polyhedra, send calculate offsets for each
+  std::vector<int> perproc(numProc+1);
+  ExportSet *poly[] = {0,0};
+  int polycount[2];
+  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
+  {
+    if (ex_iter->type == MBPOLYGON)
+    {
+      assert(!poly[0]);
+      poly[0] = &*ex_iter;
+    }
+    else if(ex_iter->type == MBPOLYHEDRON)
+    {
+      assert(!poly[1]);
+      poly[1] = &*ex_iter;
+    }
+  }
+  for (int i = 0; i < 2; i++)
+  {
+    ExportSet* ppoly = poly[i];
+    if (!ppoly)
+      continue;
+  
+    int count;
+    rval = writeUtil->get_poly_array_size( ppoly->range.begin(),
+                                           ppoly->range.end(),
+                                           count );
+    assert(MB_SUCCESS == rval);
+    result = MPI_Gather( &count, 1, MPI_INT, &perproc[0], 1, MPI_INT, 0, MPI_COMM_WORLD );
+    assert(MPI_SUCCESS == result);
+    
+    int prev = 0;
+    for (int j = 1; j <= numProc; j++)
+    {
+      int tmp = perproc[j];
+      perproc[j] = prev;
+      prev += tmp;
+    }
+                                           
+    polycount[i] = perproc[numProc];
+    result = MPI_Scatter( &perproc[0], 1, MPI_INT, &count, 1, MPI_INT, 0, MPI_COMM_WORLD );
+    assert(MPI_SUCCESS == result);
+    ppoly->poly_offset = count;
+  }
+  
+    // Create element tables
+  std::vector<long> start_ids(numtypes);
+  if (myRank == 0)
+  {
+    viter = start_ids.begin();
+    long* citer = &counts[numtypes * numProc];
+    for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
+    {
+      switch(ex_iter->type) {
+      case MBPOLYGON:
+        rval = create_poly_tables( MBPOLYGON,
+                                   *citer,
+                                   polycount[0],
+                                   *viter );
+        break;
+      case MBPOLYHEDRON:
+        rval = create_poly_tables( MBPOLYHEDRON,
+                                   *citer,
+                                   polycount[1],
+                                   *viter );
+        break;
+      default:
+        rval = create_elem_tables( ex_iter->type,
+                                   ex_iter->num_nodes,
+                                   *citer,
+                                   *viter );
+      }
+      assert(MB_SUCCESS == rval);
+      ++citer;
+      ++viter;
+    }
+  }
+  
+    // send start IDs to each processor
+  result = MPI_Bcast( &start_ids[0], numtypes, MPI_LONG, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // Assign IDs to local elements
+  viter = start_ids.begin();
+  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
+  {
+    ex_iter->first_id = *(viter++);
+    id_t myfirst = (id_t)(ex_iter->first_id + ex_iter->offset);
+    rval = writeUtil->assign_ids( ex_iter->range, idTag, myfirst );
+    assert(MB_SUCCESS == rval);
+  }
+  
+  return MB_SUCCESS;
+}
+  
+MBErrorCode WriteHDF5Parallel::create_adjacency_tables()
+{
+  MBErrorCode rval;
+  mhdf_Status status;
+  int i, j, result;
+#ifdef WRITE_NODE_ADJACENCIES  
+  const int numtypes = exportList.size()+1;
+#else
+  const int numtypes = exportList.size();
+#endif
+  std::vector<long>::iterator viter;
+  std::list<ExportSet>::iterator ex_iter;
+  std::vector<long> local(numtypes), all(numProc * numtypes + numtypes);
+  
+    // Get adjacency counts for local processor
+  viter = local.begin();
+  id_t num_adj;
+#ifdef WRITE_NODE_ADJACENCIES  
+  rval = count_adjacencies( nodeSet.range, num_adj );
+  assert (MB_SUCCESS == rval);
+  *viter = num_adj; ++viter;
+#endif
+
+  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
+  {
+    rval = count_adjacencies( ex_iter->range, num_adj );
+    assert (MB_SUCCESS == rval);
+    *viter = num_adj; ++viter;
+  }
+  
+    // Send local adjacency counts to root processor
+  result = MPI_Gather( &local[0], numtypes, MPI_LONG,
+                       &all[0],   numtypes, MPI_LONG, 
+                       0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // Convert counts to offsets
+  for (i = 0; i < numtypes; i++) 
+  {
+    long prev = 0;
+    for (j = 0; j <= numProc; j++)
+    {
+      long tmp = all[j*numtypes + i];
+      all[j*numtypes+i] = prev;
+      prev += tmp;
+    }
+  }
+  
+    // For each element type for which there is no adjacency data,
+    // send -1 to all processors as the offset
+  for (i = 0; i < numtypes; ++i)
+    if (all[numtypes*numProc+i] == 0)
+      for (j = 0; j < numProc; ++j)
+        all[j*numtypes+i] = -1;
+  
+    // Send offsets back to each processor
+  result = MPI_Scatter( &all[0],   numtypes, MPI_LONG,
+                        &local[0], numtypes, MPI_LONG,
+                        0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // Record the adjacency offset in each ExportSet
+  viter = local.begin();
+#ifdef WRITE_NODE_ADJACENCIES  
+  nodeSet.adj_offset = *viter; ++viter;
+#endif
+  for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter)
+    { ex_iter->adj_offset = *viter; ++viter; }
+  
+    // Create data tables in file
+  if (myRank == 0)
+  {
+    viter = all.begin() + (numtypes * numProc);
+#ifdef WRITE_NODE_ADJACENCIES  
+    if (*viter) {
+      hid_t handle = mhdf_createAdjacency( filePtr, 
+                                           mhdf_node_type_handle(),
+                                           *viter,
+                                           &status );
+      if (mhdf_isError( &status ))
+      {
+        writeUtil->report_error( "%s\n", mhdf_message( &status ) );
+        return MB_FAILURE;
+      }
+      mhdf_closeData( filePtr, handle, &status );
+    }
+    ++viter;
+#endif
+    for (ex_iter = exportList.begin(); ex_iter != exportList.end(); ++ex_iter, ++viter)
+    {
+      if (!*viter) 
+        continue;
+      
+      hid_t handle = mhdf_createAdjacency( filePtr,
+                                           ex_iter->name(),
+                                           *viter,
+                                           &status );
+      if (mhdf_isError( &status ))
+      {
+        writeUtil->report_error( "%s\n", mhdf_message( &status ) );
+        return MB_FAILURE;
+      }
+      mhdf_closeData( filePtr, handle, &status );
+    }
+  }
+
+  return MB_SUCCESS;
+}
+
+/*
+MBErrorCode WriteHDF5Parallel::get_interface_set_data( RemoteSetData& data,
+                                                       long& offset )
+{
+  const char* PROC_ID_TAG = "HDF5Writer_Rank";
+  MBTag iface_tag, proc_tag;
+  MBErrorCode rval;
+  
+  rval = iFace->tag_get_handle( PARALLEL_INTERFACE_TAG_NAME, iface_tag );
+  if (MB_SUCCESS != rval) return rval;
+  
+  rval = iFace->tag_get_handle( PROC_ID_TAG, proc_tag );
+  if (MB_SUCCESS == rval) 
+    iFace->tag_delete( proc_tag );
+  rval = iFace->tag_create( PROC_ID_TAG, sizeof(int), MB_TAG_DENSE, MB_TYPE_INTEGER, proc_tag, 0 );
+  if (MB_SUCCESS != rval) return rval;
+    
+  MBRange interface_sets, sets;
+  rval = iFace->get_entities_by_type_and_tag( 0, MBENTITYSET, &iface_tag, 0, 1, interface_sets );
+  if (MB_SUCCESS != rval) return rval;
+  
+  std::vector<int> list;
+  for (MBRange::iterator i = interface_sets.begin(); i != interface_sets.end(); ++i)
+  {
+    int proc_ids[2];
+    rval = iFace->tag_get_data( iface_tag, &*i, 1, proc_ids );
+    if (MB_SUCCESS != rval) return rval;
+    
+    sets.clear();
+    rval = iFace->get_entities_by_type( *i, MBENTITYSET, sets );
+    if (MB_SUCCESS != rval) return rval;
+  
+    list.clear();
+    list.resize( sets.size(), proc_ids[0] );
+    rval = iFace->tag_set_data( proc_tag, sets, &list[0] );
+    if (MB_SUCCESS != rval) return rval;
+  }
+  
+  return get_remote_set_data( PROC_ID_TAG, PARALLEL_GLOBAL_ID_TAG_NAME, data, offset );
+}
+*/
+  
+
+struct RemoteSetData {
+  MBTag data_tag, filter_tag;
+  int filter_value;
+  MBRange range;
+  std::vector<int> counts, displs, all_values, local_values;
+};
+
+MBErrorCode WriteHDF5Parallel::get_remote_set_data( 
+                        const WriteHDF5Parallel::MultiProcSetTags::Data& tags,
+                        RemoteSetData& data, long& offset )
+{
+  MBErrorCode rval;
+  int i, result;
+  MBRange::iterator riter;
+    
+  rval = iFace->tag_get_handle( tags.filterTag.c_str(), data.filter_tag );
+  if (rval != MB_SUCCESS) return rval;
+  if (tags.useFilterValue) 
+  {
+    i = 0;
+    iFace->tag_get_size( data.filter_tag, i );
+    if (i != sizeof(int)) {
+      fprintf(stderr, "Cannot use non-int tag data for filtering remote sets.\n" );
+      assert(0);
+      return MB_FAILURE;
+    }  
+    data.filter_value = tags.filterValue;
+  }
+  else
+  {
+    data.filter_value = 0;
+  }
+  
+  rval = iFace->tag_get_handle( tags.dataTag.c_str(), data.data_tag );
+  if (rval != MB_SUCCESS) return rval;
+  i = 0;
+  iFace->tag_get_size( data.data_tag, i );
+  if (i != sizeof(int)) {
+    fprintf(stderr, "Cannot use non-int tag data for matching remote sets.\n" );
+    assert(0);
+    return MB_FAILURE;
+  }  
+    
+
+  printdebug("Negotiating multi-proc meshsets for tag: \"%s\"\n", tags.filterTag.c_str());
+
+    // Get sets with tag, or leave range empty if the tag
+    // isn't defined on this processor.
+  if (rval != MB_TAG_NOT_FOUND)
+  {
+    MBTag handles[] = { data.filter_tag, data.data_tag };
+    const void* values[] = { tags.useFilterValue ? &tags.filterValue : 0, 0 };
+    rval = iFace->get_entities_by_type_and_tag( 0, 
+                                                MBENTITYSET, 
+                                                handles,
+                                                values,
+                                                2,
+                                                data.range );
+    if (rval != MB_SUCCESS) return rval;
+    data.range = data.range.intersect( setSet.range );
+    range_remove( setSet.range, data.range );
+  }
+  
+  printdebug("Found %d meshsets with \"%s\" tag.\n", data.range.size(), tags.filterTag.c_str() );
+
+    // Exchange number of sets with tag between all processors
+  data.counts.resize(numProc);
+  int count = data.range.size();
+  result = MPI_Allgather( &count,          1, MPI_INT, 
+                          &data.counts[0], 1, MPI_INT,
+                          MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+
+    // Exchange tag values for sets between all processors
+  data.displs.resize(numProc+1);
+  data.displs[0] = 0;
+  for (i = 1; i <= numProc; i++)
+    data.displs[i] = data.displs[i-1] + data.counts[i-1];
+  int total = data.displs[numProc];
+  data.all_values.resize(total);
+  data.local_values.resize(count);
+  rval = iFace->tag_get_data( data.data_tag, data.range, &data.local_values[0] );
+  assert( MB_SUCCESS == rval );
+  result = MPI_Allgatherv( &data.local_values[0], count, MPI_INT,
+                           &data.all_values[0], &data.counts[0], &data.displs[0], MPI_INT,
+                           MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+
+
+    // Remove from the list any sets that are unique to one processor
+  std::vector<int> sorted( data.all_values );
+  std::sort( sorted.begin(), sorted.end() );
+  int r = 0, w = 0;
+  for (i = 0; i < numProc; ++i)
+  {
+    const int start = w;
+    for (int j = 0; j < data.counts[i]; ++j)
+    {
+      std::vector<int>::iterator p 
+        = std::lower_bound( sorted.begin(), sorted.end(), data.all_values[r] );
+      ++p;
+      if (p != sorted.end() && *p == data.all_values[r])
+      {
+        data.all_values[w] = data.all_values[r];
+        ++w;
+      }
+      ++r;
+    }
+    data.counts[i] = w - start;
+  }
+  total = w;
+  data.all_values.resize( total );
+  r = w = 0;
+  for (i = 0; i < count; ++i)
+  {
+    std::vector<int>::iterator p 
+      = std::lower_bound( sorted.begin(), sorted.end(), data.local_values[r] );
+    ++p;
+    if (p != sorted.end() && *p == data.local_values[r])
+    {
+      data.local_values[w] = data.local_values[r];
+      ++w;
+    }
+    else
+    {
+      riter = data.range.begin();
+      riter += w;
+      setSet.range.insert( *riter );
+      data.range.erase( riter );
+    }
+    ++r;
+  }
+  count = data.range.size();
+  assert( count == data.counts[myRank] );
+  assert( count == w );
+  data.local_values.resize( count );
+  sorted.clear(); // release storage
+    // recalculate displacements
+  data.displs[0] = 0;
+  for (i = 1; i <= numProc; i++)
+    data.displs[i] = data.displs[i-1] + data.counts[i-1];
+  
+    // Find sets that span multple processors and update appropriately.
+    // The first processor (sorted by MPI rank) that contains a given set
+    // will be responsible for writing the set description.  All multi-
+    // processor sets will be written at the beginning of the set tables.
+    // Processors will write set contents/children for a given set in
+    // the order of their MPI rank.
+    //
+    // Identify which meshsets will be managed by this processor and
+    // the corresponding offset in the set description table. 
+  std::map<int,int> val_id_map;
+  int cpu = 0;
+  for (i = 0; i < total; ++i)
+  {
+    if (data.displs[cpu+1] == i)
+      ++cpu;
+
+    int id = 0;
+    std::map<int,int>::iterator p = val_id_map.find( data.all_values[i] );
+    if (p == val_id_map.end())
+    {
+      id = (int)++offset;
+      val_id_map[data.all_values[i]] = id;
+      //const unsigned int values_offset = (unsigned)i - (unsigned)data.displs[myRank];
+      //if (values_offset < (unsigned)count)
+      //{
+      //  riter = data.range.begin();
+      //  riter += values_offset;
+      //  myParallelSets.insert( *riter );
+      //}
+    }
+    std::vector<int>::iterator loc 
+      = std::find( data.local_values.begin(), data.local_values.end(), data.all_values[i] );
+    if (loc != data.local_values.end()) 
+    {
+      riter = data.range.begin();
+      riter += loc - data.local_values.begin();
+      cpuParallelSets[cpu].insert( *riter );
+    }
+  }
+  riter = data.range.begin();
+  for (i = 0; i < count; ++i, ++riter)
+  {
+    std::map<int,int>::iterator p = val_id_map.find( data.local_values[i] );
+    assert( p != val_id_map.end() );
+    int id = p->second;
+    rval = iFace->tag_set_data( idTag, &*riter, 1, &id );
+    assert(MB_SUCCESS == rval);
+  }
+  
+  return MB_SUCCESS;
+}
+
+
+MBErrorCode WriteHDF5Parallel::create_meshset_tables()
+{
+  MBErrorCode rval;
+  int result, i;
+  long total_offset = 0;
+  MBRange::const_iterator riter;
+
+  START_SERIAL;
+  print_type_sets( iFace, myRank, numProc, setSet.range );
+  END_SERIAL;
+
+    // Gather data about multi-processor meshsets - removes sets from setSet.range
+  cpuParallelSets.resize( numProc );
+  std::vector<RemoteSetData> remote_set_data( multiProcSetTags.list.size() );
+  for (i = 0; i< (int)multiProcSetTags.list.size(); i++)
+  {
+    rval = get_remote_set_data( multiProcSetTags.list[i],
+                                remote_set_data[i],
+                                total_offset ); assert(MB_SUCCESS == rval);
+  }
+  //rval = get_interface_set_data( remote_set_data[i], total_offset );
+  if (MB_SUCCESS != rval) return rval;
+
+  START_SERIAL;
+  printdebug("myLocalSets\n");
+  print_type_sets( iFace, myRank, numProc, setSet.range );
+  END_SERIAL;
+
+    // Gather counts of non-shared sets from each proc
+    // to determine total table size.
+  std::vector<long> set_offsets(numProc + 1);
+  long local_count = setSet.range.size();
+  result = MPI_Gather( &local_count,    1, MPI_LONG,
+                       &set_offsets[0], 1, MPI_LONG,
+                       0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  for (i = 0; i <= numProc; i++)
+  {
+    long tmp = set_offsets[i];
+    set_offsets[i] = total_offset;
+    total_offset += tmp;
+  }
+  
+    // Send each proc its offsets in the set description table.
+  long sets_offset;
+  result = MPI_Scatter( &set_offsets[0], 1, MPI_LONG,
+                        &sets_offset,    1, MPI_LONG, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  setSet.offset = (id_t)(sets_offset);
+
+    // Create the set description table
+  long total_count_and_start_id[2] = { set_offsets[numProc], 0 };
+  if (myRank == 0 && total_count_and_start_id[0] > 0)
+  {
+    rval = create_set_meta( (id_t)total_count_and_start_id[0], total_count_and_start_id[1] );
+    assert (MB_SUCCESS == rval);
+  }
+  
+    // Send totals to all procs.
+  result = MPI_Bcast( total_count_and_start_id, 2, MPI_LONG, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  setSet.first_id = total_count_and_start_id[1];
+  writeSets = total_count_and_start_id[0] > 0;
+
+  START_SERIAL;  
+  printdebug("Non-shared sets: %ld local, %ld global, offset = %ld, first_id = %ld\n",
+    local_count, total_count_and_start_id[0], sets_offset, total_count_and_start_id[1] );
+  printdebug("my Parallel Sets:\n");
+  print_type_sets(iFace, myRank, numProc, cpuParallelSets[myRank] );
+  END_SERIAL;
+  
+    // Not writing any sets??
+  if (!writeSets)
+    return MB_SUCCESS;
+  
+    // Assign set IDs
+  writeUtil->assign_ids( setSet.range, idTag, (id_t)(setSet.first_id + setSet.offset) );
+  for (i = 0; i < (int)remote_set_data.size(); ++i)
+    fix_remote_set_ids( remote_set_data[i], setSet.first_id );
+  
+    // Communicate sizes for remote sets
+  long data_offsets[3] = { 0, 0, 0 };
+  for (i = 0; i < (int)remote_set_data.size(); ++i)
+  {
+    rval = negotiate_remote_set_contents( remote_set_data[i], data_offsets ); 
+    assert(MB_SUCCESS == rval);
+  }
+  remote_set_data.clear();
+  
+    // Exchange IDs for remote/adjacent sets not shared between procs
+  //rval = communicate_remote_ids( MBENTITYSET ); assert(MB_SUCCESS == rval);
+  
+    // Communicate counts for local sets
+  long data_counts[3];
+  rval = count_set_size( setSet.range, rangeSets, data_counts[0], data_counts[1], data_counts[2] );
+  if (MB_SUCCESS != rval) return rval;
+  std::vector<long> set_counts(3*numProc);
+  result = MPI_Gather( data_counts,    3, MPI_LONG,
+                       &set_counts[0], 3, MPI_LONG,
+                       0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  for (i = 0; i < 3*numProc; ++i)
+  {
+    long tmp = set_counts[i];
+    set_counts[i] = data_offsets[i%3];
+    data_offsets[i%3] += tmp;
+  }
+  long all_counts[] = {data_offsets[0], data_offsets[1], data_offsets[2]};
+  result = MPI_Scatter( &set_counts[0], 3, MPI_LONG,
+                        data_offsets,   3, MPI_LONG,
+                        0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  setContentsOffset = data_offsets[0];
+  setChildrenOffset = data_offsets[1];
+  setParentsOffset = data_offsets[2];
+  
+    // Create set contents and set children tables
+  if (myRank == 0)
+  {
+    rval = create_set_tables( all_counts[0], all_counts[1], all_counts[2] );
+    if (MB_SUCCESS != rval) return rval;
+  }
+  
+    // Send totals to all processors
+  result = MPI_Bcast( all_counts, 3, MPI_LONG, 0, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  writeSetContents = all_counts[0] > 0;
+  writeSetChildren = all_counts[1] > 0;
+  writeSetParents  = all_counts[2] > 0;
+
+  START_SERIAL;  
+  printdebug("Non-shared set contents: %ld local, %ld global, offset = %ld\n",
+    data_counts[0], all_counts[0], data_offsets[0] );
+  printdebug("Non-shared set children: %ld local, %ld global, offset = %ld\n",
+    data_counts[1], all_counts[1], data_offsets[1] );
+  printdebug("Non-shared set parents: %ld local, %ld global, offset = %ld\n",
+    data_counts[2], all_counts[2], data_offsets[2] );
+  END_SERIAL;
+  
+  return MB_SUCCESS;
+}
+
+void WriteHDF5Parallel::remove_remote_entities( MBEntityHandle relative,
+                                                MBRange& range )
+{
+  MBRange result;
+  result.merge( range.intersect( nodeSet.range ) );
+  result.merge( range.intersect( setSet.range ) );  
+  for (std::list<ExportSet>::iterator eiter = exportList.begin();
+           eiter != exportList.end(); ++eiter )
+  {
+    result.merge( range.intersect( eiter->range ) );
+  }
+  //result.merge( range.intersect( myParallelSets ) );
+  MBRange sets;
+  int junk;
+  sets.merge( MBRange::lower_bound( range.begin(), range.end(), CREATE_HANDLE(MBENTITYSET, 0, junk )), range.end() );
+  remove_remote_sets( relative, sets );
+  result.merge( sets );
+  range.swap(result);
+}
+
+void WriteHDF5Parallel::remove_remote_sets( MBEntityHandle relative, 
+                                            MBRange& range )
+{
+  MBRange result( range.intersect( setSet.range ) );
+  //result.merge( range.intersect( myParallelSets ) );
+  MBRange remaining( range.subtract( result ) );
+  
+  for(MBRange::iterator i = remaining.begin(); i != remaining.end(); ++i)
+  {
+      // Look for the first CPU which knows about both sets.
+    int cpu;
+    for (cpu = 0; cpu < numProc; ++cpu)
+      if (cpuParallelSets[cpu].find(relative) != cpuParallelSets[cpu].end() &&
+          cpuParallelSets[cpu].find(*i) != cpuParallelSets[cpu].end())
+        break;
+      // If we didn't find one, it may indicate a bug.  However,
+      // it could also indicate that it is a link to some set that
+      // exists on this processor but is not being written, because
+      // the caller requested that some subset of the mesh be written.
+    //assert(cpu < numProc);
+      // If I'm the first set that knows about both, I'll handle it.
+    if (cpu == myRank)
+      result.insert( *i );
+  }
+  
+  range.swap( result );
+}
+  
+  
+
+void WriteHDF5Parallel::remove_remote_entities( MBEntityHandle relative,
+                                                std::vector<MBEntityHandle>& vect )
+{
+  MBRange intrsct;
+  for (std::vector<MBEntityHandle>::const_iterator iter = vect.begin();
+       iter != vect.end(); ++iter)
+    intrsct.insert(*iter);
+  remove_remote_entities( relative, intrsct );
+  
+  unsigned int read, write;
+  for (read = write = 0; read < vect.size(); ++read)
+  {
+    if (intrsct.find(vect[read]) != intrsct.end())
+    {
+      if (read != write)
+        vect[write] = vect[read];
+      ++write;
+    }
+  }
+  if (write != vect.size())
+    vect.resize(write);
+}
+
+  
+
+void WriteHDF5Parallel::remove_remote_sets( MBEntityHandle relative,
+                                            std::vector<MBEntityHandle>& vect )
+{
+  MBRange intrsct;
+  for (std::vector<MBEntityHandle>::const_iterator iter = vect.begin();
+       iter != vect.end(); ++iter)
+    intrsct.insert(*iter);
+  remove_remote_sets( relative, intrsct );
+  
+  unsigned int read, write;
+  for (read = write = 0; read < vect.size(); ++read)
+  {
+    if (intrsct.find(vect[read]) != intrsct.end())
+    {
+      if (read != write)
+        vect[write] = vect[read];
+      ++write;
+    }
+  }
+  if (write != vect.size())
+    vect.resize(write);
+}
+
+// Given a RemoteSetData object describing the set information for a 
+// single tag (or tag pair), populate the list of parallel sets
+// (this->parallelSets) with the per-entityset data.
+MBErrorCode WriteHDF5Parallel::negotiate_remote_set_contents( RemoteSetData& data,
+                                                              long* offsets /* long[3] */ )
+{
+  unsigned i;
+  MBErrorCode rval;
+  MBRange::const_iterator riter;
+  int result;
+  const unsigned count = data.range.size();
+  const unsigned total = data.all_values.size();
+  std::vector<int>::iterator viter, viter2;
+
+    // Calculate counts for each meshset
+  std::vector<long> local_sizes(3*count);
+  std::vector<long>::iterator sizes_iter = local_sizes.begin();
+  MBRange tmp_range;
+  std::vector<MBEntityHandle> child_list;
+  for (riter = data.range.begin(); riter != data.range.end(); ++riter)
+  {
+      // Count contents
+    *sizes_iter = 0;
+    tmp_range.clear();
+    rval = iFace->get_entities_by_handle( *riter, tmp_range );
+    remove_remote_entities( *riter, tmp_range );
+    assert (MB_SUCCESS == rval);
+    for (MBRange::iterator iter = tmp_range.begin(); iter != tmp_range.end(); ++iter)
+    {
+      int id = 0;
+      rval = iFace->tag_get_data( idTag, &*iter, 1, &id );
+      if (rval != MB_TAG_NOT_FOUND && rval != MB_SUCCESS)
+        { assert(0); return MB_FAILURE; }
+      if (id > 0)
+        ++*sizes_iter;
+    }
+    ++sizes_iter;
+    
+      // Count children
+    *sizes_iter = 0;
+    child_list.clear();
+    rval = iFace->get_child_meshsets( *riter, child_list );
+    remove_remote_sets( *riter, child_list );
+    assert (MB_SUCCESS == rval);
+    for (std::vector<MBEntityHandle>::iterator iter = child_list.begin();
+         iter != child_list.end(); ++iter)
+    {
+      int id = 0;
+      rval = iFace->tag_get_data( idTag, &*iter, 1, &id );
+      if (rval != MB_TAG_NOT_FOUND && rval != MB_SUCCESS)
+        { assert(0); return MB_FAILURE; }
+      if (id > 0)
+        ++*sizes_iter;
+    }
+    ++sizes_iter;
+    
+      // Count parents
+    *sizes_iter = 0;
+    child_list.clear();
+    rval = iFace->get_parent_meshsets( *riter, child_list );
+    remove_remote_sets( *riter, child_list );
+    assert (MB_SUCCESS == rval);
+    for (std::vector<MBEntityHandle>::iterator iter = child_list.begin();
+         iter != child_list.end(); ++iter)
+    {
+      int id = 0;
+      rval = iFace->tag_get_data( idTag, &*iter, 1, &id );
+      if (rval != MB_TAG_NOT_FOUND && rval != MB_SUCCESS)
+        { assert(0); return MB_FAILURE; }
+      if (id > 0)
+        ++*sizes_iter;
+    }
+    ++sizes_iter;
+  }
+  
+    // Exchange sizes for sets between all processors.
+  std::vector<long> all_sizes(3*total);
+  std::vector<int> counts(numProc), displs(numProc);
+  for (i = 0; i < (unsigned)numProc; i++)
+    counts[i] = 3 * data.counts[i];
+  displs[0] = 0;
+  for (i = 1; i < (unsigned)numProc; i++)
+    displs[i] = displs[i-1] + counts[i-1];
+  result = MPI_Allgatherv( &local_sizes[0], 3*count, MPI_LONG,
+                           &all_sizes[0], &counts[0], &displs[0], MPI_LONG,
+                           MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+
+  
+    // Update information in-place in the array from the Allgatherv.
+    
+    // Change the corresponding sizes for the first instance of a tag
+    // value such that it ends up being the total size of the set.
+    // Change the size to -1 for the later instances of a tag value.
+    //
+    // For the sets that this processor has, update the offsets at
+    // which the set data is to be written.  Store the offset of the data
+    // on this processor for the set *relative* to the start of the
+    // data of *the set*.
+  std::vector<long> local_offsets(3*count);
+  std::map<int,int> tagsort;  // Map of {tag value, index of first set w/ value}
+  for (i = 0; i < total; ++i)
+  {
+    const std::map<int,int>::iterator p = tagsort.find( data.all_values[i] );
+    const unsigned r = (unsigned)(i - data.displs[myRank]);  // offset in "local" array
+    
+      // If this is the first instance of this tag value, 
+      // then the processor with this instance is responsible
+      // for writing the tag description
+    if ( p == tagsort.end() )  
+    {
+      tagsort[data.all_values[i]] = i;
+        // If within the range for this processor, save offsets
+      if (r < (unsigned)count) 
+      {
+        local_offsets[3*r] = local_offsets[3*r+1] = local_offsets[3*r+2] = 0;
+      }
+    }
+      // Otherwise update the total size in the table
+      // for the processor that is responsible for writing
+      // the data and mark the data for the current processor
+      // with a -1.
+    else 
+    {
+        // If within the range for this processor, save offsets
+      int j = p->second;
+      if (r < (unsigned)count) 
+      {
+          // the offset for this processor, from the start of the data
+          // for this group of sets, is the current total count for the
+          // group of sets.
+        local_offsets[3*r  ] = all_sizes[3*j  ];  // contents
+        local_offsets[3*r+1] = all_sizes[3*j+1];  // children
+        local_offsets[3*r+2] = all_sizes[3*j+2];  // parents
+      }
+      
+        // update the total count for the set in the first position in
+        // all_sizes at which the set occurs (the one corresponding to
+        // the processor that owns the set.)
+      all_sizes[3*j  ] += all_sizes[3*i  ]; // contents
+      all_sizes[3*j+1] += all_sizes[3*i+1]; // children
+      all_sizes[3*j+2] += all_sizes[3*i+2]; // parents
+        // set the size to -1 in the positions corresponding to non-owning processor
+      all_sizes[3*i  ] = all_sizes[3*i+1] = all_sizes[3*i+2] = -1;
+    }
+  }  
+    
+  
+    // Store the total size of each set (rather than the
+    // number of entities local to this processor) in the
+    // local_sizes array for each meshset.  Only need this
+    // for the sets this processor is writing the description
+    // for, but it's easier to get it for all of them.
+  sizes_iter = local_sizes.begin();
+  viter = data.local_values.begin();
+  for (riter = data.range.begin(); riter != data.range.end(); ++riter, ++viter)
+  {
+    const std::map<int,int>::iterator p = tagsort.find( *viter ); 
+    assert( p != tagsort.end() );
+    int j = 3 * p->second;
+    *sizes_iter = all_sizes[j  ]; ++sizes_iter;  // contents
+    *sizes_iter = all_sizes[j+1]; ++sizes_iter;  // children
+    *sizes_iter = all_sizes[j+2]; ++sizes_iter;  // parents
+  }
+  
+    // Now calculate the offset of the data for each (entire, parallel) set in
+    // the set contents, children and parents tables.  offsets is long[3], and
+    // is both input and output of this function.  We increment offsets by the
+    // total count (over all processors) for each set such that it contains
+    // the next open row in the table.  This will be passed back into this
+    // function for the next tag (or tag pair) such that ultimately it will
+    // contain the beginning of the non-shared set data in each of the three tables.
+    // all_sizes is re-used to store the global offset in each table for each 
+    // set with the tag.
+  for (i = 0; i < all_sizes.size(); ++i)
+  {
+    if (all_sizes[i] >= 0) // value is -1 (from above) if not this processor
+    {
+      int j = i % 3;              // contents, children or parents list ?
+      long tmp = offsets[j];      // save current, running offset
+      offsets[j] += all_sizes[i]; // next set's offset is current plus the size of this set
+      all_sizes[i] = tmp;         // size of this set is running offset.
+    }
+  }
+  
+    // Local offsets for this processor are stored as values relative to the
+    // start of each set's data.  Convert them to offsets relative to the
+    // start of all the set data.  Add the offset *from* the start of the set
+    // data (local_offsets) to the offset *of* the start of the set data 
+    // (stored in all_sizes in the previous loop) 
+  std::vector<long>::iterator offset_iter = local_offsets.begin();
+  viter = data.local_values.begin();
+  for (riter = data.range.begin(); riter != data.range.end(); ++riter, ++viter)
+  {
+    const std::map<int,int>::iterator p = tagsort.find( *viter );
+    assert( p != tagsort.end() );
+    int j = 3 * p->second;
+    *offset_iter += all_sizes[j  ]; ++offset_iter; // contents
+    *offset_iter += all_sizes[j+1]; ++offset_iter; // children
+    *offset_iter += all_sizes[j+2]; ++offset_iter; // parents
+  }
+
+#ifdef DEBUG  
+START_SERIAL; if (counts[myRank]) {
+std::string name1, name2;
+iFace->tag_get_name( data.data_tag, name1 );
+iFace->tag_get_name( data.filter_tag, name2 );
+printdebug("Remote set data\n" );
+printdebug("    %13s %13s owner local_offsets total_counts\n", name1.c_str(), name2.c_str());
+for (unsigned d = 0; d < (unsigned)counts[myRank]; ++d) {
+switch(d%3) {
+  case 0: // data/contents
+printdebug("   %13d %13d %5s %13d %12d\n", data.all_values[(d+displs[myRank])/3], 
+ data.filter_value, 
+ all_sizes[d+displs[myRank]] < 0 ? "no" : "yes", 
+ local_offsets[d], local_sizes[d] );
+  break;
+  case 1: // children
+printdebug("                          (children) %13d %12d\n", local_offsets[d], local_sizes[d] );
+  break;
+  case 2: // parents
+printdebug("                           (parents) %13d %12d\n", local_offsets[d], local_sizes[d] );
+  break;
+} 
+}
+} 
+END_SERIAL;
+#endif
+  
+    // Store each parallel meshset in the list
+  sizes_iter = local_sizes.begin();
+  offset_iter = local_offsets.begin();
+  std::vector<long>::iterator all_iter = all_sizes.begin() + displs[myRank];
+  for (riter = data.range.begin(); riter != data.range.end(); ++riter)
+  {
+    ParallelSet info;
+    info.handle = *riter;
+    info.contentsOffset = *offset_iter; ++offset_iter;
+    info.childrenOffset = *offset_iter; ++offset_iter;
+    info.parentsOffset = *offset_iter; ++offset_iter;
+    info.contentsCount = *sizes_iter; ++sizes_iter;
+    info.childrenCount = *sizes_iter; ++sizes_iter;
+    info.parentsCount = *sizes_iter; ++sizes_iter;
+    info.description = *all_iter >= 0; all_iter += 3;
+    parallelSets.push_back( info );
+  }
+  
+  return MB_SUCCESS;
+}
+
+MBErrorCode WriteHDF5Parallel::fix_remote_set_ids( RemoteSetData& data, long first_id )
+{
+  const id_t id_diff = (id_t)(first_id - 1);
+  id_t file_id;
+  MBErrorCode rval;
+
+  for (MBRange::iterator iter = data.range.begin(); iter != data.range.end(); ++iter)
+  {
+    rval = iFace->tag_get_data( idTag, &*iter, 1, &file_id );
+    assert( MB_SUCCESS == rval );
+    file_id += id_diff;
+    rval = iFace->tag_set_data( idTag, &*iter, 1, &file_id );
+    assert( MB_SUCCESS == rval );
+  }
+  
+  return MB_SUCCESS;
+}   
+
+
+MBErrorCode WriteHDF5Parallel::write_shared_set_descriptions( hid_t table )
+{
+  const id_t start_id = setSet.first_id;
+  MBErrorCode rval;
+  mhdf_Status status;
+  
+  for( std::list<ParallelSet>::iterator iter = parallelSets.begin();
+        iter != parallelSets.end(); ++iter)
+  {
+    if (!iter->description)
+      continue;  // handled by a different processor
+    
+      // Get offset in table at which to write data
+    int file_id;
+    rval = iFace->tag_get_data( idTag, &(iter->handle), 1, &file_id );
+    file_id -= start_id;
+    
+      // Get flag data
+    unsigned int flags;
+    rval = iFace->get_meshset_options( iter->handle, flags );
+    assert( MB_SUCCESS == rval );
+      
+      // Write the data
+    long data[4] = { iter->contentsOffset + iter->contentsCount - 1, 
+                     iter->childrenOffset + iter->childrenCount - 1, 
+                     iter->parentsOffset  + iter->parentsCount  - 1,
+                     flags };
+    mhdf_writeSetMeta( table, file_id, 1, H5T_NATIVE_LONG, data, &status );
+    if (mhdf_isError(&status))
+      printdebug("Meshset %d : %s\n", ID_FROM_HANDLE(iter->handle), mhdf_message(&status));
+    assert( !mhdf_isError( &status ) );
+  }
+
+  return MB_SUCCESS;
+}
+    
+
+MBErrorCode WriteHDF5Parallel::write_shared_set_contents( hid_t table )
+{
+  MBErrorCode rval;
+  mhdf_Status status;
+  std::vector<MBEntityHandle> handle_list;
+  std::vector<id_t> id_list;
+  
+  for( std::list<ParallelSet>::iterator iter = parallelSets.begin();
+        iter != parallelSets.end(); ++iter)
+  {
+    handle_list.clear();
+    rval = iFace->get_entities_by_handle( iter->handle, handle_list );
+    assert( MB_SUCCESS == rval );
+    remove_remote_entities( iter->handle, handle_list );
+    
+    id_list.clear();
+    for (unsigned int i = 0; i < handle_list.size(); ++i)
+    {
+      int id;
+      rval = iFace->tag_get_data( idTag, &handle_list[i], 1, &id );
+      assert( MB_SUCCESS == rval );
+      if (id > 0)
+        id_list.push_back(id);
+    }
+    
+    if (id_list.empty())
+      continue;
+    
+    mhdf_writeSetData( table, 
+                       iter->contentsOffset, 
+                       id_list.size(),
+                       id_type,
+                       &id_list[0],
+                       &status );
+    assert(!mhdf_isError(&status));
+  }
+  
+  return MB_SUCCESS;
+}
+    
+
+MBErrorCode WriteHDF5Parallel::write_shared_set_children( hid_t table )
+{
+  MBErrorCode rval;
+  mhdf_Status status;
+  std::vector<MBEntityHandle> handle_list;
+  std::vector<id_t> id_list;
+  
+  printdebug("Writing %d parallel sets.\n", parallelSets.size());
+  for( std::list<ParallelSet>::iterator iter = parallelSets.begin();
+        iter != parallelSets.end(); ++iter)
+  {
+    handle_list.clear();
+    rval = iFace->get_child_meshsets( iter->handle, handle_list );
+    assert( MB_SUCCESS == rval );
+    remove_remote_sets( iter->handle, handle_list );
+    
+    id_list.clear();
+    for (unsigned int i = 0; i < handle_list.size(); ++i)
+    {
+      int id;
+      rval = iFace->tag_get_data( idTag, &handle_list[i], 1, &id );
+      assert( MB_SUCCESS == rval );
+      if (id > 0)
+        id_list.push_back(id);
+    }
+    
+    if (!id_list.empty())
+    {
+      mhdf_writeSetParentsChildren( table, 
+                                    iter->childrenOffset, 
+                                    id_list.size(),
+                                    id_type,
+                                    &id_list[0],
+                                    &status );
+      assert(!mhdf_isError(&status));
+    }
+  }
+
+  return MB_SUCCESS;
+}
+    
+
+MBErrorCode WriteHDF5Parallel::write_shared_set_parents( hid_t table )
+{
+  MBErrorCode rval;
+  mhdf_Status status;
+  std::vector<MBEntityHandle> handle_list;
+  std::vector<id_t> id_list;
+  
+  printdebug("Writing %d parallel sets.\n", parallelSets.size());
+  for( std::list<ParallelSet>::iterator iter = parallelSets.begin();
+        iter != parallelSets.end(); ++iter)
+  {
+    handle_list.clear();
+    rval = iFace->get_parent_meshsets( iter->handle, handle_list );
+    assert( MB_SUCCESS == rval );
+    remove_remote_sets( iter->handle, handle_list );
+    
+    id_list.clear();
+    for (unsigned int i = 0; i < handle_list.size(); ++i)
+    {
+      int id;
+      rval = iFace->tag_get_data( idTag, &handle_list[i], 1, &id );
+      assert( MB_SUCCESS == rval );
+      if (id > 0)
+        id_list.push_back(id);
+    }
+    
+    if (!id_list.empty())
+    {
+      mhdf_writeSetParentsChildren( table, 
+                                    iter->parentsOffset, 
+                                    id_list.size(),
+                                    id_type,
+                                    &id_list[0],
+                                    &status );
+      assert(!mhdf_isError(&status));
+    }
+  }
+
+  return MB_SUCCESS;
+}
+
+MBErrorCode WriteHDF5Parallel::write_finished()
+{
+  parallelSets.clear();
+  cpuParallelSets.clear();
+  //myParallelSets.clear();
+  return WriteHDF5::write_finished();
+}
+
+
+class TagNameCompare {
+  MBInterface* iFace;
+  std::string name1, name2;
+public:
+  TagNameCompare( MBInterface* iface ) : iFace(iface) {}
+  bool operator() (const WriteHDF5::SparseTag& t1, 
+                   const WriteHDF5::SparseTag& t2);
+};
+bool TagNameCompare::operator() (const WriteHDF5::SparseTag& t1, 
+                                 const WriteHDF5::SparseTag& t2)
+{
+  MBErrorCode rval;
+  rval = iFace->tag_get_name( t1.tag_id, name1 );
+  rval = iFace->tag_get_name( t2.tag_id, name2 );
+  return name1 < name2;
+}  
+
+void WriteHDF5Parallel::sort_tags_by_name( )
+{
+  tagList.sort( TagNameCompare( iFace ) );
+}
+
+
+MBErrorCode WriteHDF5Parallel::communicate_remote_ids( MBEntityType type )
+{
+  int result;
+  MBErrorCode rval;
+
+    // Get the export set for the specified type
+  ExportSet* export_set = 0;
+  if (type == MBVERTEX)
+    export_set = &nodeSet;
+  else if(type == MBENTITYSET)
+    export_set = &setSet;
+  else
+  {
+    for (std::list<ExportSet>::iterator esiter = exportList.begin();
+         esiter != exportList.end(); ++esiter)
+      if (esiter->type == type)
+      {
+        export_set = &*esiter;
+        break;
+      }
+  }
+  assert(export_set != NULL);
+  
+    // Get the ranges in the set
+  std::vector<unsigned long> myranges;
+  MBRange::const_pair_iterator p_iter = export_set->range.const_pair_begin();
+  const MBRange::const_pair_iterator p_end = export_set->range.const_pair_end();
+  for ( ; p_iter != p_end; ++p_iter)
+  {
+    myranges.push_back( (*p_iter).first );
+    myranges.push_back( (*p_iter).second );
+  }
+
+  START_SERIAL;
+  printdebug("%s ranges to communicate:\n", MBCN::EntityTypeName(type));
+  for (unsigned int xx = 0; xx != myranges.size(); xx+=2)
+    printdebug("  %lu - %lu\n", myranges[xx], myranges[xx+1] );
+  END_SERIAL;
+  
+    // Communicate the number of ranges and the start_id for
+    // each processor.
+  std::vector<int> counts(numProc), offsets(numProc), displs(numProc);
+  int mycount = myranges.size();
+  int mystart = export_set->first_id + export_set->offset;
+  result = MPI_Allgather( &mycount, 1, MPI_INT, &counts[0], 1, MPI_INT, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  result = MPI_Allgather( &mystart, 1, MPI_INT, &offsets[0], 1, MPI_INT, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+    // Communicate the ranges 
+  displs[0] = 0;
+  for (int i = 1; i < numProc; ++i)
+    displs[i] = displs[i-1] + counts[i-1];
+  std::vector<unsigned long> allranges( displs[numProc-1] + counts[numProc-1] );
+  result = MPI_Allgatherv( &myranges[0], myranges.size(), MPI_UNSIGNED_LONG,
+                           &allranges[0], &counts[0], &displs[0],
+                           MPI_UNSIGNED_LONG, MPI_COMM_WORLD );
+  assert(MPI_SUCCESS == result);
+  
+  MBTag global_id_tag;
+  rval = iFace->tag_get_handle( PARALLEL_GLOBAL_ID_TAG_NAME, global_id_tag );
+  assert(MB_SUCCESS == rval);
+  
+    // Set file IDs for each communicated entity
+    
+    // For each processor
+  for (int proc = 0; proc < numProc; ++proc)
+  {
+    if (proc == myRank)
+      continue;
+    
+      // Get data for corresponding processor
+    const int offset = offsets[proc];
+    const int count = counts[proc];
+    const unsigned long* const ranges = &allranges[displs[proc]];
+    
+      // For each geometry meshset in the interface
+    MBRange::iterator r_iter = MBRange::lower_bound( remoteMesh[proc].begin(),
+                                                     remoteMesh[proc].end(),
+                                                     CREATE_HANDLE(type,0,result) );
+    MBRange::iterator r_stop = MBRange::lower_bound( r_iter,
+                                                     remoteMesh[proc].end(),
+                                                     CREATE_HANDLE(type+1,0,result) );
+    for ( ; r_iter != r_stop; ++r_iter)
+    {
+      MBEntityHandle entity = *r_iter;
+
+        // Get handle on other processor
+      MBEntityHandle global;
+      rval = iFace->tag_get_data( global_id_tag, &entity, 1, &global );
+      assert(MB_SUCCESS == rval);
+
+        // Find corresponding fileid on other processor.
+        // This could potentially be n**2, but we will assume that
+        // the range list from each processor is short (typically 1).
+      int j, steps = 0;
+      unsigned long low, high;
+      for (j = 0; j < count; j += 2)
+      {
+        low = ranges[j];
+        high = ranges[j+1];
+        if (low <= global && high >= global)
+          break;
+        steps += (high - low) + 1;
+      }
+      if (j >= count) {
+      printdebug("*** handle = %u, type = %d, id = %d, proc = %d\n",
+      (unsigned)global, (int)(iFace->type_from_handle(global)), (int)(iFace->id_from_handle(global)), proc);
+      for (int ii = 0; ii < count; ii+=2) 
+      printdebug("***  %u to %u\n", (unsigned)ranges[ii], (unsigned)ranges[ii+1] );
+      MBRange junk; junk.insert( global );
+      print_type_sets( iFace, myRank, numProc, junk );
+      }
+      assert(j < count);
+      int fileid = offset + steps + (global - low);
+      rval = iFace->tag_set_data( idTag, &entity, 1, &fileid );
+      assert(MB_SUCCESS == rval);
+    } // for(r_iter->range)
+  } // for(each processor)
+  
+  return MB_SUCCESS;
+}

Added: MOAB/trunk/parallel/WriteHDF5Parallel.hpp
===================================================================
--- MOAB/trunk/parallel/WriteHDF5Parallel.hpp	                        (rev 0)
+++ MOAB/trunk/parallel/WriteHDF5Parallel.hpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,250 @@
+/** 
+ * \class WriteHDF5Parallel
+ * \brief Write MOAB HDF5 file in parallel.
+ * \author Jason Kraftcheck
+ * \data   22 July 2004
+ */
+
+#ifndef WRITE_HDF5_PARALLEL_HPP
+#define WRITE_HDF5_PARALLEL_HPP
+
+#include "WriteHDF5.hpp"
+#include <mpi.h>
+
+struct RemoteSetData;
+
+class MB_DLL_EXPORT WriteHDF5Parallel : public WriteHDF5
+{
+  public:
+    
+      /** Consturctor
+       *
+       * This constructor will automatically register the tags for
+       * material set (block), dirichlet set (nodeset), neumann set
+       * (sideset), and geometry grouping sets for use in identifying
+       * sets that are shared across multiple processors.  To explicitly
+       * disable this functionality, call one of the other construtors
+       * with an empty list of tags.
+       */
+    WriteHDF5Parallel( MBInterface* iface );
+     
+    
+      /** Constructor
+       *\param multiproc_set_tags Null-terminated list strings.
+       *
+       * multiproc_set_tags is a null-terminated list of tag names.
+       * Each tag specified must have an native integer (int) data 
+       * type.  The tag data is used to identify meshsets that span
+       * multiple processors such that they are written as a single
+       * meshset in the resulting file.  
+       *
+       * NOTE: This list must be identical on all processors, including
+       *       the order!
+       */
+    WriteHDF5Parallel( MBInterface* iface,
+                       const std::vector<std::string>& multiproc_set_tags );
+    
+    /**\brief Define tags used to identify sets spanning multiple procesors */
+    class MultiProcSetTags {
+      friend class WriteHDF5Parallel;
+      public:
+
+        /**Specify the name of a tag used to identify parallel entity sets.
+         * The tag must have an native integer (int) data type.  The value
+         * of the tag will be used to match sets on different processors.
+         */
+      void add( const std::string& name );
+ 
+        /**Specify separate tags for identifying parallel entity sets and
+         * matching them across processors.
+         *\param filter_name The name of a tag used to identify parallel entity sets
+         *\param value_name  The name of a tag having a native integer (int) data
+         *                   type.  The value of this tag is used as an ID to match
+         *                   entity sets on different processors.
+         */
+      void add( const std::string& filter_name, const std::string& value_name );
+ 
+        /**Specify separate tags for identifying parallel entity sets and
+         * matching them across processors.
+         *\param filter_name The name of a tag used to identify parallel entity sets.
+         *                   The data type of this tag must be a native integer (int).
+         *\param filter_value The value of the filter_name tag to use to identify
+         *                   parallel entity sets.
+         *\param value_name  The name of a tag having a native integer (int) data
+         *                   type.  The value of this tag is used as an ID to match
+         *                   entity sets on different processors.
+         */
+      void add( const std::string& filter_name, int filter_value, const std::string& value_name );
+      
+      private:
+      class Data;
+      std::vector<Data> list;
+    };
+     
+      /** Constructor
+       *\param multiproc_set_tags Data used to identify sets spanning multiple processors.
+       *                          NOTE:  This must be identical on all processors, including
+       *                          the order in which tags were added to the object!
+       */
+    WriteHDF5Parallel( MBInterface* iface, const MultiProcSetTags& multiproc_set_tags );
+      
+    
+  
+  protected:
+  
+      //! Called by normal (non-parallel) writer.  Sets up
+      //! necessary data for parallel write.
+    virtual MBErrorCode create_file( const char* filename,
+                                     bool overwrite,
+                                     std::vector<std::string>& qa_records,
+                                     int dimension = 3 );
+    
+      //! Figure out which mesh local mesh is duplicated on
+      //! remote processors and which processor will write
+      //! that mesh.
+    MBErrorCode gather_interface_meshes();
+    
+      //! For entities that will be written by another 
+      //! processor, get the file Ids that will be assigned
+      //! to those so they can be referenced by
+      //! entities to be written on this processor.
+    MBErrorCode communicate_remote_ids(MBEntityType type);
+    
+      //! Sort the list of tag information in the parent
+      //! class by name so all procs have them in the same
+      //! order.
+    void sort_tags_by_name();
+    
+      //! Create the node table in the file.
+    MBErrorCode create_node_table( int dimension );
+    
+      //! Communicate with other processors to negotiate 
+      //! the types of elements that will be written
+      //! (the union of the types defined on each proc.)
+    MBErrorCode negotiate_type_list();
+    
+      //! Create tables to hold element connectivity
+    MBErrorCode create_element_tables();
+    
+      //! Create tables to hold element adjacencies.
+    MBErrorCode create_adjacency_tables();
+    
+      //! Identify and set up meshsets that span multiple
+      //! processors.
+      //!\param offsets Output array of three values.
+    MBErrorCode negotiate_shared_meshsets( long* offsets );
+    
+      //! Setup meshsets spanning multiple processors
+    MBErrorCode get_remote_set_data( const MultiProcSetTags::Data& tag,
+                                     RemoteSetData& data,
+                                     long& offset );
+                                     
+      //! Setup interface meshsets spanning multiple processors
+    MBErrorCode get_interface_set_data( RemoteSetData& data, long& offset );
+    
+      //! Determine offsets in contents and children tables for 
+      //! meshsets shared between processors.
+    MBErrorCode negotiate_remote_set_contents( RemoteSetData& data,
+                                               long* offsets );
+    
+      //! Create tables for mesh sets
+    MBErrorCode create_meshset_tables();
+    
+      //! Write tag descriptions and create tables to hold tag data.
+    MBErrorCode create_tag_tables();
+    
+      //! Mark multiple-processor meshsets with correct file Id
+      //! from the set description offset stored in that tag by
+      //! negotiate_shared_meshsets(..).
+    MBErrorCode fix_remote_set_ids( RemoteSetData& data, long first_id );
+      
+      //! Write set descriptions for multi-processor meshsets.
+      //! Virtual function called by non-parallel code after
+      //! the normal (single-processor) meshset descriptions have
+      //! been written.
+    MBErrorCode write_shared_set_descriptions( hid_t table );
+       
+      //! Write set contents for multi-processor meshsets.
+      //! Virtual function called by non-parallel code after
+      //! the normal (single-processor) meshset contents have
+      //! been written.
+    MBErrorCode write_shared_set_contents( hid_t table );
+       
+      //! Write set children for multi-processor meshsets.
+      //! Virtual function called by non-parallel code after
+      //! the normal (single-processor) meshset children have
+      //! been written.
+    MBErrorCode write_shared_set_children( hid_t table );
+       
+      //! Write set children for multi-processor meshsets.
+      //! Virtual function called by non-parallel code after
+      //! the normal (single-processor) meshset children have
+      //! been written.
+    MBErrorCode write_shared_set_parents( hid_t table );
+  
+      //! Virtual function overridden from WriteHDF5.  
+      //! Release memory by clearing member lists.
+    MBErrorCode write_finished();
+    
+      //! Remove any remote mesh entities from the passed range.
+    void remove_remote_entities( MBEntityHandle relative, MBRange& range );
+    void remove_remote_entities( MBEntityHandle relative, std::vector<MBEntityHandle>& vect );
+    void remove_remote_sets( MBEntityHandle relative, MBRange& range );
+    void remove_remote_sets( MBEntityHandle relative, std::vector<MBEntityHandle>& vect );
+    
+  private:
+    
+      //! MPI environment
+    int numProc, myRank;
+                                     
+      //! An array of interface mesh which is to be written by
+      //! remote processors.  Indexed by MPI rank (processor number).
+    std::vector<MBRange> remoteMesh;
+    
+      //! Tag names for identifying multi-processor meshsets
+    MultiProcSetTags multiProcSetTags;
+    
+      //! Struct describing a multi-processor meshset
+    struct ParallelSet {
+      MBEntityHandle handle;// set handle on this processor
+      long contentsOffset;  // offset in table at which to write set contents
+      long childrenOffset;  // offset in table at which to write set children
+      long parentsOffset;   // offset in table at which to write set parents
+      long contentsCount;   // total size of set contents (all processors)
+      long childrenCount;   // total number of set children (all processors)
+      long parentsCount;    // total numoer of set parents (all processors)
+      bool description;     // true if this processor 'ownes' the set
+    };
+    
+      //! List of multi-processor meshsets
+    std::list<ParallelSet> parallelSets;
+    
+      //! Vector indexed by MPI rank, containing the list
+      //! of parallel sets that each processor knows about.
+    std::vector<MBRange> cpuParallelSets;
+    
+      //! List of parallel sets "owned" by this processor
+    //MBRange myParallelSets;
+    
+    void printrange( MBRange& );
+};
+
+
+
+class WriteHDF5Parallel::MultiProcSetTags::Data
+{
+  public:
+  Data( const std::string& name ) 
+   : filterTag(name), dataTag(name), useFilterValue(false) {}
+  Data( const std::string& fname, const std::string& dname )
+   : filterTag(fname), dataTag(dname), useFilterValue(false) {}
+  Data( const std::string& fname, const std::string& dname, int fval )
+   : filterTag(fname), dataTag(dname), filterValue(fval), useFilterValue(true) {}
+   
+  std::string filterTag;
+  std::string dataTag;
+  int filterValue;
+  bool useFilterValue;
+};
+
+#endif

Added: MOAB/trunk/parallel/crystal.c
===================================================================
--- MOAB/trunk/parallel/crystal.c	                        (rev 0)
+++ MOAB/trunk/parallel/crystal.c	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,159 @@
+/*------------------------------------------------------------------------------
+  
+  Crystal Router
+  
+  Accomplishes all-to-all communication in log P msgs per proc
+  The routine is low-level; the format of the input/output is an
+  array of integers, consisting of a sequence of messages with format:
+  
+      target proc
+      source proc
+      m
+      integer
+      integer
+      ...
+      integer  (m integers in total)
+
+  Before crystal_router is called, the source of each message should be
+  set to this proc id; upon return from crystal_router, the target of each
+  message will be this proc id.
+
+  Usage:
+  
+      MPI_Comm comm = ... ;
+      crystal_data crystal;
+      
+      crystal_init(&crystal, comm);  // initialize the data structure
+      // now crystal.id  = this proc
+      // and crystal.num = num of procs
+      
+      // allocate space for at least MAX ints
+      buffer_reserve(&crystal->all->buf, MAX*sizeof(uint));
+      
+      // fill up ((uint*)crystal->all->buf.ptr)[0 ... n-1]
+      // and set crystal->all->n
+      
+      crystal_router(&crystal);
+      
+      // incoming messages available as
+      // ((uint*)crystal->all->buf.ptr)[0 ... crystal->all->n-1]
+      
+      crystal_free(&crystal); // release acquired memory
+
+  ----------------------------------------------------------------------------*/
+
+#ifdef USE_MPI
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <mpi.h>
+
+#include "errmem.h"
+#include "types.h"
+
+typedef struct { uint n; buffer buf; } crystal_buf;
+
+typedef struct {
+  crystal_buf buffers[3];
+  crystal_buf *all, *keep, *send;
+  MPI_Comm comm;
+  uint num, id;
+} crystal_data;
+
+void crystal_init(crystal_data *p, MPI_Comm comm)
+{
+  int num,id;
+  buffer_init(&p->buffers[0].buf,1024);
+  buffer_init(&p->buffers[1].buf,1024);
+  buffer_init(&p->buffers[2].buf,1024);
+  p->all=&p->buffers[0];
+  p->keep=&p->buffers[1];
+  p->send=&p->buffers[2];
+  memcpy(&p->comm,&comm,sizeof(MPI_Comm));
+  MPI_Comm_rank(comm,&id ); p->id =id ;
+  MPI_Comm_size(comm,&num); p->num=num;
+}
+
+void crystal_free(crystal_data *p)
+{
+  buffer_free(&p->buffers[0].buf);
+  buffer_free(&p->buffers[1].buf);
+  buffer_free(&p->buffers[2].buf);
+}
+
+static void crystal_partition(crystal_data *p, uint cutoff,
+                              crystal_buf *lo, crystal_buf *hi)
+{
+  const uint *src = p->all->buf.ptr;
+  const uint *end = src+p->all->n;
+  uint *target, *lop, *hip;
+  lo->n=hi->n=0;
+  buffer_reserve(&lo->buf,p->all->n*sizeof(uint));
+  buffer_reserve(&hi->buf,p->all->n*sizeof(uint));
+  lop = lo->buf.ptr, hip = hi->buf.ptr;
+  while(src!=end) {
+    uint chunk_len = 3 + src[2];
+    if(src[0]<cutoff) target=lop,lo->n+=chunk_len,lop+=chunk_len;
+                 else target=hip,hi->n+=chunk_len,hip+=chunk_len;
+    memcpy(target,src,chunk_len*sizeof(uint));
+    src+=chunk_len;
+  }
+}
+
+static void crystal_send(crystal_data *p, uint target, int recvn)
+{
+  MPI_Request req[3];
+  MPI_Status status[3];
+  uint count[2]={0,0},sum,*recv[2];
+  crystal_buf *t;
+  int i;
+  
+  MPI_Isend(&p->send->n,sizeof(uint),MPI_UNSIGNED_CHAR,
+            target  ,p->id   ,p->comm,&req[  0]);
+  for(i=0;i<recvn;++i)
+  MPI_Irecv(&count[i]  ,sizeof(uint),MPI_UNSIGNED_CHAR,
+            target+i,target+i,p->comm,&req[i+1]);
+  MPI_Waitall(recvn+1,req,status);
+  sum = p->keep->n;
+  for(i=0;i<recvn;++i) sum+=count[i];
+  buffer_reserve(&p->keep->buf,sum*sizeof(uint));
+  recv[0]=p->keep->buf.ptr;
+  recv[0]+=p->keep->n;
+  recv[1]=recv[0]+count[0];
+  p->keep->n=sum;
+
+  MPI_Isend(p->send->buf.ptr,p->send->n*sizeof(uint),
+            MPI_UNSIGNED_CHAR,target,p->id,p->comm,&req[0]);
+  if(recvn) {
+    MPI_Irecv(recv[0],count[0]*sizeof(uint),MPI_UNSIGNED_CHAR,
+              target,target,p->comm,&req[1]);
+    if(recvn==2)
+    MPI_Irecv(recv[1],count[1]*sizeof(uint),MPI_UNSIGNED_CHAR,
+              target+1,target+1,p->comm,&req[2]);
+  }
+  MPI_Waitall(recvn+1,req,status);
+
+  t=p->all,p->all=p->keep,p->keep=t;
+}
+
+void crystal_router(crystal_data *p)
+{
+  uint bl=0, bh, n=p->num, nl, target;
+  int recvn;
+  crystal_buf *lo, *hi;
+  while(n>1) {
+    nl = n/2, bh = bl+nl;
+    if(p->id<bh)
+      target=p->id+nl,recvn=(n&1 && p->id==bh-1)?2:1   ,lo=p->keep,hi=p->send;
+    else
+      target=p->id-nl,recvn=(target==bh)?(--target,0):1,hi=p->keep,lo=p->send;
+    crystal_partition(p,bh,lo,hi);
+    crystal_send(p,target,recvn);
+    if(p->id<bh) n=nl; else n-=nl,bl=bh;
+  }
+}
+
+#endif
+

Added: MOAB/trunk/parallel/crystal.h
===================================================================
--- MOAB/trunk/parallel/crystal.h	                        (rev 0)
+++ MOAB/trunk/parallel/crystal.h	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,26 @@
+#ifndef CRYSTAL_H
+#define CRYSTAL_H
+
+/* requires <mpi.h>, "types.h", and "errmem.h" */
+#if !defined(TYPES_H) || !defined(ERRMEM_H)
+#warning "crystal.h" requires "types.h" and "errmem.h"
+#endif
+
+#ifdef USE_MPI
+
+typedef struct { uint n; buffer buf; } crystal_buf;
+
+typedef struct {
+  crystal_buf buffers[3];
+  crystal_buf *all, *keep, *send;
+  MPI_Comm comm;
+  uint num, id;
+} crystal_data;
+
+void crystal_init(crystal_data *, MPI_Comm);
+void crystal_free(crystal_data *);
+void crystal_router(crystal_data *);
+
+#endif
+
+#endif

Added: MOAB/trunk/parallel/errmem.c
===================================================================
--- MOAB/trunk/parallel/errmem.c	                        (rev 0)
+++ MOAB/trunk/parallel/errmem.c	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,13 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+
+void fail(const char *fmt, ...)
+{
+  va_list ap;
+  va_start(ap, fmt);
+  vfprintf(stderr, fmt, ap);
+  va_end(ap);
+  exit(1);
+}
+

Added: MOAB/trunk/parallel/errmem.h
===================================================================
--- MOAB/trunk/parallel/errmem.h	                        (rev 0)
+++ MOAB/trunk/parallel/errmem.h	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,64 @@
+#ifndef ERRMEM_H
+#define ERRMEM_H
+
+/* requires:
+     <stdlib.h> for malloc, calloc, realloc, free
+*/
+
+/*--------------------------------------------------------------------------
+   Error Reporting
+   Memory Allocation Wrappers to Catch Out-of-memory
+  --------------------------------------------------------------------------*/
+
+#ifdef __GNUC__
+void fail(const char *fmt, ...) __attribute__ ((noreturn));
+#else
+void fail(const char *fmt, ...);
+#endif
+
+#if 0
+{}
+#endif
+
+static void *smalloc(size_t size, const char *file)
+{
+  void *res = malloc(size);
+  if(!res && size) fail("%s: allocation of %d bytes failed\n",file,(int)size);
+  return res;
+}
+
+static void *srealloc(void *ptr, size_t size, const char *file)
+{
+  void *res = realloc(ptr, size);
+  if(!res && size) fail("%s: allocation of %d bytes failed\n",file,(int)size);
+  return res;
+}
+
+#define tmalloc(type, count) \
+  ((type*) smalloc((count)*sizeof(type),__FILE__) )
+#define tcalloc(type, count) \
+  ((type*) scalloc((count),sizeof(type),__FILE__) )
+#define trealloc(type, ptr, count) \
+  ((type*) srealloc((ptr),(count)*sizeof(type),__FILE__) )
+
+typedef struct { size_t size; void *ptr; } buffer;
+static void buffer_init_(buffer *b, size_t size, const char *file)
+{
+  b->size=size, b->ptr=smalloc(size,file);
+}
+static void buffer_reserve_(buffer *b, size_t min, const char *file)
+{
+  size_t size = b->size;
+  if(size<min) {
+    size+=size/2+1;
+    if(size<min) size=min;
+    b->ptr=srealloc(b->ptr,size,file);
+  }
+}
+static void buffer_free(buffer *b) { free(b->ptr); }
+
+#define buffer_init(b,size) buffer_init_(b,size,__FILE__)
+#define buffer_reserve(b,min) buffer_reserve_(b,min,__FILE__)
+
+#endif
+

Added: MOAB/trunk/parallel/fcrystal.c
===================================================================
--- MOAB/trunk/parallel/fcrystal.c	                        (rev 0)
+++ MOAB/trunk/parallel/fcrystal.c	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,105 @@
+/*------------------------------------------------------------------------------
+  
+  FORTRAN interface for crystal router
+  
+  integer h, np
+  MPI_Comm comm
+  call crystal_new(h,comm,np)  ! set h to handle to new instance
+  ! it is a runtime error if MPI_Comm_size gives a value different than np
+  call crystal_done(h)         ! release instance
+
+  integer*? vi(mi,max)         ! these integer and real types
+  integer*? vl(ml,max)         !   better match up with what is
+  real      vr(mr,max)         !   in "types.h" 
+  call crystal_transfer(h,n,max,vi,mi,vl,ml,vr,mr,p)
+  
+  - this treats  { vi(:,i), vl(:,i), vr(:,i) } , i in [1 ... n]
+      as a list of n tuples with mi integers and md reals each
+  - the parameter p indicates that the tuple
+      { vi(:,i), vl(:,i), vr(:,i) } should be sent to proc vi(p,i),
+      and that on return, vi(p,j) will be the source proc of tuple j
+  - n will be set to the number of tuples that came in
+  - if more tuples come in than max, n will be set to max+1,
+      although only max tuples were stored (the rest are lost)
+
+  ----------------------------------------------------------------------------*/
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#ifdef MPI
+#  include <mpi.h>
+#endif
+
+#include "fname.h"
+#include "errmem.h"
+#include "types.h"
+#ifdef MPI
+#  include "crystal.h"
+#  include "tuple_list.h"
+#  include "transfer.h"
+#else
+   typedef void MPI_Comm;
+#endif
+
+#define crystal_new      FORTRAN_NAME(crystal_new,CRYSTAL_NEW)
+#define crystal_done     FORTRAN_NAME(crystal_done,CRYSTAL_DONE)
+#define crystal_transfer FORTRAN_NAME(crystal_transfer,CRYSTAL_TRANSFER)
+
+#ifdef MPI
+static crystal_data **handle=0;
+static int n=0, max=0;
+#endif
+
+void crystal_new(sint *h, const MPI_Comm *comm, const sint *np)
+{
+#ifdef MPI
+  MPI_Comm local_com;
+  if(n==max) max+=max/2+1,handle=trealloc(crystal_data*,handle,max);
+  handle[n] = tmalloc(crystal_data,1);
+  MPI_Comm_dup(*comm,&local_com);
+  crystal_init(handle[n],local_com);
+  if(*np!=(sint)handle[n]->num)
+    fail("crystal_new: passed P=%d, but MPI_Comm_size gives P=%d\n",
+         *np,handle[n]->num);
+  *h=n++;
+#else
+  if(*np!=1)
+    fail("crystal_new: passed P=%d, but not compiled with -DMPI\n",*np);
+  *h=-1;
+#endif
+}
+
+#ifdef MPI
+crystal_data *fcrystal_handle(sint h)
+{
+  if(h<0 || h>=n || handle[h]==0) failwith("invalid crystal router handle");
+  return handle[h];
+}
+#endif
+
+void crystal_done(sint *h)
+{
+#ifdef MPI
+  crystal_data *p = fcrystal_handle(*h);
+  handle[*h]=0;
+  MPI_Comm_free(&p->comm);
+  crystal_free(p);
+  free(p);
+#endif  
+}
+
+void crystal_transfer(const sint *h, sint *n, const sint *max,
+                      sint  vi[], const sint *mi,
+                      slong vl[], const sint *ml,
+                      real  vr[], const sint *mr,
+                      const sint *p)
+{
+#ifdef MPI
+  crystal_data *crystal = fcrystal_handle(*h);
+  tuple_list tl = { *mi, *ml, *mr, *n, *max, vi, vl, vr };
+  transfer(0,&tl,*p,crystal);
+  *n = tl.n;
+#endif
+}
+

Added: MOAB/trunk/parallel/gs.c
===================================================================
--- MOAB/trunk/parallel/gs.c	                        (rev 0)
+++ MOAB/trunk/parallel/gs.c	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,563 @@
+/* compile-time settings:
+
+   FORTRAN naming convention
+     default      cpgs_setup, etc.
+     -DUPCASE     CPGS_SETUP, etc.
+     -DUNDERSCORE cpgs_setup_, etc.
+
+   -DMPI             parallel version (sequential otherwise)
+   -DCRYSTAL_STATIC  avoid some message exchange at the risk of
+                     crashing b/c of insufficient buffer size
+   
+   -DINITIAL_BUFFER_SIZE=expression
+      ignored unless CRYSTAL_STATIC is defined.
+      arithmetic expression controlling the initial buffer size for the crystal
+      router; this needs to be large enough to hold the intermediate messages
+      during all stages of the crystal router
+      
+      variables that can be used in expression include
+         num   - the number of processors
+         n     - the length of the global index array
+
+*/
+
+/* default for INITIAL_BUFFER_SIZE */
+#ifdef CRYSTAL_STATIC
+#  ifndef INITIAL_BUFFER_SIZE
+#    define INITIAL_BUFFER_SIZE 2*(3*num+n*9)
+#  endif
+#endif
+
+/* FORTRAN usage:
+
+   integer hc, np
+   call crystal_new(hc,comm,np)  ! get a crystal router handle (see fcrystal.c)
+
+   integer hgs
+   integer n, max_vec_dim
+   integer*? global_index_array(1:n) ! type corresponding to slong in "types.h"
+
+   call cpgs_setup(hgs,hc,global_index_array,n,max_vec_dim)
+     sets hgs to new handle
+
+   !ok to call crystal_done(hc) here, or any later time
+
+   call cpgs_op(hgs, u, op)
+     integer handle, op : 1-add, 2-multiply, 3-min, 4-max
+     real    u(1:n) - same layout as global_index_array provided to cpgs_setup
+
+   call cpgs_op_vec(hgs, u, d, op)
+     integer op : 1-add, 2-multiply, 3-min, 4-max
+     integer d    <= max_vec_dim
+     real    u(1:d, 1:n) - vector components for each node stored together
+
+   call cpgs_op_many(hgs, u1, u2, u3, u4, u5, u6, d, op)
+     integer op : 1-add, 2-multiply, 3-min, 4-max
+     integer d : in {1,2,3,4,5,6}, <= max_vec_dim
+     real    u1(1:n), u2(1:n), u3(1:n), etc.
+     
+     same effect as: call cpgs_op(hgs, u1, op)
+                     if(d.gt.1) call cpgs_op(hgs, u2, op)
+                     if(d.gt.2) call cpgs_op(hgs, u3, op)
+                     etc.
+     with possibly some savings as fewer messages are exchanged
+   
+   call cpgs_free(hgs)
+*/
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <math.h>
+#ifdef USE_MPI
+#  include <mpi.h>
+#endif
+
+#include "errmem.h"     
+#include "types.h"
+#include "minmax.h"
+#include "tuple_list.h"
+#ifdef USE_MPI
+#  include "crystal.h"  
+#  include "transfer.h"
+
+typedef struct {
+  uint np;           /* number of processors to communicate with          */
+  uint *target;      /* int target[np]: array of processor ids to comm w/ */
+  uint *nshared;     /* nshared[i] = number of points shared w/ target[i] */
+  uint *sh_ind;      /* list of shared point indices                      */
+  MPI_Request *reqs; /* pre-allocated for MPI calls                       */
+  real *buf;         /* pre-allocated buffer to receive data              */
+  uint maxv;         /* maximum vector size                               */
+} nonlocal_info;
+
+#else
+   typedef void crystal_data;
+#endif
+
+typedef struct {
+  sint *local_cm; /* local condense map */
+#ifdef USE_MPI
+  nonlocal_info *nlinfo;
+  MPI_Comm comm;
+#endif
+} gs_data;
+
+#define OP_ADD 1
+#define OP_MUL 2
+#define OP_MIN 3
+#define OP_MAX 4
+#define OP_BPR 5
+
+/*--------------------------------------------------------------------------
+   Local Execution Phases
+  --------------------------------------------------------------------------*/
+
+#define DO_SET(a,b) b=a
+#define DO_ADD(a,b) a+=b
+#define DO_MUL(a,b) a*=b
+#define DO_MIN(a,b) if(b<a) a=b
+#define DO_MAX(a,b) if(b>a) a=b
+#define DO_BPR(a,b) \
+  do { uint a_ = a; uint b_ = b; \
+       for(;;) { if(a_<b_) b_>>=1; else if(b_<a_) a_>>=1; else break; } \
+       a = a_; \
+     } while(0)
+
+
+#define LOOP(op) do { \
+  sint i,j; \
+  while((i=*cm++) != -1) \
+    while((j=*cm++) != -1) \
+      op(u[i],u[j]); \
+} while(0)
+  
+static void local_condense(real *u, int op, const sint *cm)
+{
+  switch(op) {
+    case OP_ADD: LOOP(DO_ADD); break;
+    case OP_MUL: LOOP(DO_MUL); break;
+    case OP_MIN: LOOP(DO_MIN); break;
+    case OP_MAX: LOOP(DO_MAX); break;
+    case OP_BPR: LOOP(DO_BPR); break;
+  }
+}
+
+static void local_uncondense(real *u, const sint *cm)
+{
+  LOOP(DO_SET);
+}
+
+#undef LOOP
+
+#define LOOP(op) do { \
+  sint i,j,k; \
+  while((i=*cm++) != -1) { \
+    real *pi=u+n*i; \
+    while((j=*cm++) != -1) { \
+      real *pj=u+n*j; \
+      for(k=n;k;--k) { op(*pi,*pj); ++pi, ++pj; } \
+    } \
+  } \
+} while(0)
+
+static void local_condense_vec(real *u, uint n, int op, const sint *cm)
+{
+  switch(op) {
+    case OP_ADD: LOOP(DO_ADD); break;
+    case OP_MUL: LOOP(DO_MUL); break;
+    case OP_MIN: LOOP(DO_MIN); break;
+    case OP_MAX: LOOP(DO_MAX); break;
+    case OP_BPR: LOOP(DO_BPR); break;
+  }
+}
+
+static void local_uncondense_vec(real *u, uint n, const sint *cm)
+{
+  LOOP(DO_SET);
+}
+
+#undef LOOP
+
+/*--------------------------------------------------------------------------
+   Non-local Execution Phases
+  --------------------------------------------------------------------------*/
+
+#ifdef USE_MPI
+
+static nonlocal_info *nlinfo_alloc(uint np, uint count, uint maxv)
+{
+  nonlocal_info *info = tmalloc(nonlocal_info,1);
+  info->np = np;
+  info->target = tmalloc(uint,2*np+count);
+  info->nshared = info->target + np;
+  info->sh_ind = info->nshared + np;
+  info->reqs = tmalloc(MPI_Request,2*np);
+  info->buf = tmalloc(real,2*count*maxv);
+  info->maxv = maxv;
+  return info;
+}
+
+static void nlinfo_free(nonlocal_info *info)
+{
+  free(info->buf);
+  free(info->reqs);
+  free(info->target);
+  free(info);
+}
+
+static void nonlocal(real *u, int op, const nonlocal_info *info, MPI_Comm comm)
+{
+  MPI_Status status;
+  uint np = info->np, i;
+  MPI_Request *reqs = info->reqs;
+  uint *targ = info->target;
+  uint *nshared = info->nshared;
+  uint *sh_ind = info->sh_ind;
+  uint id;
+  real *buf = info->buf, *start;
+  { int i; MPI_Comm_rank(comm,&i); id=i; }
+  for(i=0;i<np;++i) {
+    uint c = nshared[i];
+    start = buf;
+    for(;c;--c) *buf++ = u[*sh_ind++];
+    MPI_Isend(start,nshared[i]*sizeof(real),MPI_UNSIGNED_CHAR,
+              targ[i],id,comm,reqs++);
+  }
+  start = buf;
+  for(i=0;i<np;++i) {
+    MPI_Irecv(start,nshared[i]*sizeof(real),MPI_UNSIGNED_CHAR,
+              targ[i],targ[i],comm,reqs++);
+    start+=nshared[i];
+  }
+  for(reqs=info->reqs,i=np*2;i;--i) MPI_Wait(reqs++,&status);
+  sh_ind = info->sh_ind;
+# define LOOP(OP) do { \
+    for(i=0;i<np;++i) { \
+      uint c; \
+      for(c=nshared[i];c;--c) { OP(u[*sh_ind],*buf); ++sh_ind, ++buf; } \
+    } \
+  } while(0)
+  switch(op) {
+    case OP_ADD: LOOP(DO_ADD); break;
+    case OP_MUL: LOOP(DO_MUL); break;
+    case OP_MIN: LOOP(DO_MIN); break;
+    case OP_MAX: LOOP(DO_MAX); break;
+    case OP_BPR: LOOP(DO_BPR); break;
+  }
+# undef LOOP
+}
+
+static void nonlocal_vec(real *u, uint n, int op,
+                         const nonlocal_info *info, MPI_Comm comm)
+{
+  MPI_Status status;
+  uint np = info->np, i;
+  MPI_Request *reqs = info->reqs;
+  uint *targ = info->target;
+  uint *nshared = info->nshared;
+  uint *sh_ind = info->sh_ind;
+  uint id;
+  real *buf = info->buf, *start;
+  uint size = n*sizeof(real);
+  { int i; MPI_Comm_rank(comm,&i); id=i; }
+  for(i=0;i<np;++i) {
+    uint ns=nshared[i], c=ns;
+    start = buf;
+    for(;c;--c) memcpy(buf,u+n*(*sh_ind++),size), buf+=n;
+    MPI_Isend(start,ns*size,MPI_UNSIGNED_CHAR,targ[i],id,comm,reqs++);
+  }
+  start = buf;
+  for(i=0;i<np;++i) {
+    int nsn=n*nshared[i];
+    MPI_Irecv(start,nsn*size,MPI_UNSIGNED_CHAR,targ[i],targ[i],comm,reqs++);
+    start+=nsn;
+  }
+  for(reqs=info->reqs,i=np*2;i;--i) MPI_Wait(reqs++,&status);
+  sh_ind = info->sh_ind;
+# define LOOP(OP) do { \
+    for(i=0;i<np;++i) { \
+      uint c,j; \
+      for(c=nshared[i];c;--c) { \
+        real *uu=u+n*(*sh_ind++); \
+        for(j=n;j;--j) { OP(*uu,*buf); ++uu, ++buf; } \
+      } \
+    } \
+  } while(0)
+  switch(op) {
+    case OP_ADD: LOOP(DO_ADD); break;
+    case OP_MUL: LOOP(DO_MUL); break;
+    case OP_MIN: LOOP(DO_MIN); break;
+    case OP_MAX: LOOP(DO_MAX); break;
+    case OP_BPR: LOOP(DO_BPR); break;
+  }
+# undef LOOP
+}
+
+static void nonlocal_many(real **u, uint n, int op,
+                          const nonlocal_info *info, MPI_Comm comm)
+{
+  MPI_Status status;
+  uint np = info->np, i;
+  MPI_Request *reqs = info->reqs;
+  uint *targ = info->target;
+  uint *nshared = info->nshared;
+  uint *sh_ind = info->sh_ind;
+  uint id;
+  real *buf = info->buf, *start;
+  { int i; MPI_Comm_rank(comm,&i); id=i; }
+  for(i=0;i<np;++i) {
+    uint c, j, ns = nshared[i];
+    start = buf;
+    for(j=0;j<n;++j) {real*uu=u[j]; for(c=0;c<ns;++c) *buf++=uu[sh_ind[c]];}
+    sh_ind+=ns;
+    MPI_Isend(start,n*ns*sizeof(real),MPI_UNSIGNED_CHAR,targ[i],id,comm,reqs++);
+  }
+  start = buf;
+  for(i=0;i<np;++i) {
+    int nsn = n*nshared[i];
+    MPI_Irecv(start,nsn*sizeof(real),MPI_UNSIGNED_CHAR,
+              targ[i],targ[i],comm,reqs++);
+    start+=nsn;
+  }
+  for(reqs=info->reqs,i=np*2;i;--i) MPI_Wait(reqs++,&status);
+  sh_ind = info->sh_ind;
+# define LOOP(OP) do { \
+    for(i=0;i<np;++i) { \
+      uint c,j,ns=nshared[i]; \
+      for(j=0;j<n;++j) { \
+        real *uu=u[j]; \
+        for(c=0;c<ns;++c) { OP(uu[sh_ind[c]],*buf); ++buf; } \
+      } \
+      sh_ind+=ns; \
+    } \
+  } while(0)
+  switch(op) {
+    case OP_ADD: LOOP(DO_ADD); break;
+    case OP_MUL: LOOP(DO_MUL); break;
+    case OP_MIN: LOOP(DO_MIN); break;
+    case OP_MAX: LOOP(DO_MAX); break;
+    case OP_BPR: LOOP(DO_BPR); break;
+  }
+# undef LOOP
+}
+#endif
+
+/*--------------------------------------------------------------------------
+   Combined Execution
+  --------------------------------------------------------------------------*/
+
+void gs_op(real *u, int op, const gs_data *data)
+{
+  local_condense(u,op,data->local_cm);
+#ifdef USE_MPI
+  nonlocal(u,op,data->nlinfo,data->comm);
+#endif
+  local_uncondense(u,data->local_cm);
+}
+
+void gs_op_vec(real *u, uint n, int op, const gs_data *data)
+{
+#ifdef USE_MPI
+  if(n>data->nlinfo->maxv)
+    fail("%s: initialized with max vec size = %d,"
+         " but called with vec size = %d\n",__FILE__,data->nlinfo->maxv,n);
+#endif
+  local_condense_vec(u,n,op,data->local_cm);
+#ifdef USE_MPI
+  nonlocal_vec(u,n,op,data->nlinfo,data->comm);
+#endif
+  local_uncondense_vec(u,n,data->local_cm);
+}
+
+void gs_op_many(real **u, uint n, int op, const gs_data *data)
+{
+  uint i;
+#ifdef USE_MPI
+  if(n>data->nlinfo->maxv)
+    fail("%s: initialized with max vec size = %d,"
+         " but called with vec size = %d\n",__FILE__,data->nlinfo->maxv,n);
+#endif
+  for(i=0;i<n;++i) local_condense(u[i],op,data->local_cm);
+#ifdef USE_MPI
+  nonlocal_many(u,n,op,data->nlinfo,data->comm);
+#endif
+  for(i=0;i<n;++i) local_uncondense(u[i],data->local_cm);
+}
+
+/*--------------------------------------------------------------------------
+   Setup
+  --------------------------------------------------------------------------*/
+
+gs_data *gs_data_setup(uint n, const ulong *label,
+                       uint maxv, crystal_data *crystal)
+{
+  gs_data *data=tmalloc(gs_data,1);
+  tuple_list nonzero, primary;
+  const int nz_index=0, nz_size=1, nz_label=0;
+  const int pr_nzindex=0, pr_index=1, pr_count=2, pr_size=3, pr_label=0;
+#ifdef USE_MPI
+  tuple_list shared;
+  const int pr_proc=0;
+  const int sh_dproc=0, sh_proc2=1, sh_index=2, sh_size=3, sh_label=0;
+#else
+  buffer buf;
+#endif
+#ifdef USE_MPI
+  MPI_Comm_dup(crystal->comm,&data->comm);
+#else
+  buffer_init(&buf,1024);
+#endif
+
+  /* construct list of nonzeros: (index ^, label) */
+  tuple_list_init_max(&nonzero,nz_size,1,0,n);
+  {
+    uint i; sint *nzi = nonzero.vi; slong *nzl = nonzero.vl;
+    for(i=0;i<n;++i)
+      if(label[i]!=0) 
+        nzi[nz_index]=i,
+        nzl[nz_label]=label[i],
+        nzi+=nz_size, ++nzl, nonzero.n++;
+  }
+
+  /* sort nonzeros by label: (index ^2, label ^1) */
+#ifndef USE_MPI
+  tuple_list_sort(&nonzero,nz_size+nz_label,&buf);
+#else
+  tuple_list_sort(&nonzero,nz_size+nz_label,&crystal->all->buf);
+#endif
+
+  /* build list of unique labels w/ lowest associated index:
+     (index in nonzero ^, primary (lowest) index in label, count, label) */
+  tuple_list_init_max(&primary,pr_size,1,0,nonzero.n);
+  {
+    uint i;
+    sint  *nzi=nonzero.vi, *pi=primary.vi;
+    slong *nzl=nonzero.vl, *pl=primary.vl;
+    sint last=-1;
+    for(i=0;i<nonzero.n;++i,nzi+=nz_size,++nzl) {
+      if(nzl[nz_label]==last) {
+        ++pi[-pr_size+pr_count];
+        continue;
+      }
+      last=nzl[nz_label];
+      pi[pr_nzindex]=i;
+      pi[pr_index]=nzi[nz_index];
+      pl[pr_label]=nzl[nz_label];
+      pi[pr_count]=1;
+      pi+=pr_size, ++pl; primary.n++;
+    }
+  }
+
+  /* calculate size of local condense map */
+  {
+    uint i, count=1; sint *pi=primary.vi;
+    for(i=primary.n;i;--i,pi+=pr_size)
+      if(pi[pr_count]>1) count+=pi[pr_count]+1;
+    data->local_cm = tmalloc(sint,count);
+  }
+
+  /* sort unique labels by primary index:
+     (nonzero index ^2, primary index ^1, count, label ^2) */
+#ifndef USE_MPI
+  tuple_list_sort(&primary,pr_index,&buf);
+  buffer_free(&buf);
+#else
+  tuple_list_sort(&primary,pr_index,&crystal->all->buf);
+#endif
+  
+  /* construct local condense map */
+  {
+    uint i, n; sint *pi=primary.vi;
+    sint *cm = data->local_cm;
+    for(i=primary.n;i;--i,pi+=pr_size) if((n=pi[pr_count])>1) {
+      uint j; sint *nzi=nonzero.vi+nz_size*pi[pr_nzindex];
+      for(j=n;j;--j,nzi+=nz_size) *cm++ = nzi[nz_index];
+      *cm++ = -1;
+    }
+    *cm++ = -1;
+  }
+  tuple_list_free(&nonzero);
+  
+#ifndef USE_MPI
+  tuple_list_free(&primary);
+#else
+  /* assign work proc by label modulo np */
+  {
+    uint i; sint *pi=primary.vi; slong *pl=primary.vl;
+    for(i=primary.n;i;--i,pi+=pr_size,++pl)
+      pi[pr_proc]=pl[pr_label]%crystal->num;
+  }
+  gs_transfer(1,&primary,pr_proc,crystal); /* transfer to work procs */
+  /* primary: (source proc, index on src, useless, label) */
+  /* sort by label */
+  tuple_list_sort(&primary,pr_size+pr_label,&crystal->all->buf);
+  /* add sentinel to primary list */
+  if(primary.n==primary.max) tuple_list_grow(&primary);
+  primary.vl[primary.n] = -1;
+  /* construct shared list: (proc1, proc2, index1, label) */
+  tuple_list_init_max(&shared,sh_size,1,0,primary.n);
+  {
+    sint *pi1=primary.vi, *si=shared.vi;
+    slong lbl, *pl1=primary.vl, *sl=shared.vl;
+    for(;(lbl=pl1[pr_label])!=-1;pi1+=pr_size,++pl1) {
+      sint *pi2=pi1+pr_size; slong *pl2=pl1+1;
+      for(;pl2[pr_label]==lbl;pi2+=pr_size,++pl2) {
+        if(shared.n+2>shared.max)
+          tuple_list_grow(&shared),
+          si=shared.vi+shared.n*sh_size, sl=shared.vl+shared.n;
+        si[sh_dproc] = pi1[pr_proc];
+        si[sh_proc2] = pi2[pr_proc];
+        si[sh_index] = pi1[pr_index];
+        sl[sh_label] = lbl;
+        si+=sh_size, ++sl, shared.n++;
+        si[sh_dproc] = pi2[pr_proc];
+        si[sh_proc2] = pi1[pr_proc];
+        si[sh_index] = pi2[pr_index];
+        sl[sh_label] = lbl;
+        si+=sh_size, ++sl, shared.n++;
+      }
+    }
+  }
+  tuple_list_free(&primary);
+  gs_transfer(1,&shared,sh_dproc,crystal); /* transfer to dest procs */
+  /* shared list: (useless, proc2, index, label) */
+  /* sort by label */
+  tuple_list_sort(&shared,sh_size+sh_label,&crystal->all->buf);
+  /* sort by partner proc */
+  tuple_list_sort(&shared,sh_proc2,&crystal->all->buf);
+  /* count partner procs */
+  {
+    uint i, count=0; sint proc=-1,*si=shared.vi;
+    for(i=shared.n;i;--i,si+=sh_size)
+      if(si[sh_proc2]!=proc) ++count, proc=si[sh_proc2];
+    data->nlinfo = nlinfo_alloc(count,shared.n,maxv);
+  }
+  /* construct non-local info */
+  {
+    uint i; sint proc=-1,*si=shared.vi;
+    uint *target  = data->nlinfo->target;
+    uint *nshared = data->nlinfo->nshared;
+    uint *sh_ind  = data->nlinfo->sh_ind;
+    for(i=shared.n;i;--i,si+=sh_size) {
+      if(si[sh_proc2]!=proc)
+        proc=si[sh_proc2], *target++ = proc, *nshared++ = 0;
+      ++nshared[-1], *sh_ind++=si[sh_index];
+    }
+  }
+  tuple_list_free(&shared);
+#endif
+  return data;
+}
+
+void gs_data_free(gs_data *data)
+{
+  free(data->local_cm);
+#ifdef USE_MPI
+  nlinfo_free(data->nlinfo);
+  MPI_Comm_free(&data->comm);
+#endif
+  free(data);
+}
+

Added: MOAB/trunk/parallel/gs.h
===================================================================
--- MOAB/trunk/parallel/gs.h	                        (rev 0)
+++ MOAB/trunk/parallel/gs.h	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,54 @@
+#ifndef GS_H
+#define GS_H
+
+/* requires "types.h", and, when MPI is defined, "crystal.h" */
+#if !defined(TYPES_H) || ( defined(MPI) && !defined(CRYSTAL_H) )
+#warning "gs.h" requires "types.h" and "crystal.h"
+#endif
+
+// typedef struct gs_data_ gs_data;
+
+#ifndef MPI
+#  define crystal_data void
+#endif
+
+#ifdef USE_MPI
+typedef struct {
+  uint np;           /* number of processors to communicate with          */
+  uint *target;      /* int target[np]: array of processor ids to comm w/ */
+  uint *nshared;     /* nshared[i] = number of points shared w/ target[i] */
+  uint *sh_ind;      /* list of shared point indices                      */
+  MPI_Request *reqs; /* pre-allocated for MPI calls                       */
+  real *buf;         /* pre-allocated buffer to receive data              */
+  uint maxv;         /* maximum vector size                               */
+} nonlocal_info;
+#endif
+
+typedef struct {
+  sint *local_cm; /* local condense map */
+#ifdef USE_MPI
+  nonlocal_info *nlinfo;
+  MPI_Comm comm;
+#endif
+} gs_data;
+
+gs_data *gs_data_setup(uint n, const ulong *label,
+                       uint maxv, crystal_data *crystal);
+
+#ifndef MPI
+#  undef crystal_data
+#endif
+
+void gs_data_free(gs_data *data);
+void gs_op(real *u, int op, const gs_data *data);
+void gs_op_vec(real *u, uint n, int op, const gs_data *data);
+void gs_op_many(real **u, uint n, int op, const gs_data *data);
+
+#define GS_OP_ADD 1
+#define GS_OP_MUL 2
+#define GS_OP_MIN 3
+#define GS_OP_MAX 4
+#define GS_OP_BPR 5
+
+#endif
+

Added: MOAB/trunk/parallel/minmax.h
===================================================================
--- MOAB/trunk/parallel/minmax.h	                        (rev 0)
+++ MOAB/trunk/parallel/minmax.h	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,42 @@
+#ifndef MINMAX_H
+#define MINMAX_H
+
+/* requires <math.h> and "types.h" */
+
+#ifndef TYPES_H
+#warning "minmax.h" depends on "types.h"
+#endif
+
+/*--------------------------------------------------------------------------
+   Min, Max, Norm
+  --------------------------------------------------------------------------*/
+
+#define DECLMINMAX(type, prefix) \
+static type prefix##min_2(type a, type b) { return b<a?b:a; } \
+static type prefix##max_2(type a, type b) { return a>b?a:b; } \
+static void prefix##minmax_2(type *min, type *max, type a, type b) \
+{ if(b<a) *min=b, *max=a; else *min=a, *max=b; } \
+static type prefix##min_3(type a, type b, type c) \
+{ return b<a?(c<b?c:b):(c<a?c:a); } \
+static type prefix##max_3(type a, type b, type c) \
+{ return a>b?(a>c?a:c):(b>c?b:c); } \
+static void prefix##minmax_3(type *min, type *max, type a, type b, type c) \
+{ if(b<a)  *min=prefix##min_2(b,c), *max=prefix##max_2(a,c); \
+  else    *min=prefix##min_2(a,c), *max=prefix##max_2(b,c); }
+
+DECLMINMAX(int, i)
+DECLMINMAX(unsigned, u)
+DECLMINMAX(real, r)
+#undef DECLMINMAX
+
+static real r1norm_1(real a) { return fabsr(a); }
+static real r1norm_2(real a, real b) { return fabsr(a)+fabsr(b); }
+static real r1norm_3(real a, real b, real c)
+{ return fabsr(a)+fabsr(b)+fabsr(c); }
+static real r2norm_1(real a) { return sqrtr(a*a); }
+static real r2norm_2(real a, real b) { return sqrtr(a*a+b*b); }
+static real r2norm_3(real a, real b, real c)
+{ return sqrtr(a*a+b*b+c*c); }
+
+#endif
+

Added: MOAB/trunk/parallel/sort.c
===================================================================
--- MOAB/trunk/parallel/sort.c	                        (rev 0)
+++ MOAB/trunk/parallel/sort.c	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,37 @@
+#include <limits.h>
+#include <string.h>
+
+#include "types.h"
+
+typedef uint Index;
+
+#define sort jl_sort
+#define Value uint
+#define Data sort_data
+typedef struct { Value v; Index i; } Data;
+#include "sort_imp.c"
+
+#undef Value
+#undef Data
+
+#ifdef GLOBAL_INT
+#  define Value ulong
+#  define Data sort_data_long
+   typedef struct { Value v; Index i; } Data;
+#  define radix_count         radix_count_long
+#  define radix_offsets       radix_offsets_long
+#  define radix_zeros         radix_zeros_long
+#  define radix_pass          radix_pass_long
+#  define radix_sort          radix_sort_long
+#  define radix_index_pass_b  radix_index_pass_b_long
+#  define radix_index_pass_m  radix_index_pass_m_long
+#  define radix_index_pass_e  radix_index_pass_e_long
+#  define radix_index_pass_be radix_index_pass_be_long
+#  define radix_index_sort    radix_index_sort_long
+#  define merge_sort          merge_sort_long
+#  define merge_index_sort    merge_index_sort_long
+#  define sort                sort_long
+#  define index_sort          index_sort_long
+#  include "sort_imp.c"
+#endif
+

Added: MOAB/trunk/parallel/sort.h
===================================================================
--- MOAB/trunk/parallel/sort.h	                        (rev 0)
+++ MOAB/trunk/parallel/sort.h	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,59 @@
+#ifndef SORT_H
+#define SORT_H
+
+/* requires "types.h" */
+#ifndef TYPES_H
+#warning "sort.h" requires "types.h"
+#endif
+
+/*------------------------------------------------------------------------------
+  
+  Sort
+  
+  O(n) stable sort with good performance for all n
+
+  A, n, stride : specifices the input
+  
+  sort:
+     Value out[n] : the sorted values (output)
+     Value work[n]: scratch area
+  
+  index_sort:
+     uint idx[n]   : the sorted indices (output)
+     Data work[2*n]: scratch area
+
+  example:
+  
+    uint b[N][M];                      or ulong b...
+    sort_data work[2*N];               or sort_data_long work...
+    uint p[N];
+    ...
+    index_sort(&b[0][key],N,M, p, work);   or index_sort_long(...
+    
+    now the array can be accessed in sorted order as
+       b[p[0]][key]
+       b[p[1]][key]
+       ...
+
+  ----------------------------------------------------------------------------*/
+
+#define sort jl_sort
+void sort(const uint *A, uint n, uint stride, uint *out, uint *work);
+
+typedef struct { uint v; uint i; } sort_data;
+void index_sort(const uint *A, uint n, uint stride,
+                uint *idx, sort_data *work);
+
+#ifdef GLOBAL_INT
+  void sort_long(const ulong *A, uint n, uint stride, ulong *out, ulong *work);
+  typedef struct { ulong v; uint i; } sort_data_long;
+  void index_sort_long(const ulong *A, uint n, uint stride,
+                       uint *idx, sort_data_long *work);
+#else
+#  define sort_long       sort
+#  define sort_data_long  sort_data
+#  define index_sort_long index_sort
+#endif
+
+#endif
+

Added: MOAB/trunk/parallel/sort_imp.c
===================================================================
--- MOAB/trunk/parallel/sort_imp.c	                        (rev 0)
+++ MOAB/trunk/parallel/sort_imp.c	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,346 @@
+
+/* this file possibly included multiple times by sort.c
+   for sorting different integer sizes;
+   
+   look in sort.c for some controlling macro definitions,
+   like Value, Index, Data, and function names */
+
+#ifdef Value
+
+/*------------------------------------------------------------------------------
+  
+  Radix Sort
+  
+  stable; O(n+k) time and extra storage
+    where k = (digits in an int) * 2^(bits per digit)
+    (e.g. k = 4 * 256 = 1024 for 32-bit ints with 8-bit digits)
+
+  brief description:
+    input sorted stably on each digit, starting with the least significant
+    counting sort is used for each digit:
+      a pass through the input counts the occurences of each digit value
+      on a second pass, each input has a known destination
+  
+  tricks:
+    all counting passes are combined into one
+    the counting pass also computes the inclusive bit-wise or of all inputs,
+      which is used to skip digit positions for which all inputs have zeros
+
+  ----------------------------------------------------------------------------*/
+
+#define DIGIT_BITS   8
+#define DIGIT_VALUES (1<<DIGIT_BITS)
+#define DIGIT_MASK   ((Value)(DIGIT_VALUES-1))
+#define CEILDIV(a,b) (((a)+(b)-1)/(b))
+#define DIGITS       CEILDIV(CHAR_BIT*sizeof(Value),DIGIT_BITS)
+#define VALUE_BITS   (DIGIT_BITS*DIGITS)
+#define COUNT_SIZE   (DIGITS*DIGIT_VALUES)
+
+/* used to unroll a tiny loop: */
+#define COUNT_DIGIT_01(n,i) \
+    if(n>i) count[i][val&DIGIT_MASK]++, val>>=DIGIT_BITS
+#define COUNT_DIGIT_02(n,i) COUNT_DIGIT_01(n,i); COUNT_DIGIT_01(n,i+ 1)
+#define COUNT_DIGIT_04(n,i) COUNT_DIGIT_02(n,i); COUNT_DIGIT_02(n,i+ 2)
+#define COUNT_DIGIT_08(n,i) COUNT_DIGIT_04(n,i); COUNT_DIGIT_04(n,i+ 4)
+#define COUNT_DIGIT_16(n,i) COUNT_DIGIT_08(n,i); COUNT_DIGIT_08(n,i+ 8)
+#define COUNT_DIGIT_32(n,i) COUNT_DIGIT_16(n,i); COUNT_DIGIT_16(n,i+16)
+#define COUNT_DIGIT_64(n,i) COUNT_DIGIT_32(n,i); COUNT_DIGIT_32(n,i+32)
+
+static Value radix_count(const Value *A, const Value *end, Index stride,
+                         Index count[DIGITS][DIGIT_VALUES])
+{
+  Value bitorkey = 0;
+  memset(count,0,COUNT_SIZE*sizeof(Index));
+  do {
+    Value val=*A;
+    bitorkey|=val;
+    COUNT_DIGIT_64(DIGITS,0);
+    /* above macro expands to:
+    if(DIGITS> 0) count[ 0][val&DIGIT_MASK]++, val>>=DIGIT_BITS;
+    if(DIGITS> 1) count[ 1][val&DIGIT_MASK]++, val>>=DIGIT_BITS;
+      ...
+    if(DIGITS>63) count[63][val&DIGIT_MASK]++, val>>=DIGIT_BITS;
+    */
+  } while(A+=stride,A!=end);
+  return bitorkey;
+}
+
+#undef COUNT_DIGIT_01
+#undef COUNT_DIGIT_02
+#undef COUNT_DIGIT_04
+#undef COUNT_DIGIT_08
+#undef COUNT_DIGIT_16
+#undef COUNT_DIGIT_32
+#undef COUNT_DIGIT_64
+
+static void radix_offsets(Index *c)
+{
+  Index sum=0, t, *ce=c+DIGIT_VALUES;
+  do t=*c, *c++ = sum, sum+=t; while(c!=ce);
+}
+
+static unsigned radix_zeros(Value bitorkey, Index count[DIGITS][DIGIT_VALUES],
+                            unsigned *shift, Index **offsets)
+{
+  unsigned digits=0, sh=0; Index *c = &count[0][0];
+  do {
+    if(bitorkey&DIGIT_MASK) *shift++ = sh, *offsets++ = c, ++digits,
+                            radix_offsets(c);
+  } while(bitorkey>>=DIGIT_BITS,sh+=DIGIT_BITS,c+=DIGIT_VALUES,sh!=VALUE_BITS);
+  return digits;
+}
+
+static void radix_pass(const Value *A, const Value *end, Index stride,
+                       unsigned sh, Index *off, Value *out)
+{
+  do out[off[(*A>>sh)&DIGIT_MASK]++] = *A; while(A+=stride,A!=end);
+}
+
+static void radix_sort(const Value *A, Index n, Index stride,
+                       Value *out, Value *work)
+{
+  Index count[DIGITS][DIGIT_VALUES];
+  const Value *end = A+n*stride;
+  Value bitorkey = radix_count(A, end, stride, count);
+  unsigned shift[DIGITS]; Index *offsets[DIGITS];
+  unsigned digits = radix_zeros(bitorkey,count,shift,offsets);
+  if(digits==0) {
+    memset(out,0,sizeof(Value)*n);
+  } else {
+    Value *src, *dst; unsigned d;
+    if((digits&1)==1) src=out,dst=work;
+                 else dst=out,src=work;
+    radix_pass(A,end,stride,shift[0],offsets[0],src);
+    for(d=1;d!=digits;++d) {
+      Value *t;
+      radix_pass(src,src+n,1,shift[d],offsets[d],dst);
+      t=src,src=dst,dst=t;
+    }
+  }
+}
+
+static void radix_index_pass_b(const Value *A, Index n, Index stride,
+                               unsigned sh, Index *off, Data *out)
+{
+  Index i=0;
+  do {
+    Value v = *A;
+    Data *d = &out[off[(v>>sh)&DIGIT_MASK]++];
+    d->v=v, d->i=i++;
+  } while(A+=stride,i!=n);
+}
+
+static void radix_index_pass_m(const Data *src, const Data *end,
+                               unsigned sh, Index *off, Data *out)
+{
+  do {
+    Data *d = &out[off[(src->v>>sh)&DIGIT_MASK]++];
+    d->v=src->v,d->i=src->i;
+  } while(++src!=end);
+}
+
+static void radix_index_pass_e(const Data *src, const Data *end,
+                               unsigned sh, Index *off,
+                               Index *out)
+{
+  do out[off[(src->v>>sh)&DIGIT_MASK]++]=src->i; while(++src!=end);
+}
+
+static void radix_index_pass_be(const Value *A, Index n, Index stride,
+                                unsigned sh, Index *off, Index *out)
+{
+  Index i=0;
+  do out[off[(*A>>sh)&DIGIT_MASK]++]=i++; while(A+=stride,i!=n);
+}
+
+static void radix_index_sort(const Value *A, Index n, Index stride,
+                             Index *idx, Data *work)
+{
+  Index count[DIGITS][DIGIT_VALUES];
+  Value bitorkey = radix_count(A, A+n*stride, stride, count);
+  unsigned shift[DIGITS]; Index *offsets[DIGITS];
+  unsigned digits = radix_zeros(bitorkey,count,shift,offsets);
+  if(digits==0) {
+    Index i=0; do *idx++=i++; while(i!=n);
+  } else if(digits==1) {
+    radix_index_pass_be(A,n,stride,shift[0],offsets[0],idx);
+  } else {
+    Data *src, *dst; unsigned d;
+    if((digits&1)==0) dst=work,src=dst+n;
+                 else src=work,dst=src+n;
+    radix_index_pass_b(A,n,stride,shift[0],offsets[0],src);
+    for(d=1;d!=digits-1;++d) {
+      Data *t;
+      radix_index_pass_m(src,src+n,shift[d],offsets[d],dst);
+      t=src,src=dst,dst=t;
+    }
+    radix_index_pass_e(src,src+n,shift[d],offsets[d],idx);
+  }
+}
+
+/*------------------------------------------------------------------------------
+  
+  Merge Sort
+  
+  stable; O(n log n) time
+
+  ----------------------------------------------------------------------------*/
+
+static void merge_sort(const Value *A, Index n, Index stride,
+                       Value *out, Value *work)
+{
+  Value *const buf[2]={out,work};
+  Index base=-n, odd=0, c=0, b=1;
+  for(;;) {
+    Value *p;
+    if((c&1)==0) {
+      base+=n, n+=(odd&1), c|=1, b^=1;
+      while(n>3) odd<<=1,odd|=(n&1),n>>=1,c<<=1,b^=1;
+    } else
+      base-=n-(odd&1),n<<=1,n-=(odd&1),odd>>=1,c>>=1;
+    if(c==0) break;
+    p = buf[b]+base;
+    if(n==2) {
+      Value v[2]; v[0]=*A,A+=stride,v[1]=*A,A+=stride;
+      if(v[1]<v[0]) p[0]=v[1],p[1]=v[0];
+               else p[0]=v[0],p[1]=v[1];
+    } else if(n==3) {
+      Value v[3]; v[0]=*A,A+=stride,v[1]=*A,A+=stride,v[2]=*A,A+=stride;
+      if(v[1]<v[0]) {
+        if(v[2]<v[1])        p[0]=v[2],p[1]=v[1],p[2]=v[0];
+        else { if(v[2]<v[0]) p[0]=v[1],p[1]=v[2],p[2]=v[0];
+                        else p[0]=v[1],p[1]=v[0],p[2]=v[2]; }
+      } else {
+        if(v[2]<v[0])        p[0]=v[2],p[1]=v[0],p[2]=v[1];
+        else { if(v[2]<v[1]) p[0]=v[0],p[1]=v[2],p[2]=v[1];
+                        else p[0]=v[0],p[1]=v[1],p[2]=v[2]; }
+      }
+    } else {
+      const Index na = n>>1, nb = (n+1)>>1;
+      const Value *ap = buf[b^1]+base, *ae = ap+na;
+      Value *bp = p+na, *be = bp+nb;
+      for(;;) {
+        if(*bp<*ap) {
+          *p++=*bp++;
+          if(bp!=be) continue;
+          do *p++=*ap++; while(ap!=ae);
+          break;
+        } else {
+          *p++=*ap++;
+          if(ap==ae) break;
+        }
+      }
+    }
+  }
+}
+
+static void merge_index_sort(const Value *A, const Index An, Index stride,
+                             Index *idx, Data *work)
+{
+  Data *const buf[2]={work+An,work};
+  Index n=An, base=-n, odd=0, c=0, b=1;
+  Index i=0;
+  for(;;) {
+    Data *p;
+    if((c&1)==0) {
+      base+=n, n+=(odd&1), c|=1, b^=1;
+      while(n>3) odd<<=1,odd|=(n&1),n>>=1,c<<=1,b^=1;
+    } else
+      base-=n-(odd&1),n<<=1,n-=(odd&1),odd>>=1,c>>=1;
+    if(c==0) break;
+    p = buf[b]+base;
+    if(n==2) {
+      Value v[2]; v[0]=*A,A+=stride,v[1]=*A,A+=stride;
+      if(v[1]<v[0]) p[0].v=v[1],p[0].i=i+1, p[1].v=v[0],p[1].i=i  ;
+               else p[0].v=v[0],p[0].i=i  , p[1].v=v[1],p[1].i=i+1;
+      i+=2;
+    } else if(n==3) {
+      Value v[3]; v[0]=*A,A+=stride,v[1]=*A,A+=stride,v[2]=*A,A+=stride;
+      if(v[1]<v[0]) {
+        if(v[2]<v[1])        p[0].v=v[2],p[1].v=v[1],p[2].v=v[0],
+                             p[0].i=i+2 ,p[1].i=i+1 ,p[2].i=i   ;
+        else { if(v[2]<v[0]) p[0].v=v[1],p[1].v=v[2],p[2].v=v[0],
+                             p[0].i=i+1 ,p[1].i=i+2 ,p[2].i=i   ;
+                        else p[0].v=v[1],p[1].v=v[0],p[2].v=v[2],
+                             p[0].i=i+1 ,p[1].i=i   ,p[2].i=i+2 ; }
+      } else {
+        if(v[2]<v[0])        p[0].v=v[2],p[1].v=v[0],p[2].v=v[1],
+                             p[0].i=i+2 ,p[1].i=i   ,p[2].i=i+1 ;
+        else { if(v[2]<v[1]) p[0].v=v[0],p[1].v=v[2],p[2].v=v[1],
+                             p[0].i=i   ,p[1].i=i+2 ,p[2].i=i+1 ;
+                        else p[0].v=v[0],p[1].v=v[1],p[2].v=v[2],
+                             p[0].i=i   ,p[1].i=i+1 ,p[2].i=i+2 ; }
+      }
+      i+=3;
+    } else {
+      const Index na = n>>1, nb = (n+1)>>1;
+      const Data *ap = buf[b^1]+base, *ae = ap+na;
+      Data *bp = p+na, *be = bp+nb;
+      for(;;) {
+        if(bp->v<ap->v) {
+          *p++=*bp++;
+          if(bp!=be) continue;
+          do *p++=*ap++; while(ap!=ae);
+          break;
+        } else {
+          *p++=*ap++;
+          if(ap==ae) break;
+        }
+      }
+    }
+  }
+  {
+    const Data *p = buf[0], *pe = p+An;
+    do *idx++ = (p++)->i; while(p!=pe);
+  }
+}
+
+/*------------------------------------------------------------------------------
+  
+  Hybrid Stable Sort
+  
+  low-overhead merge sort when n is small,
+  otherwise asymptotically superior radix sort
+
+  result = O(n) sort with good performance for all n
+  
+  A, n, stride : specifices the input
+  
+  sort:
+     Value out[n] : the sorted values (output)
+     Value work[n]: scratch area
+  
+  index_sort:
+     Index idx[n]  : the sorted indices (output)
+     Data work[2*n]: scratch area
+
+  ----------------------------------------------------------------------------*/
+
+void sort(const Value *A, Index n, Index stride, Value *out, Value *work)
+{
+  if(n<DIGIT_VALUES) {
+    if(n==0) return;
+    if(n==1) *out = *A;
+    else     merge_sort(A,n,stride,out,work);
+  } else     radix_sort(A,n,stride,out,work);
+}
+
+void index_sort(const Value *A, Index n, Index stride,
+                Index *idx, Data *work)
+{
+  if(n<DIGIT_VALUES) {
+    if(n==0) return;
+    if(n==1) *idx=0;
+    else     merge_index_sort(A,n,stride,idx,work);
+  } else     radix_index_sort(A,n,stride,idx,work);
+}
+
+#undef DIGIT_BITS
+#undef DIGIT_VALUES
+#undef DIGIT_MASK
+#undef CEILDIV
+#undef DIGITS
+#undef VALUE_BITS
+#undef COUNT_SIZE
+
+#endif

Added: MOAB/trunk/parallel/transfer.c
===================================================================
--- MOAB/trunk/parallel/transfer.c	                        (rev 0)
+++ MOAB/trunk/parallel/transfer.c	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,95 @@
+#ifdef USE_MPI
+
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <math.h>
+#include <mpi.h>
+#include "errmem.h"
+#include "types.h"
+#include "minmax.h"
+#include "sort.h"
+#include "tuple_list.h"
+#include "crystal.h"
+
+#define UINT_PER_X(X) ((sizeof(X)+sizeof(uint)-1)/sizeof(uint))
+#define UINT_PER_REAL UINT_PER_X(real)
+#define UINT_PER_LONG UINT_PER_X(slong)
+
+/*------------------------------------------------------------------------------
+  
+  Transfer
+ 
+  Treats one integer (not long) member of the tuple list as a target proc;
+  Sends out tuples accordingly, using the crystal router.
+  Target proc member overwritten with source proc.
+  
+  dynamic: non-zero if the tuple list should grow to accomodate arrivals
+  tl:      the tuple list
+  pf:      which tuple member specifies target proc
+  crystal: an initialized crystal router structure (cf. crystal.h)
+
+  ----------------------------------------------------------------------------*/
+
+void gs_transfer(int dynamic, tuple_list *tl,
+                 unsigned pf, crystal_data *crystal)
+{
+  const unsigned mi=tl->mi,ml=tl->ml,mr=tl->mr;
+  const unsigned tsize = (mi-1) + ml*UINT_PER_LONG + mr*UINT_PER_REAL;
+  sint p, lp = -1;
+  sint *ri; slong *rl; real *rr;
+  uint i, j, *buf, *len=0, *buf_end;
+
+  /* sort to group by target proc */
+  tuple_list_sort(tl,pf,&crystal->all->buf);
+
+  /* pack into buffer for crystal router */
+  buffer_reserve(&crystal->all->buf,(tl->n*(3+tsize))*sizeof(uint));
+  crystal->all->n=0, buf = crystal->all->buf.ptr;
+  ri=tl->vi,rl=tl->vl,rr=tl->vr;
+  for(i=tl->n;i;--i) {
+    p = ri[pf];
+    if(p!=lp) {
+      lp = p;
+      *buf++ = p;           /* target */
+      *buf++ = crystal->id; /* source */
+      len = buf++; *len=0;  /* length */
+      crystal->all->n += 3;
+    }
+    for(j=0;j<mi;++j,++ri) if(j!=pf) *buf++ = *ri;
+    for(j=ml;j;--j,++rl)
+      memcpy(buf,rl,sizeof(slong)), buf+=UINT_PER_LONG;
+    for(j=mr;j;--j,++rr)
+      memcpy(buf,rr,sizeof(real )), buf+=UINT_PER_REAL;
+    *len += tsize, crystal->all->n += tsize;
+  }
+  
+  crystal_router(crystal);
+  
+  /* unpack */
+  buf = crystal->all->buf.ptr, buf_end = buf + crystal->all->n;
+  tl->n = 0;
+  ri=tl->vi,rl=tl->vl,rr=tl->vr;
+  while(buf != buf_end) {
+    sint p, len;
+    buf++;        /* target ( == this proc ) */
+    p = *buf++;   /* source */
+    len = *buf++; /* length */
+    while(len>0) {
+      if(tl->n==tl->max) {
+        if(!dynamic) { tl->n = tl->max + 1; return; }
+        tuple_list_grow(tl);
+        ri = tl->vi + mi*tl->n, rl = tl->vl + ml*tl->n, rr = tl->vr + mr*tl->n;
+      }
+      ++tl->n;
+      for(j=0;j<mi;++j) if(j!=pf) *ri++ = *buf++; else *ri++ = p;
+      for(j=ml;j;--j) memcpy(rl++,buf,sizeof(slong)), buf+=UINT_PER_LONG;
+      for(j=mr;j;--j) memcpy(rr++,buf,sizeof(real )), buf+=UINT_PER_REAL;
+      len-=tsize;
+    }
+  }
+}
+
+#endif
+

Added: MOAB/trunk/parallel/transfer.h
===================================================================
--- MOAB/trunk/parallel/transfer.h	                        (rev 0)
+++ MOAB/trunk/parallel/transfer.h	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,30 @@
+#ifndef TRANSFER_H
+#define TRANSFER_H
+
+#ifdef USE_MPI
+
+#if !defined(TUPLE_LIST_H) || !defined(CRYSTAL_H)
+#warning "transfer.h" requires "tuple_list.h" and "crystal.h"
+#endif
+
+/*------------------------------------------------------------------------------
+  
+  Transfer
+ 
+  Treats one integer (not long) member of the tuple list as a target proc;
+  Sends out tuples accordingly, using the crystal router.
+  Target proc member overwritten with source proc.
+  
+  dynamic: non-zero if the tuple list should grow to accomodate arrivals
+  tl:      the tuple list
+  pf:      which tuple member specifies target proc
+  crystal: an initialized crystal router structure (cf. crystal.h)
+
+  ----------------------------------------------------------------------------*/
+void gs_transfer(int dynamic, tuple_list* tl,
+                 unsigned pf, crystal_data *crystal);
+
+#endif
+
+#endif
+

Added: MOAB/trunk/parallel/tuple_list.c
===================================================================
--- MOAB/trunk/parallel/tuple_list.c	                        (rev 0)
+++ MOAB/trunk/parallel/tuple_list.c	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,58 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <math.h>
+#include "errmem.h"
+#include "types.h"
+#include "minmax.h"
+#include "sort.h"
+
+typedef struct {
+  unsigned mi,ml,mr;
+  uint n, max;
+  sint *vi; slong *vl; real *vr;
+} tuple_list;
+
+void tuple_list_permute(tuple_list *tl, uint *perm, void *work)
+{
+  const unsigned mi=tl->mi, ml=tl->ml, mr=tl->mr;
+  const unsigned int_size  = mi*sizeof(sint),
+                 long_size = ml*sizeof(slong),
+                 real_size = mr*sizeof(real);
+  if(mi) {
+    uint *p=perm, *pe=p+tl->n; char *sorted=work;
+    while(p!=pe) memcpy(sorted,&tl->vi[mi*(*p++)],int_size),sorted+=int_size;
+    memcpy(tl->vi,work,int_size*tl->n);
+  }
+  if(ml) {
+    uint *p=perm, *pe=p+tl->n; char *sorted=work;
+    while(p!=pe) memcpy(sorted,&tl->vl[ml*(*p++)],long_size),sorted+=long_size;
+    memcpy(tl->vl,work,long_size*tl->n);
+  }
+  if(mr) {
+    uint *p=perm, *pe=p+tl->n; char *sorted=work;
+    while(p!=pe) memcpy(sorted,&tl->vr[mr*(*p++)],real_size),sorted+=real_size;
+    memcpy(tl->vr,work,real_size*tl->n);
+  }
+}
+
+void tuple_list_sort(tuple_list *tl, unsigned key, buffer *buf)
+{
+  const unsigned mi=tl->mi, ml=tl->ml, mr=tl->mr;
+  const unsigned int_size =  mi*sizeof(sint);
+  const unsigned long_size = ml*sizeof(slong);
+  const unsigned real_size = mr*sizeof(real);
+  const unsigned width = umax_3(int_size,long_size,real_size);
+  const unsigned data_size = key>=mi ? sizeof(sort_data_long):sizeof(sort_data);
+  uint work_min=tl->n * umax_2(2*data_size,sizeof(sint)+width);
+  uint *work;
+  buffer_reserve(buf,work_min);
+  work = buf->ptr;
+  if(key<mi)
+    index_sort     ((uint *)&tl->vi[key   ],tl->n,mi, work, (void*)work);
+  else
+    index_sort_long((ulong*)&tl->vl[key-mi],tl->n,ml, work, (void*)work);
+  tuple_list_permute(tl,work,work+tl->n);
+}
+

Added: MOAB/trunk/parallel/tuple_list.h
===================================================================
--- MOAB/trunk/parallel/tuple_list.h	                        (rev 0)
+++ MOAB/trunk/parallel/tuple_list.h	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,66 @@
+/*------------------------------------------------------------------------------
+  
+  Tuple list definition and utilities
+  
+  Conceptually, a tuple list is a list of n records or tuples,
+  each with mi integers, ml longs, and mr reals
+  (these types are defined in "types.h" as sint, slong, real;
+   it may be that sint==slong)
+  
+  There are three arrays, one for each type (vi,vl,vr),
+  with records layed out contiguously within each array
+
+  ----------------------------------------------------------------------------*/
+
+#ifndef TUPLE_LIST_H
+#define TUPLE_LIST_H
+
+/* requires "errmem.h" and "types.h" */
+#if !defined(ERRMEM_H) || !defined(TYPES_H)
+#warning "tuple_list.h" requires "errmem.h" and "types.h"
+#endif
+
+typedef struct {
+  unsigned mi,ml,mr;
+  uint n, max;
+  sint *vi; slong *vl; real *vr;
+} tuple_list;
+
+/* storage layed out as: vi[max][mi], vl[max][ml], vr[max][mr]
+   where "tuple" i is given by (vi[i][0:mi-1],vl[i][0:ml-1],vr[i][0:mr-1]).
+   only the first n tuples are in use */
+
+static void tuple_list_init_max(tuple_list *tl,
+  unsigned mi, unsigned ml, unsigned mr, uint max)
+{
+  tl->n=0; tl->max=max;
+  tl->mi=mi,tl->ml=ml,tl->mr=mr;
+  tl->vi=tmalloc(sint, max*mi);
+  tl->vl=tmalloc(slong,max*ml);
+  tl->vr=tmalloc(real, max*mr);
+}
+
+static void tuple_list_free(tuple_list *tl) {
+  free(tl->vi), free(tl->vl), free(tl->vr);
+}
+
+static void tuple_list_resize(tuple_list *tl, uint max)
+{
+  tl->max = max;
+  tl->vi=trealloc(sint, tl->vi,tl->max*tl->mi);
+  tl->vl=trealloc(slong,tl->vl,tl->max*tl->ml);
+  tl->vr=trealloc(real, tl->vr,tl->max*tl->mr);
+}
+
+static void tuple_list_grow(tuple_list *tl)
+{
+  tuple_list_resize(tl,tl->max+tl->max/2+1);
+}
+
+void tuple_list_permute(tuple_list *tl, uint *perm, void *work);
+/* sort tuples by the field specified by key<mi+ml;
+   entries in vi[:][key] (or vl[:][key-mi]) assumed nonnegative */
+void tuple_list_sort(tuple_list *tl, unsigned key, buffer *buf);
+
+#endif
+

Added: MOAB/trunk/parallel/types.h
===================================================================
--- MOAB/trunk/parallel/types.h	                        (rev 0)
+++ MOAB/trunk/parallel/types.h	2007-10-03 20:28:42 UTC (rev 1297)
@@ -0,0 +1,74 @@
+#ifndef TYPES_H
+#define TYPES_H
+
+/* integer type to use for everything */
+#if   defined(USE_LONG)
+#  define INTEGER long
+#elif defined(USE_LONG_LONG)
+#  define INTEGER long long
+#elif defined(USE_SHORT)
+#  define INTEGER short
+#else
+#  define INTEGER int
+#endif
+
+/* when defined, use the given type for global indices instead of INTEGER */
+#if   defined(USE_GLOBAL_LONG_LONG)
+#  define GLOBAL_INT long long
+#elif defined(USE_GLOBAL_LONG)
+#  define GLOBAL_INT long
+#endif
+
+/* floating point type to use for everything */
+#if   defined(USE_FLOAT)
+   typedef float real;
+#  define floorr floorf
+#  define ceilr  ceilf
+#  define sqrtr  sqrtf
+#  define fabsr  fabsf
+#  define cosr   cosf
+#  define sinr   sinf
+#  define EPS   (128*FLT_EPSILON)
+#  define PI 3.1415926535897932384626433832795028841971693993751058209749445923F
+#elif defined(USE_LONG_DOUBLE)
+   typedef long double real;
+#  define floorr floorl
+#  define ceilr  ceill
+#  define sqrtr  sqrtl
+#  define fabsr  fabsl
+#  define cosr   cosl
+#  define sinr   sinl
+#  define EPS   (128*LDBL_EPSILON)
+#  define PI 3.1415926535897932384626433832795028841971693993751058209749445923L
+#else
+   typedef double real;
+#  define floorr floor
+#  define ceilr  ceil
+#  define sqrtr  sqrt
+#  define fabsr  fabs
+#  define cosr   cos
+#  define sinr   sin
+#  define EPS   (128*DBL_EPSILON)
+#  define PI 3.1415926535897932384626433832795028841971693993751058209749445923
+#endif
+
+/* apparently uint and ulong can be defined already in standard headers */
+#define uint uint_
+#define ulong ulong_
+#define sint sint_
+#define slong slong_
+
+typedef   signed INTEGER sint;
+typedef unsigned INTEGER uint;
+#undef INTEGER
+
+#ifdef GLOBAL_INT
+  typedef   signed GLOBAL_INT slong;
+  typedef unsigned GLOBAL_INT ulong;
+#else
+  typedef sint slong;
+  typedef uint ulong;
+#endif
+
+#endif
+

Modified: MOAB/trunk/test/h5file/Makefile.am
===================================================================
--- MOAB/trunk/test/h5file/Makefile.am	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/test/h5file/Makefile.am	2007-10-03 20:28:42 UTC (rev 1297)
@@ -3,6 +3,7 @@
 check_PROGRAMS = h5test
 if PARALLEL_HDF5
   check_PROGRAMS += parallel
+  INCLUDES += -I$(top_srcdir)/parallel
 endif
 h5test_SOURCES = h5file_test.cpp
 h5test_LDADD = $(top_builddir)/libMOAB.la

Modified: MOAB/trunk/test/h5file/parallel.cpp
===================================================================
--- MOAB/trunk/test/h5file/parallel.cpp	2007-09-27 14:18:44 UTC (rev 1296)
+++ MOAB/trunk/test/h5file/parallel.cpp	2007-10-03 20:28:42 UTC (rev 1297)
@@ -9,6 +9,7 @@
 #include "MBTagConventions.hpp"
 #include "MBParallelConventions.h"
 #include "WriteHDF5Parallel.hpp"
+#include "FileOptions.hpp"
 
 #include "testdir.h"
 
@@ -400,7 +401,9 @@
   WriteHDF5Parallel *writer = new WriteHDF5Parallel( iFace );
   
   printerror ("Writing parallel file: \"%s\"", fnames[0] );
-  rval = writer->write_file( fnames[0], true, &list[0], list.size(), qa );
+  rval = writer->write_file( fnames[0], true, 
+                             FileOptions("PARALLEL"),
+                             &list[0], list.size(), qa );
   if (MB_SUCCESS != rval)
   {
     printerror( "Failed to write parallel file: \"%s\"", fnames[0] );