[MOAB-dev] exchanging sets of entities between processors in parallel

Iulian Grindeanu iulian at mcs.anl.gov
Mon Apr 16 10:45:35 CDT 2012


Hello, Thank you for the detailed report. Sorry for the silence, my email didn't work. I am using parmetis 3.1 in my build, it may take a while to reproduce your error. I will try compiling your code and let you know. Still, what kind of capability you need? Is mbpart not enough for you? Or do you want to do partitioning in the same run? Thanks, Iulian ----- Original Message -----
| By clearing ranges
| ParallelComm::SharedEnts
| ParallelComm::interfaceSets
| and calling function ParallelComm::reset_all_buffers
| I've passed resolve_shared_ents function and now
| have an error in exchange_ghost_cells:
| Invalid entity handle: Hex 550
| Couldn't get pstatus tag.
| cant get_sharing_data
| Trouble setting remote data range on sent entities in ghost exchange.
| Failed to unpack remote handles.
| or
| Found bad entities in check_local_shared, proc rank 5,
| ...
| Reason: Vertex proc set not same size as entity proc set.
| Probably there are some other stored ranges that should be cleared?
| On Apr 14, 2012, at 6:35 AM, Iulian Grindeanu wrote:
| | Hello
| | ----- Original Message -----
| | | Hello,
| | | I'm trying to implement distribution of entities between multiple
| | | processors based on parmetis graph partitioning in parallel.
| | | I've used moab #4.5.0 and #4.6.0pre
| | Can you try current dev version?
| I've downloaded moab from
| http://ftp.mcs.anl.gov/pub/fathom/moab-nightly.tar.gz which appeared
| to be 4.6.0pre.
| Is it current dev version?
this is the current dev version | | | I've tested different approaches for exchanging entities which are
| | | 1. exchange_owned_meshs - hangs in cycle while(incoming2)
| | | 2. send_entities / recv_entities - works if # of processors is 2
| | | otherwise hangs
| | | 3. pack_buffer / unpack_buffer
| | | 3rd one work if i don't set "store_remote_handles" flag, because
| | | otherwise it stops on following assertion while unpacking new
| | | entities:
| | | Assertion failed: (new_ents[ind] == from_vec[i]), function
| | | get_remote_handles, file ParallelComm.cpp, line 1728.
| | Can you send test files for us to be able to reproduce the error?
| I attach source file test.cpp which may be compiled by
| mpicxx test.cpp -lmoab -lnetcdf -lparmetis -lmetis
| and run for example by
| mpirun -n 10 mesh1.mhdf 0
| last number may be 0,1,2
| 0 - load file on one processor
| 1 - load already distributed file (for example by mbpart)
| 2 - load file on every processor then determine own part of
| the mesh by processor rank and elements global id.
| It is written for parmetis 4.0.2
| it may be compiled with
| -DUSE_EXCHANGE_OWNED_MESHES - to use
| exchange_owned_meshs for entities distribution
| -DUSE_SENDRECV - to use send_entities / recv_entities
| -DSTORE_REMOTE_HANDLES - set store_remote_handles
| flag in pack_buffer / unpack_buffer and exchange
| remote handles between processors
| -DDEBUG_OUTPUT - to let program output information about distribution
| I attach two files with mesh:
| mesh1.mhdf - serial file with mesh
| mesh.mhdf - same mesh, but distributed for 4 processors by mbpart
| | | so if i don't use store_remote_handles I obtain final
| | | distribution.
| | | it works well if load mesh on one processor and distribute
| | | it across many processors.
| | | But if i load already distributed mesh, program fails on
| | | resolve_shared_ents with following errors:
| | | Invalid entity handle: Vertex 21
| | | Failed to get pstatus flag.
| | | Couldn't filter on owner.
| | | get_sent_ents failed.
| | | Trouble resolving shared entity remote handles.
| | | cannot resolve ents
| | how do you load? what options do you use? How is the mesh
| | partitioned?
| When I load mesh already distributed by mbpart I use
| "PARALLEL=READ_PART;
| PARTITION=PARALLEL_PARTITION;
| PARTITION_BY_RANK;
| PARALLEL_COMM=xxx;", where xxx is id of communicator
| Otherwise I load file without any options on one processor or on
| all processors then in second case I erase part of the mesh
| and then in both cases create a set with PARALLEL_PARTITION tag
| | | Which is strange, because after distribution i set for all
| | | entities
| | | of
| | | all dimensions zeros for pstatus_tag, zeros for sharedh_tag and -1
| | | for
| | | sharedp_tag
| | | Could you help me with that problem?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/moab-dev/attachments/20120416/cdfb0527/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mesh.mhdf
Type: application/octet-stream
Size: 177388 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/moab-dev/attachments/20120416/cdfb0527/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mesh1.mhdf
Type: application/octet-stream
Size: 174188 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/moab-dev/attachments/20120416/cdfb0527/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test.cpp
Type: application/octet-stream
Size: 71515 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/moab-dev/attachments/20120416/cdfb0527/attachment-0005.obj>


More information about the moab-dev mailing list