[MOAB-dev] ghost/halo elements

Ryan O'Kuinghttons ryan.okuinghttons at noaa.gov
Tue Oct 30 14:44:35 CDT 2018


Hi Iulian, yep, good catch, sorry about that, those are typos that came 
over in the process of switching from the 4 proc to the 2 proc test case..

Is it possible for moab to use the ESMF global ids to resolve shared 
entities? We create a tag called gid_tag to store the ids of the nodes 
and elements as they are created, and they should match by construction.

Ryan

On 10/30/18 13:12, Grindeanu, Iulian R. wrote:
> Hi Ryan,
> I was able to build and run your code, there are some issues with your model
> Please see the corrected code;
> It still crashes in the same spot, I need to figure why;
> One problem is that the global ids have to match between processes, otherwise moab cannot resolve shared entities (so this is not resolving based on position, it is resolving only based on global id, a unique integer across tasks)
>
> There are other methods to resolve based on position (that involve a merge parallel method)
>
>
> Thanks,
> Iulian
>
> ________________________________________
> From: Ryan O'Kuinghttons <ryan.okuinghttons at noaa.gov>
> Sent: Tuesday, October 30, 2018 12:32:27 PM
> To: Grindeanu, Iulian R.; Vijay S. Mahadevan
> Cc: moab-dev at mcs.anl.gov
> Subject: Re: [MOAB-dev] ghost/halo elements
>
> Hi Iulian,
>
> First, I pushed this code with a reproducer to a remote branch. You can
> clone esmf here: (read only)
>
> git clone git://git.code.sf.net/p/esmf/esmf esmf
>
> and the branch is called moab4.
>
> Building ESMF for this case should only require mpi and a fortran/c
> compiler. Set the following environment variables and type make at the
> top level esmf directory:
>
> ESMF_DIR=<path to top level esmf dir>
>
> ESMF_COMPILER=gfortran
>
> ESMF_COMM=mpich3
>
> ESMF_BOPT=g
>
> If the build goes awry, please send me the output of 'make info'. Then
> once that is built, the reproducer is here:
>
> esmf/src/Infrastructure/Mesh/tests/ESMC_MeshMOABGhostUTest.C
>
> you can build it by typing make in the tests directory, and run it with:
>
> mpirun -n 2 $ESMF_DIR/test/*/*/ESMC_MeshMOABGhostUTest
>
> -----------------------------------------------------------------------------------
>
> As to your questions:
>
> - Yes, the elem_coords tag is what stores the coordinates of an ESMF
> MBMesh element, this is what we need to access on the ghost elements
>
> - It is actually the elem_coords that are not exchanged in this case.
> The vertex coordinates are in a different tag (node_coords) and those
> ones go through the exchange_tags just fine.
>
> - I don't know what you mean by 'instanced'.. but before the ghost
> exchange is called on the mesh it is a valid mesh and the vtk file is
> created correctly. after the ghost exchange most, but not all, of the
> shared cells have 0 values for all tags which have not been exchanged
> yet. and then during the tag exchange we get the segv when trying to
> operate on the elem_coords tag.
>
> - I will need to check further to see if we are handling adjacencies in
> the way that you describe. I will do that now.
>
> FYI, I have an appointment now, I will bring my laptop and keep looking
> at this as I am able, should be back online in about an hour.
>
> Ryan
>
> On 10/30/18 10:25, Grindeanu, Iulian R. wrote:
>> Hi Ryan,
>> So you are using this to exchange actual coordinates?
>>    so the vertex coordinates of the nodes that are not "exchanged" yet are 0?
>> When the mesh is instanced on each task, is it a valid "mesh", or not yet? I understand that some coordinates could be 0 (not defined?), but the rest of the vtk file should be "valid"
>> One issue is that we do use adjacency information to build up the ghosting layer; adjacency should work fine even if the coordinates of the nodes are wrong, as long as the connectivity information is correct;  For the adjacency logic to work, the vertex adjacencies have to be defined/set correctly;
>> If you create an element/a cell somewhere in the middle, make sure that the "adjacency" information is updated for the affected vertices;
>>
>> for example, if you use
>> ReadUtilIface::get_element_connect to create elements, you need to also call ReadUtilIface::update_adjacencies
>>
>> What entities are you passing to tag exchange, are they including cells, edges, and/or vertices ?
>> Is this tag defined on vertices and cells too, or just on vertices? Does it make sense to be created on cells? Maybe the center coordinate of cell?
>>
>> Iulian
>>
>> ________________________________________
>> From: Ryan O'Kuinghttons <ryan.okuinghttons at noaa.gov>
>> Sent: Tuesday, October 30, 2018 9:27:33 AM
>> To: Grindeanu, Iulian R.; Vijay S. Mahadevan
>> Cc: moab-dev at mcs.anl.gov
>> Subject: Re: [MOAB-dev] ghost/halo elements
>>
>> Hi Iulian,
>>
>> I'll answer your questions one at a time:
>>
>> - sdim is 2 in this test, it the simple 2D mesh in the ascii drawing in
>> the last mail
>>
>> - Yes, elem_coords is a tag with 2 doubles per entity
>>
>> - Yes, merr is checked after both tag creation and tag set, both are fine
>>
>> - The MB_ALREADY_ALLOCATED comes from the moab error code return after
>> exchange_tags, but only when the elem_coords tag is in the vector of
>> tags to exchange
>>
>> - I have written out the vtk files already, the tags that are not
>> exchanged (elem_coords) come back with value of 0, so the mesh has a
>> bunch of connections that go back to (0,0)
>>
>> - elemCoords+mbmp->sdim*e is a reference to the e'th element in the
>> elemCoords array of coordinate values. This code is widely tested in a
>> variety of scenarios and passes everywhere. Additionally, the
>> elem_coords tag is created and set as part of the original mesh creation
>> in my test code, and writing out the mesh before the ghost_exchange
>> shows a healthy looking mesh.
>>
>> I'm a bit hesitant to ask you to look at the esmf moab code directly
>> because it requires building esmf, which can be a hassle. However, if
>> you think that is the best way to go, I will send instructions on where
>> to get the code and how to build.
>>
>> Ryan
>>
>>
>> On 10/29/18 19:54, Grindeanu, Iulian R. wrote:
>>> Hi Ryan,
>>> Yes, nothing comes out right away; it would be easier if I could debug this on my computer;
>>> what is sdim? Is it 2? So is this a tag with 2 doubles per entity?
>>> did you check merr after tag creaton?
>>> If you use MB_TAG_EXCL , it means the tag should not have been created before;
>>>
>>> When did you get MB_ALREADY_ALLOCATED? after you called tag exchange? I did not see in output, maybe cerr is redirected?
>>>
>>> You can try to write out the file (in serial), from each task; to check that ghosting did what you expect:
>>> something like this:
>>>
>>>       std::ostringstream ent_str;
>>>       ent_str << "mesh." <<pcomm->rank() << ".vtk";
>>>       moab_mesh->write_mesh(ent_str.str().c_str());
>>>
>>>
>>> what is elemCoords+mbmp->sdim*e in your code? I assume you are using some pointer arithmetic, are all those variables OK?
>>> Is elemCoords a double pointer, and e an index ?
>>>
>>> Iulian
>>>
>>> ________________________________________
>>> From: Ryan O'Kuinghttons <ryan.okuinghttons at noaa.gov>
>>> Sent: Monday, October 29, 2018 4:57:43 PM
>>> To: Grindeanu, Iulian R.; Vijay S. Mahadevan
>>> Cc: moab-dev at mcs.anl.gov
>>> Subject: Re: [MOAB-dev] ghost/halo elements
>>>
>>> Hi Iulian, I know this isn't a clean or easy problem, I really
>>> appreciate you looking into it with me.
>>>
>>> The tag is created like this:
>>>
>>> merr=moab_mesh->tag_get_handle("elem_coords", mbmp->sdim,
>>> MB_TYPE_DOUBLE, mbmp->elem_coords_tag, MB_TAG_EXCL|MB_TAG_DENSE,
>>> &dbl_def_val);
>>>
>>> and it is set like this:
>>>
>>> merr=moab_mesh->tag_set_data(mbmp->elem_coords_tag, &new_elem, 1,
>>> elemCoords+mbmp->sdim*e);
>>>
>>>
>>> This is a small test mesh, distributed like this:
>>>
>>>       //
>>>       //   3.0   13 ------ 14 ------ 15     [15] ----------- 16
>>>       //         |         |         |       |               |
>>>       //         |         |         |       |               |
>>>       //         |    8    |    9    |       |       10      |
>>>       //         |         |         |       |               |
>>>       //         |         |         |       |               |
>>>       //   2.0  [9] ----- [10] ---- [11]    [11] ---------- [12]
>>>       //
>>>       //       1.0       1.5       2.0     2.0             3.0
>>>       //
>>>       //                PET 2                      PET 3
>>>       //
>>>       //
>>>       //   2.0   9 ------- 10 ------ 11     [11] ----------- 12
>>>       //         |         |         |       |               |
>>>       //         |    5    |    6    |       |       7       |
>>>       //         |         |         |       |               |
>>>       //   1.5   5 ------- 6 ------- 7      [7] -----------  8
>>>       //         |         |         |       |               |
>>>       //         |    1    |    2    |       |       4       |
>>>       //         |         |         |       |               |
>>>       //   1.0   1 ------- 2 ------- 3      [3] ------------ 4
>>>       //
>>>       //         1.0       1.5     2.0      2.0             3.0
>>>       //
>>>       //                PET 0                      PET 1
>>>       //
>>>       //               Node Id labels at corners
>>>       //              Element Id labels in centers
>>>
>>> It would be difficult to create a small reproducer for you, given that
>>> I'm working on a development branch, but I could probably send you a
>>> tarball if we get to that. Likewise, it would be some work to create a
>>> similar test on 2 processors, but not out of the question.
>>>
>>> Is there anything that comes to mind with the above information that I
>>> could look into first? Is it possible that this is duplicating a default
>>> tag in moab? (given the message MB_ALREADY_ALLOCATED)..
>>>
>>>
>>> On 10/29/18 15:31, Grindeanu, Iulian R. wrote:
>>>> Sorry, I still don't know what is wrong.
>>>> What kind of tag is it? is it fixed length, does it have a default value?
>>>> How was this tag created/set ?
>>>>
>>>> the sizes of the buffers sent seem to be small, is this a small test mesh? how is this partitioned?
>>>> Can you replicate the bug in a way I can run on my machine? I mean, I do have an old version of esmf. I assume this is on newer code. Or can you replicate on smaller number of tasks (maybe 2?)
>>>>
>>>> Iulian
>>>>
>>>> ________________________________________
>>>> From: Ryan O'Kuinghttons <ryan.okuinghttons at noaa.gov>
>>>> Sent: Monday, October 29, 2018 4:10:16 PM
>>>> To: Grindeanu, Iulian R.; Vijay S. Mahadevan
>>>> Cc: moab-dev at mcs.anl.gov
>>>> Subject: Re: [MOAB-dev] ghost/halo elements
>>>>
>>>>        0  ParallelComm(0.03 s) Entering exchange_tags
>>>>        3  ParallelComm(0.03 s) Entering exchange_tags
>>>>        1  ParallelComm(0.03 s) Entering exchange_tags
>>>>        2  ParallelComm(0.03 s) Entering exchange_tags
>>>>        3  ParallelComm(0.03 s) Irecv, 1<-3, buffer ptr = 0x13a5a20, tag=7,
>>>> size=1024, incoming=1
>>>>        1  ParallelComm(0.03 s) Irecv, 0<-1, buffer ptr = 0x1115b30, tag=7,
>>>> size=1024, incoming=1
>>>>        0  ParallelComm(0.03 s) Irecv, 1<-0, buffer ptr = 0x11cc7e0, tag=7,
>>>> size=1024, incoming=1
>>>>        1  ParallelComm(0.03 s) Irecv, 2<-1, buffer ptr = 0x17a4330, tag=7,
>>>> size=1024, incoming=2
>>>>        0  ParallelComm(0.03 s) Irecv, 2<-0, buffer ptr = 0x11cdf60, tag=7,
>>>> size=1024, incoming=2
>>>>        0  ParallelComm(0.03 s) Irecv, 3<-0, buffer ptr = 0xb4b9e0, tag=7,
>>>> size=1024, incoming=3
>>>>        3  ParallelComm(0.03 s) Irecv, 2<-3, buffer ptr = 0x13a6360, tag=7,
>>>> size=1024, incoming=2
>>>>        3  ParallelComm(0.03 s) Irecv, 0<-3, buffer ptr = 0xd23290, tag=7,
>>>> size=1024, incoming=3
>>>>        1  ParallelComm(0.03 s) Irecv, 3<-1, buffer ptr = 0x11135a0, tag=7,
>>>> size=1024, incoming=3
>>>>        2  ParallelComm(0.03 s) Irecv, 0<-2, buffer ptr = 0x16505f0, tag=7,
>>>> size=1024, incoming=1
>>>>        2  ParallelComm(0.03 s) Irecv, 1<-2, buffer ptr = 0x2362770, tag=7,
>>>> size=1024, incoming=2
>>>>        2  ParallelComm(0.03 s) Irecv, 3<-2, buffer ptr = 0x1651720, tag=7,
>>>> size=1024, incoming=3
>>>>        1  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        0  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        3  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        2  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        1  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 1->0,
>>>> buffer ptr = 0x17a5500, tag=7, size=83
>>>>        0  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 0->1,
>>>> buffer ptr = 0x11ceee0, tag=7, size=83
>>>>        3  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 3->1,
>>>> buffer ptr = 0x1a27200, tag=7, size=83
>>>>        2  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 2->0,
>>>> buffer ptr = 0x2363980, tag=7, size=107
>>>>        0  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        3  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        1  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        2  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        0  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 0->2,
>>>> buffer ptr = 0xb4a0d0, tag=7, size=83
>>>>        3  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 3->2,
>>>> buffer ptr = 0xd22a20, tag=7, size=131
>>>>        1  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 1->2,
>>>> buffer ptr = 0x1114230, tag=7, size=107
>>>>        2  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 2->1,
>>>> buffer ptr = 0x1652330, tag=7, size=107
>>>>        0  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        1  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        3  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        1  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 1->3,
>>>> buffer ptr = 0x11163b0, tag=7, size=107
>>>>        0  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 0->3,
>>>> buffer ptr = 0xb4bdf0, tag=7, size=59
>>>>        3  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 3->0,
>>>> buffer ptr = 0xd22610, tag=7, size=59
>>>>        0  ParallelComm(0.03 s) Unpacking tag elem_coords
>>>>        1  ParallelComm(0.03 s) Unpacking tag elem_coords
>>>>        2  ParallelComm(0.03 s) Packing tag "elem_coords"(0.03 s)
>>>>        3  ParallelComm(0.03 s) Unpacking tag elem_coords
>>>>        2  ParallelComm(0.03 s) Done packing tags.(0.03 s) Isend, 2->3,
>>>> buffer ptr = 0x1652bd0, tag=7, size=107
>>>>        2  ParallelComm(0.03 s) Unpacking tag elem_coords
>>>> [3]MOAB ERROR: --------------------- Error Message
>>>> ------------------------------------
>>>> [0]MOAB ERROR: --------------------- Error Message
>>>> ------------------------------------
>>>> [0]MOAB ERROR: Failed to recv-unpack-tag message!
>>>> [1]MOAB ERROR: --------------------- Error Message
>>>> ------------------------------------
>>>> [1]MOAB ERROR: Failed to recv-unpack-tag message!
>>>> [3]MOAB ERROR: Failed to recv-unpack-tag message!
>>>> [3]MOAB ERROR: exchange_tags() line 7414 in ParallelComm.cpp
>>>> [0]MOAB ERROR: exchange_tags() line 7414 in ParallelComm.cpp
>>>> [1]MOAB ERROR: exchange_tags() line 7414 in ParallelComm.cpp
>>>> [2]MOAB ERROR: --------------------- Error Message
>>>> ------------------------------------
>>>> [2]MOAB ERROR: Failed to recv-unpack-tag message!
>>>> [2]MOAB ERROR: exchange_tags() line 7414 in ParallelComm.cpp
>>>> terminate called after throwing an instance of 'int'
>>>> terminate called after throwing an instance of 'int'
>>>> terminate called after throwing an instance of 'int'
>>>> terminate called after throwing an instance of 'int'
>>>>
>>>> Program received signal SIGABRT: Process abort signal.
>>>>
>>>> On 10/29/18 15:02, Grindeanu, Iulian R. wrote:
>>>>> Hi Ryan,
>>>>> I don't know what could be wrong; can you set verbosity level in your code?
>>>>> I mean, can you use something like
>>>>>
>>>>> pcomm->set_debug_verbosity(4)
>>>>>
>>>>> before calling exchange tags? And send me the output?
>>>>>
>>>>> I don't think MB_ALREADY_ALLOCATED is a problem in this case; I could be wrong
>>>>>
>>>>> Iulian
>>>>> ________________________________________
>>>>> From: Ryan O'Kuinghttons <ryan.okuinghttons at noaa.gov>
>>>>> Sent: Monday, October 29, 2018 3:44:01 PM
>>>>> To: Grindeanu, Iulian R.; Vijay S. Mahadevan
>>>>> Cc: moab-dev at mcs.anl.gov
>>>>> Subject: Re: [MOAB-dev] ghost/halo elements
>>>>>
>>>>> Hi Iulian, thanks for responding :)
>>>>>
>>>>> So I did get the exchange_tags call to work for all but one of our tags.
>>>>> However, when I add the last tag to the vector that is passed into
>>>>> exchange_tags I get the following segfault:
>>>>>
>>>>> [3]MOAB ERROR: --------------------- Error Message
>>>>> ------------------------------------
>>>>> [3]MOAB ERROR: Failed to recv-unpack-tag message!
>>>>> [3]MOAB ERROR: exchange_tags() line 7414 in ParallelComm.cpp
>>>>> terminate called after throwing an instance of 'int'
>>>>>
>>>>> Program received signal SIGABRT: Process abort signal.
>>>>>
>>>>> Backtrace for this error:
>>>>> [0]MOAB ERROR: --------------------- Error Message
>>>>> ------------------------------------
>>>>> [0]MOAB ERROR: Failed to recv-unpack-tag message!
>>>>> [1]MOAB ERROR: --------------------- Error Message
>>>>> ------------------------------------
>>>>> [2]MOAB ERROR: --------------------- Error Message
>>>>> ------------------------------------
>>>>> [2]MOAB ERROR: Failed to recv-unpack-tag message!
>>>>> [2]MOAB ERROR: exchange_tags() line 7414 in ParallelComm.cpp
>>>>> [0]MOAB ERROR: exchange_tags() line 7414 in ParallelComm.cpp
>>>>> [1]MOAB ERROR: Failed to recv-unpack-tag message!
>>>>> [1]MOAB ERROR: exchange_tags() line 7414 in ParallelComm.cpp
>>>>> terminate called after throwing an instance of 'int'
>>>>> terminate called after throwing an instance of 'int'
>>>>> terminate called after throwing an instance of 'int'
>>>>>
>>>>>
>>>>>
>>>>> Do you have any ideas what I could be doing wrong?
>>>>>
>>>>> Ryan
>>>>>
>>>>> On 10/29/18 12:52, Grindeanu, Iulian R. wrote:
>>>>>> Hi Ryan,
>>>>>> Sorry for missing the messages, my mistake;
>>>>>> If you have your tags defined on your entities, you will need to call the "exchange_tags" method, after your ghost method;
>>>>>> (call this one:
>>>>>> ErrorCode ParallelComm::exchange_tags(const std::vector<Tag> &src_tags,
>>>>>>                                               const std::vector<Tag> &dst_tags,
>>>>>>                                               const Range &entities_in)
>>>>>> usually, source and dest tags are the same lists)
>>>>>>
>>>>>>
>>>>>> the augment method is  for "membership" to specific moab sets (material, boundary conditions, partition );
>>>>>>
>>>>>>        Entities (cells, usually), do not have a tag defined on them that would signal they are part of those sets, this is why we had to transfer that information in a different way.
>>>>>>
>>>>>> What is exactly your use case? Do you need to transfer information about a set or about a tag ?
>>>>>>
>>>>>> Iulian
>>>>>>
>>>>>> ________________________________________
>>>>>> From: Vijay S. Mahadevan <vijay.m at gmail.com>
>>>>>> Sent: Monday, October 29, 2018 1:05:53 PM
>>>>>> To: Ryan OKuinghttons - NOAA Affiliate
>>>>>> Cc: Grindeanu, Iulian R.; moab-dev at mcs.anl.gov
>>>>>> Subject: Re: [MOAB-dev] ghost/halo elements
>>>>>>
>>>>>> Hi Ryan,
>>>>>>
>>>>>> The method `augment_default_sets_with_ghosts` [1] in ParallelComm
>>>>>> enables this for some of the default tags. We haven't exposed this as
>>>>>> a very general mechanism as the user could define the tag data layout
>>>>>> in many possible ways and it would be difficult to cover all use cases
>>>>>> from a library standpoint. Remember that for custom defined Tags, you
>>>>>> can also use exchange_tag/reduce_tags in ParallelComm to get the data
>>>>>> that you need on shared entities. If exchange_tags does not suit your
>>>>>> needs, you can see if you can adapt augment_default_sets_with_ghosts
>>>>>> method for your use case. Let me know if that helps.
>>>>>>
>>>>>> Vijay
>>>>>>
>>>>>> [1] http://ftp.mcs.anl.gov/pub/fathom/moab-docs-develop/classmoab_1_1ParallelComm.html#a76dced0b6dbe2fb99340945c7ca3c0f2
>>>>>> On Mon, Oct 29, 2018 at 1:49 PM Ryan O'Kuinghttons
>>>>>> <ryan.okuinghttons at noaa.gov> wrote:
>>>>>>> Hi Iulian, Vijay,
>>>>>>>
>>>>>>> We have a set of custom tags that we use in ESMF, would it be possible
>>>>>>> to transfer these as well?
>>>>>>>
>>>>>>> Ryan
>>>>>>>
>>>>>>> On 10/29/18 10:54, Vijay S. Mahadevan wrote:
>>>>>>>> The default set of tags do get transferred. These would be the
>>>>>>>> MATERIAL_SET, DIRICHLET/NEUMANN sets etc.
>>>>>>>>
>>>>>>>> Iulian, did we add a separate method to transfer these to the ghosted
>>>>>>>> entities ? I vaguely remember something like that.
>>>>>>>>
>>>>>>>> Vijay
>>>>>>>> On Mon, Oct 29, 2018 at 12:42 PM Ryan O'Kuinghttons
>>>>>>>> <ryan.okuinghttons at noaa.gov> wrote:
>>>>>>>>> Hi Vijay,
>>>>>>>>>
>>>>>>>>> I ran into another issue with the call to exchange_ghost_cells, but this
>>>>>>>>> is probably just a misunderstanding on my part. I find that the tag
>>>>>>>>> information that is part of the mesh before the the call does not
>>>>>>>>> transfer to the newly created faces. Is there some way to ask moab to
>>>>>>>>> transfer tags when creating ghost entities? (FWIW, I could not find this
>>>>>>>>> in the example you sent me) Thanks,
>>>>>>>>>
>>>>>>>>> Ryan
>>>>>>>>>
>>>>>>>>> On 10/26/18 07:36, Ryan O'Kuinghttons wrote:
>>>>>>>>>> Hi Vijay,
>>>>>>>>>>
>>>>>>>>>> I resolved the issue with not getting any shared, owned entities back
>>>>>>>>>> by using the other version of resolve_shared_ents, as demonstrated in
>>>>>>>>>> the example you indicated. I created a Range containing all elements
>>>>>>>>>> (faces for moab), and used that as input.
>>>>>>>>>>
>>>>>>>>>>          Range range_ent;
>>>>>>>>>> merr=mb->get_entities_by_dimension(0,this->sdim,range_ent);
>>>>>>>>>>          MBMESH_CHECK_ERR(merr, localrc);
>>>>>>>>>>
>>>>>>>>>>          merr = pcomm->resolve_shared_ents(0, range_ent, this->sdim, 1);
>>>>>>>>>>          MBMESH_CHECK_ERR(merr, localrc);
>>>>>>>>>>
>>>>>>>>>> The first version of the interface (which does not use a Range input)
>>>>>>>>>> does not work for me:
>>>>>>>>>>
>>>>>>>>>>          merr = pcomm->resolve_shared_ents(0, this->sdim, 1);
>>>>>>>>>>
>>>>>>>>>> Otherwise, everything else seems to be resolved with this issue,
>>>>>>>>>> thanks again for all of your help!!
>>>>>>>>>>
>>>>>>>>>> Ryan
>>>>>>>>>>
>>>>>>>>>> On 10/25/18 15:13, Vijay S. Mahadevan wrote:
>>>>>>>>>>> Ryan, I don't see anything obviously wrong in the code below. Does
>>>>>>>>>>> your shared_ents not have any entities after parallel resolve ? If you
>>>>>>>>>>> can replicate your issue in a small standalone test, we should be able
>>>>>>>>>>> to quickly find out what's going wrong.
>>>>>>>>>>>
>>>>>>>>>>> Look at the test_reduce_tag_explicit_dest routine in
>>>>>>>>>>> test/parallel/parallel_unit_tests.cpp:1617. It has the workflow where
>>>>>>>>>>> you create a mesh in memory, resolve shared entities, get shared
>>>>>>>>>>> entities and even set/reduce tag data.
>>>>>>>>>>>
>>>>>>>>>>> Vijay
>>>>>>>>>>> On Thu, Oct 25, 2018 at 4:25 PM Ryan O'Kuinghttons
>>>>>>>>>>> <ryan.okuinghttons at noaa.gov> wrote:
>>>>>>>>>>>> OK, that allows the resolve_shared_ents routine to complete without
>>>>>>>>>>>> error, but I still don't get any shared, owned entities back from the
>>>>>>>>>>>> get_shared_entities/filter_pstatus calls:
>>>>>>>>>>>>
>>>>>>>>>>>>            Range shared_ents;
>>>>>>>>>>>>            // Get entities shared with all other processors
>>>>>>>>>>>>            merr = pcomm->get_shared_entities(-1, shared_ents);
>>>>>>>>>>>>            MBMESH_CHECK_ERR(merr, localrc);
>>>>>>>>>>>>
>>>>>>>>>>>>            // Filter shared entities with not not_owned, which means owned
>>>>>>>>>>>>            Range owned_entities;
>>>>>>>>>>>>            merr = pcomm->filter_pstatus(shared_ents, PSTATUS_NOT_OWNED,
>>>>>>>>>>>> PSTATUS_NOT, -1, &owned_entities);
>>>>>>>>>>>>            MBMESH_CHECK_ERR(merr, localrc);
>>>>>>>>>>>>
>>>>>>>>>>>>            unsigned int nums[4] = {0}; // to store the owned entities per
>>>>>>>>>>>> dimension
>>>>>>>>>>>>            for (int i = 0; i < 4; i++)
>>>>>>>>>>>>              nums[i] = (int)owned_entities.num_of_dimension(i);
>>>>>>>>>>>>
>>>>>>>>>>>>            vector<int> rbuf(nprocs*4, 0);
>>>>>>>>>>>>            MPI_Gather(nums, 4, MPI_INT, &rbuf[0], 4, MPI_INT, 0, mpi_comm);
>>>>>>>>>>>>            // Print the stats gathered:
>>>>>>>>>>>>            if (0 == rank) {
>>>>>>>>>>>>              for (int i = 0; i < nprocs; i++)
>>>>>>>>>>>>                cout << " Shared, owned entities on proc " << i << ": " <<
>>>>>>>>>>>> rbuf[4*i] << " verts, " <<
>>>>>>>>>>>>                    rbuf[4*i + 1] << " edges, " << rbuf[4*i + 2] << " faces,
>>>>>>>>>>>> " <<
>>>>>>>>>>>> rbuf[4*i + 3] << " elements" << endl;
>>>>>>>>>>>>            }
>>>>>>>>>>>>
>>>>>>>>>>>> OUTPUT:
>>>>>>>>>>>>
>>>>>>>>>>>>           Shared, owned entities on proc 0: 0 verts, 0 edges, 0 faces, 0
>>>>>>>>>>>> elements
>>>>>>>>>>>>           Shared, owned entities on proc 1: 0 verts, 0 edges, 0 faces, 0
>>>>>>>>>>>> elements
>>>>>>>>>>>>           Shared, owned entities on proc 2: 0 verts, 0 edges, 0 faces, 0
>>>>>>>>>>>> elements
>>>>>>>>>>>>           Shared, owned entities on proc 3: 0 verts, 0 edges, 0 faces, 0
>>>>>>>>>>>> elements
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Note: I also tried changing the -1 to 2 in the
>>>>>>>>>>>> get_shared_entities/filter_pstatus calls to only retrieve elements.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/25/18 13:57, Vijay S. Mahadevan wrote:
>>>>>>>>>>>>> No problem. Can you try 1 for the last argument to look for edges as
>>>>>>>>>>>>> the shared_dim.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vijay
>>>>>>>>>>>>> On Thu, Oct 25, 2018 at 3:41 PM Ryan O'Kuinghttons
>>>>>>>>>>>>> <ryan.okuinghttons at noaa.gov> wrote:
>>>>>>>>>>>>>> OK, thanks again for that explanation, I never would have been
>>>>>>>>>>>>>> able to figure out that this_set was the root set. I apologize for
>>>>>>>>>>>>>> all the dumb questions, hopefully they are all getting me closer
>>>>>>>>>>>>>> to self sufficiency..
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm getting an error that doesn't make sense to me. I specifying
>>>>>>>>>>>>>> the resolve_dim and shared_dim in the call:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>            // Get the ParallelComm instance
>>>>>>>>>>>>>>            ParallelComm* pcomm = new ParallelComm(mb, mpi_comm);
>>>>>>>>>>>>>>            int nprocs = pcomm->proc_config().proc_size();
>>>>>>>>>>>>>>            int rank = pcomm->proc_config().proc_rank();
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>            // get the root set handle
>>>>>>>>>>>>>>            EntityHandle root_set = mb->get_root_set();
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>            merr = pcomm->resolve_shared_ents(root_set, 2, -1);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> but I'm getting this message:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [0]MOAB ERROR: Unable to guess shared_dim or resolve_dim!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> what am i missing?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 10/25/18 12:51, Vijay S. Mahadevan wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is your root set or fileset if you added entities to it. You
>>>>>>>>>>>>>> can get the root set handle from Core object.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Vijay
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Oct 25, 2018, 14:38 Ryan O'Kuinghttons
>>>>>>>>>>>>>> <ryan.okuinghttons at noaa.gov> wrote:
>>>>>>>>>>>>>>> Thanks again, Vijay. However, I still don't understand what this_set
>>>>>>>>>>>>>>> should be, is it an output maybe?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ryan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 10/25/18 12:27, Vijay S. Mahadevan wrote:
>>>>>>>>>>>>>>>> You can try either the first or the second variant instead [1] with
>>>>>>>>>>>>>>>> following arguments.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ErrorCode moab::ParallelComm::resolve_shared_ents(EntityHandle
>>>>>>>>>>>>>>>> this_set,
>>>>>>>>>>>>>>>> int resolve_dim = 2,
>>>>>>>>>>>>>>>> int shared_dim = -1,
>>>>>>>>>>>>>>>> const Tag * id_tag = 0
>>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> That should resolve 2-dim entities with shared edges across
>>>>>>>>>>>>>>>> partitions. You can leave id_tag pointer as zero since the
>>>>>>>>>>>>>>>> default is
>>>>>>>>>>>>>>>> GLOBAL_ID.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This brings up a more general question I've had about the moab
>>>>>>>>>>>>>>>> documentation for a while. In the doc for this routine, it only
>>>>>>>>>>>>>>>> lists 2
>>>>>>>>>>>>>>>> parameters, proc_ents and shared_dim, even though in the function
>>>>>>>>>>>>>>>> signature above it clearly shows more. I've had trouble
>>>>>>>>>>>>>>>> understanding
>>>>>>>>>>>>>>>> which parameters are relevant in the past, or what they do
>>>>>>>>>>>>>>>> because I'm
>>>>>>>>>>>>>>>> not quite sure how to read the documentation.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is an oversight. We will go through and rectify some of the
>>>>>>>>>>>>>>>> inconsistencies in the documentation. We are preparing for an
>>>>>>>>>>>>>>>> upcoming
>>>>>>>>>>>>>>>> release and I'll make sure that routines in Core/ParallelComm have
>>>>>>>>>>>>>>>> updated documentation that match the interfaces. Meanwhile, if you
>>>>>>>>>>>>>>>> have questions, feel free to shoot an email to the list.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hope that helps.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Vijay
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>> http://ftp.mcs.anl.gov/pub/fathom/moab-docs-develop/classmoab_1_1ParallelComm.html#a29a3b834b3fc3b4ddb3a5d8a78a37c8a
>>>>>>>>>>>>>>>> On Thu, Oct 25, 2018 at 1:58 PM Ryan O'Kuinghttons
>>>>>>>>>>>>>>>> <ryan.okuinghttons at noaa.gov> wrote:
>>>>>>>>>>>>>>>>> Hi Vijay,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks again for that explanation.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ESMF does use unique global ids for the vertices and set them
>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>> GLOBAL_ID in the moab mesh. So I think we are good there.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I can't quite figure out how to use resolve_shared_entities
>>>>>>>>>>>>>>>>> though.
>>>>>>>>>>>>>>>>> There are three versions of the call in the documentation, and
>>>>>>>>>>>>>>>>> I assume
>>>>>>>>>>>>>>>>> that I should use the first and pass in a Range containing all
>>>>>>>>>>>>>>>>> entities
>>>>>>>>>>>>>>>>> for proc_ents:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://ftp.mcs.anl.gov/pub/fathom/moab-docs-develop/classmoab_1_1ParallelComm.html#a59e35d9906f2e33fe010138a144a5cb6
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> However, I'm not sure what the this_set EntityHandle should be.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This brings up a more general question I've had about the moab
>>>>>>>>>>>>>>>>> documentation for a while. In the doc for this routine, it only
>>>>>>>>>>>>>>>>> lists 2
>>>>>>>>>>>>>>>>> parameters, proc_ents and shared_dim, even though in the function
>>>>>>>>>>>>>>>>> signature above it clearly shows more. I've had trouble
>>>>>>>>>>>>>>>>> understanding
>>>>>>>>>>>>>>>>> which parameters are relevant in the past, or what they do
>>>>>>>>>>>>>>>>> because I'm
>>>>>>>>>>>>>>>>> not quite sure how to read the documentation.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Any example or explanation you give me for
>>>>>>>>>>>>>>>>> resolve_shared_entities is
>>>>>>>>>>>>>>>>> much appreciated!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Ryan
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 10/25/18 09:40, Vijay S. Mahadevan wrote:
>>>>>>>>>>>>>>>>>> Hi Ryan,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Glad the example helped.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The example shows the ghost exchange happening between shared
>>>>>>>>>>>>>>>>>>> owned
>>>>>>>>>>>>>>>>>> entities. In my experiments I've been unable to make the ghost
>>>>>>>>>>>>>>>>>> exchange
>>>>>>>>>>>>>>>>>> work, and I think that might be because my entities are not
>>>>>>>>>>>>>>>>>> shared.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> You need to call resolve_shared_entities on the entities prior to
>>>>>>>>>>>>>>>>>> doing ghost exchange. When we load the mesh from file, this
>>>>>>>>>>>>>>>>>> happens
>>>>>>>>>>>>>>>>>> automatically based on the read options but if you are forming
>>>>>>>>>>>>>>>>>> a mesh
>>>>>>>>>>>>>>>>>> in memory, you need to make sure that the shared vertices have
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> same GLOBAL_ID numbering consistently across processes. That
>>>>>>>>>>>>>>>>>> is shared
>>>>>>>>>>>>>>>>>> vertices are unique. Once that is set, shared entities
>>>>>>>>>>>>>>>>>> resolution will
>>>>>>>>>>>>>>>>>> work correctly out of the box and you will have shared
>>>>>>>>>>>>>>>>>> edges/entities
>>>>>>>>>>>>>>>>>> query working correctly. A call to get ghosted layers once
>>>>>>>>>>>>>>>>>> this is
>>>>>>>>>>>>>>>>>> done would be the way to go.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I assume ESMF has a unique global numbering for the vertices ?
>>>>>>>>>>>>>>>>>> Use
>>>>>>>>>>>>>>>>>> that to set the GLOBAL_ID tag. Let us know if you are still
>>>>>>>>>>>>>>>>>> having
>>>>>>>>>>>>>>>>>> issues.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Vijay
>>>>>>>>>>>>>>>>>> On Thu, Oct 25, 2018 at 11:17 AM Ryan O'Kuinghttons
>>>>>>>>>>>>>>>>>> <ryan.okuinghttons at noaa.gov> wrote:
>>>>>>>>>>>>>>>>>>> Hi Vijay,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I've had some time to play with this now, and I have another
>>>>>>>>>>>>>>>>>>> question.
>>>>>>>>>>>>>>>>>>> Thank you for sending along the example code, it has been
>>>>>>>>>>>>>>>>>>> extremely
>>>>>>>>>>>>>>>>>>> helpful.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The example shows the ghost exchange happening between shared
>>>>>>>>>>>>>>>>>>> owned
>>>>>>>>>>>>>>>>>>> entities. In my experiments I've been unable to make the
>>>>>>>>>>>>>>>>>>> ghost exchange
>>>>>>>>>>>>>>>>>>> work, and I think that might be because my entities are not
>>>>>>>>>>>>>>>>>>> shared. The
>>>>>>>>>>>>>>>>>>> situation I have is entities that are owned wholly on a single
>>>>>>>>>>>>>>>>>>> processor, which need to be communicated to other processors
>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>> require them as part of a halo region for mesh based
>>>>>>>>>>>>>>>>>>> computation. In
>>>>>>>>>>>>>>>>>>> this situation would I need to "share" my entities across the
>>>>>>>>>>>>>>>>>>> whole
>>>>>>>>>>>>>>>>>>> processor space, before requesting a ghost exchange? I'm not
>>>>>>>>>>>>>>>>>>> even really
>>>>>>>>>>>>>>>>>>> sure what shared entities mean, is there a good place to look
>>>>>>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>>>>>>> documentation to learn more about the terminology? Thanks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Ryan
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 10/8/18 12:17, Vijay S. Mahadevan wrote:
>>>>>>>>>>>>>>>>>>>> Ryan,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> You need to use the ParallelComm object in MOAB to call
>>>>>>>>>>>>>>>>>>>> exchange_ghost_cells [1] with appropriate levels of ghost
>>>>>>>>>>>>>>>>>>>> layers for
>>>>>>>>>>>>>>>>>>>> your mesh. This needs to be done after the mesh is loaded,
>>>>>>>>>>>>>>>>>>>> or you can
>>>>>>>>>>>>>>>>>>>> pass this information also as part of the options when
>>>>>>>>>>>>>>>>>>>> loading a file
>>>>>>>>>>>>>>>>>>>> and MOAB will internally load the file with ghosted layers.
>>>>>>>>>>>>>>>>>>>> Here's an
>>>>>>>>>>>>>>>>>>>> example [2].
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Vijay
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>> http://ftp.mcs.anl.gov/pub/fathom/moab-docs-develop/classmoab_1_1ParallelComm.html#a55dfa308f56fd368319bfb4244428878
>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>> http://ftp.mcs.anl.gov/pub/fathom/moab-docs-develop/HelloParMOAB_8cpp-example.html
>>>>>>>>>>>>>>>>>>>> On Mon, Oct 8, 2018 at 1:23 PM Ryan O'Kuinghttons
>>>>>>>>>>>>>>>>>>>> <ryan.okuinghttons at noaa.gov> wrote:
>>>>>>>>>>>>>>>>>>>>> Hi, I am wondering if there is a way to create ghost
>>>>>>>>>>>>>>>>>>>>> elements in MOAB.
>>>>>>>>>>>>>>>>>>>>> By this I mean a list of copies of elements surrounding a
>>>>>>>>>>>>>>>>>>>>> specific MOAB
>>>>>>>>>>>>>>>>>>>>> element, ghost elements may exist on a different processor
>>>>>>>>>>>>>>>>>>>>> than the
>>>>>>>>>>>>>>>>>>>>> source element. I see a ghost_elems class in the appData
>>>>>>>>>>>>>>>>>>>>> namespace, but
>>>>>>>>>>>>>>>>>>>>> there is not much documentation on how to use it.. Thank you,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> Ryan O'Kuinghttons
>>>>>>>>>>>>>>>>>>>>> Cherokee Nation Management and Consulting
>>>>>>>>>>>>>>>>>>>>> NESII/NOAA/Earth System Research Laboratory
>>>>>>>>>>>>>>>>>>>>> ryan.okuinghttons at noaa.gov
>>>>>>>>>>>>>>>>>>>>> https://www.esrl.noaa.gov/gsd/nesii/
>>>>>>>>>>>>>>>>>>>>>


More information about the moab-dev mailing list