itaps-parallel removal of
Carl Ollivier-Gooch
cfog at mech.ubc.ca
Tue Mar 18 01:48:49 CDT 2008
Mark Miller wrote:
>
>
> Onkar pointed out to me that he thinks
>
> the decsion to remove iMeshP_getPartsOnRank from the interface
>
> yesterday makes it impossible for a processor to obtain the part
>
> handle for any part that is both a) remote and b) not a 'neighbor'.
>
> If you look at any of the methods that return part handles, I think
>
> you'll find that they all require that either the associated part is local
>
> or that it is the 'stored with' a 'copy' of an entity.
>
>
>
> This is bad for a parallel mesh decomposer, for example, which
>
> needs to be able to 'assign' entities to 'any' parts. It will need to
>
> be able to identify parts to 'send' entities to that are neither local
>
> nor neighbors of the 'initial' decomposition. I suppose one could
>
> argue that it could achieve this by sending entities on a 'journey'
>
> through all the part neighbors between the originating and destination
>
> parts. But, that wouldn't be too efficient.
By "parallel mesh decomposer" I assume you do -not- mean a full-up
partitioner (which will have a list of part handles that come with the
entities to be partitioned). Instead, you're presumably talking about a
scenario qualitatively similar to having a single process read a
parallel mesh and dice it -before- the partitioner gets involved.
In this context, I could argue that the parallel read will have to do
some outside-iMeshP communication so that the reading proc can have all
the part handles.
Or I could argue (and this is the alternative I prefer) that it's
possible to have part handles that:
a) are globally unique;
b) allow identification of what process they live on;
c) can be computed knowing only process rank and local part number; and
d) still be be universally distinguished from pointers.
So here we go:
We know what the number of processors is (MPI communicator size or
equivalent). So we can easily compute the number of bits required to
store the process rank. Call this m.
This leaves n = 30-m bits for the number of parts per process. That 30
is deliberate.
Then the part handle can be an int that looks like this:
Bit
31 30 29 28 ... 31-m n n-1 n-2 ... 3 2 1 0
1 [process rank ] [part # in process] 1
That 1 at the end is there because no pointer to a struct ever ends in
anything but 00. The 1 at the beginning disambiguates this from integer
handles (which will surely be positive). And having "only" 30 bits left
over for rank and part # in process isn't that much of a restriction.
Yes, I can imagine a million processes or a million parts on one
process, but not as many as 1000 parts on each of a million processes.
And of course 64 bit handles make questions of size moot.
Given this approach, one can compute (from the number of parts on a
process and the process rank) the handles of all those parts. This way,
you "only" have to know the number of parts on a process (which should
be very compressible data: most will be the same...) instead of all the
part handles.
Having said all that, it's possible that this cure is worse than the
disease. Any thoughts on that?
Carl
--
------------------------------------------------------------------------
Dr. Carl Ollivier-Gooch, P.Eng. Voice: +1-604-822-1854
Associate Professor Fax: +1-604-822-2403
Department of Mechanical Engineering email: cfog at mech.ubc.ca
University of British Columbia http://www.mech.ubc.ca/~cfog
Vancouver, BC V6T 1Z4 http://tetra.mech.ubc.ca/ANSLab/
------------------------------------------------------------------------
More information about the itaps-parallel
mailing list