[cgma-dev] Switching unique IDs to RFC4122 UUIDs

Tue Sep 24 10:43:01 CDT 2013

On Monday, September 23, 2013 03:07:29 PM Tim Tautges wrote:
> [cc'ing cubit-dev, since they will also want to weigh in on this...]
> 
> On 09/20/2013 04:08 PM, David Thompson wrote:
> > Hi all,
> > 
> > I am looking into switching CGMA's IDs from ints (4-8 bytes, signed) to
> > RFC4122-style UUIDs (16 bytes, unsigned) so that cross-references across
> > multiple large models are much less likely to have duplicate IDs.
> While I think this is worth considering (and doing, if it were just CGM),
> I'm not sure it's so critical for file operations.  Large models are indeed
> likely to have conflicting ids with other large files, but those could be
> handled by keeping track of the current file you're reading and the
> entities read by each file (not efficient, I know, but simple to
> implement).
> 
> > One problem I am having is that CGMA performs math (addition/subtraction)
> > on UUIDs in places. For instance, SubEntitySet::renumerate,
> > CAEntityId::actuate, and PartitionEngine::restore_from_attrib all do some
> > kind of math (increment/decrement) on UIDs. This is not supported by any
> > UUID implementation I'm aware of.
> Looking at CAEntityId::actuate, and the GSaveOpen::get_id_inc, it appears
> that this increment is used in place of a file id or indicator (haven't
> checked that to be sure, but I'd be very surprised if that wasn't the
> case), but that's for entity ids, not for unique ids.  SubEntitySet's use
> seems like EntityId, not UID; same goes for
> PartitionEngine::restore_from_attrib.
> 
> > Also, CGMA compares/swaps CAUniqueId and TDUniqueId values, which I
> > thought lived in different "namespaces" (sequential integers for
> > CAUniqueId and random integers for TDUniqueId). Was I wrong to think
> > this? If not, why does CAEntityId add the same id_inc to both its own ID
> > and its merge-partner's (which appears to be obtained in some places
> > using TDUniqueId::find_td_unique_id)?
> 
> Here I think you're wrong, the unique id and entity id are different
> concepts, and the find_td_unique_id in CAEntityId is used strictly with
> UIDs and not with entity ids.
> 
> To summarize, while I think it would be useful to use a standards-compliant
> UUID, in practice I'm guessing it's going to be difficult to motivate the
> Cubit project to change what they do to accomplish this, even if you/we
> implement all the changes in CGM and below.  Cubit guys, please chime in
> here.  It appears to me that GSaveOpen was probably meant to do what a true
> UUID change would, that is, guarantee different UUID spaces in different
> files, alas they solved it a different way.
> 

Hi,

Keeping track of the current file and entities being read is a good approach.  

Even if you have a better unique id (e.g. RFC 4122), you will still have 
collisions when importing the same file twice.  In that case, GSaveOpen serves 
a purpose to help reference entities local to the current file being imported, 
which UUIDs cannot help with.

Aside from that, in the past, a couple of us have informally talked about 
wanting to use a standard UUID in CGM to reduce collisions.  I think its a 
good idea.

By the way, its nice to see Kitware using CGM.  I'm curious what its being 
used for.

-- 
Clinton Stimpson
Elemental Technologies, Inc
Computational Simulation Software, LLC
www.csimsoft.com