[Mochi-devel] Memory Registration time

Philip Davis philip.davis at sci.utah.edu
Mon Nov 29 11:50:42 CST 2021


Hello,

Working on Frontera, I am seeing some fairly long times for memory registration. It takes 3.2 ms pretty consistently to do a margo_bulk_create for a 8MiB buffer. In light of recent discoveries regarding huge pages, I disabled THP on Frontera with prctl and saw registration time increase to 3.8ms. Anecdotally, pinning memory is a fairly expensive operation but I’m not sure what to expect. The same call on Summit takes about 500-600 usec. When multiple bulk handle creates are in contention, I see more variance, including some registrations that only take about 1.5 milliseconds, so I assume there’s some amount of caching somewhere. However, I’m running with FI_MR_CACHE_MAX_COUNT=0, since that’s the only way I can make rxm;verbs work for most (all?) recent version of libfabric. 

This leads me to a couple questions. First, is there any bulk handle caching inside Margo? Second, does margo_bulk_create block the calling execution stream in Margo, or are they being completed asynchronously while the handler yields? If the former, is there a different call I can use to do the latter?

Thanks,
Philip


More information about the mochi-devel mailing list