<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>Thanks Philip, the "= {0};" initialization of that struct got me
going.</p>
<p>I can run the test case now and it is producing output in the
client and server perf dirs. Just to sanity check what to look
for, I think the problem should be exhibited in the "put_wait" or
maybe "do_put" trace events on the client? For example on my
laptop I see this:</p>
<p>carns-x1-7g ~/w/d/d/m/client.perf [SIGINT]> grep do_put
trace_events.0.json<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352591977.250000,"dur":350.464053,"args":{"GUID":2,"Parent
GUID":0}},<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352593065.000000,"dur":36.858053,"args":{"GUID":4,"Parent
GUID":0}},<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352593617.000000,"dur":32.954053,"args":{"GUID":6,"Parent
GUID":0}},<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352594193.000000,"dur":36.026053,"args":{"GUID":8,"Parent
GUID":0}},<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352594850.750000,"dur":34.404053,"args":{"GUID":10,"Parent
GUID":0}},<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352595400.750000,"dur":33.524053,"args":{"GUID":12,"Parent
GUID":0}},<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352595927.500000,"dur":34.390053,"args":{"GUID":14,"Parent
GUID":0}},<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352596416.000000,"dur":37.922053,"args":{"GUID":16,"Parent
GUID":0}},<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352596870.000000,"dur":35.506053,"args":{"GUID":18,"Parent
GUID":0}},<br>
{"name":"do_put","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352597287.500000,"dur":34.774053,"args":{"GUID":20,"Parent
GUID":0}},<br>
carns-x1-7g ~/w/d/d/m/client.perf> grep put_wait
trace_events.0.json<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352592427.750000,"dur":570.428053,"args":{"GUID":3,"Parent
GUID":0}},<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352593122.750000,"dur":429.156053,"args":{"GUID":5,"Parent
GUID":0}},<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352593671.250000,"dur":465.616053,"args":{"GUID":7,"Parent
GUID":0}},<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352594248.500000,"dur":547.054053,"args":{"GUID":9,"Parent
GUID":0}},<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352594906.750000,"dur":428.964053,"args":{"GUID":11,"Parent
GUID":0}},<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352595455.750000,"dur":416.796053,"args":{"GUID":13,"Parent
GUID":0}},<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352595981.250000,"dur":371.040053,"args":{"GUID":15,"Parent
GUID":0}},<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352596485.500000,"dur":334.758053,"args":{"GUID":17,"Parent
GUID":0}},<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352596934.250000,"dur":298.168053,"args":{"GUID":19,"Parent
GUID":0}},<br>
{"name":"put_wait","cat":"CPU","ph":"X","pid":0,"tid":0,"ts":1635872352597342.250000,"dur":389.624053,"args":{"GUID":21,"Parent
GUID":0}},</p>
<p>I should look at the "dur" field right? What are the units on
that?</p>
<p>I'll see if I can run this on a "real" system shortly.<br>
</p>
<p>thanks!</p>
<p>-Phil<br>
</p>
<div class="moz-cite-prefix">On 11/2/21 12:11 PM, Philip Davis
wrote:<br>
</div>
<blockquote type="cite" cite="mid:104B5045-328B-48EE-A32C-09D25817EED7@rutgers.edu">
Hi Phil,
<div class=""><br class="">
</div>
<div class="">Sorry the data structures are like that; I wanted to
preserve as much of the RPC size and ordering in case it ended
up being important.</div>
<div class=""><br class="">
</div>
<div class="">I’m surprised in.odsc.size is troublesome, as I
set in.odsc.size with the line `in.odsc.size = sizeof(odsc);`.
I’m not sure what could be corrupting that value in the
meantime.</div>
<div class=""><br class="">
</div>
<div class="">I don’t set in.odsc.gdim_size (which was an
oversight, since that’s non-zero normally), so I’m less
surprised that’s an issue. I thought I initialized `in` to zero,
but I see I didn’t do that after all.</div>
<div class=""><br class="">
</div>
<div class="">Maybe change the line `bulk_gdim_t in;` to
`bulk_gdim_t in = {0};`</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Nov 2, 2021, at 11:48 AM, Phil Carns <<a href="mailto:carns@mcs.anl.gov" class="moz-txt-link-freetext" moz-do-not-send="true">carns@mcs.anl.gov</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">
<p class="">Awesome, thanks Philip. It came through
fine.</p>
<p class="">I started by modifying the job script
slightly to just run it on my laptop with sm (I wanted
to make sure I understood the test case, and how to
use apex, before trying elsewhere). Does in.size
needs to be set in client.c? For me there is a random
value in that field and it is causing the encoder on
the forward to attempt a very large allocation. The
same might be true of gdim_size if it got past that
step. I started to alter them but then I wasn't sure
what the implications were.<br class="">
</p>
<p class="">(fwiw I needed to include stdlib.h in
common.h, but I've hit that a couple of times recently
on codes that didn't previously generate warnings; I
think something in Ubuntu has gotten strict about that
recently).</p>
<p class="">thanks,</p>
<p class="">-Phil<br class="">
</p>
<p class=""><br class="">
</p>
<div class="moz-cite-prefix">On 11/1/21 4:51 PM, Philip
Davis wrote:<br class="">
</div>
<blockquote type="cite" cite="mid:67D76603-B4B9-48E5-9452-966D1E4866D2@rutgers.edu" class="">
<div class="" style="word-wrap:break-word;
line-break:after-white-space">
Hi Phil,</div>
<div class="" style="word-wrap:break-word;
line-break:after-white-space">
<div class=""><br class="">
</div>
<div class="">I’ve attached the reproducer. I see
the 4th and 8th issue on Frontera, but not Summit.
Hopefully it will build and run without too much
modification. Let me know if there are any issues
with running it (or if the anl listserv eats the
tarball, which I kind of expect).</div>
<div class=""><br class="">
</div>
<div class="">Thanks,</div>
<div class="">Philip</div>
<div class=""><br class="">
<div class=""><br class="">
<blockquote type="cite" class="">
<div class="">On Nov 1, 2021, at 11:14 AM,
Phil Carns <<a href="mailto:carns@mcs.anl.gov" class="moz-txt-link-freetext" moz-do-not-send="true">carns@mcs.anl.gov</a>>
wrote:</div>
<br class="x_Apple-interchange-newline">
<div class="">
<div class="">
<p class="">Hi Philip,</p>
<p class="">(FYI I think the first image
didn't come through in your email, but I
think the others are sufficient to get
across what you are seeing)</p>
<p class="">I don't have any idea what
would cause that. The recently released
libfabric 1.13.2 (available in spack
from the mochi-spack-packages repo)
includes some fixes to the rxm provider
that could be relevant to Frontera and
Summit, but nothing that aligns with
what you are observing.</p>
<p class="">If it were later in the
sequence (much later) I would speculate
that memory allocation/deallocation
cycles were eventually causing a
hiccup. We've seen something like that
in the past, and it's a theory that we
could then test with alternative
allocators like jemalloc. That's not
memory allocation jitter that early in
the run though.<br class="">
</p>
<p class="">Please do share your
reproducer if you don't mind! We can
try a few systems here to at least
isolate if it is something peculiar to
the InfiniBand path or if there is a
more general problem in Margo.</p>
<p class="">thanks,</p>
<p class="">-Phil<br class="">
</p>
<div class="x_moz-cite-prefix">On 10/29/21
3:20 PM, Philip Davis wrote:<br class="">
</div>
<blockquote type="cite" class="">
<div class="">Hello,</div>
<div class=""><br class="">
</div>
<div class="">I apologize in advance for
the winding nature of this email. I’m
not sure how to ask my question
without explaining the story of my
results some.</div>
<div class=""><br class="">
</div>
I’m doing some characterization of our
server performance under load, and I
have a quirk of performance that I
wanted to run by people to see if they
make sense. My testing so far has been
to iteratively send batches of RPCs
using margo_iforward, and then measure
the wait time until they are all
complete. On the server side, handling
the RPC includes a margo_bulk_transfer
as a pull initiated on the server to
pull (for now) 8 bytes. The payload of
the RPC request is about 500 bytes, and
the response payload is 4 bytes.
<div class=""><br class="">
</div>
<div class="">I’ve isolated my results
down to one server rank and one client
rank, because it’s an easier starting
point to reason from. Below is a graph
of some of my initial results. These
results are from Frontera. The median
times are good (a single RPC takes on
the order of 10s of microseconds,
which seems fantastic). However, the
outliers are fairly high (note the log
scale of the y-axis). With only one
RPC per timestep, for example, there
is a 30x spread between the median and
the max.</div>
<div class=""><br class="">
</div>
<div class=""><img id="x_126DD6C1-A420-4D70-8D1C-96320B7F54E7" src="cid:5EB5DF63-97AA-48FC-8BBE-E666E933D79F" class="" moz-do-not-send="true"></div>
<div class=""><br class="">
</div>
<div class="">I was hoping (expecting)
the first timestep would be where the
long replies resided, but that turned
out not to be the case. Below are
traces from the 1 RPC (blue) and 2 RPC
(orange) per timestep cases, 5 trials
of 10 timesteps for each case
(normalized to fix the same x-axis):</div>
<div class=""><br class="">
</div>
<div class=""><span id="x_cid:part1.PbJ8iWDr.ROdFhNeX@mcs.anl.gov" class=""><PastedGraphic-6.png></span></div>
<div class=""><br class="">
</div>
<div class="">What strikes me is how
consistent these results are across
trials. For the 1 RPC per timestep
case, the 3rd and 7th timestep are
consistently<i class=""> </i><span class="" style="font-style:normal">slow
(and the rest are fast). For the 2
RPC per timestep case, the 2nd and
4th timestep are always slow and
sometimes the 10th is. These results
are repeatable with very rare
variation.</span></div>
<div class=""><span class="" style="font-style:normal"><br class="">
</span></div>
<div class="">For the single RPC case, I
recorded some timers on the server
side, and attempted to overlay them
with the client side (there is some
unknown offset, but probably on the
order of 10s of microseconds at worst,
given the pattern):</div>
<div class=""><br class="">
</div>
<div class=""><span id="x_cid:part2.1KhzfDFv.Yq10fEh8@mcs.anl.gov" class=""><PastedGraphic-7.png></span></div>
<div class=""><br class="">
</div>
<div class="">I blew up the first few
timesteps of one of the trials:</div>
<div class=""><span id="x_cid:part3.B6Byp6pi.aa0hKH2Q@mcs.anl.gov" class=""><PastedGraphic-8.png></span></div>
<div class=""><br class="">
</div>
<div class="">The different colors are
different segments of the handler, but
there doesn’t seem to be anything too
interesting going on inside the
handler. So it looks like the time is
being introduced before the 3rd RPC
handler starts, based on the where the
gap appears on the server side.</div>
<div class=""><br class="">
</div>
<div class="">To try and isolate any
dataspaces-specific behavior, I
created a pure Margo test case that
just sends a single rpc of the same
size as dataspaces iteratively, whre
the server side does an 8-byte bulk
transfer initiated by the server, and
sends a response. The results are
similar, except that it is now the 4th
and 8th timestep that are slow (and
the first timestep is VERY long,
presumably because rxm communication
state is being established. DataSpaces
has an earlier RPC in its init that
was absorbing this latency).</div>
<div class=""><br class="">
</div>
<div class="">I got margo profiling
results for this test case:</div>
<div class=""><br class="">
</div>
<div class="">```</div>
<div class="">
<div class="">3</div>
<div class="">18446744025556676964,ofi+verbs;ofi_rxm://192.168.72.245:39573</div>
<div class="">0xa2a1,term_rpc</div>
<div class="">0x27b5,put_rpc</div>
<div class="">0xd320,__shutdown__</div>
<div class="">0x27b5
,0.000206208,10165,18446744027256353016,0,0.041241646,0.000045538,0.025733232,200,18446744073709551615,286331153,0,18446744073709551615,286331153,0</div>
<div class="">0x27b5
,0;0.041241646,200.000000000, 0;</div>
<div class="">0xa2a1
,0.000009298,41633,18446744027256353016,0,0.000009298,0.000009298,0.000009298,1,18446744073709551615,286331153,0,18446744073709551615,286331153,0</div>
<div class="">0xa2a1
,0;0.000009298,1.000000000, 0;</div>
</div>
<div class="">```</div>
<div class=""><br class="">
</div>
<div class="">So I guess my question at
this point is, is there any sensible
reason why the 4th and 8th RPC sent
would have a long response time? I
think I’ve cleared my code on the
client side and server side, so it
appears to be latency being introduced
by Margo, LibFabric, Argobots, or the
underlying OS. I do see long timesteps
occasionally after this (perhaps every
20-30 timesteps) but these are not
consistent.</div>
<div class=""><br class="">
</div>
<div class="">One last detail: this does
not happen on Summit. On summit, I see
about 5-7x worse single-RPC
performance (250-350 microseconds per
RPC), but without the intermittent
long timesteps.</div>
<div class=""><br class="">
</div>
<div class="">I can provide the minimal
test case if it would be helpful. I am
using APEX for timing results, and the
following dependencies with Spack:</div>
<div class=""><br class="">
</div>
<div class=""><a class="moz-txt-link-freetext
x_moz-txt-link-abbreviated" href="mailto:argobots@1.1" moz-do-not-send="true">argobots@1.1</a>
<a class="moz-txt-link-freetext
x_moz-txt-link-abbreviated" href="mailto:json-c@0.15" moz-do-not-send="true">json-c@0.15</a>
<a class="moz-txt-link-freetext
x_moz-txt-link-abbreviated" href="mailto:libfabric@1.13.1" moz-do-not-send="true">libfabric@1.13.1</a>
<a class="moz-txt-link-freetext
x_moz-txt-link-abbreviated" href="mailto:mercury@2.0.1" moz-do-not-send="true">mercury@2.0.1</a>
<a class="moz-txt-link-freetext
x_moz-txt-link-abbreviated" href="mailto:mochi-margo@0.9.5" moz-do-not-send="true">mochi-margo@0.9.5</a>
rdma-core@20</div>
<div class=""><br class="">
</div>
<div class="">Thanks,</div>
<div class="">Philip</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<br class="">
<fieldset class="x_mimeAttachmentHeader"></fieldset>
<pre class="x_moz-quote-pre">_______________________________________________
mochi-devel mailing list
<a class="moz-txt-link-freetext x_moz-txt-link-abbreviated" href="mailto:mochi-devel@lists.mcs.anl.gov" moz-do-not-send="true">mochi-devel@lists.mcs.anl.gov</a>
<a class="x_moz-txt-link-freetext" href="https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.mcs.anl.gov%2Fmailman%2Flistinfo%2Fmochi-devel&data=04%7C01%7Cphilip.e.davis%40rutgers.edu%7C7f152c3f72534603d01008d99e184d07%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C637714650202547739%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=dpAo6xraOBJUBYZl9u%2FTKlgE0SxgsPHA%2BkcLHGUlRzE%3D&reserved=0" originalsrc="https://lists.mcs.anl.gov/mailman/listinfo/mochi-devel" shash="NyZqvUYuffwdW6QtaT9uKVTegoYHMu39M6JpDeqx7fxGDPocMr2brCxbIba4j9baATNThhsKwln6QXSPijuHGJf2KrU+LichGX9Kdax4uDisRyLEp0lzOKluyTYbdyg1fJMcqJuL52bD6IwhqBMUfKIoq2ZkeHNi7WvAh2YEJpQ=" moz-do-not-send="true">https://lists.mcs.anl.gov/mailman/listinfo/mochi-devel</a>
<a class="x_moz-txt-link-freetext" href="https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mcs.anl.gov%2Fresearch%2Fprojects%2Fmochi&data=04%7C01%7Cphilip.e.davis%40rutgers.edu%7C7f152c3f72534603d01008d99e184d07%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C637714650202557723%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=AkqAutFJvngA4OtsQa4vLGyAgddum5BXjpwqgxXqjtw%3D&reserved=0" originalsrc="https://www.mcs.anl.gov/research/projects/mochi" shash="LPhxsc+GTgGuYzh8ZXI5vXzHVu/W1QXzh8vKlAOtBcF7lotuqJp6nstqkJ7ARZPDdzbY5jgLmdgMOdYnoDYKrUlWIooD1XUAT1CtRet19gaB+7kcjzeQT7cJxbV64r75fcLo0mO/uE/W+Z8cI3WI0g3BKz89ZW8QejDJ0FwF1eE=" moz-do-not-send="true">https://www.mcs.anl.gov/research/projects/mochi</a>
</pre>
</blockquote>
</div>
_______________________________________________<br class="">
mochi-devel mailing list<br class="">
<a href="mailto:mochi-devel@lists.mcs.anl.gov" class="moz-txt-link-freetext" moz-do-not-send="true">mochi-devel@lists.mcs.anl.gov</a><br class="">
<a class="moz-txt-link-freetext" href="https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.mcs.anl.gov%2Fmailman%2Flistinfo%2Fmochi-devel&data=04%7C01%7Cphilip.e.davis%40rutgers.edu%7C7f152c3f72534603d01008d99e184d07%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C637714650202567719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=AOe9pjQ%2FoA3ZJTbxO9EYcxlBq%2FZu8%2Bs%2BXx%2BM8Mv2v0U%3D&reserved=0" originalsrc="https://lists.mcs.anl.gov/mailman/listinfo/mochi-devel" shash="A8T+t+K8Gd/85hkgtmy5s26CUuyTKh/aQ9iUyXotXGE1c3HXYuVfABE9Ok2V+X2Kot0ryOFQFvhjy5EzXi/YP4d7DTId4nazL1YhZKNxViU6q43HauJ6WRjzM9kzEcnl88MgzLVs5dXYtVRV7DMMDwDkdhBa9FtAB2lzw9QX5iY=" moz-do-not-send="true">https://lists.mcs.anl.gov/mailman/listinfo/mochi-devel</a><br class="">
<a class="moz-txt-link-freetext" href="https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mcs.anl.gov%2Fresearch%2Fprojects%2Fmochi&data=04%7C01%7Cphilip.e.davis%40rutgers.edu%7C7f152c3f72534603d01008d99e184d07%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C637714650202567719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=pWkmT6aXdad0gI%2B5nEWwp6MSHnBnn51TijDRd29oQeo%3D&reserved=0" originalsrc="https://www.mcs.anl.gov/research/projects/mochi" shash="XIvXb1ct7GRIthcN3ti6mzEyUz7VfxRy+jSloufhrTDrBlz5XCxykRoI8K0VlrPv0Zb3saDmSkitqnYnHZoa9DLzI6wHuFUb5ZJbXN4G+YkBxiWt7pvkSu9WnrdT7DYJMkFg5OYzHJUR83cOOaTlv3tcieEQNUj3NdQifmk23VM=" moz-do-not-send="true">https://www.mcs.anl.gov/research/projects/mochi</a><br class="">
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</blockquote>
</body>
</html>