<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Hi Philip,<div class=""><br class=""></div><div class="">As far as LP distribution, CODES maps LPs to PEs linearly based on the LP Groups part of the CODES configuration file you use. I don’t think there’s any way currently to change the mapping (besides going into src/util/codes_mapping.c and changing it to do something else). Recently we’ve had some discussion on adding in load balancing support to ROSS, but I’m unsure when this will happen. </div><div class=""><br class=""></div><div class="">I do have some tips for settings on optimistic simulation. You may want to try the real time optimistic (sync=5) mode. I think we’ve seen some improvement in CODES models in that mode. Normal optimistic (sync=3) bases the synchronization frequency on the number of events that have been calculated, where as real time sync, performs it after some amount of real time has passed. If you choose sync=5, then you probably want to set --gvt-interval=32, which is 32 ms between GVT computations (if you stick with sync=3, perhaps a gvt-interval of 128 will be good). Also, regardless of whether you set sync to 3 or 5, set --batch=1. This will increase how often the network is polled for new events that have come in and can help reduce rollbacks. </div><div class=""><br class=""></div><div class="">Another setting you can try (in addition to the ones I’ve already listed) is --max-opt-lookahead. Not sure of the exact value you should try, but I’ve had decent success with setting it somewhere from 100 to 1000 when using the dragonfly model. What this does is put a window on events that can be executed, keeping PEs from getting too far ahead in virtual time of the other PEs. So it should slow down your PEs that have the lighter load, hopefully keeping them from causing rollbacks on the PE that has a heavy load.</div><div class=""><br class=""></div><div class="">Hopefully this helps!</div><div class="">Caitlin</div><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jun 29, 2018, at 12:10 PM, Taffet, Philip Adam <<a href="mailto:taffet2@llnl.gov" class="">taffet2@llnl.gov</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="WordSection1" style="page: WordSection1; font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: Calibri, sans-serif;" class=""><span style="font-size: 11pt;" class="">Hi,<o:p class=""></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: Calibri, sans-serif;" class=""><span style="font-size: 11pt;" class="">I’m trying to run an TraceR OTF simulation with lots of messages and lots of congestion. This is the first time that I’ve had a big enough simulation that I need to run it in parallel, and I’m having a really hard time getting any sort of parallel speedup. I tried running on 4-8 nodes with –sync=2 and –sync=3, as well as various values of --nkp, and the best I’ve gotten is only a few percent faster than serial. I looked into it some, and found that the cause appears to be massive load imbalance. I’m attaching a screenshot from hpctraceviewer that shows that rank 0 does almost all the work while the other ranks spend a large amount of time in MPI_Allreduce, waiting for rank 0 to arrive. I don’t know this part of ROSS/CODES very well, but does this mean the LPs are not being distributed evenly? If so, how can I change the distribution?<o:p class=""></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: Calibri, sans-serif;" class=""><span style="font-size: 11pt;" class="">It wouldn’t surprise me if my traffic pattern caused some load imbalance because there are 4 endpoints that receive way more traffic than the others, but I don’t think the imbalance should be this bad.<o:p class=""></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: Calibri, sans-serif;" class=""><span style="font-size: 11pt;" class="">Thank you very much,<o:p class=""></o:p></span></div><div style="margin: 0in 0in 0.0001pt; font-size: 12pt; font-family: Calibri, sans-serif;" class=""><span style="font-size: 11pt;" class="">Philip Taffet<o:p class=""></o:p></span></div></div><span id="cid:86A9F17FEDA15F4BBA062DBB235F8854@namprd09.prod.outlook.com"><Screen Shot 2018-06-26 at 4.16.09 PM[1].png></span><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">_______________________________________________</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">codes-ross-users mailing list</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><a href="mailto:codes-ross-users@lists.mcs.anl.gov" style="color: rgb(149, 79, 114); text-decoration: underline; font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">codes-ross-users@lists.mcs.anl.gov</a><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><a href="https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users" style="color: rgb(149, 79, 114); text-decoration: underline; font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;" class="">https://lists.mcs.anl.gov/mailman/listinfo/codes-ross-users</a></div></blockquote></div><br class=""></div></body></html>