<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Thanks Jeff...you provided some good suggestions. I'll consult the DMAPP documentation and also go back to the code to see if I can reuse window buffers in some way.<div><br></div><div>Would you happen to have links to the DMAPP docs on-hand? I couldn't seem to find any tutorials etc. after a quick browse.<br><div><br></div><div>Cheers,</div><div><br></div><div>Tim.</div><div><br></div><div><div><div>On May 30, 2012, at 11:40 AM, Jeff Hammond wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div>If you don't care about portability, translating from MPI-2 RMA to<br>DMAPP is mostly trivial and you can eliminate collective window<br>creation altogether. However, I will note that my experience getting<br>MPI and DMAPP to inter-operate properly on XE6 (Hopper, in fact) was<br>terrible. And yes, I did everything the NERSC documentation and Cray<br>told me to do.<br><br>I wonder if you can reduce the time spent in MPI_WIN_CREATE by calling<br>it less often. Can you not allocate the window once and keep reusing<br>it? You might need to restructure your code to reuse the underlying<br>local buffers but that isn't that complicated in some cases.<br><br>Best,<br><br>Jeff<br><br>On Wed, May 30, 2012 at 10:36 AM, Jed Brown <<a href="mailto:jedbrown@mcs.anl.gov">jedbrown@mcs.anl.gov</a>> wrote:<br><blockquote type="cite">On Wed, May 30, 2012 at 10:29 AM, Timothy Stitt <<a href="mailto:Timothy.Stitt.9@nd.edu">Timothy.Stitt.9@nd.edu</a>><br></blockquote><blockquote type="cite">wrote:<br></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Hi all,<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">I am currently trying to improve the scaling of a CFD code on some Cray<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">machines at NERSC (I believe Cray systems leverage mpich2 for their MPI<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">communications, hence the posting to this list) and I am running into some<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">scalability issues with the MPI_WIN_CREATE() routine.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">To cut a long story short, the CFD code requires each process to receive<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">values from some neighborhood processes. Unfortunately, each process doesn't<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">know who its neighbors should be in advance.<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">How often do the neighbors change? By what mechanism?<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">To overcome this we exploit the one-sided MPI_PUT() routine to communicate<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">data from neighbors directly.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Recent profiling at 256, 512 and 1024 processes shows that the<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">MPI_WIN_CREATE routine is starting to dominate the walltime and reduce our<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">scalability quite rapidly. For instance the %walltime for MPI_WIN_CREATE<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">over various process sizes increases as follows:<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">256 cores - 4.0%<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">512 cores - 9.8%<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">1024 cores - 24.3%<br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">The current implementation of MPI_Win_create uses an Allgather which is<br></blockquote><blockquote type="cite">synchronizing and relatively expensive.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">I was wondering if anyone in the MPICH2 community had any advice on how<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">one can improve the performance of MPI_WIN_CREATE? Or maybe someone has a<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">better strategy for communicating the data that bypasses the (poorly<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">scaling?) MPI_WIN_CREATE routine.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Thanks in advance for any help you can provide.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Regards,<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">Tim.<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">_______________________________________________<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">mpich-discuss mailing list <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite">To manage subscription options or unsubscribe:<br></blockquote></blockquote><blockquote type="cite"><blockquote type="cite"><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br></blockquote></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">_______________________________________________<br></blockquote><blockquote type="cite">mpich-discuss mailing list <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br></blockquote><blockquote type="cite">To manage subscription options or unsubscribe:<br></blockquote><blockquote type="cite"><a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br></blockquote><blockquote type="cite"><br></blockquote><br><br><br>-- <br>Jeff Hammond<br>Argonne Leadership Computing Facility<br>University of Chicago Computation Institute<br><a href="mailto:jhammond@alcf.anl.gov">jhammond@alcf.anl.gov</a> / (630) 252-5381<br><a href="http://www.linkedin.com/in/jeffhammond">http://www.linkedin.com/in/jeffhammond</a><br>https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond<br>_______________________________________________<br>mpich-discuss mailing list mpich-discuss@mcs.anl.gov<br>To manage subscription options or unsubscribe:<br>https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss<br></div></blockquote></div><br><div>
<!--Text inside such 'comments' is not included in your siggy. You may delete these text comments once you are done--><!-- This bit defines your entire box. Change the 'height' value to reduce the size of the box, the line-height value to reduce the spacing between text lines, and so on. I suggest you change the min-width style to better suit the width of your siggy-->
<title></title>
<div id="sig" style="border-top: 1px dotted rgb(153, 153, 153); border-bottom: 1px dotted rgb(153, 153, 153); margin: 6px 0pt; padding: 8px; min-height: 50px; line-height: 17px; font-family: 'Lucida Grande',Verdana,Arial,Sans-Serif; font-size: 11px; color: rgb(96, 111, 120); min-width: 250px;"><!--This is the image. Upload an image to your own server or imageshack.us and replace the url in this tag-->
<!--<img src="http://www.rostauguardian.webuda.com/photo.jpg" alt="me"
style="padding: 2px 6px 0pt 0pt; float: left; width: 46px; height: 45px;">--><!--end--><!--replace details outside the <> tag brackets. Your name, company, etc. Also change the URLs where needed.
You can also replace the text colour #606f78 to anything you choose--><strong style="color: rgb(255, 102, 0); font-weight: bold;">Tim
Stitt</strong><span style="font-weight: bold;"> </span><span style="color: rgb(255, 102, 0); font-weight: bold;">PhD</span><span style="font-weight: bold;"></span>
<span style="color: rgb(0, 153, 0);">(User Support Manager)</span>.<br>
Center for Research Computing | University of Notre Dame
| <br>
<!--the <br /> tag (above) signifies a line break. Add that tag anywhere you want the line to break into another one. Remove that to make the bottom line flow to the right of the one above it-->
P.O.
Box 539,
Notre Dame, IN 46556 | Phone:
574-631-5287
| Email: <span style="color: rgb(51, 51, 255);"><a href="mailto:tstitt@nd.edu">tstitt@nd.edu</a> </span></div>
</div>
<br></div></div></body></html>