<br><br><div class="gmail_quote">On Sat, Jan 22, 2011 at 12:45 AM, Pavan Balaji <span dir="ltr"><<a href="mailto:balaji@mcs.anl.gov">balaji@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div class="im"><br>
On 01/21/2011 09:17 AM, Colin Hercus wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
OK, I'm pretty much sorted, MPI job is running about 99% (times 4<br>
servers) vs non MPI single server version even with IO to stdout. I<br>
</blockquote>
<br></div>
Great. Though, my recommendation to use MPI-IO still stands.</blockquote><div>OK, but later :)<br></div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
tried ch3:nem and it was 2% slower than sock. I'm using multi-threaded<br>
slaves with processor affinity, it's about 5% faster than single<br>
threaded slaves.<br>
</blockquote>
<br></div>
That's surprising. ch3:nemesis should always be faster than ch3:sock as long as you don't have more threads/processes than the available number of cores.<div><div></div><div class="h5"><br></div></div></blockquote>
<div>There was a slave thread for every core so I was a bit overcommitted and master, mpiexec etc have to queue up. With nem master ran at around 30% CPU, with sock around 1%.<br><br>Colin<br></div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div><div class="h5">
<br>
-- Pavan<br>
<br>
-- <br>
Pavan Balaji<br>
<a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/~balaji</a><br>
</div></div></blockquote></div><br>