<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
If I understand, you have 2 sockets (i7) of 4 cores each. Your nodes
are NUMA. Do you place your processes and threads ?<br>
<br>
If you do not place your processes, they will migrate from one socket
to the other. That could be the contention you observed.<br>
I suggest you to use 2 MPI processes per node and place each on one
socket by the command taskset (Linux).<br>
<br>
Pascal<br>
<br>
Hiatt, Dave M a écrit :
<blockquote
cite="mid:C547B18DBDE4504493D3745C6FB985270393A29B71@exgtmb13.nam.nsroot.net"
type="cite">
<meta http-equiv="Content-Type" content="text/html; ">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri","sans-serif";}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">What kinds of approaches are others using to
maximize performance on multi-core, multi-processor systems. For
performance reasons I’d really like to have only one MPI process
running on each node and then communicate the data between the
processes on the nodes in the most efficient way.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I’ve got a bunch of twin i7 nodes. For a while
I was able to get by just using the round robin approach and assigning
enough processes so that each core was busy. The advantage was that it
was easy and let me be lazy. But as the development has matured the
volumes of data have begun to badly clog the bandwidth of the cluster.
That’s because I have 8 cores per node and so the same data is going to
each process (i.e. core) and hence it’s replicated 8 times. Further
it’s then being replicated 8 times in terms of memory use, which is
expensive too. This is because I have a particular component of data
that is common to every MPI process, so it’s wasteful to replicate it.
We knew this would be an issue eventually, and eventually has become
now.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Right now, to start with I’ve separated out the
MPI into a single thread in in the process, so the data volume is very
reasonable and network performance is fine. And I have only one copy
of the large amount of shared data. And in my current design I am then
forking off threads to fill up the cores. This works, but there is a
moderate amount of contention even after using a number of performance
analysis tools to find hot spots. And the threading has in my opinion
introduced logic “clutter” which is distracting. That said if
threading is the best solution then the threading stays. But the
performance issue seems to have roots in OpenMP (the threading model
I’m using – I am not using critical sections, they were too expensive
in terms of contention, I went to individual locks). I’m using
OpenMP because this system runs in both Linux and Windows and I want a
common denominator for how threading is implemented. So I’m looking to
see if there is a better solution that this. Hence the following
questions.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Would creating one independent communicator for
each node, and communicating within that node using a local
communicator be better than using say pipes or trying to referee using
the OpenMP locks? I would have to make an “inter-communicator” then on
the real COMM_WORLD communicator to pass data only to the local node
0’s. And they in turn would use intra communicators to communicate
locally. Or would perhaps using “named pipes”, which I understand in
Windows, but for my Linux components I’m not as well versed be better
and leave the MPI to moving the freight between nodes? If so, does
anyone know if there is there a Boost or Gnu package out there that
creates named pipes for Linux? I looked and either I’m just
malfunctioning mentally, or there isn’t one. Can anyone help there.
It looks to me that in the Linux space I need to create a socket level
class to emulate “named pipes” that exist in Windows. Or am I missing
something? This is all in C++ by the way.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Or are others attacking this problem with better
ideas than these. If so could you share them?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">“People get held back by the voice inside em” –
K’naan – In the Beginning<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><b><span style="color: rgb(0, 112, 192);">Dave
M. Hiatt<o:p></o:p></span></b></p>
<p class="MsoNormal">Director, Risk Analytics<o:p></o:p></p>
<p class="MsoNormal">CitiMortgage<o:p></o:p></p>
<p class="MsoNormal">1000 Technology Drive<br>
O'Fallon, MO 63368-2240<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Telephone: 636-261-1408<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<pre wrap="">
<hr size="4" width="90%">
_______________________________________________
mpich-discuss mailing list
<a class="moz-txt-link-abbreviated" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a>
<a class="moz-txt-link-freetext" href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a>
</pre>
</blockquote>
<br>
</body>
</html>