<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";
        mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri","sans-serif";
        mso-fareast-language:EN-US;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:70.85pt 70.85pt 70.85pt 70.85pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=FR link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><span lang=EN-US>Hello,<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>I am currently working on a Win32 program that makes some intensive calculation, and is already written to be multithreaded. As a result, it uses all the available cores on the PC it runs on.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>The basic behavior is for the user to open a model, click the “start” button, then the threads are spawned, and once all is finished, control is given back to the user.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>While this works great, we have found that for larger models, the computation time is limited by the number of cores as the pool of tasks that could run in parallel is not empty.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>As a result, we are investigating the possibility to use grid computing to somehow multiply the number of available cores.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>This, of course, has technical challenges and reading documentation on various websites led me to the MPICH2 one and to this list.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>I’m not sure it’s the appropriate place to ask my questions, but should it not be the case, please tell me what an appropriate place might be.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>I understand that MPI is a framework that would facilitate the communication between the user’s computer and the nodes that perform the distributed tasks.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>What I have a hard time grasping are these :<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>What communication layer is used? How do I choose it?<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>What is the behavior in case a node dies or becomes unreachable?<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>What makes any given machine become a node available for tasks?<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>Is there some sort of load balancing ?<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>Is there a monitoring tool that would give me indications of the status and health of the nodes?<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>How does the “MPI enabled” code gets transferred to the nodes? If I understand things correctly, I would have to write a separate command line exe that takes care of the tasks and this would be the exe that gets sent over to node.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>I’m quite sure all these are trivial questions for those with more experience, but I’m having a hard time finding resources that would answer those.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US>Thanks in advance for your help<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US>Olivier<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p></div></body></html>