[mpich-discuss] Help with some fundamentals

Thu Jan 20 11:17:29 CST 2011

Hi,
 I will try to answer the questions below from a Windows developer (MPICH2 on Windows) perspective,

>> What communication layer is used? How do I choose it? 
 You typically run your MPI job (program) using the mpiexec utility (mpiexec -n 2 mympipgm.exe ; launches two instances of mympipgm.exe on the local host - host where mpiexec command is run. mpiexec -n 2 -machinefile mf.txt mympipgm.exe; launches two instances of mympipgm.exe on the hosts/machines specified in the file, mf.txt.). By default MPICH2 uses TCP/IP sockets to communicate across nodes/hosts and shared memory for communication on the local host/node. On Windows we plan to provide support for IB in couple of months.

>> What is the behavior in case a node dies or becomes unreachable? 
 MPICH2 1.3.2 (the upcoming release) on Unix will be fault tolerant to node failures (In 1.3.2 if a nodes fails, MPI process dies, the job need not abort. However all communication to the failed node/process would fail). However this feature is not yet implemented on Windows (We will be supporting this soon on Windows). Currently on Windows if a process belonging to a job dies the whole job is aborted.

>> What makes any given machine become a node available for tasks? 
 Once MPICH2 is installed on a machine you can specify the machine in the node list while launching your job (mpiexec -n 2 -machinefile mf.txt mympipgm.exe ; the machinefile, mf.txt, contains the list of nodes to be used for launching the MPI job).

>> Is there some sort of load balancing ?
 MPICH2 does not support load balancing. You will have to incorporate any load balancing inside your MPI program.

>> Is there a monitoring tool that would give me indications of the status and health of the nodes? 
 You can use the wmpiconfig utility provided with MPICH2 to check the status of the nodes. If you really want a sophisticated tool you might want to consider using Microsoft's HPC Server (which comes with utilities to manage nodes in your cluster).

>> How does the “MPI enabled” code gets transferred to the nodes? If I understand things correctly, I would have to write a separate command line exe that takes care of the tasks and this would be the exe that gets sent over to node. 
 Currently users do this in two ways. Share the directory containing the executable on all nodes (once a user has access to a network share you just need to provide a command line option when launching your job to share the drive) OR explicitly copy the executable to each node (MPICH2 does not do it for you - as you guessed correctly you will have to explicitly copy the executable to each node if you use this option).

 Hope this helps. Let us know if you have more questions.

Regards,
Jayesh
----- Original Message -----
From: "Olivier SANNIER" <Olivier.SANNIER at actuaris.com>
To: mpich-discuss at mcs.anl.gov
Sent: Thursday, January 20, 2011 9:18:52 AM
Subject: [mpich-discuss] Help with some fundamentals

Hello, 

I am currently working on a Win32 program that makes some intensive calculation, and is already written to be multithreaded. As a result, it uses all the available cores on the PC it runs on. 

The basic behavior is for the user to open a model, click the “start” button, then the threads are spawned, and once all is finished, control is given back to the user. 

While this works great, we have found that for larger models, the computation time is limited by the number of cores as the pool of tasks that could run in parallel is not empty. 

As a result, we are investigating the possibility to use grid computing to somehow multiply the number of available cores. 

This, of course, has technical challenges and reading documentation on various websites led me to the MPICH2 one and to this list. 

I’m not sure it’s the appropriate place to ask my questions, but should it not be the case, please tell me what an appropriate place might be. 

I understand that MPI is a framework that would facilitate the communication between the user’s computer and the nodes that perform the distributed tasks. 

What I have a hard time grasping are these : 

What communication layer is used? How do I choose it? 

What is the behavior in case a node dies or becomes unreachable? 

What makes any given machine become a node available for tasks? 

Is there some sort of load balancing ? 

Is there a monitoring tool that would give me indications of the status and health of the nodes? 

How does the “MPI enabled” code gets transferred to the nodes? If I understand things correctly, I would have to write a separate command line exe that takes care of the tasks and this would be the exe that gets sent over to node. 

I’m quite sure all these are trivial questions for those with more experience, but I’m having a hard time finding resources that would answer those. 

Thanks in advance for your help 

Olivier 

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss