[mpich-discuss] Some questions for understanding

Hiatt, Dave M dave.m.hiatt at citi.com
Wed Jun 23 09:07:25 CDT 2010


What kind of network utilization do you have going?  Are the results from a single machine using local I/O for input and output?  There are lots of variables that can impact the performance.  My first guess would be that you have some distribution issues/efficiency issues that are exposed when trying to distribute that you are covering when running on a single machine.  You might start by profiling the network and see if that is a constriction issue or if you have issues of contention with network accessible disk.

From: mpich-discuss-bounces at mcs.anl.gov [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Norman Geist
Sent: Wednesday, June 23, 2010 1:28 AM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] Some questions for understanding

 Hello,

my name is Norman, and i administrate a 32x FSC Primergy-Cluster installed with OpenSuse, SunGridEngine, MPICH and linked with Gigabit-Ethernet. Its used for executing scientific scripts on a cpmd.x or sander.mpi with a input file of atoms. If i run such a job on one machine=4cpus, it runs with nearly 100 %. BUt if i now run the same job on 8 cpus=2 machines, the average goes down to close under 50 %. Means one machine alone is faster than 2 machines. But i think the sense of a cluster is to bundle the cpu-power, or not?? Am I right when i suppose that mpi does only work when the programmer himself splitt the tasks in his programm or script with MPI-Commands?

It complicated =)! Or could it be that while compiling a programm for example the normal routines, operators and functions are replaced with mpi-functions which have the communication routines within, so that the comiled programm example.x can itself can communicate??? But if, how the program know with which machine it shell cooperate.

Firstly i though that maybe the job isn't cooperating but executed on f.e. two machines seperately, but the output files are normal, no dublicates or something.

Please help me to understand the real mechanics of mpich so i can try to optimze my cluster to execute jobs on multible machines with a good average.

Thank you very much.



WEB.DE DSL ab 19,99 Euro/Monat. Bis zu 150,- Euro Startguthaben und
50,- Euro Geldprämie inklusive! https://freundschaftswerbung.web.de


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100623/3de63988/attachment.htm>


More information about the mpich-discuss mailing list