[MPICH] RE: [MPICH] MPICH2 doesn´t distribute jobs when running applications

Matthew Chambers matthew.chambers at vanderbilt.edu
Tue May 15 09:28:29 CDT 2007


I’m still quite confused about what you’re doing.  I don’t know whether or
not you’re using the --ncpus option because you haven’t posted the way in
which you are starting your MPDs.  Personally, I start my nodes’ MPDs at
boot time in the /etc/rc.local script.  I always start my jobs from the head
node (which is running the main MPD) without a machine file because I don’t
care which process ranks are running on which nodes as long as all the nodes
get the correct load.  In your case, you should start your MPDs with the
--ncpus option depending on how many cores the machine has, and then start a
job WITHOUT a machine file and using “-np <total number of cores>”.  That
should create a job that properly distributes your process over every core
(i.e. it runs 8 processes on each machine with 8 cores, 4 processes on each
machine with 4 cores, and 2 processes on each machine with 2 cores).  The
caveat is that if you run less than “-np <total number of cores>” the end
result won’t be balanced (i.e. your 2 core machines would be unused if they
were the last nodes in your MPD ring and you ran “-np <total number of cores
- 4>”.

 

Hope this helps,

Matt

 

  _____  

From: Christian M. Probst [mailto:cmacprobst at gmail.com] 
Sent: Monday, May 14, 2007 9:35 PM
To: Matthew Chambers
Subject: Re: [MPICH] MPICH2 doesn´t distribute jobs when running
applications

 

I thought that machine file would be a good place to equilibrate
resources... I have 2 servers with 8 cores, 3 with 4 and 2 with 2... So, in
the machine file, I inform how many processes could be running in the MPD
ring... 

 

As you probably saw, I am not using the --ncpus options, although now I
wonder if informing the processor number in the machine file would be
equivalnet?

 

Thanks.
Christian

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070515/ddaac02b/attachment.htm>


More information about the mpich-discuss mailing list