[MPICH2-dev] maximum number of processes per node
William Gropp
gropp at mcs.anl.gov
Fri Jan 12 14:55:19 CST 2007
Are you running with the latest version of MPICH2 (1.0.5)? There are
some changes in mpd that might help here. There are also, I believe,
some timeouts in mpd that may need to be increased when running many
processes on the same node.
Another option when running all of the processes on the same node is
to use the gforker process manager; to use this, just configure
mpich2 with --with-pm=gforker . In fact, you can use either mpd or
gforker by configuring with --with-pm=gforker:mpd .
One problem that you may run into is running out of file descriptors;
in the gforker case, the mpiexec process will need several fd's for
each process. Make sure that your OS allows more than 1024 fds per
process.
Bill
On Jan 11, 2007, at 4:09 PM, Eric Grobelny wrote:
> Hello,
>
>
>
> I am running some experiments to see the effects of time sharing
> when running many processes on a single node. I have been able to
> run up to 256 processes, but when I try to increase this value to
> 512, I get the following error:
>
>
>
> mpiexec_compute-0-10.local (mpiexec 375): no msg recvd from mpd
> when expecting ack of request
>
>
>
> I am guessing that the problem is caused due to a limit set on the
> number of processes that can run on a single node. Is this the
> case? Can I solve this problem by redefining some #define in the
> source code? Or is this from a limitation placed by the OS?
>
>
>
> By the way, the single node I am running my experiments on is
> running CentOS with kernel version 2.6.9-22.ELsmp.
>
>
>
> Thanks,
>
>
>
> Eric Grobelny
>
>
>
>
>
>
>
> =======================================================
> Eric Grobelny
> ECE Ph.D. Candidate, Research Assistant
> Advanced Space Computing (ASC) group member
> Modeling and Simulation (Performance Prediction) focus
> High-performance Computing and Simulation (HCS) Research Laboratory
>
>
>
> Dept. of Electrical and Computer Engineering, University of Florida
> PO Box 116200, 330 Benton Hall, Gainesville, FL 32611-6200
> Lab: (352)392-9034/9046 FAX: (352)392-8671
>
>
>
>
> --
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.5.432 / Virus Database: 268.16.9/622 - Release Date:
> 1/10/2007 2:52 PM
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.mcs.anl.gov/mailman/private/mpich2-dev/attachments/20070112/4bd047e8/attachment.htm>
More information about the mpich2-dev
mailing list