[MPICH2-dev] running out of fd's?
Dave Goodell
goodell at mcs.anl.gov
Wed Jan 30 15:46:27 CST 2008
If his ulimit is set to 256, then it makes perfect sense that he
would run out of fd's at anything much higher than 75. gforker
probably uses 3 fd's per process (stdin,stdout,stderr). floor(256/3)
=85, so I would expect to see a problem in the transition from 84->85
processes (or maybe a little lower due to the 3 fd's for gforker's
own stdio).
In fact, I just tested it, and on my linux box (with "ulimit -n 256")
I hit the error at 84 processes.
-Dave
On Jan 30, 2008, at 3:01 PM, Rob Ross wrote:
> i guess what i was getting at was that it seems to me that at 100
> processes he wouldn't be hitting the limit?
>
> there's a tool called "lsof" that can be used to look at open files
> for a specific process; you could use this to see what's going on
> with the mpd.
>
> rob
>
> On Jan 30, 2008, at 2:47 PM, Brian R. Toonen wrote:
>
>> Nick,
>>
>> The maximum number of file descriptors per process on your machine
>> is 10240
>> and the maximum number for all processes is 12288. These numbers
>> were
>> obtained using the following command.
>>
>> % sysctl -a | grep maxfiles
>> kern.exec: unknown type returned
>> kern.maxfiles = 12288
>> kern.maxfilesperproc = 10240
>> kern.maxfiles: 12288
>> kern.maxfilesperproc: 10240
>>
>> Assuming you are using (t)csh as your shell, you can increase your
>> limit per
>> process to the maximum by adding "limit descriptors 10240" to
>> your .cshrc
>> file. If you use (ba)sh, then adding "ulimit -n 10240" to
>> your .profile
>> should do the trick.
>>
>> --brian
>>
>> |-----Original Message-----
>> |From: owner-mpich2-dev at mcs.anl.gov [mailto:owner-mpich2-
>> dev at mcs.anl.gov] On
>> |Behalf Of Rob Ross
>> |Sent: Wednesday, January 30, 2008 13:53
>> |To: Darius Buntinas
>> |Cc: Nicholas Karonis; mpich2-dev at mcs.anl.gov; Brian Toonen
>> |Subject: Re: [MPICH2-dev] running out of fd's?
>> |
>> |default socket max is i think 1024? -- rob
>> |
>> |On Jan 30, 2008, at 1:26 PM, Darius Buntinas wrote:
>> |
>> |> I bet it's gforker. It creates O(N) sockets for stdio, etc. Try
>> |> mpd and see if that helps.
>> |>
>> |> -d
>> |>
>> |> On 01/30/2008 01:06 PM, Nicholas Karonis wrote:
>> |>> Hi,
>> |>> I just installed MPICH2-1.0.6 on Mac OS X 10.5.1 (i.e., Leopard).
>> |>> I configured it with the gforker and it was all compiled using
>> |>> Gnu's C and C++ compilers that came with the developer tools
>> |>> on the Mac OS X disk.
>> |>> The build seem to go OK and so I tried testing it with a small
>> |>> ring program (source at bottom). When I run the ring with -np 75
>> |>> all is OK but when I increase it to 100 I get an error message:
>> |>> /* running with 75, all OK */
>> |>> mpro% mpiexec -np 75 ring
>> |>> nprocs 75 received 75
>> |>> /* attempting to run with 100, problem :-( */
>> |>> mpro% mpiexec -np 100 ring
>> |>> Error in system call select: Bad file descriptor
>> |>> mpro%
>> |>> Any suggestions?
>> |>> Thanks in advance,
>> |>> Nick
>> |>> --- app source
>> |>> mpro% cat ring.c
>> |>> #include "mpi.h"
>> |>> #include <stdio.h>
>> |>> #include <stdlib.h>
>> |>> int main(int argc, char *argv[])
>> |>> {
>> |>> int nprocs, myid;
>> |>> int val;
>> |>> MPI_Status st;
>> |>> MPI_Init(&argc, &argv);
>> |>> MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
>> |>> if (nprocs > 1)
>> |>> {
>> |>> MPI_Comm_rank(MPI_COMM_WORLD, &myid);
>> |>> if (myid == 0)
>> |>> {
>> |>> val = 1;
>> |>> MPI_Send(&val, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
>> |>> MPI_Recv(&val, 1, MPI_INT, nprocs-1, 0, MPI_COMM_WORLD,
>> &st);
>> |>> printf("nprocs %d received %d\n", nprocs, val);
>> |>> }
>> |>> else
>> |>> {
>> |>> MPI_Recv(&val, 1, MPI_INT, myid-1, 0, MPI_COMM_WORLD,
>> &st);
>> |>> val ++;
>> |>> MPI_Send(&val, 1, MPI_INT, (myid+1)%nprocs, 0,
>> |>> MPI_COMM_WORLD);
>> |>> } /* endif */
>> |>> } /* endif */
>> |>> MPI_Finalize();
>> |>> exit(0);
>> |>> } /* end main() */
>> |>> mpro%
>> |>
>>
>
More information about the mpich2-dev
mailing list