[MPICH2-dev] running out of fd's?

Dave Goodell goodell at mcs.anl.gov
Wed Jan 30 15:46:27 CST 2008


If his ulimit is set to 256, then it makes perfect sense that he  
would run out of fd's at anything much higher than 75.  gforker  
probably uses 3 fd's per process (stdin,stdout,stderr).  floor(256/3) 
=85, so I would expect to see a problem in the transition from 84->85  
processes (or maybe a little lower due to the 3 fd's for gforker's  
own stdio).

In fact, I just tested it, and on my linux box (with "ulimit -n 256")  
I hit the error at 84 processes.

-Dave

On Jan 30, 2008, at 3:01 PM, Rob Ross wrote:

> i guess what i was getting at was that it seems to me that at 100  
> processes he wouldn't be hitting the limit?
>
> there's a tool called "lsof" that can be used to look at open files  
> for a specific process; you could use this to see what's going on  
> with the mpd.
>
> rob
>
> On Jan 30, 2008, at 2:47 PM, Brian R. Toonen wrote:
>
>> Nick,
>>
>> The maximum number of file descriptors per process on your machine  
>> is 10240
>> and the maximum number for all processes is 12288.  These numbers  
>> were
>> obtained using the following command.
>>
>> % sysctl -a | grep maxfiles
>> kern.exec: unknown type returned
>> kern.maxfiles = 12288
>> kern.maxfilesperproc = 10240
>> kern.maxfiles: 12288
>> kern.maxfilesperproc: 10240
>>
>> Assuming you are using (t)csh as your shell, you can increase your  
>> limit per
>> process to the maximum by adding "limit descriptors 10240" to  
>> your .cshrc
>> file.  If you use (ba)sh, then adding "ulimit -n 10240" to  
>> your .profile
>> should do the trick.
>>
>> --brian
>>
>> |-----Original Message-----
>> |From: owner-mpich2-dev at mcs.anl.gov [mailto:owner-mpich2- 
>> dev at mcs.anl.gov] On
>> |Behalf Of Rob Ross
>> |Sent: Wednesday, January 30, 2008 13:53
>> |To: Darius Buntinas
>> |Cc: Nicholas Karonis; mpich2-dev at mcs.anl.gov; Brian Toonen
>> |Subject: Re: [MPICH2-dev] running out of fd's?
>> |
>> |default socket max is i think 1024? -- rob
>> |
>> |On Jan 30, 2008, at 1:26 PM, Darius Buntinas wrote:
>> |
>> |> I bet it's gforker.  It creates O(N) sockets for stdio, etc.  Try
>> |> mpd and see if that helps.
>> |>
>> |> -d
>> |>
>> |> On 01/30/2008 01:06 PM, Nicholas Karonis wrote:
>> |>> Hi,
>> |>> I just installed MPICH2-1.0.6 on Mac OS X 10.5.1 (i.e., Leopard).
>> |>> I configured it with the gforker and it was all compiled using
>> |>> Gnu's C and C++ compilers that came with the developer tools
>> |>> on the Mac OS X disk.
>> |>> The build seem to go OK and so I tried testing it with a small
>> |>> ring program (source at bottom).  When I run the ring with -np 75
>> |>> all is OK but when I increase it to 100 I get an error message:
>> |>> /* running with 75, all OK */
>> |>> mpro% mpiexec -np 75 ring
>> |>> nprocs 75 received 75
>> |>> /* attempting to run with 100, problem :-( */
>> |>> mpro% mpiexec -np 100 ring
>> |>> Error in system call select: Bad file descriptor
>> |>> mpro%
>> |>> Any suggestions?
>> |>> Thanks in advance,
>> |>> Nick
>> |>> --- app source
>> |>> mpro% cat ring.c
>> |>> #include "mpi.h"
>> |>> #include <stdio.h>
>> |>> #include <stdlib.h>
>> |>> int main(int argc, char *argv[])
>> |>> {
>> |>>    int nprocs, myid;
>> |>>    int val;
>> |>>    MPI_Status st;
>> |>>    MPI_Init(&argc, &argv);
>> |>>    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
>> |>>    if (nprocs > 1)
>> |>>    {
>> |>>    MPI_Comm_rank(MPI_COMM_WORLD, &myid);
>> |>>    if (myid == 0)
>> |>>    {
>> |>>        val = 1;
>> |>>        MPI_Send(&val, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
>> |>>        MPI_Recv(&val, 1, MPI_INT, nprocs-1, 0, MPI_COMM_WORLD,  
>> &st);
>> |>>        printf("nprocs %d received %d\n", nprocs, val);
>> |>>    }
>> |>>    else
>> |>>    {
>> |>>        MPI_Recv(&val, 1, MPI_INT, myid-1, 0, MPI_COMM_WORLD,  
>> &st);
>> |>>        val ++;
>> |>>        MPI_Send(&val, 1, MPI_INT, (myid+1)%nprocs, 0,
>> |>> MPI_COMM_WORLD);
>> |>>    } /* endif */
>> |>>    } /* endif */
>> |>>    MPI_Finalize();
>> |>>    exit(0);
>> |>> } /* end main() */
>> |>> mpro%
>> |>
>>
>




More information about the mpich2-dev mailing list