[mpich-discuss] Problem with tcsh and ppn >= 5

Frank Riley fhr at rincon.com
Tue Jun 14 14:22:21 CDT 2011


> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov [mailto:mpich-discuss-
> bounces at mcs.anl.gov] On Behalf Of Frank Riley
> Sent: Tuesday, June 14, 2011 12:20 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: [mpich-discuss] Problem with tcsh and ppn >= 5
> 
> Hello,
> 
> We are having a problem running more than 4 processes per node when
> using the tcsh shell. Has anyone seen this? Here is a simple test case:
> 
> mpiexec -n 1 -ppn 5 /bin/csh -c  /path/to/a.out
> 
> where a.out is a simple C test executable that does a MPI_Init and a
> MPI_Finalize. The error is as follows:
> 
> [cli_3]: write_line error; fd=18 buf=:cmd=init pmi_version=1
> pmi_subversion=1 system message for write_line failure : Bad file descriptor
> 
> Note that the following command (bash shell) works fine:
> 
> mpiexec -n 1 -ppn 5 /bin/sh -c  /path/to/a.out
> 
> Our mpich2 is version 1.3.2p1 and is built with the following flags:
> 
> --enable-fast --enable-romio --enable-debuginfo --enable-smpcoll --enable-
> mpe --enable-threads=runtime --enable-shared --with-mpe

I forgot to mention that we do not see the failure on our cluster that has nodes with 2 cores each. It only fails on our clusters that have nodes with 8 cores each.


More information about the mpich-discuss mailing list