[MPICH] Problem handling redirected stdin in MPICH2 1.0.4p1, F90,dual Opteron

Xavier Cartoixa Soler Xavier.Cartoixa at uab.es
Wed Nov 8 14:16:14 CST 2006


  Thanks, I changed the code to read from a file and it worked.

  Xavier

Ralph Butler wrote:
> The current implementation of mpiexec/mpd only supports low-volume,
> slow (e.g. via tty) stdin.  This is because it provide options to route
> to arbitrary subsets of ranks, and does not do its own buffering.  There
> is no standard on this and implementations sometimes only permit
> stdin to go to rank 0, or to no rank at all.
> Other input should be obtained by open and read, or some parallel I/O
> operations.
> 
> On WedNov 8, at Wed Nov 8 5:26AM, Xavier Cartoixa Soler wrote:
> 
>>  Hi everyone,
>>
>>  I am trying to run a parallel program (SIESTA, an electronic 
>> structure code) compiled with the Intel compilers under MPICH2 
>> 1.0.4p1, and I am facing serious difficulties that I think are related 
>> to redirection of input. I am running a dual Opteron cluster under 
>> Rocks 4.1 (clone of Red Hat Enterprise Linux 4.0).
>>  My MPICH2 was configured with
>>
>> CC=icc CFLAGS=-O0 CXX=icc CXXFLAGS=-O0 F77=ifort FFLAGS="-O0 -assume 
>> 2underscores" F90=ifort F90FLAGS="-O0 -assume 2underscores" 
>> ./configure --prefix=/opt/mpich2-1.0.4p1/ch3_intel_eth/ 
>> --with-device=ch3:sock --enable-f90 --enable-cxx --disable-sharedlibs 
>> --enable-timer-type=gettimeofday
>>
>> (the -O0 option to rule out problems caused by optimization). The mpd 
>> daemon starts all right in one of the dual cpu nodes:
>>
>> [compute-0-0 ~]$ /opt/mpich2-1.0.4p1/ch3_intel_eth/bin/mpdtrace -l
>> compute-0-0.local_33552 (10.255.255.254)
>>
>> [compute-0-0 ~]$ /opt/mpich2-1.0.4p1/ch3_intel_eth/bin/mpdringtest 1000
>> time for 1000 loops = 0.160723924637 seconds
>>
>> and I can even run small parallel programs with redirected input:
>>
>> [compute-0-0 mpi_bug_test]$
>> /opt/mpich2-1.0.4p1/ch3_intel_eth/bin/mpiexec -n 2 ./example1 < inf.txt
>>  Hello world from number            0          54
>>  Hello world from number            1           1
>>
>> but when I go for the real deal, the siesta program hangs most of the 
>> times I try:
>>
>> [xcs at hydra partest]$ /opt/mpich2-1.0.4p1/ch3_intel_eth/bin/mpiexec -n 
>> 2 ./siesta < emmental.fdf
>>  Before MPI_Init...
>>
>> it just hangs at that point. The "Before MPI_Init..." is a write(0,*) 
>> statement I have added before the MPI initialization block. Some times 
>> the 2 cpus print the statement, some times only one does and some 
>> times none of them prints it. Some times (one out of ~30) everything 
>> works as it's supposed to. Hitting Ctrl+C and killing the siesta 
>> program by hand, I receive the mpd error messages:
>>
>> hydra.uab.es_mpdman_0 (handle_console_input 1281): cannot send stdin 
>> to client
>> [... many times ...]
>> hydra.uab.es_mpdman_0 (handle_console_input 1281): cannot send stdin 
>> to client
>> hydra.uab.es_mpdman_0: mpd_uncaught_except_tb handling:
>>   exceptions.AttributeError: 'int' object has no attribute 
>> 'send_dict_msg'
>>     /opt/mpich2-1.0.4p1/ch3_intel_eth/bin/mpdman.py  1270 
>> handle_console_input
>>         self.ring.rhsSock.send_dict_msg(msg)
>>     /opt/mpich2-1.0.4p1/ch3_intel_eth/bin/mpdlib.py  527 
>> handle_active_streams
>>         handler(stream,*args)
>> plus traceback to mpd.run()
>>
>> it can also fail as
>>
>> hydra.uab.es_mpdman_0: mpd_uncaught_except_tb handling:
>>   exceptions.AttributeError: 'int' object has no attribute 
>> 'send_char_msg'
>>     /opt/mpich2-1.0.4p1/ch3_intel_eth/bin/mpdman.py  538  
>> handle_lhs_input
>>         self.pmiSock.send_char_msg(pmiMsgToSend)
>>     /opt/mpich2-1.0.4p1/ch3_intel_eth/bin/mpdlib.py  527 
>> handle_active_streams
>>         handler(stream,*args)
>> plus traceback to mpd.run()
>>
>>
>> Incidentally, using gcc and/or MPICH2 1.0.3 gives me the same problem.
>> After unsuccessful googling, now I have run out of things to try, so 
>> any pointer would be extremely appreciated! Thanks if you've made it 
>> thus far in the message!!
>>
>>  Xavier
>>
>>
>>
>>
>>
> 




More information about the mpich-discuss mailing list