[MPICH] MPICH2 hangs on diskless SuSE 10.2 based cluster

Rajeev Thakur thakur at mcs.anl.gov
Tue Dec 19 22:19:49 CST 2006


Are you using the latest release, 1.0.5?

If you want to run a job in the background, you should run it as
mpiexec -n 2 process < /dev/null &
(although I don't know if it will help in this case).

Rajeev

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Shaun Qualheim
> Sent: Tuesday, December 19, 2006 7:04 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] MPICH2 hangs on diskless SuSE 10.2 based cluster
> 
> Hey everyone...
> 
> I'm trying to figure out why I'm having an issue with running 
> a job on 
> multiple machines here.
> 
> It worked fine with a 32-bit SuSE 9.3 based setup.
> 
> I started it up with a 64-bit base distro and kernel...
> 
> I can start up the process with:
> mpd &; mpiexec -n 2 process &
> and it starts with no issues...
> 
> When I try doing that with 2 machines though...
> mpdboot -n 2 &; mpiexec -n 2 process &
> It hangs for about 5 minutes and then starts up.
> 
> After it starts up and runs to completion, I can issue the 
> same command 
> again and it starts up right away.
> 
> Any ideas of where to start here or what might be causing this issue?
> 
> Thanks!
> Shaun
> 
> 




More information about the mpich-discuss mailing list