[mpich-discuss] mpich2 hangs on Ubuntu beowulf cluster(with NFS)

Konstantinos Varotsos kvarotso at gmail.com
Wed Jan 4 15:19:02 CST 2012


Hi


I am trying to  run a Fortran exe on two four quad machines les0 and les1.

The machines are set up with ubuntu 11.04 and  NFS

I have installed the latest stable mpich2 with gfortran and gcc.

The problem is that when I try to run the code on both machines

the run hangs without any error.

The exe runs fine on each machine separately and produces output.


Also cpi  example runs fine.





mpiexec -f machinefile -n 8 ./cpi

Process 4 of 8 is on les1
Process 5 of 8 is on les1
Process 7 of 8 is on les1
Process 6 of 8 is on les1
Process 0 of 8 is on les0
Process 1 of 8 is on les0
Process 2 of 8 is on les0
Process 3 of 8 is on les0

pi is approximately 3.1415926544231247, Error is 0.0000000008333316
wall clock time = 0.002584


hello.f runs fine too


mpiexec -f machinefile -n 8 ./hellow_exe
  Process            0  of            8  is alive
  Process            1  of            8  is alive
  Process            2  of            8  is alive
  Process            3  of            8  is alive
  Process            5  of            8  is alive
  Process            7  of            8  is alive
  Process            4  of            8  is alive
  Process            6  of            8  is alive


machinefile
les1:4
les0:4

The command i use

mpiexec -f machinefile -n 8 ./test.x_RESTART > output/les_$time.output &


I looked through the mpich forum and I found a post

with similar tiltle to mine but with hubrid code

This is not the case. The code is mpi


I am stuck! Any help will be appreaciated


Thanx,  Kwstas






More information about the mpich-discuss mailing list