[mpich-discuss] Cluster problem running MPI programs

Pavan Balaji balaji at mcs.anl.gov
Tue Apr 10 19:41:48 CDT 2012


Hello,

How did you configure mpich2?  Please note that mpd is now deprecated 
and is not supported.  In 1.4.1p1, mpd should not be built at all by 
default.

  -- Pavan

On 04/10/2012 07:27 PM, Brice Chaffin wrote:
> Hi all,
>
> I have built a small cluster, but seem to be having a problem.
>
> I am using Ubuntu Linux 11.04 server edition on two nodes, with an NFS
> share for a common directory when running as a cluster.
>
> According to mpdtrace the ring is fully functional. Both machines are
> recognized and communicating.
>
> I can run regular c programs compiled with gcc using mpiexec or mpirun,
> and results are returned from both nodes. When running actual MPI
> programs, such as the examples included with MPICH2, or ones I compile
> myself with mpicc, I get this:
>
> rank 1 in job 8  node1_33851   caused collective abort of all ranks
>    exit status of rank 1: killed by signal 4
>
> I am including mpich2version output so you can see exactly how I built
> it.
>
> MPICH2 Version:    	1.4.1p1
> MPICH2 Release date:	Thu Sep  1 13:53:02 CDT 2011
> MPICH2 Device:    	ch3:nemesis
> MPICH2 configure: 	--disable-f77 --disable-fc --with-pm=mpd
> --prefix=/home/bchaffin/mpich2
> MPICH2 CC: 	gcc    -O2
> MPICH2 CXX: 	c++   -O2
> MPICH2 F77: 	
> MPICH2 FC:
>
> This is my first time working with a cluster, so any advice or
> suggestions are more than welcome.
>
>
>
>
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list