[mpich-discuss] mvapich2 on multiple nodes: 2 problems

Dave Goodell goodell at mcs.anl.gov
Thu Apr 22 10:11:58 CDT 2010


On Apr 22, 2010, at 2:59 AM, abc def wrote:

> DK, Thanks for the explanation about mvapich and mpich - I've  
> checked and I'm definitely using mpich.

Then this is the right list for support.

> Bill, I have now ensured that the directories are all the same on  
> both computers, and these directories contain the same files, but  
> they're not linked by nfs - is this necessary? (I'm hoping not,  
> because setting up nfs is somewhat beyond my skills!)

No, NFS isn't necessary.  It just makes it easier to avoid  
accidentally running mismatched copies of the binaries.

> Just a reminder about this specific problem:
> mpiexec -n 8 /home/me/software.ex
>
> produces the following error:
> MPIR_Init_thread(310): Initialization failed
> MPID_Init(113).......: channel initialization failed
> MPIDI_CH3_Init(244)..: process not on the same host (quad !=  
> december)Fatal error in MPI_Init: Other MPI error, error stack:
>
> And running:
> mpirun_rsh -hostfile ./machinefile -n 8 /home/me/software.ex >  
> job.out 2> job.err
>
> produces the same error.

This error happens because you are using the ch3:shm channel.  This  
channel is deprecated, please don't use it unless you know that you  
specifically need to.  The shm channel only communicates over shared  
memory and does not have any network capability.

You should probably use the default channel instead.  In the old  
version of MPICH2 you are using (1.0.8p1 maybe?) that is ch3:sock.  In  
a more recent stable version that will be ch3:nemesis.

-Dave



More information about the mpich-discuss mailing list