[mpich-discuss] Fwd: jobs of mpich2 & mvapich2 on the same cluster?

Pavan Balaji balaji at mcs.anl.gov
Fri Oct 8 08:54:37 CDT 2010


Sangamesh,

You are using a *very* old version of MPICH2. I don't even remember when 
1.0.4p1 was released :-).

It's possible that you are using a version of MVAPICH2 that's based on a 
newer MPICH2; that's why you are not seeing the error there.

You might want to upgrade to a newer version of MPICH2 (1.3rc2 is the 
latest). In the MPICH2-1.3.x series we moved to the Hydra process 
manager, so there's no MPD or MPDBOOT.

  -- Pavan

On 10/08/2010 02:27 AM, Sangamesh B wrote:
> Dear MPICH2 team,
>
>          We've a Rocks-5.1 Linux HPC cluster. It has two set of nodes.
> But the master node is common for both the sets of nodes. First set is
> connected with Infiniband and the second with gigabit.
>
> mpdboot works well for both MVAPICH2 and MPICH2( Version: 1.0.4p1) on
> the two sets of nodes respectively, mpds are booted from the same master
> node.
>
> Also the jobs run well with MVAPICH2 on INFINIBAND.
>
> But the MPICH2 jobs fail with following error:
>
> mpiexec_node28.local (mpiexec 371): no msg recvd from mpd when expecting
> ack of request
>
> Is this a bug with MPICH2? Or Is it not possible to run both mpich2 &
> mvapich2 mpdboot from the same master node (from the same user)?
>
> Thank you
>
>
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list