[mpich-discuss] Fwd: jobs of mpich2 & mvapich2 on the same cluster?
Pavan Balaji
balaji at mcs.anl.gov
Fri Oct 8 08:54:37 CDT 2010
Sangamesh,
You are using a *very* old version of MPICH2. I don't even remember when
1.0.4p1 was released :-).
It's possible that you are using a version of MVAPICH2 that's based on a
newer MPICH2; that's why you are not seeing the error there.
You might want to upgrade to a newer version of MPICH2 (1.3rc2 is the
latest). In the MPICH2-1.3.x series we moved to the Hydra process
manager, so there's no MPD or MPDBOOT.
-- Pavan
On 10/08/2010 02:27 AM, Sangamesh B wrote:
> Dear MPICH2 team,
>
> We've a Rocks-5.1 Linux HPC cluster. It has two set of nodes.
> But the master node is common for both the sets of nodes. First set is
> connected with Infiniband and the second with gigabit.
>
> mpdboot works well for both MVAPICH2 and MPICH2( Version: 1.0.4p1) on
> the two sets of nodes respectively, mpds are booted from the same master
> node.
>
> Also the jobs run well with MVAPICH2 on INFINIBAND.
>
> But the MPICH2 jobs fail with following error:
>
> mpiexec_node28.local (mpiexec 371): no msg recvd from mpd when expecting
> ack of request
>
> Is this a bug with MPICH2? Or Is it not possible to run both mpich2 &
> mvapich2 mpdboot from the same master node (from the same user)?
>
> Thank you
>
>
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list