[mpich-discuss] MPI and slurm

Sat Oct 15 04:45:40 CDT 2011

Hello,

If you need to use MPICH2's mpiexec, you should not give the options 
--with-pmi and --with-pm. The behavior you described (MPI_Recv blocking) 
is expected in this case, as all processes will end up getting rank 0 
because of a mismatch between the internal protocols used by the native 
SLURM daemons (slurmd) and mpiexec.

Can you try configuring without the above configure options and see if 
it helps?

  -- Pavan

On 10/07/2011 10:33 AM, Evan Patton wrote:
> Hello all,
>
> I'm trying to configure MPI 1.4.1p1 and slurm 2.3.0 on a 4-node cluster.
> I've tried to do this in two different ways but with no success and
> could use some pointers. First, I configured MPI using slurm as the pmi:
>
> ./configure --sysconfdir=/etc --localstatedir=/var --with-pmi=slurm
> --with-pm=none
>
> Using the srun command to run 32 ranks across 2 machines in the cluster,
> MPI bailed out with this error:
>
> Fatal error in PMPI_Isend: Other MPI error, error stack:
> PMPI_Isend(148)..........: MPI_Isend(buf=0x7ff41b42b010, count=2097152,
> MPI_DOUBLE, dest=0, tag=0, MPI_COMM_WORLD, request=0x6c8a38) failed
> MPID_nem_lmt_RndvSend(81):
> MPIDI_CH3_RndvSend(63)...: failure occurred while attempting to send RTS
> packet
> MPIDI_CH3_iStartMsg(36)..: Communication error with rank 0
> srun: error: hercules-2: task 31: Exited with exit code 1
> Fatal error in PMPI_Isend: Other MPI error, error stack:
> PMPI_Isend(148)..........: MPI_Isend(buf=0x7f5df0e30010, count=2097152,
> MPI_DOUBLE, dest=24, tag=0, MPI_COMM_WORLD, request=0x6c8a38) failed
> MPID_nem_lmt_RndvSend(81):
> MPIDI_CH3_RndvSend(63)...: failure occurred while attempting to send RTS
> packet
> MPIDI_CH3_iStartMsg(36)..: Communication error with rank 24
> srun: error: hercules-1: task 23: Exited with exit code 1
> srun: First task exited 30s ago
> srun: tasks 0-22,24-30: running
> srun: tasks 23,31: exited abnormally
> srun: Terminating job step 26.0
>
> Given that ranks 23 and 31 terminated and that running the same command
> on a single machine works correctly, I assume it must be some
> inter-machine communication issue. I went to the FAQs
> (http://wiki.mcs.anl.gov/mpich2/index.php/Frequently_Asked_Questions#Q:_How_do_I_use_MPICH2_with_slurm.3F)
> to see if there was a solution to this. That document suggests verifying
> that one can ssh between machines and that the firewalls are off. This
> happens to be the case in my configuration so that appeared to be a dead
> end.
>
> I went back and reconfigured MPI to use hydra to see if maybe there was
> an issue with the slurm PMI. Using a bash script containing the
> following mpiexec I ran an sbatch:
>
> #!/bin/bash
> # run.sh
> export HYDRA_BOOTSTRAP=slurm
> mpiexec -n 16 ./mat_mul 8192
> #END run.sh
>
> hercules-1# sbatch -t 10 -p two -n 16 ./run.sh
>
> This also failed with the same error as above. Just to see what would
> happen, I added a -hosts option to the mpiexec call to see if that would
> help at all. This caused the processes to not crash, but instead they
> all blocked at an MPI_Recv operation, indicating that the MPI_Isend
> operations were not occurring correctly, similar to what had happened
> previously. At this point I've run out of ideas as to how to proceed, so
> if anyone can point me in the right direction I would greatly appreciate it.
>
> Thanks for your time,
> Evan
>
>
>
>
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji