[mpich-discuss] HP-XC 3000 cluster issues

Rajeev Thakur thakur at mcs.anl.gov
Mon Feb 23 23:38:06 CST 2009


To run MPICH2 with SLURM, configure with the options "--with-pmi=slurm
--with-pm=no" as described in the MPICH2 README file. Also see the
instructions on how to run MPICH2 with SLURM at
https://computing.llnl.gov/linux/slurm/quickstart.html .
 
Rajeev
 


  _____  

From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Gauri Kulkarni
Sent: Monday, February 23, 2009 11:19 PM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] HP-XC 3000 cluster issues


Hi,

I am a newbie to the MPI in general. Currently in our institute, we have a
cluster of 16nodes-8processors. It is an HP-XC 3000 cluster which basically
means, it's quite proprietary. It has its own MPI implementation - HP-MPI -
in which, the parallelization is managed by SLURM (Simple Linux Utility for
Resource Management). There is also a batch job scheduler - LSF (Load
Sharing Facility) which works in tandem with SLURM to parallelize the batch
jobs. We have installed both MPICH and MPICH2 and are testing it, but we are
running into compatibility issues. For a simple helloworld.c program:
1. For HPMPI: Compiled with mpicc of this implemetation and executed with
its mpirun: mpirun -np 4 helloworld works correctly. For batch scheduling,
we need to isse "bsub -n4 [other options] mpirun -srun helloworld" and it
runs fine too. "srun" is SLURM utility that parallelizes the jobs.
2. For MPICH and mPICH2: Again, compiled with mpicc of these respective
implemetations and executed with their own mpirun:
    i) mpirun -np 4 helloword : Works.
   ii) mpirun -np 15 helloworld: The parallelization is limited to just a
single node - that is 8 processes run first on 8 processors of a single node
and then remaining ones.
  iii) bsub -n4 [options] mpirun -srun hellowrold: Job terminated. srun
option not recognized.
   iv) bsub [options] mpirun -np 4 helloworld: Works
   V) bsub [options] mpirun -np 15 helloworld: (Same as iii)

Anybody aware of HP cluster issues with MPICH? Am I misinterpreting? Any
help is appreciated.

Gauri.
---------


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090223/20dd812b/attachment.htm>


More information about the mpich-discuss mailing list