[mpich-discuss] HP-XC 3000 cluster issues

Gauri Kulkarni gaurivk at gmail.com
Mon Feb 23 23:19:20 CST 2009


Hi,

I am a newbie to the MPI in general. Currently in our institute, we have a
cluster of 16nodes-8processors. It is an HP-XC 3000 cluster which basically
means, it's quite proprietary. It has its own MPI implementation - HP-MPI -
in which, the parallelization is managed by SLURM (Simple Linux Utility for
Resource Management). There is also a batch job scheduler - LSF (Load
Sharing Facility) which works in tandem with SLURM to parallelize the batch
jobs. We have installed both MPICH and MPICH2 and are testing it, but we are
running into compatibility issues. For a simple helloworld.c program:
1. For HPMPI: Compiled with mpicc of this implemetation and executed with
its mpirun: mpirun -np 4 helloworld works correctly. For batch scheduling,
we need to isse "bsub -n4 [other options] mpirun -srun helloworld" and it
runs fine too. "srun" is SLURM utility that parallelizes the jobs.
2. For MPICH and mPICH2: Again, compiled with mpicc of these respective
implemetations and executed with their own mpirun:
    i) mpirun -np 4 helloword : Works.
   ii) mpirun -np 15 helloworld: The parallelization is limited to just a
single node - that is 8 processes run first on 8 processors of a single node
and then remaining ones.
  iii) bsub -n4 [options] mpirun -srun hellowrold: Job terminated. srun
option not recognized.
   iv) bsub [options] mpirun -np 4 helloworld: Works
   V) bsub [options] mpirun -np 15 helloworld: (Same as iii)

Anybody aware of HP cluster issues with MPICH? Am I misinterpreting? Any
help is appreciated.

Gauri.
---------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090224/042f199b/attachment.htm>


More information about the mpich-discuss mailing list