[mpich-discuss] MPICH2 and OpenPBS/Torque using MPICH2-Hydra

Pavan Balaji balaji at mcs.anl.gov
Fri Sep 25 10:48:20 CDT 2009


Simon,

I did an initial version of the code; tarball here: 
http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/nightly/hydra

However, I'm not sure if skipping the "-n" option is a good idea. Even 
if your allocation has 100 nodes, the application might need to be 
launched only on 10 nodes. The "-f" option, on the other hand, would be 
redundant in PBS environments. You don't need to specify it with PBS 
anymore. Also, this would keep the behavior consistent with other 
environments such as slurm. Thoughts?

  -- Pavan

On 09/24/2009 03:25 AM, Simon Hammond wrote:
> This would definitely be helpful.
> 
> I'm guessing these processes (under SSH) wouldn't be managed in the
> same way that PBS would say an OpenMPI job? Getting rid of the -n
> would be a good start. Can I ask how node placement and rank
> allocation etc would work?
> 
> 
> 
> 
> S.
> 
> 
> 2009/9/23  <balaji at mcs.anl.gov>:
>> PBS support is not added in Hydra yet. See: https://trac.mcs.anl.gov/projects/mpich2/ticket/443
>>
>> However, so far we were considering PBS support as a bootstrap server (which has a lot more functionality than just providing the number of nodes). But adding PBS as a resource management kernel (RMK) is possible too and should be simpler as well. Doing this will allow you to skip the "-n" option in PBS environments, but you'll internally still be using ssh to start the processes. If this is helpful for you, we can consider adding it for the next release.
>>
>>  -- Pavan
>>
>> ----- "Si Hammond" <simon.hammond at gmail.com> wrote:
>>
>>> Hi everyone,
>>>
>>> We're trying to get MPICH2 to work under PBS so that users don't need
>>>
>>> to specify the number of processors etc. We have built the install
>>> with the MPICH2-Hydra device and this seems to work (in that jobs will
>>>
>>> run) but users still have to specify the number of MPI ranks using
>>> -n.
>>>
>>> I found the -rmk flag in some documentation using Google but "-rmk
>>> pbs" doesn't seem to work either (the example uses "-rmk lsf")
>>>
>>> How does a user like myself go about making a PBS-enabled MPICH2 build
>>>
>>> to the job placement etc is all handled under the covers? Is this
>>> possible? OpenMPI seems to have this covered but we'd like to install
>>>
>>> MPICH2 as well since one of our codes only runs with MPICH2.
>>>
>>> Thanks for your help.
>>>
>>>
>>>
>>> ---------------------------------------------------------------------------------------
>>> Si Hammond
>>>
>>> Performance Modelling, Analysis and Optimisation Laboratory
>>> High Performance Systems Group
>>> Department of Computer Science
>>> University of Warwick, CV4 7AL, UK
>>> ----------------------------------------------------------------------------------------

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list