[mpich-discuss] Sorting nodes by CPUs with -ppn (1.4 and r9834)

Yauheni Zelenko zelenko at cadence.com
Tue May 8 17:57:37 CDT 2012


No, for multi-process case (without --ppn) we wanted to use as much shared memory communication as possible (place as much slave process on same host with master). We always sue hosts assigned by SGE/LSF without mixing with local ones.

Eugene.
________________________________________
From: mpich-discuss-bounces at mcs.anl.gov [mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Reuti [reuti at staff.uni-marburg.de]
Sent: Tuesday, May 08, 2012 3:52 PM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] Sorting nodes by CPUs with -ppn (1.4 and r9834)

Am 09.05.2012 um 00:25 schrieb Pavan Balaji:

>
> The -ppn setting was always to specify how many cores you want Hydra to consider on each node (we can call it -cpn for cores per node, if it makes it clearer).  This is needed, if you have more processes than the number of nodes either at the start, or through dynamic processes, when we might have to wrap back to first node.
>
> Also, if you are creating threads based on the number of cores available, don't you already know the number of cores on each node?  If yes, why not just create a new communicator with the ranks reordered?
>
> -- Pavan
>
> On 05/08/2012 05:09 PM, Yauheni Zelenko wrote:
>> Hi!
>>
>> Hydra assumes number of processes per node as number of cores (in 1.4 and r9834) when -ppn specified.
>>
>> In this case -order-nodes behaves inconsistently with -ppn (order from command line) and without (nodes are sorted).
>>
>> Our application could use mixed MPI/multi-threads mode. In this case with use -ppn 1, but still sorting nodes by number of CPUs make sense for us to run master process (rank 0) on host with most CPUs.

For SGE you can specify the master queue, i.e. where the jobscript should run, with the -masterq option. This will also be rank 0 for your MPI application.

If I get you right, you are looking for an option for MPICH2, to put rank 0 on another machine but still use the local machine running the jobscript for other ranks.

-- Reuti



>> This is especially useful in farm environment like LSF/SGE/etc when CPUs on hosts may be allocated randomly.
>>
>> Eugene.
>> _______________________________________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

_______________________________________________
mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
To manage subscription options or unsubscribe:
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss


More information about the mpich-discuss mailing list