[mpich-discuss] Hydra process placement verification

Pavan Balaji balaji at mcs.anl.gov
Tue Aug 31 10:29:51 CDT 2010


Hi,

1.1.1p1 only had an experimental version of Hydra at that point, and if 
I remember correctly it didn't have any support for hwloc at that point. 
However, you can download the latest version of Hydra (1.3b1) and use 
that with your existing application without recompiling it.

With respect to your results, if you are running on the same node, 
MPICH2 should return the same value for all processes, unless the OS 
returns a different "hostname" for each core on the system. Is this the 
case for your system?

Another way to checking whether the bindings are working correctly is 
using "top". But for that the application has to run for a few seconds, 
at least.

  -- Pavan

On 08/31/2010 10:22 AM, Jeffrey J. Evans wrote:
> I am trying to learn how to correctly verify hydra process placement.
>
> For example: I have an MPI program that uses 6 processes. On a node with 2 quad-core processors I setup the following host file
>
> hpn10:8 binding=user:1,2,3,5,6,7
>
> My MPI program grabs the process information from each process using MPI_Get_processor_name()
>
> The resulting output:
> # 000: OK on hpn10/0, EA: 1, rank: 0 ptrn 1 tick: 1.000000e-06
> # 001: OK on hpn10/1, EA: 1, rank: 1 ptrn 1 tick: 1.000000e-06
> # 002: OK on hpn10/2, EA: 1, rank: 2 ptrn 1 tick: 1.000000e-06
> # 003: OK on hpn10/3, EA: 1, rank: 3 ptrn 1 tick: 1.000000e-06
> # 004: OK on hpn10/4, EA: 1, rank: 4 ptrn 1 tick: 1.000000e-06
> # 005: OK on hpn10/5, EA: 1, rank: 5 ptrn 1 tick: 1.000000e-06
>
> I was expecting to see something like:
> # 000: OK on hpn10/1, EA: 1, rank: 0 ptrn 1 tick: 1.000000e-06
> # 001: OK on hpn10/2, EA: 1, rank: 1 ptrn 1 tick: 1.000000e-06
> # 002: OK on hpn10/3, EA: 1, rank: 2 ptrn 1 tick: 1.000000e-06
> # 003: OK on hpn10/5, EA: 1, rank: 3 ptrn 1 tick: 1.000000e-06
> # 004: OK on hpn10/6, EA: 1, rank: 4 ptrn 1 tick: 1.000000e-06
> # 005: OK on hpn10/7, EA: 1, rank: 5 ptrn 1 tick: 1.000000e-06
>
> My real problem is that I cannot locate in the documentation how hydra uses hwloc to bind processes to cores. Does hydra need paths to hwloc bin and lib directories? Does mpich2 need to be rebuilt with configuration information regarding the location of hwloc binaries and libraries?
>
> To provide more info - my mpich2-1.1.1p1 build was done with the pm:hydra configuration set, but hwloc was installed later. Does mpich2 need to be rebuilt?
> The resource manager = Torque, scheduler = Maui.
>
> How can I verify core placement?
>
> Jeffrey J. Evans
> jje at purdue.edu
> http://web.ics.purdue.edu/~evans6/
>
>
>
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list