[MPICH] debug flag

Wei-keng Liao wkliao at ece.northwestern.edu
Sun May 27 09:03:27 CDT 2007


I just tried your approach and got the error:
     elf_check_version failed
     yod: Unable to load program(s).
      The file, /bin/hostname
              is unrecognized executable type

I guess yod only recognizes executables that are compiled by MPI.

I also tried call MPI_Get_processor_name() in an MPI program. I got 
nid00061 and nid00062 when I allocated 2 nodes. But they are not host 
names for the 2 nodes. I also tried gethostname() right after 
MPI_Get_processor_name() and it printed out a different name, yodjag16, 
and always the same on each node.

Wei-keng


On Sun, 27 May 2007, Anthony Chan wrote:

>
> It is possible that the Cray has its own process manager that yod is only
> way to interact with the process manager and probably the only one knows
> the nodes allocated.  If that is the case, you could try this in your PBS
> script.
>
>> cat <your_pbs_script>
> #!/bin/sh
> ...
> #PBS -l nodes=<num_of_nodes>
> ...
> YOD_NODEFILE=<my_node_file>
> rm -f ${YOD_NODEFILE}
> yod -size <num_of_nodes> hostname > ${YOD_NODEFILE}
> ...
> mpdboot -n <num_of_nodes> -f ${YOD_NODEFILE}
> ...
> mpiexec -n <num_of_nodes> a.out
>
>
>
> Not sure if this works, may worth a try.
>
> A.Chan
>
> On Sat, 26 May 2007, Wei-keng Liao wrote:
>
>>
>> The machine is a Cray XT3, 4. The batch is PBS and the command to launch
>> an MPI job is "yod" as described in the user guide.
>>
>> I tried to replace yod with mpich mpdboot and mpiexec, etc. I also used
>> the PBS environment variable PBS_NODEFILE in mpdboot, but the node file
>> does not contain the hostnames of nodes allocated by the PBS. It actually
>> always contains just one yod node, no matter how many nodes I request in
>> PBS script.
>>
>> Any idea of how to do it?
>>
>> Wei-keng
>>
>>
>> On Sat, 26 May 2007, Rajeev Thakur wrote:
>>
>>> Are you sure you can't run your own MPICH2 on the machine? It is just a
>>> user-level library. Once the scheduler assigns you a set of nodes, you can
>>> run mpd and your own mpiexec.
>>>
>>> Rajeev
>>>
>>
>>
>




More information about the mpich-discuss mailing list