[MPICH] -nolocal for mpiexec ?

Ralph Butler rbutler at mtsu.edu
Tue Aug 1 17:24:34 CDT 2006


The -nolocal is not supported in the new code.  However, you should  
not need to build
a new file every time you run mpiexec.  Hopefully, you can do it once  
with all nodes nec
for many runs and just re-use it each time.

On TueAug 1, at Tue Aug 1 5:20PM, Wei-keng Liao wrote:

>
> mpdtrace did tell me c4 is not include. After rerun mpdboot with
> % mpdboot -n 5 -f mpd.hosts
> and mpiexec -machinefile works fine now. Thanks.
>
> Is there a way to setup this -nolocal during the start of mpdboot?  
> So, I don't need to specify -machinefile each time I ran mpiexec.
>
> Wei-keng
>
>
>
> On Tue, 1 Aug 2006, Ralph Butler wrote:
>
>> Note that mpdboot always puts an mpd on the local host.  Thus, if  
>> you start a ring
>> with mpdboot using "-n 4", you are getting one local and 3  
>> remote.   So, in the
>> problem below, c4 is actually not in the ring and thus is not  
>> found when trying
>> to start processes via the machinefile that mentions it by name.   
>> mpdtrace
>> should verify that c4 is not in the ring.
>> --ralph
>>
>> On TueAug 1, at Tue Aug 1 4:44PM, Wei-keng Liao wrote:
>>
>>> I just tested mpdcheck on each of the compute nodes with the host  
>>> machine. They are all fine. The strange thing is if I did not use  
>>> -machinefile option, they all turn out OK without such error  
>>> messages.
>>> Wei-keng
>>> On Tue, 1 Aug 2006, Rajeev Thakur wrote:
>>>> There is something wrong with the networking setup on c4 then.  
>>>> It says
>>>> invalid machine name. Can you ssh to it? If you can, and cannot  
>>>> detect any
>>>> other problem, then try running the mpdcheck utility as  
>>>> described in the
>>>> install guide.
>>>> Rajeev
>>>>> -----Original Message-----
>>>>> From: Wei-keng Liao [mailto:wkliao at ece.northwestern.edu]
>>>>> Sent: Tuesday, August 01, 2006 4:20 PM
>>>>> To: Rajeev Thakur
>>>>> Cc: mpich-discuss at mcs.anl.gov
>>>>> Subject: RE: [MPICH] -nolocal for mpiexec ?
>>>>> Rajeev,
>>>>> I tried that but I don't know why it is not working on my machine.
>>>>> Here is my mpd.hosts file used in mpdboot
>>>>> % cat mpd.hosts
>>>>> c1
>>>>> c2
>>>>> c3
>>>>> c4
>>>>> My host machine is not in mpd.hosts.
>>>>> % mpdboot -n 4 -f mpd.hosts
>>>>> % cat machines
>>>>> c1
>>>>> c2
>>>>> c3
>>>>> c4
>>>>> c1
>>>>> c2
>>>>> c3
>>>>> c4
>>>>> % mpiexec -machinefile machines -n 4 hello
>>>>> mpiexec: unable to start all procs; may have invalid machine names
>>>>>      remaining specified hosts:
>>>>>          192.168.1.14 (c4)
>>>>> Using 3 and less nodes are fine. I ran all these on the host  
>>>>> machine.
>>>>> Wei-keng
>>>>> On Tue, 1 Aug 2006, Rajeev Thakur wrote:
>>>>>> Wei-keng,
>>>>>>         Let's say you have 4 machines: host, node1, node2,
>>>>> node3. You run
>>>>>> mpiexec from host and want the jobs to run only the 3
>>>>> nodes. Here's what you
>>>>>> do:
>>>>>> * From host, start an MPD ring on all 4 machines using mpdboot.
>>>>>> * Create a machine file containing
>>>>>> node1
>>>>>> node2
>>>>>> node3
>>>>>> node1
>>>>>> node2
>>>>>> node3
>>>>>> (repeated as many times as needed to cover the maximum
>>>>> number of processes
>>>>>> you want to run).
>>>>>> * Then run the job from host as
>>>>>> mpiexec -machinefile FILE -n NPROCS a.out
>>>>>> NPROCS has to be <= the number of machines listed in the
>>>>> machinefile.
>>>>>> Rajeev
>>>>>>> -----Original Message-----
>>>>>>> From: Wei-keng Liao [mailto:wkliao at ece.northwestern.edu]
>>>>>>> Sent: Monday, July 31, 2006 10:58 PM
>>>>>>> To: Rajeev Thakur
>>>>>>> Cc: mpich-discuss at mcs.anl.gov
>>>>>>> Subject: RE: [MPICH] -nolocal for mpiexec ?
>>>>>>> I tried -1, not working.
>>>>>>> Based on mpiexec help page, it just tries not to run the 1st
>>>>>>> proc locally.
>>>>>>> So, the local machine eventually appears as one of the MPI
>>>>> node with
>>>>>>> higher rank.
>>>>>>> Wei-keng
>>>>>>> On Mon, 31 Jul 2006, Rajeev Thakur wrote:
>>>>>>>> Try the -1 option to mpiexec.
>>>>>>>> Rajeev
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: owner-mpich-discuss at mcs.anl.gov
>>>>>>>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
>>>>> Wei-keng Liao
>>>>>>>>> Sent: Monday, July 31, 2006 8:17 PM
>>>>>>>>> To: mpich-discuss at mcs.anl.gov
>>>>>>>>> Subject: [MPICH] -nolocal for mpiexec ?
>>>>>>>>> How do I run mpdboot and mpiexec so I can run MPI jobs
>>>>> on non-local
>>>>>>>>> machines? In mpich1, mpirun has an option -nolocal for not
>>>>>>>>> running job on
>>>>>>>>> local machine. How do I achieve the same effect iof -nolocal
>>>>>>>>> on mpich2?
>>>>>>>>> Wei-keng
>>
>




More information about the mpich-discuss mailing list