[mpich-discuss] how to make the hosts files

Mandar Gurav mandarwce at gmail.com
Thu May 12 08:38:46 CDT 2011


Hi hyunduk !

This is not about the total number of processors.... Its about to
create number of processes. You can create as many number of processes
as you can. Simultaneously, only those many processes(in your case 2 X
6 = 12) will be executing on the actual processors. Other will be
waiting for processor quantum (This is Operating system concept.. you
can refer to any Operating system book...). As you can see in your
computer many processes(programs) are running simultaneously. Only few
of the processes are running on the processors and others are waiting
for their chance. But, you cannot realize this phenomenon because
within a second Operating system switches among different processes
for tens of hundreds of times.

You can run your program with 20,25,30 ... processes but only few (12
in your case) will be executing...

-- Mandar Gurav

On Thu, May 12, 2011 at 6:41 PM, hyunduk kim <fororigin at gmail.com> wrote:
> Dear Pavan
> The hostname of My linux machine is francium.ac.kr.
> And I removed my machine as your comment.
> I received message as like
> [root at francium machine]# mpiexec -n 11
> /usr/local/mpich2-1.3.2p1/examples/cpi
> Process 0 of 11 is on francium
> Process 2 of 11 is on francium
> Process 3 of 11 is on francium
> Process 4 of 11 is on francium
> Process 5 of 11 is on francium
> Process 7 of 11 is on francium
> Process 8 of 11 is on francium
> Process 9 of 11 is on francium
> Process 10 of 11 is on francium
> Process 6 of 11 is on francium
> Process 1 of 11 is on francium
> pi is approximately 3.1415926544231247, Error is 0.0000000008333316
> wall clock time = 0.000453
> In above command, the option " -n 11" means that some program is going to
> use the 11 machine.
> Then I modified my run command as below message.
> [root at francium machine]# mpiexec -n 16
> /usr/local/mpich2-1.3.2p1/examples/cpi
> Process 0 of 16 is on francium
> Process 1 of 16 is on francium
> Process 2 of 16 is on francium
> Process 3 of 16 is on francium
> Process 4 of 16 is on francium
> Process 6 of 16 is on francium
> Process 7 of 16 is on francium
> Process 8 of 16 is on francium
> Process 9 of 16 is on francium
> Process 12 of 16 is on francium
> Process 14 of 16 is on francium
> Process 10 of 16 is on francium
> Process 15 of 16 is on francium
> Process 11 of 16 is on francium
> Process 13 of 16 is on francium
> Process 5 of 16 is on francium
> pi is approximately 3.1415926544231274, Error is 0.0000000008333343
> wall clock time = 0.000500
> In this command, I expected the error message because my linux machine is
> composed of 2 CPU, and each CPU has the 6 core.(Then my machine for mpich2
> is just 12.)
> Question is the meaning of the option "-n" in execute command.
> Thank for your kindness
>
> H.D., Kim
>
>
>
>
>
> 2011/5/12 Pavan Balaji <balaji at mcs.anl.gov>
>>
>> Is there an actual machine with the name "host1" or "host2" in your setup?
>>
>> If you are just running it on the local node, you should not give the
>> -machinefile or -f option.
>>
>>  -- Pavan
>>
>> On 05/12/2011 03:28 AM, hyunduk kim wrote:
>>>
>>> Thanks for your response
>>> However, my setup is not working.
>>>
>>> In my check progress.
>>> 1) I installed mpich2 on intel muti-core 2 cpu machine
>>> 2) check : /etc/hosts file
>>>     127.0.0.1               localhost.localdomain localhost
>>>     ::1                        localhost6.localdomain6 localhost6
>>>
>>> 3) made the machinefile for mpiexec :
>>> /usr/local/mpich2/machine/machinefile
>>>
>>>  host1:6
>>>  host2:6
>>>
>>> 4) run : [root at francium machine]# mpiexec -n 10 -machinefile
>>> ./machinefile /usr/local/mpich2-1.3.2p1/examples/cpi
>>>    ==> I received messages as below
>>>          ssh: connect to host host1 port 22: Connection timed out
>>>          ssh: connect to host host2 port 22: Connection timed out
>>>
>>>   Question is :
>>> 1) why do I setup passwordless login among the two hosts?
>>> 2) Mpich2 was installed on the just multi-core 2 cpu machine. Why dose
>>> the mpiexec try to connect host1 and host2 using port 22 ?
>>> 3) Is there other method for defining the machinefile on the multi-core
>>> 2 cpu machine ?
>>>
>>>  I will attach my log files.
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> mpich-discuss mailing list
>>> mpich-discuss at mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>> --
>> Pavan Balaji
>> http://www.mcs.anl.gov/~balaji
>
>



-- 
Mandar Gurav
http://www.mandargurav.org


More information about the mpich-discuss mailing list