[mpich-discuss] unable to connect ?

Gus Correa gus at ldeo.columbia.edu
Fri Feb 27 14:57:08 CST 2009


Hi K.A.Albert,list

kiss attila wrote:
> :)))))))))))
> S U C C E S S ! ! !
>  Thanks a lot. Probably i've played with instalation settings and
> modified the passphrase, but now it works :).
> 
> Hmmm.... a strange thing: the cpi.exe runs with the same input in
> 1,872297s running on both machines,  1,869439s running on 10.0.0.13
> and 1.897818s runnning on 10.0.0.10. I thought running on two machines
> would be the fastest.
> 
The cpi.c loop itereates 100 times.
This goes so fast that most of the time you see
is probably startup/wrapup overhead (Amdahl's law).

I think cpi.c is meant only to test if MPICH is working properly,
not to test how it scales with the number of processors.
You may need to increase that number to see any scaling with
the number of processors.

Gus Correa

> mpiexec with the -log switch shouldn't create some logfile? I can't
> find any logfiles in the mpich2 folders.
> 
>  regards
> K.A. Albert
> 
> 2009/2/27 Jayesh Krishna <jayesh at mcs.anl.gov>:
>>  Hi,
>>   From your debug logs the problem does not appear to be a network
>> connectivity issue. It looks more like a configuration issue,
>>
>> ============== snip ========================
>> ...\smpd_state_reading_connect_result
>> ....read connect result: 'FAIL'
>> ....connection rejected, server returned - FAIL
>> ============== snip ========================
>>
>>   Your PM connection can get rejected due to the foll reasons,
>>
>> # There is a mismatch in the version of MPICH2 software installed on the
>> multiple machines.
>> # There is a mismatch in the passphrase used on the multiple machines (You
>> enter this "passphrase" during MPICH2 installation).
>>
>>   I would recommend the following,
>>
>> # Uninstall MPICH2 on both the machines.
>> # Download the latest stable version (1.0.8) of MPICH2 from the downloads
>> page
>> (http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads).
>> # Install MPICH2 on the machines using the installer downloaded from the
>> downloads page.
>>
>> ------- Make sure that you keep the default settings for passphrase during
>> the installation
>> ------- Also make sure that all users have access to MPICH2 (Change the
>> default option from "Just me" to "Everyone" during installation)
>>
>> # If your machine is not part of a domain, when registering the
>> username/password with mpiexec don't specify any domain name. Also validate,
>> as before, after registering the user.
>>
>>  Let us know the results.
>>
>> (PS: There is no specific configuration required, apart from the info above,
>> to get MPICH2 working across multiple windows machines)
>>
>> Regards,
>> Jayesh
>>
>> -----Original Message-----
>> From: kiss attila [mailto:kissattila2008 at gmail.com]
>> Sent: Thursday, February 26, 2009 11:45 PM
>> To: Jayesh Krishna
>> Cc: mpich-discuss at mcs.anl.gov
>> Subject: Re: [mpich-discuss] unable to connect ?
>>
>> Hi
>>
>>   I've tried now everything: I've created the same user, I've validated this
>> user ( mpiuser ), but still nothing... Can anyone send me some config files
>> from a  w o r k i n g Mpich2 setup based on windows workgroup (not domain).
>> Till then these are my output from smpd -d,  and mpiexec commands, when I
>> try to run from 10.0.0.10 hostname on remote computer (10.0.0.13)
>>
>> D:\Program Files\MPICH2\bin>smpd -d
>>
>> [00:2436]..\smpd_set_smpd_data
>> [00:2436]../smpd_set_smpd_data
>> [00:2436]..created a set for the listener: 1724 [00:2436]..smpd listening on
>> port 8676 [00:2436]..\smpd_create_context [00:2436]...\smpd_init_context
>> [00:2436]....\smpd_init_command [00:2436]..../smpd_init_command
>> [00:2436].../smpd_init_context [00:2436]../smpd_create_context
>> [00:2436]..\smpd_option_on [00:2436]...\smpd_get_smpd_data
>> [00:2436]....\smpd_get_smpd_data_from_environment
>> [00:2436]..../smpd_get_smpd_data_from_environment
>> [00:2436]....\smpd_get_smpd_data_default
>> [00:2436]..../smpd_get_smpd_data_default
>> [00:2436]....Unable to get the data for the key 'no_dynamic_hosts'
>> [00:2436].../smpd_get_smpd_data
>> [00:2436]../smpd_option_on
>> [00:2436]..\smpd_insert_into_dynamic_hosts
>> [00:2436]../smpd_insert_into_dynamic_hosts
>> [00:2436]..\smpd_enter_at_state
>> [00:2436]...sock_waiting for the next event.
>> [00:2436]...SOCK_OP_ACCEPT
>> [00:2436]...\smpd_handle_op_accept
>> [00:2436]....\smpd_state_smpd_listening
>> [00:2436].....authenticating new connection
>> [00:2436].....\smpd_create_context
>> [00:2436]......\smpd_init_context
>> [00:2436].......\smpd_init_command
>> [00:2436]......./smpd_init_command
>> [00:2436]....../smpd_init_context
>> [00:2436]...../smpd_create_context
>> [00:2436].....\smpd_gen_authentication_strings
>> [00:2436]......\smpd_hash
>> [00:2436]....../smpd_hash
>> [00:2436]...../smpd_gen_authentication_strings
>> [00:2436].....posting a write of the challenge string: 1.0.8 7993
>> [00:2436]..../smpd_state_smpd_listening
>> [00:2436].../smpd_handle_op_accept
>> [00:2436]...sock_waiting for the next event.
>> [00:2436]...SOCK_OP_WRITE
>> [00:2436]...\smpd_handle_op_write
>> [00:2436]....\smpd_state_writing_challenge_string
>> [00:2436].....wrote challenge string: '1.0.8 7993'
>> [00:2436]..../smpd_state_writing_challenge_string
>> [00:2436].../smpd_handle_op_write
>> [00:2436]...sock_waiting for the next event.
>> [00:2436]...SOCK_OP_READ
>> [00:2436]...\smpd_handle_op_read
>> [00:2436]....\smpd_state_reading_challenge_response
>> [00:2436].....read challenge response: 'd6fdd96549e0c22c875ac55a2735a162'
>> [00:2436]..../smpd_state_reading_challenge_response
>> [00:2436].../smpd_handle_op_read
>> [00:2436]...sock_waiting for the next event.
>> [00:2436]...SOCK_OP_WRITE
>> [00:2436]...\smpd_handle_op_write
>> [00:2436]....\smpd_state_writing_connect_result
>> [00:2436].....wrote connect result: 'FAIL'
>> [00:2436].....connection reject string written, closing sock.
>> [00:2436]..../smpd_state_writing_connect_result
>> [00:2436].../smpd_handle_op_write
>> [00:2436]...sock_waiting for the next event.
>> [00:2436]...SOCK_OP_CLOSE
>> [00:2436]...\smpd_handle_op_close
>> [00:2436]....\smpd_get_state_string
>> [00:2436]..../smpd_get_state_string
>> [00:2436]....op_close received - SMPD_CLOSING state.
>> [00:2436]....Unaffiliated undetermined context closing.
>> [00:2436]....\smpd_free_context
>> [00:2436].....freeing undetermined context.
>> [00:2436].....\smpd_init_context
>> [00:2436]......\smpd_init_command
>> [00:2436]....../smpd_init_command
>> [00:2436]...../smpd_init_context
>> [00:2436]..../smpd_free_context
>> [00:2436].../smpd_handle_op_close
>> [00:2436]...sock_waiting for the next event.
>>
>>
>> C:\Program Files\MPICH2\bin>mpiexec -verbose -hosts 1 10.0.0.13 -user
>> mpiuser hostname
>>
>> ..\smpd_add_host_to_default_list
>> ...\smpd_add_extended_host_to_default_list
>> .../smpd_add_extended_host_to_default_list
>> ../smpd_add_host_to_default_list
>> ..\smpd_hide_string_arg
>> ...\first_token
>> .../first_token
>> ...\compare_token
>> .../compare_token
>> ...\next_token
>> ....\first_token
>> ..../first_token
>> ....\first_token
>> ..../first_token
>> .../next_token
>> ../smpd_hide_string_arg
>> ../smpd_hide_string_arg
>> ..\smpd_hide_string_arg
>> ...\first_token
>> .../first_token
>> ...\compare_token
>> .../compare_token
>> ...\next_token
>> ....\first_token
>> ..../first_token
>> ....\first_token
>> ..../first_token
>> .../next_token
>> ../smpd_hide_string_arg
>> ../smpd_hide_string_arg
>> ..\smpd_get_full_path_name
>> ...fixing up exe name: 'hostname' -> '(null)'
>> ../smpd_get_full_path_name
>> ..handling executable:
>> hostname.exe
>> ..\smpd_get_next_host
>> ...\smpd_get_host_id
>> .../smpd_get_host_id
>> ../smpd_get_next_host
>> ..\smpd_create_cliques
>> ...\next_launch_node
>> .../next_launch_node
>> ...\next_launch_node
>> .../next_launch_node
>> ../smpd_create_cliques
>> ..\smpd_fix_up_host_tree
>> ../smpd_fix_up_host_tree
>> ./mp_parse_command_args
>> .host tree:
>> . host: 10.0.0.13, parent: 0, id: 1
>> .launch nodes:
>> . iproc: 0, id: 1, exe: hostname.exe
>> .\smpd_get_smpd_data
>> ..\smpd_get_smpd_data_from_environment
>> ../smpd_get_smpd_data_from_environment
>> ./smpd_get_smpd_data
>> .\smpd_create_context
>> ..\smpd_init_context
>> ...\smpd_init_command
>> .../smpd_init_command
>> ../smpd_init_context
>> ./smpd_create_context
>> .\smpd_make_socket_loop
>> ..\smpd_get_hostname
>> ../smpd_get_hostname
>> ./smpd_make_socket_loop
>> .\smpd_create_context
>> ..\smpd_init_context
>> ...\smpd_init_command
>> .../smpd_init_command
>> ../smpd_init_context
>> ./smpd_create_context
>> .\smpd_enter_at_state
>> ..sock_waiting for the next event.
>> ..SOCK_OP_CONNECT
>> ..\smpd_handle_op_connect
>> ...connect succeeded, posting read of the challenge string
>> ../smpd_handle_op_connect ..sock_waiting for the next event.
>> ..SOCK_OP_READ
>> ..\smpd_handle_op_read
>> ...\smpd_state_reading_challenge_string
>> ....read challenge string: '1.0.8 7993'
>> ....\smpd_verify_version
>> ..../smpd_verify_version
>> ....\smpd_hash
>> ..../smpd_hash
>> .../smpd_state_reading_challenge_string
>> ../smpd_handle_op_read
>> ..sock_waiting for the next event.
>> ..SOCK_OP_WRITE
>> ..\smpd_handle_op_write
>> ...\smpd_state_writing_challenge_response
>> ....wrote challenge response: 'd6fdd96549e0c22c875ac55a2735a162'
>> .../smpd_state_writing_challenge_response
>> ../smpd_handle_op_write
>> ..sock_waiting for the next event.
>> ..SOCK_OP_READ
>> ..\smpd_handle_op_read
>> ...\smpd_state_reading_connect_result
>> ....read connect result: 'FAIL'
>> ....connection rejected, server returned - FAIL ....\smpd_post_abort_command
>> .....\smpd_create_command ......\smpd_init_command ....../smpd_init_command
>> ...../smpd_create_command .....\smpd_add_command_arg
>> ...../smpd_add_command_arg .....\smpd_command_destination ......0 -> 0 :
>> returning NULL context ...../smpd_command_destination
>> Aborting: unable to connect to 10.0.0.13 ..../smpd_post_abort_command
>> ....\smpd_exit .....\smpd_kill_all_processes ...../smpd_kill_all_processes
>> .....\smpd_finalize_drive_maps ...../smpd_finalize_drive_maps
>> .....\smpd_dbs_finalize ...../smpd_dbs_finalize
>>
>>
>> Thanks for any ideas.
>> regards
>> K.A. Albert
>>
>> 2009/2/26 Jayesh Krishna <jayesh at mcs.anl.gov>:
>>> Hi,
>>>
>>>>> .. I launch mpiexec.exe from an another windows user acount...
>>>  This could be your problem. You can try registering a
>>> username/password available on both the machines using the "-user"
>>> option (mpiexec -register -user 1) & launch your job using that user
>>> (mpiexec -n 2 -user 1 -hosts 2 10.0.0.10 10.0.0.13 hostname). You can
>>> also validate if the user credentials are capable of launching a job
>>> using the "-validate" option of mpiexec (mpiexec -validate -user 1
>>> 10.0.0.10 ; mpiexec -validate -user 1 10.0.0.13)
>>>
>>> (PS: Did you copy-paste the complete output of the mpiexec command &
>>> the command itself ? Please don't remove any part of the output. This
>>> will help us in debugging your problem.)
>>>
>>> Regards,
>>> Jayesh
>>>
>>> -----Original Message-----
>>> From: kiss attila [mailto:kissattila2008 at gmail.com]
>>> Sent: Thursday, February 26, 2009 12:26 AM
>>> To: Jayesh Krishna
>>> Subject: Re: [mpich-discuss] unable to connect ?
>>>
>>> 1. Yes, the ping works fine. With wmpiconfig.exe i can see both machines.
>>> 2. MPICH2 1.0.8 installed on both.
>>> 3. No firewalls of any kind.
>>> 4. On  smpd -status i get:
>>> smpd running on 10.0.0.10
>>> smpd running on 10.0.0.13
>>>
>>> 5. from 10.0.0.10
>>> C:\Program Files\MPICH2\bin>mpiexec -hosts 2 10.0.0.10 10.0.0.13
>>> hostname
>>> abort: unable to connect to 10.0.0.13
>>>
>>> from 10.0.0.13
>>> C:\Program Files\MPICH2\bin>mpiexec -hosts 2 10.0.0.10 10.0.0.13
>>> hostname
>>> abort: unable to connect to 10.0.0.10
>>>
>>> and here is the -verbose mode:
>>>
>>> ...../first_token
>>> .....\compare_token
>>> ...../compare_token
>>> .....\next_token
>>> ......\first_token
>>> ....../first_token
>>> ......\first_token
>>> ....../first_token
>>> ...../next_token
>>> ..../smpd_hide_string_arg
>>> ..../smpd_hide_string_arg
>>> .....\smpd_option_on
>>> ......\smpd_get_smpd_data
>>> .......\smpd_get_smpd_data_from_environment
>>> ......./smpd_get_smpd_data_from_environment
>>> .......\smpd_get_smpd_data_default
>>> ......./smpd_get_smpd_data_default
>>> .......Unable to get the data for the key 'nocache'
>>> ....../smpd_get_smpd_data
>>> ...../smpd_option_on
>>> ....\smpd_hide_string_arg
>>> .....\first_token
>>> ...../first_token
>>> .....\compare_token
>>> ...../compare_token
>>> .....\next_token
>>> ......\first_token
>>> ....../first_token
>>> ......\first_token
>>> ....../first_token
>>> ...../next_token
>>> ..../smpd_hide_string_arg
>>> ..../smpd_hide_string_arg
>>> .../smpd_handle_op_read
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_WRITE
>>> ...\smpd_handle_op_write
>>> ....\smpd_state_writing_cred_ack_yes
>>> .....wrote cred request yes ack.
>>> ..../smpd_state_writing_cred_ack_yes
>>> .../smpd_handle_op_write
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_WRITE
>>> ...\smpd_handle_op_write
>>> ....\smpd_state_writing_account
>>> .....wrote account: 'mpiuser'
>>> .....\smpd_encrypt_data
>>> ...../smpd_encrypt_data
>>> ..../smpd_state_writing_account
>>> .../smpd_handle_op_write
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_WRITE
>>> ...\smpd_handle_op_write
>>> ....\smpd_hide_string_arg
>>> .....\first_token
>>> ...../first_token
>>> .....\compare_token
>>> ...../compare_token
>>> .....\next_token
>>> ......\first_token
>>> ....../first_token
>>> ......\first_token
>>> ....../first_token
>>> ...../next_token
>>> ..../smpd_hide_string_arg
>>> ..../smpd_hide_string_arg
>>> .....\smpd_hide_string_arg
>>> ......\first_token
>>> ....../first_token
>>> ......\compare_token
>>> ....../compare_token
>>> ......\next_token
>>> .......\first_token
>>> ......./first_token
>>> .......\first_token
>>> ......./first_token
>>> ....../next_token
>>> ...../smpd_hide_string_arg
>>> ...../smpd_hide_string_arg
>>> ....\smpd_hide_string_arg
>>> .....\first_token
>>> ...../first_token
>>> .....\compare_token
>>> ...../compare_token
>>> .....\next_token
>>> ......\first_token
>>> ....../first_token
>>> ......\first_token
>>> ....../first_token
>>> ...../next_token
>>> ..../smpd_hide_string_arg
>>> ..../smpd_hide_string_arg
>>> .../smpd_handle_op_write
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_READ
>>> ...\smpd_handle_op_read
>>> ....\smpd_state_reading_process_result
>>> .....read process session result: 'SUCCESS'
>>> ..../smpd_state_reading_process_result
>>> .../smpd_handle_op_read
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_READ
>>> ...\smpd_handle_op_read
>>> ....\smpd_state_reading_reconnect_request
>>> .....read re-connect request: '3972'
>>> .....closing the old socket in the left context.
>>> .....MPIDU_Sock_post_close(1720)
>>> .....connecting a new socket.
>>> .....\smpd_create_context
>>> ......\smpd_init_context
>>> .......\smpd_init_command
>>> ......./smpd_init_command
>>> ....../smpd_init_context
>>> ...../smpd_create_context
>>> .....posting a re-connect to 10.0.0.10:3972 in left context.
>>> ..../smpd_state_reading_reconnect_request
>>> .../smpd_handle_op_read
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_CLOSE
>>> ...\smpd_handle_op_close
>>> ....\smpd_get_state_string
>>> ..../smpd_get_state_string
>>> ....op_close received - SMPD_CLOSING state.
>>> ....Unaffiliated left context closing.
>>> ....\smpd_free_context
>>> .....freeing left context.
>>> .....\smpd_init_context
>>> ......\smpd_init_command
>>> ....../smpd_init_command
>>> ...../smpd_init_context
>>> ..../smpd_free_context
>>> .../smpd_handle_op_close
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_CONNECT
>>> ...\smpd_handle_op_connect
>>> ....\smpd_generate_session_header
>>> .....session header: (id=1 parent=0 level=0)
>>> ..../smpd_generate_session_header .../smpd_handle_op_connect
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_WRITE
>>> ...\smpd_handle_op_write
>>> ....\smpd_state_writing_session_header
>>> .....wrote session header: 'id=1 parent=0 level=0'
>>> .....\smpd_post_read_command
>>> ......posting a read for a command header on the left context, sock
>>> 1656 ...../smpd_post_read_command .....creating connect command for
>>> left node .....creating connect command to '10.0.0.13'
>>> .....\smpd_create_command
>>> ......\smpd_init_command
>>> ....../smpd_init_command
>>> ...../smpd_create_command
>>> .....\smpd_add_command_arg
>>> ...../smpd_add_command_arg
>>> .....\smpd_add_command_int_arg
>>> ...../smpd_add_command_int_arg
>>> .....\smpd_post_write_command
>>> ......\smpd_package_command
>>> ....../smpd_package_command
>>> ......smpd_post_write_command on the left context sock 1656: 65 bytes
>>> for
>>> command: "cmd=connect src=0 dest=1 tag=0 host=10.0.0.13 id=2 "
>>> ...../smpd_post_write_command
>>> .....not connected yet: 10.0.0.13 not connected
>>> ..../smpd_state_writing_session_header
>>> .../smpd_handle_op_write
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_WRITE
>>> ...\smpd_handle_op_write
>>> ....\smpd_state_writing_cmd
>>> .....wrote command
>>> .....command written to left: "cmd=connect src=0 dest=1 tag=0
>>> host=10.0.0.13 id=2 "
>>> .....moving 'connect' command to the wait_list.
>>> ..../smpd_state_writing_cmd
>>> .../smpd_handle_op_write
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_READ
>>> ...\smpd_handle_op_read
>>> ....\smpd_state_reading_cmd_header
>>> .....read command header
>>> .....command header read, posting read for data: 69 bytes
>>> ..../smpd_state_reading_cmd_header
>>> .../smpd_handle_op_read
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_READ
>>> ...\smpd_handle_op_read
>>> ....\smpd_state_reading_cmd
>>> .....read command
>>> .....\smpd_parse_command
>>> ...../smpd_parse_command
>>> .....read command: "cmd=abort src=1 dest=0 tag=0 error="unable to
>>> connect to 10.0.0.13" "
>>> .....\smpd_handle_command
>>> ......handling command:
>>> ...... src  = 1
>>> ...... dest = 0
>>> ...... cmd  = abort
>>> ...... tag  = 0
>>> ...... ctx  = left
>>> ...... len  = 69
>>> ...... str  = cmd=abort src=1 dest=0 tag=0 error="unable to connect to
>>> 10.0.0.13"
>>> ......\smpd_command_destination
>>> .......0 -> 0 : returning NULL context ....../smpd_command_destination
>>> ......\smpd_handle_abort_command
>>> .......abort: unable to connect to 10.0.0.13
>>> ....../smpd_handle_abort_command ...../smpd_handle_command
>>> .....\smpd_post_read_command ......posting a read for a command header
>>> on the left context, sock 1656 ...../smpd_post_read_command
>>> .....\smpd_create_command ......\smpd_init_command
>>> ....../smpd_init_command ...../smpd_create_command
>>> .....\smpd_post_write_command ......\smpd_package_command
>>> ....../smpd_package_command ......smpd_post_write_command on the left
>>> context sock 1656: 43 bytes for
>>> command: "cmd=close src=0 dest=1 tag=1 "
>>> ...../smpd_post_write_command
>>> ..../smpd_state_reading_cmd
>>> .../smpd_handle_op_read
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_READ
>>> ...\smpd_handle_op_read
>>> ....\smpd_state_reading_cmd_header
>>> .....read command header
>>> .....command header read, posting read for data: 31 bytes
>>> ..../smpd_state_reading_cmd_header
>>> .../smpd_handle_op_read
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_WRITE
>>> ...\smpd_handle_op_write
>>> ....\smpd_state_writing_cmd
>>> .....wrote command
>>> .....command written to left: "cmd=close src=0 dest=1 tag=1 "
>>> .....\smpd_free_command
>>> ......\smpd_init_command
>>> ....../smpd_init_command
>>> ...../smpd_free_command
>>> ..../smpd_state_writing_cmd
>>> .../smpd_handle_op_write
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_READ
>>> ...\smpd_handle_op_read
>>> ....\smpd_state_reading_cmd
>>> .....read command
>>> .....\smpd_parse_command
>>> ...../smpd_parse_command
>>> .....read command: "cmd=closed src=1 dest=0 tag=1 "
>>> .....\smpd_handle_command
>>> ......handling command:
>>> ...... src  = 1
>>> ...... dest = 0
>>> ...... cmd  = closed
>>> ...... tag  = 1
>>> ...... ctx  = left
>>> ...... len  = 31
>>> ...... str  = cmd=closed src=1 dest=0 tag=1
>>> ......\smpd_command_destination .......0 -> 0 : returning NULL context
>>> ....../smpd_command_destination ......\smpd_handle_closed_command
>>> .......closed command received from left child, closing sock.
>>> .......MPIDU_Sock_post_close(1656)
>>> .......received a closed at node with no parent context, assuming
>>> root, returning SMPD_EXITING.
>>> ....../smpd_handle_closed_command
>>> ...../smpd_handle_command
>>> .....not posting read for another command because SMPD_EXITING
>>> returned ..../smpd_state_reading_cmd .../smpd_handle_op_read
>>> ...sock_waiting for the next event.
>>> ...SOCK_OP_CLOSE
>>> ...\smpd_handle_op_close
>>> ....\smpd_get_state_string
>>> ..../smpd_get_state_string
>>> ....op_close received - SMPD_EXITING state.
>>> ....\smpd_free_context
>>> .....freeing left context.
>>> .....\smpd_init_context
>>> ......\smpd_init_command
>>> ....../smpd_init_command
>>> ...../smpd_init_context
>>> ..../smpd_free_context
>>> .../smpd_handle_op_close
>>> ../smpd_enter_at_state
>>> ./main
>>> .\smpd_exit
>>> ..\smpd_kill_all_processes
>>> ../smpd_kill_all_processes
>>> ..\smpd_finalize_drive_maps
>>> ../smpd_finalize_drive_maps
>>> ..\smpd_dbs_finalize
>>> ../smpd_dbs_finalize
>>>
>>> I have registered with wmpiregister.exe the same user with the same
>>> password on both computers but I launch mpiexec.exe from an another
>>> windows user acount; could this be a problem?. Thanks
>>>
>>> regards
>>> k.a.albert
>>>
>>>
>>>
>>>
>>> 2009/2/25 Jayesh Krishna <jayesh at mcs.anl.gov>:
>>>>  Hi,
>>>>
>>>> # Can you ping the machines from each other ?
>>>> # Make sure that you have the same version of MPICH2 installed on
>>>> both the machines.
>>>> # Do you have any firewalls (windows, third-party) running on the
>>>> machines (Turn off any firewalls running on the machines)?
>>>> # Make sure that you have the MPICH2 process manager, smpd.exe,
>>>> running as a service on both the machines (To check the status of the
>>>> process manager type, smpd -status, at the command prompt).
>>>> # Before trying to execute an MPI program like cpi.exe, try executing
>>>> a non-MPI program like hostname on the machines (mpiexec -hosts 2
>>>> 10.0.0.10
>>>> 10.0.0.13 hostname).
>>>>
>>>>  Let us know the results.
>>>>
>>>> (PS: In your reply please copy-paste the commands and the output)
>>>> Regards, Jayesh
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: mpich-discuss-bounces at mcs.anl.gov
>>>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of kiss attila
>>>> Sent: Wednesday, February 25, 2009 1:46 PM
>>>> To: mpich-discuss at mcs.anl.gov
>>>> Subject: [mpich-discuss] unable to connect ?
>>>>
>>>> Hi
>>>>
>>>>   I have two WinXp machines (10.0.0.13,10.0.0.10) with mpich2
>>>> installed, and on this command:
>>>> "D:\Program Files\MPICH2\bin\mpiexec.exe" -hosts 2 10.0.0.10
>>>> 10.0.0.13 -noprompt c:\ex\cpi.exe
>>>>
>>>> I get:
>>>>
>>>> Aborting: unable to connect to 10.0.0.10
>>>>
>>>> Somehow I can't start any process on the remote machine(10.0.0.10).
>>>> It annoys me, that a few days ago it worked, but I had to reinstall
>>>> one of them, and since then i couldn't figure it out what's wrong
>>>> with my settings.  thanks.
>>>>
>>>> regards
>>>> K.A. Albert
>>>>



More information about the mpich-discuss mailing list