[mpich-discuss] help-mpich2-unable to connect
Jayesh Krishna
jayesh at mcs.anl.gov
Mon Jan 17 23:57:12 CST 2011
Hi,
Can you ping one machine from the other (192.168.1.3 from 192.168.1.5 and 192.168.1.5 from 192.168.1.3) ?
Do you have the same username on both the machines (And from your email I am assuming that you are able to run your job locally on each machine, right ? )?
(PS: Please copy your response to mpich-discuss, Apart from me other devs and users can also pitch in with their comments/solns)
Regards,
Jayesh
----- Original Message -----
From: "Sayed Zulfikar" <sayed.zulfikar at yahoo.com>
To: "jayesh MPICH2 master" <jayesh at mcs.anl.gov>
Sent: Saturday, January 15, 2011 4:24:51 PM
Subject: help-mpich2-unable to connect
dear jayesh,
i'm sorry for mailing you, but from what i found in internet, you are the guy that often help people with MPICH2,
e.g in this link
http://lists.mcs.anl.gov/pipermail/mpich-discuss/2009-February/004657.html
you help the guy,
i have the same problem, when i try to running MPICH2 with 2 computers connected by LAN cable, it is said
"abort : unable to connect to 192.168.1.3"
note : i made the ip static, my computer is 192.168.1.5 and the other one is 192.168.1.3
i test running the cpi.exe and some other parallel files in my single computer that has procesor core 2 duo, and it run correctly
i use mpich2 1.3
both machine is using windows 7 profesional
seem like i have done what you have suggest in that link,
i turn off the firewall
i made sure both machine run the same smpd version
i made sure both machine installed MPICH2 correctly, (run in admin priviledge, for everyone and default passphrase)
i use "mpiexec -register -user 1" and then i didn't enter the username, i just simply press "enter" button and then i enter the password "behappy"
i validated that user then i run mpiexec -hosts2 xxxxxxx xxxxxx - user 1, which didn't work
i made both machine has same logon password, "behappy"
thank you,
this is the verbose
..handling executable:
C:\pp\MergeSort.exe
..Processing environment variables
..Processing drive mappings
..Creating launch nodes (2)
..\smpd_get_next_host
...\smpd_get_host_id
.../smpd_get_host_id
../smpd_get_next_host
..Adding host (192.168.1.5) to launch list
..\smpd_get_next_host
...\smpd_get_host_id
.../smpd_get_host_id
../smpd_get_next_host
..Adding host (192.168.1.3) to launch list
..\smpd_create_cliques
...\prev_launch_node
.../prev_launch_node
...\prev_launch_node
.../prev_launch_node
...\prev_launch_node
.../prev_launch_node
...\prev_launch_node
.../prev_launch_node
../smpd_create_cliques
..\smpd_fix_up_host_tree
../smpd_fix_up_host_tree
./mp_parse_command_args
.host tree:
. host: 192.168.1.5, parent: 0, id: 1
. host: 192.168.1.3, parent: 1, id: 2
.launch nodes:
. iproc: 1, id: 2, exe: C:\pp\MergeSort.exe
. iproc: 0, id: 1, exe: C:\pp\MergeSort.exe
.\SMPDU_Sock_create_set
..\smpd_get_smpd_data
...\smpd_get_smpd_data_from_environment
.../smpd_get_smpd_data_from_environment
../smpd_get_smpd_data
..\smpd_create_context
...\smpd_init_context
....\smpd_init_command
..../smpd_init_command
.../smpd_init_context
../smpd_create_context
..\SMPDU_Sock_post_connect
../SMPDU_Sock_post_connect
..\SMPDU_Sock_set_user_ptr
../SMPDU_Sock_set_user_ptr
..\smpd_make_socket_loop
...\smpd_get_hostname
.../smpd_get_hostname
../smpd_make_socket_loop
..\SMPDU_Sock_native_to_sock
../SMPDU_Sock_native_to_sock
..\SMPDU_Sock_native_to_sock
../SMPDU_Sock_native_to_sock
..\smpd_create_context
...\smpd_init_context
....\smpd_init_command
..../smpd_init_command
....\SMPDU_Sock_set_user_ptr
..../SMPDU_Sock_set_user_ptr
.../smpd_init_context
../smpd_create_context
..\SMPDU_Sock_post_read
...\SMPDU_Sock_post_readv .../SMPDU_Sock_post_readv
../SMPDU_Sock_post_read
..\smpd_enter_at_state
...sock_waiting for the next event.
...\SMPDU_Sock_wait
.../SMPDU_Sock_wait
...SOCK_OP_CONNECT event.error = 0, result = 0, context=left
...\smpd_handle_op_connect
....connect succeeded, posting read of the challenge string
....\SMPDU_Sock_post_read
.....\SMPDU_Sock_post_readv
...../SMPDU_Sock_post_readv
..../SMPDU_Sock_post_read
.../smpd_handle_op_connect
...sock_waiting for the next event.
...\SMPDU_Sock_wait
.../SMPDU_Sock_wait
...SOCK_OP_READ event.error = 0, result = 0, context=left
...\smpd_handle_op_read
....\smpd_state_reading_challenge_string
.....read challenge string: '1.3 28253'
.....\smpd_verify_version
...../smpd_verify_version
.....Verification of smpd version succeeded
.....\smpd_hash
...../smpd_hash
.....\SMPDU_Sock_post_write
......\SMPDU_Sock_post_writev
....../SMPDU_Sock_post_writev
...../SMPDU_Sock_post_write
..../smpd_state_reading_challenge_string
.../smpd_handle_op_read
...sock_waiting for the next event.
...\SMPDU_Sock_wait
.../SMPDU_Sock_wait
...SOCK_OP_WRITE event.error = 0, result = 0, context=left
...\smpd_handle_op_write
....\smpd_state_writing_challenge_response
.....wrote challenge response: 'ac829821dd2160834c62236a36b026c8'
.....\SMPDU_Sock_post_read
......\SMPDU_Sock_post_readv
....../SMPDU_Sock_post_readv
...../SMPDU_Sock_post_read
..../smpd_state_writing_challenge_response
.../smpd_handle_op_write
...sock_waiting for the next event.
...\SMPDU_Sock_wait
.../SMPDU_Sock_wait
...SOCK_OP_READ event.error = 0, result = 0, context=left
...\smpd_handle_op_read
....\smpd_state_reading_connect_result
.....read connect result: 'SUCCESS'
.....\SMPDU_Sock_post_write
......\SMPDU_Sock_post_writev
....../SMPDU_Sock_post_writev
...../SMPDU_Sock_post_write
..../smpd_state_reading_connect_result
.../smpd_handle_op_read
...sock_waiting for the next event.
...\SMPDU_Sock_wait
.../SMPDU_Sock_wait
...SOCK_OP_WRITE event.error = 0, result = 0, context=left
...\smpd_handle_op_write
....\smpd_state_writing_process_session_request
.....wrote process session request: 'process'
.....\SMPDU_Sock_post_read
......\SMPDU_Sock_post_readv
....../SMPDU_Sock_post_readv
...../SMPDU_Sock_post_read
..../smpd_state_writing_process_session_request
.../smpd_handle_op_write
...sock_waiting for the next event.
...\SMPDU_Sock_wait
.../SMPDU_Sock_wait
...SOCK_OP_READ event.error = 0, result = 0, context=left
...\smpd_handle_op_read
....\smpd_state_reading_cred_request
.....read cred request: 'credentials'
.....\smpd_hide_string_arg
......\first_token
....../first_token
......\compare_token
....../compare_token
......\next_token
.......\first_token
......./first_token
.......\first_token
......./first_token
....../next_token
...../smpd_hide_string_arg
...../smpd_hide_string_arg
.....\smpd_hide_string_arg
......\first_token
....../first_token
......\compare_token
....../compare_token
......\next_token
.......\first_token
......./first_token
.......\first_token
......./first_token
....../next_token
...../smpd_hide_string_arg
...../smpd_hide_string_arg
.....\smpd_hide_string_arg
......\first_token
....../first_token
......\compare_token
....../compare_token
......\next_token
.......\first_token
......./first_token
.......\first_token
......./first_token
....../next_token
...../smpd_hide_string_arg
...../smpd_hide_string_arg
. ....\smpd_hide_string_arg
......\first_token
....../first_token
......\compare_token
....../compare_token
......\next_token
.......\first_token
......./first_token
.......\first_token
......./first_token
....../next_token
...../smpd_hide_string_arg
...../smpd_hide_string_arg
.....\smpd_hide_string_arg
......\first_token
....../first_token
......\compare_token
....../compare_token
......\next_token
.......\first_token
......./first_token
.......\first_token
......./first_token
....../next_token
...../smpd_hide_string_arg
...../smpd_hide_string_arg
......\smpd_option_on
.......\smpd_get_smpd_data
........\smpd_get_smpd_data_from_environment
......../smpd_get_smpd_data_from_environment
........\smpd_get_smpd_data_default
......../smpd_get_smpd_data_default
........Unable to get the data for the key 'nocache'
......./smpd_get_smpd_data
....../smpd_option_on
.....\smpd_hide_string_arg
......\first_token
....../first_token
......\compare_token
....../compare_token
......\next_token
.......\first_token
......./first_token
.......\first_token
......./first_token
....../next_token
...../smpd_hide_string_arg
...../smpd_hide_string_arg
.....\SMPDU_Sock_post_write
......\SMPDU_Sock_post_writev
....../SMPDU_Sock_post_writev
...../SMPDU_Sock_post_write
..../smpd_handle_op_read
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_WRITE event.error = 0, result = 0, context=left
....\smpd_handle_op_write
.....\smpd_state_writing_cred_ack_yes
......wrote cred request yes ack.
......\SMPDU_Sock_post_write
.......\SMPDU_Sock_post_writev
......./SMPDU_Sock_post_writev
....../SMPDU_Sock_post_write
...../smpd_state_writing_cred_ack_yes
..../smpd_handle_op_write
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_WRITE event.error = 0, result = 0, context=left
....\smpd_handle_op_write
.....\smpd_state_writing_account
......wrote account: 'morrow-PC\morrow'
......\smpd_encrypt_data
....../smpd_encrypt_data
......\SMPDU_Sock_post_write
.......\SMPDU_Sock_post_writev
......./SMPDU_Sock_post_writev
....../SMPDU_Sock_post_write
...../smpd_state_writing_account
..../smpd_handle_op_write
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_WRITE event.error = 0, result = 0, context=left
....\smpd_handle_op_write
.....\smpd_hide_string_arg
......\first_token
....../first_token
......\compare_token
....../compare_token
......\next_token
.......\first_token
......./first_token
.......\first_token
......./first_token
....../next_token
...../smpd_hide_string_arg
...../smpd_hide_string_arg
......\smpd_hide_string_arg
.......\first_token
......./first_token
.......\compare_token
......./compare_token
.......\next_token
........\first_token
......../first_token
........\first_token
......../first_token
......./next_token
....../smpd_hide_string_arg
....../smpd_hide_string_arg
......\SMPDU_Sock_post_read
.......\SMPDU_Sock_post_readv
......./SMPDU_Sock_post_readv
....../SMPDU_Sock_post_read
.....\smpd_hide_string_arg
......\first_token
....../first_token
......\compare_token
....../compare_token
......\next_token
..... ..\first_token
......./first_token
.......\first_token
......./first_token
....../next_token
...../smpd_hide_string_arg
...../smpd_hide_string_arg
..../smpd_handle_op_write
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_READ event.error = 0, result = 0, context=left
....\smpd_handle_op_read
.....\smpd_state_reading_process_result
......read process session result: 'SUCCESS'
......\SMPDU_Sock_post_read
.......\SMPDU_Sock_post_readv
......./SMPDU_Sock_post_readv
....../SMPDU_Sock_post_read
...../smpd_state_reading_process_result
..../smpd_handle_op_read
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_READ event.error = 0, result = 0, context=left
....\smpd_handle_op_read
.....\smpd_state_reading_reconnect_request
......read re-connect request: '49432'
......closing the old socket in the left context.
......\SMPDU_Sock_get_sock_id
....../SMPDU_Sock_get_sock_id
......SMPDU_Sock_post_close(464)
......\SMPDU_Sock_post_close
.......\SMPDU_Sock_post_read
........\SMPDU_Sock_post_readv
......../SMPDU_Sock_post_readv
......./SMPDU_Sock_post_read
....../SMPDU_Sock_post_close
......connecting a new socket.
......\smpd_create_context
.......\smpd_init_context
........\smpd_init_command
......../smpd_init_command
......./smpd_init_context
....../smpd_create_context
......posting a re-connect to 192.168.1.5:49432 in left context.
......\SMPDU_Sock_post_connect
....../SMPDU_Sock_post_connect
...../smpd_state_reading_reconnect_request
..../smpd_handle_op_read
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_CLOSE event.error = 0, result = 0, context=left
....\smpd_handle_op_close
.....\smpd_get_state_string
...../smpd_get_state_string
.....op_close received - SMPD_CLOSING state.
.....Unaffiliated left context closing.
.....\smpd_free_context
......freeing left context.
......\smpd_init_context
.......\smpd_init_command
......./smpd_init_command
....../smpd_init_context
...../smpd_free_context
..../smpd_handle_op_close
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_CONNECT event.error = 0, result = 0, context=left
....\smpd_handle_op_connect
.....\smpd_generate_session_header
......session header: (id=1 parent=0 level=0)
...../smpd_generate_session_header
.....\SMPDU_Sock_post_write
......\SMPDU_Sock_post_writev
....../SMPDU_Sock_post_writev
...../SMPDU_Sock_post_write
..../smpd_handle_op_connect
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_WRITE event.error = 0, result = 0, context=left
....\smpd_handle_op_write
.....\smpd_state_writing_session_header
......wrote session header: 'id=1 parent=0 level=0'
......\smpd_post_read_command
.......\SMPDU_Sock_get_sock_id
......./SMPDU_Sock_get_sock_id
.......posting a read for a command header on the left context, sock 528
.......\SMPDU_Sock_post_read
........\SMPDU_Sock_post_readv
......../SMPDU_Sock_post_readv
......./SMPDU_Sock_post_read
....../smpd_post_read_command
......creating connect command for left node
......creating connect command to '192.168.1.3'
......\smpd_create_command
.......\smpd_init_command
......./smpd_init_command
....../smpd_create_command
......\smpd_add_command_arg
....../smpd_add_command_arg
......\smpd_add_command_int_arg
....../smpd_add_command_int_arg
......\smpd_post_write_command
.......\smpd_package_command
......./smpd_package_command
.......\SMPDU_Sock_get_sock_id
......./SMPDU_Sock_get_sock_id
.......smpd_post_write_command on the left context sock 528: 67 bytes for command: "cmd=connect src=0 dest=1 tag=0 host=192.168.1.3 id=2 "
.......\SMPDU_Sock_post_writev
......./SMPDU_Sock_post_writev
....../smpd_post_write_command
......not connected yet: 192.168.1.3 not connected
...../smpd_state_writing_session_header
..../smpd_handle_op_write
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_WRITE event.error = 0, result = 0, context=left
....\smpd_handle_op_write
.....\smpd_state_writing_cmd
......wrote command
......command written to left: "cmd=connect src=0 dest=1 tag=0 host=192.168.1.3 id=2 "
......moving 'connect' command to the wait_list.
...../smpd_state_writing_cmd
..../smpd_handle_op_write
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_READ event.error = 0, result = 0, context=left
....\smpd_handle_op_read
.....\smpd_state_reading_cmd_header
......read command header
......command header read, posting read for data: 71 bytes
......\SMPDU_Sock_post_read
.......\SMPDU_Sock_post_readv
......./SMPDU_Sock_post_readv
....../SMPDU_Sock_post_read
...../smpd_state_reading_cmd_header
..../smpd_handle_op_read
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_READ event.error = 0, result = 0, context=left
....\smpd_handle_op_read
.....\smpd_state_reading_cmd
......read command
......\smpd_parse_command
....../smpd_parse_command
......read command: "cmd=abort src=1 dest=0 tag=0 error="Unable to connect to 192.168.1.3" "
......\smpd_handle_command
.......handling command:
....... src = 1
....... dest = 0
....... cmd = abort
....... tag = 0
....... ctx = left
....... len = 71
....... str = cmd=abort src=1 dest=0 tag=0 error="Unable to connect to 192.168.1.3"
.......\smpd_command_destination
........0 -> 0 : returning NULL context
......./smpd_command_destination
.......\smpd_handle_abort_command
........abort: Unable to connect to 192.168.1.3
......./smpd_handle_abort_command
....../smpd_handle_command
......\smpd_post_read_command
.......\SMPDU_Sock_get_sock_id
......./SMPDU_Sock_get_sock_id
.......posting a read for a command header on the left context, sock 528
.......\SMPDU_Sock_post_read
........\SMPDU_Sock_post_readv
......../SMPDU_Sock_post_readv
......./SMPDU_Sock_post_read
....../smpd_post_read_command
......\smpd_create_command
.......\smpd_init_command
......./smpd_init_command
....../smpd_create_command
......\smpd_post_write_command
.......\smpd_package_command
......./smpd_package_command
.......\SMPDU_Sock_get_sock_id
......./SMPDU_Sock_get_sock_id
.......smpd_post_write_command on the left context sock 528: 43 bytes for command: "cmd=close src=0 dest=1 tag=1 "
.......\SMPDU_Sock_post_writev
......./SMPDU_Sock_post_writev
....../smpd_post_write_command
...../smpd_state_reading_cmd
..../smpd_handle_op_read
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_WRITE event.error = 0, result = 0, context=left
....\smpd_handle_op_write
.....\smpd_state_writing_cmd
......wrote command
......command written to left: "cmd=close src=0 dest=1 tag=1 "
......\smpd_free_command
.......\smpd_init_command
......./smpd_init_command
....../smpd_free_command
...../smpd_state_writing_cmd
..../smpd_handle_op_write
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_READ event.error = 0, result = 0, context=left
....\smpd_handle_op_read
.....\smpd_state_reading_cmd_header
......read command header
......command header read, posting read for data: 31 bytes
......\SMPDU_Sock_post_read
.......\SMPDU_Sock_post_readv
......./SMPDU_Sock_post_readv
....../SMPDU_Sock_post_read
...../smpd_state_reading_cmd_header
..../smpd_handle_op_read
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_READ event.error = 0, result = 0, context=left
....\smpd_handle_op_read
.....\smpd_state_reading_cmd
......read command
......\smpd_parse_command
....../smpd_parse_command
......read command: "cmd=closed src=1 dest=0 tag=1 "
......\smpd_handle_command
.......handling command:
....... src = 1
....... dest = 0
....... cmd = closed
....... tag = 1
....... ctx = left
....... len = 31
....... str = cmd=closed src=1 dest=0 tag=1
.......\smpd_command_destination
........0 -> 0 : returning NULL context
......./smpd_command_destination
.......\smpd_handle_closed_command
........closed command received from left child, closing sock.
........\SMPDU_Sock_get_sock_id
......../SMPDU_Sock_get_sock_id
........SMPDU_Sock_post_close(528)
........\SMPDU_Sock_post_close
.........\SMPDU_Sock_post_read
..........\SMPDU_Sock_post_readv
........../SMPDU_Sock_post_readv
........./SMPDU_Sock_post_read
......../SMPDU_Sock_post_close
........received a closed at node with no parent context, assuming root, returning SMPD_EXITING.
......./smpd_handle_closed_command
....../smpd_handle_command
......not posting read for another command because SMPD_EXITING returned
...../smpd_state_reading_cmd
..../smpd_handle_op_read
....sock_waiting for the next event.
....\SMPDU_Sock_wait
..../SMPDU_Sock_wait
....SOCK_OP_CLOSE event.error = 0, result = 0, context=left
....\smpd_handle_op_close
.....\smpd_get_state_string
...../smpd_get_state_string
.....op_close received - SMPD_EXITING state.
.....\smpd_free_context
......freeing left context.
......\smpd_init_context
.......\smpd_init_command
......./smpd_init_command
....../smpd_init_context
...../smpd_free_context
..../smpd_handle_op_close
.../smpd_enter_at_state
...calling SMPDU_Sock_finalize
...\SMPDU_Sock_finalize
.../SMPDU_Sock_finalize
../main
..\smpd_exit
...\smpd_kill_all_processes
.../smpd_kill_all_processes
...\smpd_finalize_drive_maps
.../smpd_finalize_drive_maps
More information about the mpich-discuss
mailing list