[mpich-discuss] help-mpich2-unable to connect

Jayesh Krishna jayesh at mcs.anl.gov
Mon Jan 17 23:57:12 CST 2011


Hi,
 Can you ping one machine from the other (192.168.1.3 from 192.168.1.5 and 192.168.1.5 from 192.168.1.3) ?
 Do you have the same username on both the machines (And from your email I am assuming that you are able to run your job locally on each machine, right ? )?

(PS: Please copy your response to mpich-discuss, Apart from me other devs and users can also pitch in with their comments/solns)

Regards,
Jayesh

----- Original Message -----
From: "Sayed Zulfikar" <sayed.zulfikar at yahoo.com>
To: "jayesh MPICH2 master" <jayesh at mcs.anl.gov>
Sent: Saturday, January 15, 2011 4:24:51 PM
Subject: help-mpich2-unable to connect


dear jayesh, 




i'm sorry for mailing you, but from what i found in internet, you are the guy that often help people with MPICH2, 

e.g in this link 

http://lists.mcs.anl.gov/pipermail/mpich-discuss/2009-February/004657.html 

you help the guy, 

i have the same problem, when i try to running MPICH2 with 2 computers connected by LAN cable, it is said 
"abort : unable to connect to 192.168.1.3" 

note : i made the ip static, my computer is 192.168.1.5 and the other one is 192.168.1.3 
i test running the cpi.exe and some other parallel files in my single computer that has procesor core 2 duo, and it run correctly 
i use mpich2 1.3 
both machine is using windows 7 profesional 

seem like i have done what you have suggest in that link, 
i turn off the firewall 
i made sure both machine run the same smpd version 
i made sure both machine installed MPICH2 correctly, (run in admin priviledge, for everyone and default passphrase) 
i use "mpiexec -register -user 1" and then i didn't enter the username, i just simply press "enter" button and then i enter the password "behappy" 
i validated that user then i run mpiexec -hosts2 xxxxxxx xxxxxx - user 1, which didn't work 
i made both machine has same logon password, "behappy" 

thank you, 

this is the verbose 

..handling executable: 
C:\pp\MergeSort.exe 
..Processing environment variables 
..Processing drive mappings 
..Creating launch nodes (2) 
..\smpd_get_next_host 
...\smpd_get_host_id 
.../smpd_get_host_id 
../smpd_get_next_host 
..Adding host (192.168.1.5) to launch list 
..\smpd_get_next_host 
...\smpd_get_host_id 
.../smpd_get_host_id 
../smpd_get_next_host 
..Adding host (192.168.1.3) to launch list 
..\smpd_create_cliques 
...\prev_launch_node 
.../prev_launch_node 
...\prev_launch_node 
.../prev_launch_node 
...\prev_launch_node 
.../prev_launch_node 
...\prev_launch_node 
.../prev_launch_node 
../smpd_create_cliques 
..\smpd_fix_up_host_tree 
../smpd_fix_up_host_tree 
./mp_parse_command_args 
.host tree: 
. host: 192.168.1.5, parent: 0, id: 1 
. host: 192.168.1.3, parent: 1, id: 2 
.launch nodes: 
. iproc: 1, id: 2, exe: C:\pp\MergeSort.exe 
. iproc: 0, id: 1, exe: C:\pp\MergeSort.exe 
.\SMPDU_Sock_create_set 
..\smpd_get_smpd_data 
...\smpd_get_smpd_data_from_environment 
.../smpd_get_smpd_data_from_environment 
../smpd_get_smpd_data 
..\smpd_create_context 
...\smpd_init_context 
....\smpd_init_command 
..../smpd_init_command 
.../smpd_init_context 
../smpd_create_context 
..\SMPDU_Sock_post_connect 
../SMPDU_Sock_post_connect 
..\SMPDU_Sock_set_user_ptr 
../SMPDU_Sock_set_user_ptr 
..\smpd_make_socket_loop 
...\smpd_get_hostname 
.../smpd_get_hostname 
../smpd_make_socket_loop 
..\SMPDU_Sock_native_to_sock 
../SMPDU_Sock_native_to_sock 
..\SMPDU_Sock_native_to_sock 
../SMPDU_Sock_native_to_sock 
..\smpd_create_context 
...\smpd_init_context 
....\smpd_init_command 
..../smpd_init_command 
....\SMPDU_Sock_set_user_ptr 
..../SMPDU_Sock_set_user_ptr 
.../smpd_init_context 
../smpd_create_context 
..\SMPDU_Sock_post_read 
...\SMPDU_Sock_post_readv .../SMPDU_Sock_post_readv 
../SMPDU_Sock_post_read 
..\smpd_enter_at_state 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_CONNECT event.error = 0, result = 0, context=left 
...\smpd_handle_op_connect 
....connect succeeded, posting read of the challenge string 
....\SMPDU_Sock_post_read 
.....\SMPDU_Sock_post_readv 
...../SMPDU_Sock_post_readv 
..../SMPDU_Sock_post_read 
.../smpd_handle_op_connect 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_READ event.error = 0, result = 0, context=left 
...\smpd_handle_op_read 
....\smpd_state_reading_challenge_string 
.....read challenge string: '1.3 28253' 
.....\smpd_verify_version 
...../smpd_verify_version 
.....Verification of smpd version succeeded 
.....\smpd_hash 
...../smpd_hash 
.....\SMPDU_Sock_post_write 
......\SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_writev 
...../SMPDU_Sock_post_write 
..../smpd_state_reading_challenge_string 
.../smpd_handle_op_read 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_WRITE event.error = 0, result = 0, context=left 
...\smpd_handle_op_write 
....\smpd_state_writing_challenge_response 
.....wrote challenge response: 'ac829821dd2160834c62236a36b026c8' 
.....\SMPDU_Sock_post_read 
......\SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_readv 
...../SMPDU_Sock_post_read 
..../smpd_state_writing_challenge_response 
.../smpd_handle_op_write 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_READ event.error = 0, result = 0, context=left 
...\smpd_handle_op_read 
....\smpd_state_reading_connect_result 
.....read connect result: 'SUCCESS' 
.....\SMPDU_Sock_post_write 
......\SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_writev 
...../SMPDU_Sock_post_write 
..../smpd_state_reading_connect_result 
.../smpd_handle_op_read 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_WRITE event.error = 0, result = 0, context=left 
...\smpd_handle_op_write 
....\smpd_state_writing_process_session_request 
.....wrote process session request: 'process' 
.....\SMPDU_Sock_post_read 
......\SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_readv 
...../SMPDU_Sock_post_read 
..../smpd_state_writing_process_session_request 
.../smpd_handle_op_write 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_READ event.error = 0, result = 0, context=left 
...\smpd_handle_op_read 
....\smpd_state_reading_cred_request 
.....read cred request: 'credentials' 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
. ....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
......\smpd_option_on 
.......\smpd_get_smpd_data 
........\smpd_get_smpd_data_from_environment 
......../smpd_get_smpd_data_from_environment 
........\smpd_get_smpd_data_default 
......../smpd_get_smpd_data_default 
........Unable to get the data for the key 'nocache' 
......./smpd_get_smpd_data 
....../smpd_option_on 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
.....\SMPDU_Sock_post_write 
......\SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_writev 
...../SMPDU_Sock_post_write 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_cred_ack_yes 
......wrote cred request yes ack. 
......\SMPDU_Sock_post_write 
.......\SMPDU_Sock_post_writev 
......./SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_write 
...../smpd_state_writing_cred_ack_yes 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_account 
......wrote account: 'morrow-PC\morrow' 
......\smpd_encrypt_data 
....../smpd_encrypt_data 
......\SMPDU_Sock_post_write 
.......\SMPDU_Sock_post_writev 
......./SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_write 
...../smpd_state_writing_account 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
......\smpd_hide_string_arg 
.......\first_token 
......./first_token 
.......\compare_token 
......./compare_token 
.......\next_token 
........\first_token 
......../first_token 
........\first_token 
......../first_token 
......./next_token 
....../smpd_hide_string_arg 
....../smpd_hide_string_arg 
......\SMPDU_Sock_post_read 
.......\SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_read 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
..... ..\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_process_result 
......read process session result: 'SUCCESS' 
......\SMPDU_Sock_post_read 
.......\SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_read 
...../smpd_state_reading_process_result 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_reconnect_request 
......read re-connect request: '49432' 
......closing the old socket in the left context. 
......\SMPDU_Sock_get_sock_id 
....../SMPDU_Sock_get_sock_id 
......SMPDU_Sock_post_close(464) 
......\SMPDU_Sock_post_close 
.......\SMPDU_Sock_post_read 
........\SMPDU_Sock_post_readv 
......../SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_read 
....../SMPDU_Sock_post_close 
......connecting a new socket. 
......\smpd_create_context 
.......\smpd_init_context 
........\smpd_init_command 
......../smpd_init_command 
......./smpd_init_context 
....../smpd_create_context 
......posting a re-connect to 192.168.1.5:49432 in left context. 
......\SMPDU_Sock_post_connect 
....../SMPDU_Sock_post_connect 
...../smpd_state_reading_reconnect_request 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_CLOSE event.error = 0, result = 0, context=left 
....\smpd_handle_op_close 
.....\smpd_get_state_string 
...../smpd_get_state_string 
.....op_close received - SMPD_CLOSING state. 
.....Unaffiliated left context closing. 
.....\smpd_free_context 
......freeing left context. 
......\smpd_init_context 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_init_context 
...../smpd_free_context 
..../smpd_handle_op_close 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_CONNECT event.error = 0, result = 0, context=left 
....\smpd_handle_op_connect 
.....\smpd_generate_session_header 
......session header: (id=1 parent=0 level=0) 
...../smpd_generate_session_header 
.....\SMPDU_Sock_post_write 
......\SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_writev 
...../SMPDU_Sock_post_write 
..../smpd_handle_op_connect 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_session_header 
......wrote session header: 'id=1 parent=0 level=0' 
......\smpd_post_read_command 
.......\SMPDU_Sock_get_sock_id 
......./SMPDU_Sock_get_sock_id 
.......posting a read for a command header on the left context, sock 528 
.......\SMPDU_Sock_post_read 
........\SMPDU_Sock_post_readv 
......../SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_read 
....../smpd_post_read_command 
......creating connect command for left node 
......creating connect command to '192.168.1.3' 
......\smpd_create_command 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_create_command 
......\smpd_add_command_arg 
....../smpd_add_command_arg 
......\smpd_add_command_int_arg 
....../smpd_add_command_int_arg 
......\smpd_post_write_command 
.......\smpd_package_command 
......./smpd_package_command 
.......\SMPDU_Sock_get_sock_id 
......./SMPDU_Sock_get_sock_id 
.......smpd_post_write_command on the left context sock 528: 67 bytes for command: "cmd=connect src=0 dest=1 tag=0 host=192.168.1.3 id=2 " 
.......\SMPDU_Sock_post_writev 
......./SMPDU_Sock_post_writev 
....../smpd_post_write_command 
......not connected yet: 192.168.1.3 not connected 
...../smpd_state_writing_session_header 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_cmd 
......wrote command 
......command written to left: "cmd=connect src=0 dest=1 tag=0 host=192.168.1.3 id=2 " 
......moving 'connect' command to the wait_list. 
...../smpd_state_writing_cmd 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_cmd_header 
......read command header 
......command header read, posting read for data: 71 bytes 
......\SMPDU_Sock_post_read 
.......\SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_read 
...../smpd_state_reading_cmd_header 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_cmd 
......read command 
......\smpd_parse_command 
....../smpd_parse_command 
......read command: "cmd=abort src=1 dest=0 tag=0 error="Unable to connect to 192.168.1.3" " 
......\smpd_handle_command 
.......handling command: 
....... src = 1 
....... dest = 0 
....... cmd = abort 
....... tag = 0 
....... ctx = left 
....... len = 71 
....... str = cmd=abort src=1 dest=0 tag=0 error="Unable to connect to 192.168.1.3" 
.......\smpd_command_destination 
........0 -> 0 : returning NULL context 
......./smpd_command_destination 
.......\smpd_handle_abort_command 
........abort: Unable to connect to 192.168.1.3 
......./smpd_handle_abort_command 
....../smpd_handle_command 
......\smpd_post_read_command 
.......\SMPDU_Sock_get_sock_id 
......./SMPDU_Sock_get_sock_id 
.......posting a read for a command header on the left context, sock 528 
.......\SMPDU_Sock_post_read 
........\SMPDU_Sock_post_readv 
......../SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_read 
....../smpd_post_read_command 
......\smpd_create_command 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_create_command 
......\smpd_post_write_command 
.......\smpd_package_command 
......./smpd_package_command 
.......\SMPDU_Sock_get_sock_id 
......./SMPDU_Sock_get_sock_id 
.......smpd_post_write_command on the left context sock 528: 43 bytes for command: "cmd=close src=0 dest=1 tag=1 " 
.......\SMPDU_Sock_post_writev 
......./SMPDU_Sock_post_writev 
....../smpd_post_write_command 
...../smpd_state_reading_cmd 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_cmd 
......wrote command 
......command written to left: "cmd=close src=0 dest=1 tag=1 " 
......\smpd_free_command 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_free_command 
...../smpd_state_writing_cmd 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_cmd_header 
......read command header 
......command header read, posting read for data: 31 bytes 
......\SMPDU_Sock_post_read 
.......\SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_read 
...../smpd_state_reading_cmd_header 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_cmd 
......read command 
......\smpd_parse_command 
....../smpd_parse_command 
......read command: "cmd=closed src=1 dest=0 tag=1 " 
......\smpd_handle_command 
.......handling command: 
....... src = 1 
....... dest = 0 
....... cmd = closed 
....... tag = 1 
....... ctx = left 
....... len = 31 
....... str = cmd=closed src=1 dest=0 tag=1 
.......\smpd_command_destination 
........0 -> 0 : returning NULL context 
......./smpd_command_destination 
.......\smpd_handle_closed_command 
........closed command received from left child, closing sock. 
........\SMPDU_Sock_get_sock_id 
......../SMPDU_Sock_get_sock_id 
........SMPDU_Sock_post_close(528) 
........\SMPDU_Sock_post_close 
.........\SMPDU_Sock_post_read 
..........\SMPDU_Sock_post_readv 
........../SMPDU_Sock_post_readv 
........./SMPDU_Sock_post_read 
......../SMPDU_Sock_post_close 
........received a closed at node with no parent context, assuming root, returning SMPD_EXITING. 
......./smpd_handle_closed_command 
....../smpd_handle_command 
......not posting read for another command because SMPD_EXITING returned 
...../smpd_state_reading_cmd 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_CLOSE event.error = 0, result = 0, context=left 
....\smpd_handle_op_close 
.....\smpd_get_state_string 
...../smpd_get_state_string 
.....op_close received - SMPD_EXITING state. 
.....\smpd_free_context 
......freeing left context. 
......\smpd_init_context 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_init_context 
...../smpd_free_context 
..../smpd_handle_op_close 
.../smpd_enter_at_state 
...calling SMPDU_Sock_finalize 
...\SMPDU_Sock_finalize 
.../SMPDU_Sock_finalize 
../main 
..\smpd_exit 
...\smpd_kill_all_processes 
.../smpd_kill_all_processes 
...\smpd_finalize_drive_maps 
.../smpd_finalize_drive_maps 






More information about the mpich-discuss mailing list