[mpich-discuss] help-mpich2-unable to connect

Sayed Zulfikar sayed.zulfikar at yahoo.com
Mon Jan 24 11:09:35 CST 2011


sorry, i forgot to mail to mpich-discuss, my bad...

Can you ping one machine from the other (192.168.1.3 from 192.168.1.5 and 
192.168.1.5 from 192.168.1.3) ?
=> yes i can, both machine can ping each other

 
Do you have the same username on both the machines (And from your email I  am 
assuming that you are able to run your job locally on each machine,  right ? )?
=> no, the username are different, but the password and passphrase is same
yes, i am able to run my job locally on each machine,
the error message said that "unable connect to xxxx"
i've tried on both machine, but the error message just the same..

can anybody help me with this problem?

thx



________________________________
From: Jayesh Krishna <jayesh at mcs.anl.gov>
To: Sayed Zulfikar <sayed.zulfikar at yahoo.com>
Cc: mpich-discuss at mcs.anl.gov
Sent: Mon, January 17, 2011 9:57:12 PM
Subject: Re: help-mpich2-unable to connect

Hi,
Can you ping one machine from the other (192.168.1.3 from 192.168.1.5 and 
192.168.1.5 from 192.168.1.3) ?
Do you have the same username on both the machines (And from your email I am 
assuming that you are able to run your job locally on each machine, right ? )?

(PS: Please copy your response to mpich-discuss, Apart from me other devs and 
users can also pitch in with their comments/solns)

Regards,
Jayesh

----- Original Message -----
From: "Sayed Zulfikar" <sayed.zulfikar at yahoo.com>
To: "jayesh MPICH2 master" <jayesh at mcs.anl.gov>
Sent: Saturday, January 15, 2011 4:24:51 PM
Subject: help-mpich2-unable to connect


dear jayesh, 




i'm sorry for mailing you, but from what i found in internet, you are the guy 
that often help people with MPICH2, 


e.g in this link 

http://lists.mcs.anl.gov/pipermail/mpich-discuss/2009-February/004657.html 

you help the guy, 

i have the same problem, when i try to running MPICH2 with 2 computers connected 
by LAN cable, it is said 

"abort : unable to connect to 192.168.1.3" 

note : i made the ip static, my computer is 192.168.1.5 and the other one is 
192.168.1.3 

i test running the cpi.exe and some other parallel files in my single computer 
that has procesor core 2 duo, and it run correctly 

i use mpich2 1.3 
both machine is using windows 7 profesional 

seem like i have done what you have suggest in that link, 
i turn off the firewall 
i made sure both machine run the same smpd version 
i made sure both machine installed MPICH2 correctly, (run in admin priviledge, 
for everyone and default passphrase) 

i use "mpiexec -register -user 1" and then i didn't enter the username, i just 
simply press "enter" button and then i enter the password "behappy" 

i validated that user then i run mpiexec -hosts2 xxxxxxx xxxxxx - user 1, which 
didn't work 

i made both machine has same logon password, "behappy" 

thank you, 

this is the verbose 

..handling executable: 
C:\pp\MergeSort.exe 
..Processing environment variables 
..Processing drive mappings 
..Creating launch nodes (2) 
..\smpd_get_next_host 
...\smpd_get_host_id 
.../smpd_get_host_id 
../smpd_get_next_host 
..Adding host (192.168.1.5) to launch list 
..\smpd_get_next_host 
...\smpd_get_host_id 
.../smpd_get_host_id 
../smpd_get_next_host 
..Adding host (192.168.1.3) to launch list 
..\smpd_create_cliques 
...\prev_launch_node 
.../prev_launch_node 
...\prev_launch_node 
.../prev_launch_node 
...\prev_launch_node 
.../prev_launch_node 
...\prev_launch_node 
.../prev_launch_node 
../smpd_create_cliques 
..\smpd_fix_up_host_tree 
../smpd_fix_up_host_tree 
./mp_parse_command_args 
.host tree: 
. host: 192.168.1.5, parent: 0, id: 1 
. host: 192.168.1.3, parent: 1, id: 2 
.launch nodes: 
. iproc: 1, id: 2, exe: C:\pp\MergeSort.exe 
. iproc: 0, id: 1, exe: C:\pp\MergeSort.exe 
.\SMPDU_Sock_create_set 
..\smpd_get_smpd_data 
...\smpd_get_smpd_data_from_environment 
.../smpd_get_smpd_data_from_environment 
../smpd_get_smpd_data 
..\smpd_create_context 
...\smpd_init_context 
....\smpd_init_command 
..../smpd_init_command 
.../smpd_init_context 
../smpd_create_context 
..\SMPDU_Sock_post_connect 
../SMPDU_Sock_post_connect 
..\SMPDU_Sock_set_user_ptr 
../SMPDU_Sock_set_user_ptr 
..\smpd_make_socket_loop 
...\smpd_get_hostname 
.../smpd_get_hostname 
../smpd_make_socket_loop 
..\SMPDU_Sock_native_to_sock 
../SMPDU_Sock_native_to_sock 
..\SMPDU_Sock_native_to_sock 
../SMPDU_Sock_native_to_sock 
..\smpd_create_context 
...\smpd_init_context 
....\smpd_init_command 
..../smpd_init_command 
....\SMPDU_Sock_set_user_ptr 
..../SMPDU_Sock_set_user_ptr 
.../smpd_init_context 
../smpd_create_context 
..\SMPDU_Sock_post_read 
...\SMPDU_Sock_post_readv .../SMPDU_Sock_post_readv 
../SMPDU_Sock_post_read 
..\smpd_enter_at_state 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_CONNECT event.error = 0, result = 0, context=left 
...\smpd_handle_op_connect 
....connect succeeded, posting read of the challenge string 
....\SMPDU_Sock_post_read 
.....\SMPDU_Sock_post_readv 
...../SMPDU_Sock_post_readv 
..../SMPDU_Sock_post_read 
.../smpd_handle_op_connect 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_READ event.error = 0, result = 0, context=left 
...\smpd_handle_op_read 
....\smpd_state_reading_challenge_string 
.....read challenge string: '1.3 28253' 
.....\smpd_verify_version 
...../smpd_verify_version 
.....Verification of smpd version succeeded 
.....\smpd_hash 
...../smpd_hash 
.....\SMPDU_Sock_post_write 
......\SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_writev 
...../SMPDU_Sock_post_write 
..../smpd_state_reading_challenge_string 
.../smpd_handle_op_read 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_WRITE event.error = 0, result = 0, context=left 
...\smpd_handle_op_write 
....\smpd_state_writing_challenge_response 
.....wrote challenge response: 'ac829821dd2160834c62236a36b026c8' 
.....\SMPDU_Sock_post_read 
......\SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_readv 
...../SMPDU_Sock_post_read 
..../smpd_state_writing_challenge_response 
.../smpd_handle_op_write 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_READ event.error = 0, result = 0, context=left 
...\smpd_handle_op_read 
....\smpd_state_reading_connect_result 
.....read connect result: 'SUCCESS' 
.....\SMPDU_Sock_post_write 
......\SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_writev 
...../SMPDU_Sock_post_write 
..../smpd_state_reading_connect_result 
.../smpd_handle_op_read 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_WRITE event.error = 0, result = 0, context=left 
...\smpd_handle_op_write 
....\smpd_state_writing_process_session_request 
.....wrote process session request: 'process' 
.....\SMPDU_Sock_post_read 
......\SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_readv 
...../SMPDU_Sock_post_read 
..../smpd_state_writing_process_session_request 
.../smpd_handle_op_write 
...sock_waiting for the next event. 
...\SMPDU_Sock_wait 
.../SMPDU_Sock_wait 
...SOCK_OP_READ event.error = 0, result = 0, context=left 
...\smpd_handle_op_read 
....\smpd_state_reading_cred_request 
.....read cred request: 'credentials' 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
. ....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
......\smpd_option_on 
.......\smpd_get_smpd_data 
........\smpd_get_smpd_data_from_environment 
......../smpd_get_smpd_data_from_environment 
........\smpd_get_smpd_data_default 
......../smpd_get_smpd_data_default 
........Unable to get the data for the key 'nocache' 
......./smpd_get_smpd_data 
....../smpd_option_on 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
.....\SMPDU_Sock_post_write 
......\SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_writev 
...../SMPDU_Sock_post_write 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_cred_ack_yes 
......wrote cred request yes ack. 
......\SMPDU_Sock_post_write 
.......\SMPDU_Sock_post_writev 
......./SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_write 
...../smpd_state_writing_cred_ack_yes 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_account 
......wrote account: 'morrow-PC\morrow' 
......\smpd_encrypt_data 
....../smpd_encrypt_data 
......\SMPDU_Sock_post_write 
.......\SMPDU_Sock_post_writev 
......./SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_write 
...../smpd_state_writing_account 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
.......\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
......\smpd_hide_string_arg 
.......\first_token 
......./first_token 
.......\compare_token 
......./compare_token 
.......\next_token 
........\first_token 
......../first_token 
........\first_token 
......../first_token 
......./next_token 
....../smpd_hide_string_arg 
....../smpd_hide_string_arg 
......\SMPDU_Sock_post_read 
.......\SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_read 
.....\smpd_hide_string_arg 
......\first_token 
....../first_token 
......\compare_token 
....../compare_token 
......\next_token 
..... ..\first_token 
......./first_token 
.......\first_token 
......./first_token 
....../next_token 
...../smpd_hide_string_arg 
...../smpd_hide_string_arg 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_process_result 
......read process session result: 'SUCCESS' 
......\SMPDU_Sock_post_read 
.......\SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_read 
...../smpd_state_reading_process_result 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_reconnect_request 
......read re-connect request: '49432' 
......closing the old socket in the left context. 
......\SMPDU_Sock_get_sock_id 
....../SMPDU_Sock_get_sock_id 
......SMPDU_Sock_post_close(464) 
......\SMPDU_Sock_post_close 
.......\SMPDU_Sock_post_read 
........\SMPDU_Sock_post_readv 
......../SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_read 
....../SMPDU_Sock_post_close 
......connecting a new socket. 
......\smpd_create_context 
.......\smpd_init_context 
........\smpd_init_command 
......../smpd_init_command 
......./smpd_init_context 
....../smpd_create_context 
......posting a re-connect to 192.168.1.5:49432 in left context. 
......\SMPDU_Sock_post_connect 
....../SMPDU_Sock_post_connect 
...../smpd_state_reading_reconnect_request 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_CLOSE event.error = 0, result = 0, context=left 
....\smpd_handle_op_close 
.....\smpd_get_state_string 
...../smpd_get_state_string 
.....op_close received - SMPD_CLOSING state. 
.....Unaffiliated left context closing. 
.....\smpd_free_context 
......freeing left context. 
......\smpd_init_context 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_init_context 
...../smpd_free_context 
..../smpd_handle_op_close 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_CONNECT event.error = 0, result = 0, context=left 
....\smpd_handle_op_connect 
.....\smpd_generate_session_header 
......session header: (id=1 parent=0 level=0) 
...../smpd_generate_session_header 
.....\SMPDU_Sock_post_write 
......\SMPDU_Sock_post_writev 
....../SMPDU_Sock_post_writev 
...../SMPDU_Sock_post_write 
..../smpd_handle_op_connect 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_session_header 
......wrote session header: 'id=1 parent=0 level=0' 
......\smpd_post_read_command 
.......\SMPDU_Sock_get_sock_id 
......./SMPDU_Sock_get_sock_id 
.......posting a read for a command header on the left context, sock 528 
.......\SMPDU_Sock_post_read 
........\SMPDU_Sock_post_readv 
......../SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_read 
....../smpd_post_read_command 
......creating connect command for left node 
......creating connect command to '192.168.1.3' 
......\smpd_create_command 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_create_command 
......\smpd_add_command_arg 
....../smpd_add_command_arg 
......\smpd_add_command_int_arg 
....../smpd_add_command_int_arg 
......\smpd_post_write_command 
.......\smpd_package_command 
......./smpd_package_command 
.......\SMPDU_Sock_get_sock_id 
......./SMPDU_Sock_get_sock_id 
.......smpd_post_write_command on the left context sock 528: 67 bytes for 
command: "cmd=connect src=0 dest=1 tag=0 host=192.168.1.3 id=2 " 

.......\SMPDU_Sock_post_writev 
......./SMPDU_Sock_post_writev 
....../smpd_post_write_command 
......not connected yet: 192.168.1.3 not connected 
...../smpd_state_writing_session_header 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_cmd 
......wrote command 
......command written to left: "cmd=connect src=0 dest=1 tag=0 host=192.168.1.3 
id=2 " 

......moving 'connect' command to the wait_list. 
...../smpd_state_writing_cmd 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_cmd_header 
......read command header 
......command header read, posting read for data: 71 bytes 
......\SMPDU_Sock_post_read 
.......\SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_read 
...../smpd_state_reading_cmd_header 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_cmd 
......read command 
......\smpd_parse_command 
....../smpd_parse_command 
......read command: "cmd=abort src=1 dest=0 tag=0 error="Unable to connect to 
192.168.1.3" " 

......\smpd_handle_command 
.......handling command: 
....... src = 1 
....... dest = 0 
....... cmd = abort 
....... tag = 0 
....... ctx = left 
....... len = 71 
....... str = cmd=abort src=1 dest=0 tag=0 error="Unable to connect to 
192.168.1.3" 

.......\smpd_command_destination 
........0 -> 0 : returning NULL context 
......./smpd_command_destination 
.......\smpd_handle_abort_command 
........abort: Unable to connect to 192.168.1.3 
......./smpd_handle_abort_command 
....../smpd_handle_command 
......\smpd_post_read_command 
.......\SMPDU_Sock_get_sock_id 
......./SMPDU_Sock_get_sock_id 
.......posting a read for a command header on the left context, sock 528 
.......\SMPDU_Sock_post_read 
........\SMPDU_Sock_post_readv 
......../SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_read 
....../smpd_post_read_command 
......\smpd_create_command 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_create_command 
......\smpd_post_write_command 
.......\smpd_package_command 
......./smpd_package_command 
.......\SMPDU_Sock_get_sock_id 
......./SMPDU_Sock_get_sock_id 
.......smpd_post_write_command on the left context sock 528: 43 bytes for 
command: "cmd=close src=0 dest=1 tag=1 " 

.......\SMPDU_Sock_post_writev 
......./SMPDU_Sock_post_writev 
....../smpd_post_write_command 
...../smpd_state_reading_cmd 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_WRITE event.error = 0, result = 0, context=left 
....\smpd_handle_op_write 
.....\smpd_state_writing_cmd 
......wrote command 
......command written to left: "cmd=close src=0 dest=1 tag=1 " 
......\smpd_free_command 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_free_command 
...../smpd_state_writing_cmd 
..../smpd_handle_op_write 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_cmd_header 
......read command header 
......command header read, posting read for data: 31 bytes 
......\SMPDU_Sock_post_read 
.......\SMPDU_Sock_post_readv 
......./SMPDU_Sock_post_readv 
....../SMPDU_Sock_post_read 
...../smpd_state_reading_cmd_header 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_READ event.error = 0, result = 0, context=left 
....\smpd_handle_op_read 
.....\smpd_state_reading_cmd 
......read command 
......\smpd_parse_command 
....../smpd_parse_command 
......read command: "cmd=closed src=1 dest=0 tag=1 " 
......\smpd_handle_command 
.......handling command: 
....... src = 1 
....... dest = 0 
....... cmd = closed 
....... tag = 1 
....... ctx = left 
....... len = 31 
....... str = cmd=closed src=1 dest=0 tag=1 
.......\smpd_command_destination 
........0 -> 0 : returning NULL context 
......./smpd_command_destination 
.......\smpd_handle_closed_command 
........closed command received from left child, closing sock. 
........\SMPDU_Sock_get_sock_id 
......../SMPDU_Sock_get_sock_id 
........SMPDU_Sock_post_close(528) 
........\SMPDU_Sock_post_close 
.........\SMPDU_Sock_post_read 
..........\SMPDU_Sock_post_readv 
........../SMPDU_Sock_post_readv 
........./SMPDU_Sock_post_read 
......../SMPDU_Sock_post_close 
........received a closed at node with no parent context, assuming root, 
returning SMPD_EXITING. 

......./smpd_handle_closed_command 
....../smpd_handle_command 
......not posting read for another command because SMPD_EXITING returned 
...../smpd_state_reading_cmd 
..../smpd_handle_op_read 
....sock_waiting for the next event. 
....\SMPDU_Sock_wait 
..../SMPDU_Sock_wait 
....SOCK_OP_CLOSE event.error = 0, result = 0, context=left 
....\smpd_handle_op_close 
.....\smpd_get_state_string 
...../smpd_get_state_string 
.....op_close received - SMPD_EXITING state. 
.....\smpd_free_context 
......freeing left context. 
......\smpd_init_context 
.......\smpd_init_command 
......./smpd_init_command 
....../smpd_init_context 
...../smpd_free_context 
..../smpd_handle_op_close 
.../smpd_enter_at_state 
...calling SMPDU_Sock_finalize 
...\SMPDU_Sock_finalize 
.../SMPDU_Sock_finalize 
../main 
..\smpd_exit 
...\smpd_kill_all_processes 
.../smpd_kill_all_processes 
...\smpd_finalize_drive_maps 
.../smpd_finalize_drive_maps 


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20110124/115fb585/attachment-0001.htm>


More information about the mpich-discuss mailing list