<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7036.0">
<TITLE>RE: [mpich-discuss] unable to connect ?</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2> Hi,<BR>
Good to know MPICH2 is working for you now.<BR>
<BR>
>> ...the cpi.exe runs with the same input in 1,872297s running on both machines ...<BR>
You must consider the cost of network communication. Try increasing the number of iterations.<BR>
<BR>
>> mpiexec with the -log switch shouldn't create some logfile ...<BR>
Are you trying to use Jumpshot to analyse your code ? Make sure that you add the "-log" option before the name of the MPI program (mpiexec -log -n 2 cpi.exe).<BR>
<BR>
Regards,<BR>
Jayesh<BR>
<BR>
-----Original Message-----<BR>
From: kiss attila [<A HREF="mailto:kissattila2008@gmail.com">mailto:kissattila2008@gmail.com</A>]<BR>
Sent: Friday, February 27, 2009 11:16 AM<BR>
To: Jayesh Krishna<BR>
Cc: mpich-discuss@mcs.anl.gov<BR>
Subject: Re: [mpich-discuss] unable to connect ?<BR>
<BR>
:)))))))))))<BR>
S U C C E S S ! ! !<BR>
Thanks a lot. Probably i've played with instalation settings and modified the passphrase, but now it works :).<BR>
<BR>
Hmmm.... a strange thing: the cpi.exe runs with the same input in 1,872297s running on both machines, 1,869439s running on 10.0.0.13 and 1.897818s runnning on 10.0.0.10. I thought running on two machines would be the fastest.<BR>
<BR>
mpiexec with the -log switch shouldn't create some logfile? I can't find any logfiles in the mpich2 folders.<BR>
<BR>
regards<BR>
K.A. Albert<BR>
<BR>
2009/2/27 Jayesh Krishna <jayesh@mcs.anl.gov>:<BR>
> Hi,<BR>
> From your debug logs the problem does not appear to be a network<BR>
> connectivity issue. It looks more like a configuration issue,<BR>
><BR>
> ============== snip ========================<BR>
> ...\smpd_state_reading_connect_result<BR>
> ....read connect result: 'FAIL'<BR>
> ....connection rejected, server returned - FAIL ============== snip<BR>
> ========================<BR>
><BR>
> Your PM connection can get rejected due to the foll reasons,<BR>
><BR>
> # There is a mismatch in the version of MPICH2 software installed on<BR>
> the multiple machines.<BR>
> # There is a mismatch in the passphrase used on the multiple machines<BR>
> (You enter this "passphrase" during MPICH2 installation).<BR>
><BR>
> I would recommend the following,<BR>
><BR>
> # Uninstall MPICH2 on both the machines.<BR>
> # Download the latest stable version (1.0.8) of MPICH2 from the<BR>
> downloads page<BR>
> (<A HREF="http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads">http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=downloads</A>).<BR>
> # Install MPICH2 on the machines using the installer downloaded from<BR>
> the downloads page.<BR>
><BR>
> ------- Make sure that you keep the default settings for passphrase<BR>
> during the installation<BR>
> ------- Also make sure that all users have access to MPICH2 (Change<BR>
> the default option from "Just me" to "Everyone" during installation)<BR>
><BR>
> # If your machine is not part of a domain, when registering the<BR>
> username/password with mpiexec don't specify any domain name. Also<BR>
> validate, as before, after registering the user.<BR>
><BR>
> Let us know the results.<BR>
><BR>
> (PS: There is no specific configuration required, apart from the info<BR>
> above, to get MPICH2 working across multiple windows machines)<BR>
><BR>
> Regards,<BR>
> Jayesh<BR>
><BR>
> -----Original Message-----<BR>
> From: kiss attila [<A HREF="mailto:kissattila2008@gmail.com">mailto:kissattila2008@gmail.com</A>]<BR>
> Sent: Thursday, February 26, 2009 11:45 PM<BR>
> To: Jayesh Krishna<BR>
> Cc: mpich-discuss@mcs.anl.gov<BR>
> Subject: Re: [mpich-discuss] unable to connect ?<BR>
><BR>
> Hi<BR>
><BR>
> I've tried now everything: I've created the same user, I've<BR>
> validated this user ( mpiuser ), but still nothing... Can anyone send<BR>
> me some config files from a w o r k i n g Mpich2 setup based on windows workgroup (not domain).<BR>
> Till then these are my output from smpd -d, and mpiexec commands,<BR>
> when I try to run from 10.0.0.10 hostname on remote computer<BR>
> (10.0.0.13)<BR>
><BR>
> D:\Program Files\MPICH2\bin>smpd -d<BR>
><BR>
> [00:2436]..\smpd_set_smpd_data<BR>
> [00:2436]../smpd_set_smpd_data<BR>
> [00:2436]..created a set for the listener: 1724 [00:2436]..smpd<BR>
> listening on port 8676 [00:2436]..\smpd_create_context<BR>
> [00:2436]...\smpd_init_context [00:2436]....\smpd_init_command<BR>
> [00:2436]..../smpd_init_command [00:2436].../smpd_init_context<BR>
> [00:2436]../smpd_create_context [00:2436]..\smpd_option_on<BR>
> [00:2436]...\smpd_get_smpd_data<BR>
> [00:2436]....\smpd_get_smpd_data_from_environment<BR>
> [00:2436]..../smpd_get_smpd_data_from_environment<BR>
> [00:2436]....\smpd_get_smpd_data_default<BR>
> [00:2436]..../smpd_get_smpd_data_default<BR>
> [00:2436]....Unable to get the data for the key 'no_dynamic_hosts'<BR>
> [00:2436].../smpd_get_smpd_data<BR>
> [00:2436]../smpd_option_on<BR>
> [00:2436]..\smpd_insert_into_dynamic_hosts<BR>
> [00:2436]../smpd_insert_into_dynamic_hosts<BR>
> [00:2436]..\smpd_enter_at_state<BR>
> [00:2436]...sock_waiting for the next event.<BR>
> [00:2436]...SOCK_OP_ACCEPT<BR>
> [00:2436]...\smpd_handle_op_accept<BR>
> [00:2436]....\smpd_state_smpd_listening<BR>
> [00:2436].....authenticating new connection<BR>
> [00:2436].....\smpd_create_context<BR>
> [00:2436]......\smpd_init_context<BR>
> [00:2436].......\smpd_init_command<BR>
> [00:2436]......./smpd_init_command<BR>
> [00:2436]....../smpd_init_context<BR>
> [00:2436]...../smpd_create_context<BR>
> [00:2436].....\smpd_gen_authentication_strings<BR>
> [00:2436]......\smpd_hash<BR>
> [00:2436]....../smpd_hash<BR>
> [00:2436]...../smpd_gen_authentication_strings<BR>
> [00:2436].....posting a write of the challenge string: 1.0.8 7993<BR>
> [00:2436]..../smpd_state_smpd_listening<BR>
> [00:2436].../smpd_handle_op_accept<BR>
> [00:2436]...sock_waiting for the next event.<BR>
> [00:2436]...SOCK_OP_WRITE<BR>
> [00:2436]...\smpd_handle_op_write<BR>
> [00:2436]....\smpd_state_writing_challenge_string<BR>
> [00:2436].....wrote challenge string: '1.0.8 7993'<BR>
> [00:2436]..../smpd_state_writing_challenge_string<BR>
> [00:2436].../smpd_handle_op_write<BR>
> [00:2436]...sock_waiting for the next event.<BR>
> [00:2436]...SOCK_OP_READ<BR>
> [00:2436]...\smpd_handle_op_read<BR>
> [00:2436]....\smpd_state_reading_challenge_response<BR>
> [00:2436].....read challenge response: 'd6fdd96549e0c22c875ac55a2735a162'<BR>
> [00:2436]..../smpd_state_reading_challenge_response<BR>
> [00:2436].../smpd_handle_op_read<BR>
> [00:2436]...sock_waiting for the next event.<BR>
> [00:2436]...SOCK_OP_WRITE<BR>
> [00:2436]...\smpd_handle_op_write<BR>
> [00:2436]....\smpd_state_writing_connect_result<BR>
> [00:2436].....wrote connect result: 'FAIL'<BR>
> [00:2436].....connection reject string written, closing sock.<BR>
> [00:2436]..../smpd_state_writing_connect_result<BR>
> [00:2436].../smpd_handle_op_write<BR>
> [00:2436]...sock_waiting for the next event.<BR>
> [00:2436]...SOCK_OP_CLOSE<BR>
> [00:2436]...\smpd_handle_op_close<BR>
> [00:2436]....\smpd_get_state_string<BR>
> [00:2436]..../smpd_get_state_string<BR>
> [00:2436]....op_close received - SMPD_CLOSING state.<BR>
> [00:2436]....Unaffiliated undetermined context closing.<BR>
> [00:2436]....\smpd_free_context<BR>
> [00:2436].....freeing undetermined context.<BR>
> [00:2436].....\smpd_init_context<BR>
> [00:2436]......\smpd_init_command<BR>
> [00:2436]....../smpd_init_command<BR>
> [00:2436]...../smpd_init_context<BR>
> [00:2436]..../smpd_free_context<BR>
> [00:2436].../smpd_handle_op_close<BR>
> [00:2436]...sock_waiting for the next event.<BR>
><BR>
><BR>
> C:\Program Files\MPICH2\bin>mpiexec -verbose -hosts 1 10.0.0.13 -user<BR>
> mpiuser hostname<BR>
><BR>
> ..\smpd_add_host_to_default_list<BR>
> ...\smpd_add_extended_host_to_default_list<BR>
> .../smpd_add_extended_host_to_default_list<BR>
> ../smpd_add_host_to_default_list<BR>
> ..\smpd_hide_string_arg<BR>
> ...\first_token<BR>
> .../first_token<BR>
> ...\compare_token<BR>
> .../compare_token<BR>
> ...\next_token<BR>
> ....\first_token<BR>
> ..../first_token<BR>
> ....\first_token<BR>
> ..../first_token<BR>
> .../next_token<BR>
> ../smpd_hide_string_arg<BR>
> ../smpd_hide_string_arg<BR>
> ..\smpd_hide_string_arg<BR>
> ...\first_token<BR>
> .../first_token<BR>
> ...\compare_token<BR>
> .../compare_token<BR>
> ...\next_token<BR>
> ....\first_token<BR>
> ..../first_token<BR>
> ....\first_token<BR>
> ..../first_token<BR>
> .../next_token<BR>
> ../smpd_hide_string_arg<BR>
> ../smpd_hide_string_arg<BR>
> ..\smpd_get_full_path_name<BR>
> ...fixing up exe name: 'hostname' -> '(null)'<BR>
> ../smpd_get_full_path_name<BR>
> ..handling executable:<BR>
> hostname.exe<BR>
> ..\smpd_get_next_host<BR>
> ...\smpd_get_host_id<BR>
> .../smpd_get_host_id<BR>
> ../smpd_get_next_host<BR>
> ..\smpd_create_cliques<BR>
> ...\next_launch_node<BR>
> .../next_launch_node<BR>
> ...\next_launch_node<BR>
> .../next_launch_node<BR>
> ../smpd_create_cliques<BR>
> ..\smpd_fix_up_host_tree<BR>
> ../smpd_fix_up_host_tree<BR>
> ./mp_parse_command_args<BR>
> .host tree:<BR>
> . host: 10.0.0.13, parent: 0, id: 1<BR>
> .launch nodes:<BR>
> . iproc: 0, id: 1, exe: hostname.exe<BR>
> .\smpd_get_smpd_data<BR>
> ..\smpd_get_smpd_data_from_environment<BR>
> ../smpd_get_smpd_data_from_environment<BR>
> ./smpd_get_smpd_data<BR>
> .\smpd_create_context<BR>
> ..\smpd_init_context<BR>
> ...\smpd_init_command<BR>
> .../smpd_init_command<BR>
> ../smpd_init_context<BR>
> ./smpd_create_context<BR>
> .\smpd_make_socket_loop<BR>
> ..\smpd_get_hostname<BR>
> ../smpd_get_hostname<BR>
> ./smpd_make_socket_loop<BR>
> .\smpd_create_context<BR>
> ..\smpd_init_context<BR>
> ...\smpd_init_command<BR>
> .../smpd_init_command<BR>
> ../smpd_init_context<BR>
> ./smpd_create_context<BR>
> .\smpd_enter_at_state<BR>
> ..sock_waiting for the next event.<BR>
> ..SOCK_OP_CONNECT<BR>
> ..\smpd_handle_op_connect<BR>
> ...connect succeeded, posting read of the challenge string<BR>
> ../smpd_handle_op_connect ..sock_waiting for the next event.<BR>
> ..SOCK_OP_READ<BR>
> ..\smpd_handle_op_read<BR>
> ...\smpd_state_reading_challenge_string<BR>
> ....read challenge string: '1.0.8 7993'<BR>
> ....\smpd_verify_version<BR>
> ..../smpd_verify_version<BR>
> ....\smpd_hash<BR>
> ..../smpd_hash<BR>
> .../smpd_state_reading_challenge_string<BR>
> ../smpd_handle_op_read<BR>
> ..sock_waiting for the next event.<BR>
> ..SOCK_OP_WRITE<BR>
> ..\smpd_handle_op_write<BR>
> ...\smpd_state_writing_challenge_response<BR>
> ....wrote challenge response: 'd6fdd96549e0c22c875ac55a2735a162'<BR>
> .../smpd_state_writing_challenge_response<BR>
> ../smpd_handle_op_write<BR>
> ..sock_waiting for the next event.<BR>
> ..SOCK_OP_READ<BR>
> ..\smpd_handle_op_read<BR>
> ...\smpd_state_reading_connect_result<BR>
> ....read connect result: 'FAIL'<BR>
> ....connection rejected, server returned - FAIL<BR>
> ....\smpd_post_abort_command .....\smpd_create_command<BR>
> ......\smpd_init_command ....../smpd_init_command<BR>
> ...../smpd_create_command .....\smpd_add_command_arg ...../smpd_add_command_arg .....\smpd_command_destination ......0 -> 0 :<BR>
> returning NULL context ...../smpd_command_destination<BR>
> Aborting: unable to connect to 10.0.0.13 ..../smpd_post_abort_command<BR>
> ....\smpd_exit .....\smpd_kill_all_processes<BR>
> ...../smpd_kill_all_processes .....\smpd_finalize_drive_maps<BR>
> ...../smpd_finalize_drive_maps .....\smpd_dbs_finalize<BR>
> ...../smpd_dbs_finalize<BR>
><BR>
><BR>
> Thanks for any ideas.<BR>
> regards<BR>
> K.A. Albert<BR>
><BR>
> 2009/2/26 Jayesh Krishna <jayesh@mcs.anl.gov>:<BR>
>> Hi,<BR>
>><BR>
>>>>.. I launch mpiexec.exe from an another windows user acount...<BR>
>><BR>
>> This could be your problem. You can try registering a<BR>
>> username/password available on both the machines using the "-user"<BR>
>> option (mpiexec -register -user 1) & launch your job using that user<BR>
>> (mpiexec -n 2 -user 1 -hosts 2 10.0.0.10 10.0.0.13 hostname). You can<BR>
>> also validate if the user credentials are capable of launching a job<BR>
>> using the "-validate" option of mpiexec (mpiexec -validate -user 1<BR>
>> 10.0.0.10 ; mpiexec -validate -user 1 10.0.0.13)<BR>
>><BR>
>> (PS: Did you copy-paste the complete output of the mpiexec command &<BR>
>> the command itself ? Please don't remove any part of the output. This<BR>
>> will help us in debugging your problem.)<BR>
>><BR>
>> Regards,<BR>
>> Jayesh<BR>
>><BR>
>> -----Original Message-----<BR>
>> From: kiss attila [<A HREF="mailto:kissattila2008@gmail.com">mailto:kissattila2008@gmail.com</A>]<BR>
>> Sent: Thursday, February 26, 2009 12:26 AM<BR>
>> To: Jayesh Krishna<BR>
>> Subject: Re: [mpich-discuss] unable to connect ?<BR>
>><BR>
>> 1. Yes, the ping works fine. With wmpiconfig.exe i can see both machines.<BR>
>> 2. MPICH2 1.0.8 installed on both.<BR>
>> 3. No firewalls of any kind.<BR>
>> 4. On smpd -status i get:<BR>
>> smpd running on 10.0.0.10<BR>
>> smpd running on 10.0.0.13<BR>
>><BR>
>> 5. from 10.0.0.10<BR>
>> C:\Program Files\MPICH2\bin>mpiexec -hosts 2 10.0.0.10 10.0.0.13<BR>
>> hostname<BR>
>> abort: unable to connect to 10.0.0.13<BR>
>><BR>
>> from 10.0.0.13<BR>
>> C:\Program Files\MPICH2\bin>mpiexec -hosts 2 10.0.0.10 10.0.0.13<BR>
>> hostname<BR>
>> abort: unable to connect to 10.0.0.10<BR>
>><BR>
>> and here is the -verbose mode:<BR>
>><BR>
>> ...../first_token<BR>
>> .....\compare_token<BR>
>> ...../compare_token<BR>
>> .....\next_token<BR>
>> ......\first_token<BR>
>> ....../first_token<BR>
>> ......\first_token<BR>
>> ....../first_token<BR>
>> ...../next_token<BR>
>> ..../smpd_hide_string_arg<BR>
>> ..../smpd_hide_string_arg<BR>
>> .....\smpd_option_on<BR>
>> ......\smpd_get_smpd_data<BR>
>> .......\smpd_get_smpd_data_from_environment<BR>
>> ......./smpd_get_smpd_data_from_environment<BR>
>> .......\smpd_get_smpd_data_default<BR>
>> ......./smpd_get_smpd_data_default<BR>
>> .......Unable to get the data for the key 'nocache'<BR>
>> ....../smpd_get_smpd_data<BR>
>> ...../smpd_option_on<BR>
>> ....\smpd_hide_string_arg<BR>
>> .....\first_token<BR>
>> ...../first_token<BR>
>> .....\compare_token<BR>
>> ...../compare_token<BR>
>> .....\next_token<BR>
>> ......\first_token<BR>
>> ....../first_token<BR>
>> ......\first_token<BR>
>> ....../first_token<BR>
>> ...../next_token<BR>
>> ..../smpd_hide_string_arg<BR>
>> ..../smpd_hide_string_arg<BR>
>> .../smpd_handle_op_read<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_WRITE<BR>
>> ...\smpd_handle_op_write<BR>
>> ....\smpd_state_writing_cred_ack_yes<BR>
>> .....wrote cred request yes ack.<BR>
>> ..../smpd_state_writing_cred_ack_yes<BR>
>> .../smpd_handle_op_write<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_WRITE<BR>
>> ...\smpd_handle_op_write<BR>
>> ....\smpd_state_writing_account<BR>
>> .....wrote account: 'mpiuser'<BR>
>> .....\smpd_encrypt_data<BR>
>> ...../smpd_encrypt_data<BR>
>> ..../smpd_state_writing_account<BR>
>> .../smpd_handle_op_write<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_WRITE<BR>
>> ...\smpd_handle_op_write<BR>
>> ....\smpd_hide_string_arg<BR>
>> .....\first_token<BR>
>> ...../first_token<BR>
>> .....\compare_token<BR>
>> ...../compare_token<BR>
>> .....\next_token<BR>
>> ......\first_token<BR>
>> ....../first_token<BR>
>> ......\first_token<BR>
>> ....../first_token<BR>
>> ...../next_token<BR>
>> ..../smpd_hide_string_arg<BR>
>> ..../smpd_hide_string_arg<BR>
>> .....\smpd_hide_string_arg<BR>
>> ......\first_token<BR>
>> ....../first_token<BR>
>> ......\compare_token<BR>
>> ....../compare_token<BR>
>> ......\next_token<BR>
>> .......\first_token<BR>
>> ......./first_token<BR>
>> .......\first_token<BR>
>> ......./first_token<BR>
>> ....../next_token<BR>
>> ...../smpd_hide_string_arg<BR>
>> ...../smpd_hide_string_arg<BR>
>> ....\smpd_hide_string_arg<BR>
>> .....\first_token<BR>
>> ...../first_token<BR>
>> .....\compare_token<BR>
>> ...../compare_token<BR>
>> .....\next_token<BR>
>> ......\first_token<BR>
>> ....../first_token<BR>
>> ......\first_token<BR>
>> ....../first_token<BR>
>> ...../next_token<BR>
>> ..../smpd_hide_string_arg<BR>
>> ..../smpd_hide_string_arg<BR>
>> .../smpd_handle_op_write<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_READ<BR>
>> ...\smpd_handle_op_read<BR>
>> ....\smpd_state_reading_process_result<BR>
>> .....read process session result: 'SUCCESS'<BR>
>> ..../smpd_state_reading_process_result<BR>
>> .../smpd_handle_op_read<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_READ<BR>
>> ...\smpd_handle_op_read<BR>
>> ....\smpd_state_reading_reconnect_request<BR>
>> .....read re-connect request: '3972'<BR>
>> .....closing the old socket in the left context.<BR>
>> .....MPIDU_Sock_post_close(1720)<BR>
>> .....connecting a new socket.<BR>
>> .....\smpd_create_context<BR>
>> ......\smpd_init_context<BR>
>> .......\smpd_init_command<BR>
>> ......./smpd_init_command<BR>
>> ....../smpd_init_context<BR>
>> ...../smpd_create_context<BR>
>> .....posting a re-connect to 10.0.0.10:3972 in left context.<BR>
>> ..../smpd_state_reading_reconnect_request<BR>
>> .../smpd_handle_op_read<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_CLOSE<BR>
>> ...\smpd_handle_op_close<BR>
>> ....\smpd_get_state_string<BR>
>> ..../smpd_get_state_string<BR>
>> ....op_close received - SMPD_CLOSING state.<BR>
>> ....Unaffiliated left context closing.<BR>
>> ....\smpd_free_context<BR>
>> .....freeing left context.<BR>
>> .....\smpd_init_context<BR>
>> ......\smpd_init_command<BR>
>> ....../smpd_init_command<BR>
>> ...../smpd_init_context<BR>
>> ..../smpd_free_context<BR>
>> .../smpd_handle_op_close<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_CONNECT<BR>
>> ...\smpd_handle_op_connect<BR>
>> ....\smpd_generate_session_header<BR>
>> .....session header: (id=1 parent=0 level=0)<BR>
>> ..../smpd_generate_session_header .../smpd_handle_op_connect<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_WRITE<BR>
>> ...\smpd_handle_op_write<BR>
>> ....\smpd_state_writing_session_header<BR>
>> .....wrote session header: 'id=1 parent=0 level=0'<BR>
>> .....\smpd_post_read_command<BR>
>> ......posting a read for a command header on the left context, sock<BR>
>> 1656 ...../smpd_post_read_command .....creating connect command for<BR>
>> left node .....creating connect command to '10.0.0.13'<BR>
>> .....\smpd_create_command<BR>
>> ......\smpd_init_command<BR>
>> ....../smpd_init_command<BR>
>> ...../smpd_create_command<BR>
>> .....\smpd_add_command_arg<BR>
>> ...../smpd_add_command_arg<BR>
>> .....\smpd_add_command_int_arg<BR>
>> ...../smpd_add_command_int_arg<BR>
>> .....\smpd_post_write_command<BR>
>> ......\smpd_package_command<BR>
>> ....../smpd_package_command<BR>
>> ......smpd_post_write_command on the left context sock 1656: 65 bytes<BR>
>> for<BR>
>> command: "cmd=connect src=0 dest=1 tag=0 host=10.0.0.13 id=2 "<BR>
>> ...../smpd_post_write_command<BR>
>> .....not connected yet: 10.0.0.13 not connected<BR>
>> ..../smpd_state_writing_session_header<BR>
>> .../smpd_handle_op_write<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_WRITE<BR>
>> ...\smpd_handle_op_write<BR>
>> ....\smpd_state_writing_cmd<BR>
>> .....wrote command<BR>
>> .....command written to left: "cmd=connect src=0 dest=1 tag=0<BR>
>> host=10.0.0.13 id=2 "<BR>
>> .....moving 'connect' command to the wait_list.<BR>
>> ..../smpd_state_writing_cmd<BR>
>> .../smpd_handle_op_write<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_READ<BR>
>> ...\smpd_handle_op_read<BR>
>> ....\smpd_state_reading_cmd_header<BR>
>> .....read command header<BR>
>> .....command header read, posting read for data: 69 bytes<BR>
>> ..../smpd_state_reading_cmd_header<BR>
>> .../smpd_handle_op_read<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_READ<BR>
>> ...\smpd_handle_op_read<BR>
>> ....\smpd_state_reading_cmd<BR>
>> .....read command<BR>
>> .....\smpd_parse_command<BR>
>> ...../smpd_parse_command<BR>
>> .....read command: "cmd=abort src=1 dest=0 tag=0 error="unable to<BR>
>> connect to 10.0.0.13" "<BR>
>> .....\smpd_handle_command<BR>
>> ......handling command:<BR>
>> ...... src = 1<BR>
>> ...... dest = 0<BR>
>> ...... cmd = abort<BR>
>> ...... tag = 0<BR>
>> ...... ctx = left<BR>
>> ...... len = 69<BR>
>> ...... str = cmd=abort src=1 dest=0 tag=0 error="unable to connect<BR>
>> to 10.0.0.13"<BR>
>> ......\smpd_command_destination<BR>
>> .......0 -> 0 : returning NULL context<BR>
>> ....../smpd_command_destination ......\smpd_handle_abort_command<BR>
>> .......abort: unable to connect to 10.0.0.13<BR>
>> ....../smpd_handle_abort_command ...../smpd_handle_command<BR>
>> .....\smpd_post_read_command ......posting a read for a command<BR>
>> header on the left context, sock 1656 ...../smpd_post_read_command<BR>
>> .....\smpd_create_command ......\smpd_init_command<BR>
>> ....../smpd_init_command ...../smpd_create_command<BR>
>> .....\smpd_post_write_command ......\smpd_package_command<BR>
>> ....../smpd_package_command ......smpd_post_write_command on the left<BR>
>> context sock 1656: 43 bytes for<BR>
>> command: "cmd=close src=0 dest=1 tag=1 "<BR>
>> ...../smpd_post_write_command<BR>
>> ..../smpd_state_reading_cmd<BR>
>> .../smpd_handle_op_read<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_READ<BR>
>> ...\smpd_handle_op_read<BR>
>> ....\smpd_state_reading_cmd_header<BR>
>> .....read command header<BR>
>> .....command header read, posting read for data: 31 bytes<BR>
>> ..../smpd_state_reading_cmd_header<BR>
>> .../smpd_handle_op_read<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_WRITE<BR>
>> ...\smpd_handle_op_write<BR>
>> ....\smpd_state_writing_cmd<BR>
>> .....wrote command<BR>
>> .....command written to left: "cmd=close src=0 dest=1 tag=1 "<BR>
>> .....\smpd_free_command<BR>
>> ......\smpd_init_command<BR>
>> ....../smpd_init_command<BR>
>> ...../smpd_free_command<BR>
>> ..../smpd_state_writing_cmd<BR>
>> .../smpd_handle_op_write<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_READ<BR>
>> ...\smpd_handle_op_read<BR>
>> ....\smpd_state_reading_cmd<BR>
>> .....read command<BR>
>> .....\smpd_parse_command<BR>
>> ...../smpd_parse_command<BR>
>> .....read command: "cmd=closed src=1 dest=0 tag=1 "<BR>
>> .....\smpd_handle_command<BR>
>> ......handling command:<BR>
>> ...... src = 1<BR>
>> ...... dest = 0<BR>
>> ...... cmd = closed<BR>
>> ...... tag = 1<BR>
>> ...... ctx = left<BR>
>> ...... len = 31<BR>
>> ...... str = cmd=closed src=1 dest=0 tag=1<BR>
>> ......\smpd_command_destination .......0 -> 0 : returning NULL<BR>
>> context ....../smpd_command_destination<BR>
>> ......\smpd_handle_closed_command .......closed command received from left child, closing sock.<BR>
>> .......MPIDU_Sock_post_close(1656)<BR>
>> .......received a closed at node with no parent context, assuming<BR>
>> root, returning SMPD_EXITING.<BR>
>> ....../smpd_handle_closed_command<BR>
>> ...../smpd_handle_command<BR>
>> .....not posting read for another command because SMPD_EXITING<BR>
>> returned ..../smpd_state_reading_cmd .../smpd_handle_op_read<BR>
>> ...sock_waiting for the next event.<BR>
>> ...SOCK_OP_CLOSE<BR>
>> ...\smpd_handle_op_close<BR>
>> ....\smpd_get_state_string<BR>
>> ..../smpd_get_state_string<BR>
>> ....op_close received - SMPD_EXITING state.<BR>
>> ....\smpd_free_context<BR>
>> .....freeing left context.<BR>
>> .....\smpd_init_context<BR>
>> ......\smpd_init_command<BR>
>> ....../smpd_init_command<BR>
>> ...../smpd_init_context<BR>
>> ..../smpd_free_context<BR>
>> .../smpd_handle_op_close<BR>
>> ../smpd_enter_at_state<BR>
>> ./main<BR>
>> .\smpd_exit<BR>
>> ..\smpd_kill_all_processes<BR>
>> ../smpd_kill_all_processes<BR>
>> ..\smpd_finalize_drive_maps<BR>
>> ../smpd_finalize_drive_maps<BR>
>> ..\smpd_dbs_finalize<BR>
>> ../smpd_dbs_finalize<BR>
>><BR>
>> I have registered with wmpiregister.exe the same user with the same<BR>
>> password on both computers but I launch mpiexec.exe from an another<BR>
>> windows user acount; could this be a problem?. Thanks<BR>
>><BR>
>> regards<BR>
>> k.a.albert<BR>
>><BR>
>><BR>
>><BR>
>><BR>
>> 2009/2/25 Jayesh Krishna <jayesh@mcs.anl.gov>:<BR>
>>> Hi,<BR>
>>><BR>
>>> # Can you ping the machines from each other ?<BR>
>>> # Make sure that you have the same version of MPICH2 installed on<BR>
>>> both the machines.<BR>
>>> # Do you have any firewalls (windows, third-party) running on the<BR>
>>> machines (Turn off any firewalls running on the machines)?<BR>
>>> # Make sure that you have the MPICH2 process manager, smpd.exe,<BR>
>>> running as a service on both the machines (To check the status of<BR>
>>> the process manager type, smpd -status, at the command prompt).<BR>
>>> # Before trying to execute an MPI program like cpi.exe, try<BR>
>>> executing a non-MPI program like hostname on the machines (mpiexec<BR>
>>> -hosts 2 10.0.0.10<BR>
>>> 10.0.0.13 hostname).<BR>
>>><BR>
>>> Let us know the results.<BR>
>>><BR>
>>> (PS: In your reply please copy-paste the commands and the output)<BR>
>>> Regards, Jayesh<BR>
>>><BR>
>>><BR>
>>><BR>
>>> -----Original Message-----<BR>
>>> From: mpich-discuss-bounces@mcs.anl.gov<BR>
>>> [<A HREF="mailto:mpich-discuss-bounces@mcs.anl.gov">mailto:mpich-discuss-bounces@mcs.anl.gov</A>] On Behalf Of kiss attila<BR>
>>> Sent: Wednesday, February 25, 2009 1:46 PM<BR>
>>> To: mpich-discuss@mcs.anl.gov<BR>
>>> Subject: [mpich-discuss] unable to connect ?<BR>
>>><BR>
>>> Hi<BR>
>>><BR>
>>> I have two WinXp machines (10.0.0.13,10.0.0.10) with mpich2<BR>
>>> installed, and on this command:<BR>
>>> "D:\Program Files\MPICH2\bin\mpiexec.exe" -hosts 2 10.0.0.10<BR>
>>> 10.0.0.13 -noprompt c:\ex\cpi.exe<BR>
>>><BR>
>>> I get:<BR>
>>><BR>
>>> Aborting: unable to connect to 10.0.0.10<BR>
>>><BR>
>>> Somehow I can't start any process on the remote machine(10.0.0.10).<BR>
>>> It annoys me, that a few days ago it worked, but I had to reinstall<BR>
>>> one of them, and since then i couldn't figure it out what's wrong<BR>
>>> with my settings. thanks.<BR>
>>><BR>
>>> regards<BR>
>>> K.A. Albert<BR>
>>><BR>
>><BR>
><BR>
</FONT>
</P>
</BODY>
</HTML>