[MPICH] communication between windows and linux
Jayesh Krishna
jayesh at mcs.anl.gov
Fri Dec 21 12:17:54 CST 2007
Hi,
You can use 32-bit or 64-bit versions of MPICH2 for running procs spanning
linux and windows machines. You have to make sure that all the machines
involved have either 32-bit version installed or the 64-bit version
installed (As I mentioned before the only config not supported is
communication across 32-bit and 64-bit machines).
The early version of heterogeneous support for MPICH2 is tentatively
slated for summer next year.
Regards,
Jayesh
-----Original Message-----
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Jaroslaw Bulat
Sent: Friday, December 21, 2007 11:50 AM
To: 'mpich2'
Subject: RE: [MPICH] communication between windows and linux
Hi!
On both side I have started smpd (on linux compiled with --with-pm=smpd). I
try homogenus config. Should I chose 64bit or 32bit configuration (linux and
windows)? First of all, is it working and usable MPICH2 on 64bit Windows (XP
or Vista) or should I use 32bit linux?
When do you expect heterogeneous MPICH2?
Thanks for answer!
Jarek!
On Fri, 2007-12-21 at 10:54 -0600, Jayesh Krishna wrote:
> Hi,
> The only process manager available on MPICH2 for windows is SMPD.
> You should configure MPICH2 on the linux machine to use SMPD (./configure
...
> --with-pm=smpd ...).
> Also note that we currently don't have support for heterogeneous
> machines in MPICH2 (You cannot run your program across 32-bit and 64-bit
machines).
> Heterogeneous support is slated for a future release (sometime next year).
>
> (PS: The default process manager on the linux side is MPD.)
>
> Regards,
> Jayesh
>
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Jaroslaw Bulat
> Sent: Friday, December 21, 2007 10:11 AM
> To: mpich2
> Subject: [MPICH] communication between windows and linux
>
> Hi!
>
> I cannot establish communication between linux and windows. On both
> side I have MPICH2-1.0.6p1 with smpd. In order to test communication,
> I wrote simple ping-pong like program (two programs, c-sources in
attachment).
> When they both working either on linux or windows side, everything is
> correct. When one program is executed on linux and second on windows,
> ring crushes during first message passing. Bellow some debug outputs:
>
> Both programs (ssimple_sender and ssimple_receiver) are working in a
> windows box (149.156.197.238), however, they are started from linux
> box (mpiexec command are executed in a linux environment - different
> computer, different IP). Debug output are mixed from both programs and
> appears on linux box. In this case everything is correct.
>
> mpiexec -n 1 -host 149.156.197.238 -path "C:\Documents and Settings
> \kwant\Desktop\workspace\ssimple_sender\Debug"
> ssimple_sender/Debug/ssimple_sender : -n 1 -host 149.156.197.238 -path
> "C:\Documents and Settings\kwant\Desktop\workspace\ssimple_receiver
> \Debug" ssimple_receiver
> REC: init_1
> SEND: init_1
> REC: init_2
> REC: init_3 numprocs (2): 2
> REC: init_4 myid (1): 1
> REC: loop 1
> SEND: init_2
> SEND: init_3 numprocs (2): 2
> SEND: init_4 myid (0): 0
> SEND: loop 1
> REC: loop 2_Irecv
> REC: loop 3_Wait
> SEND: loop 2_Irecv
> SEND: loop 3_Send, data: 0
> SEND: loop 4_Sent
> SEND: loop 5_Wait
> SEND: loop 6_Rec, data: 0
> ------ SEND -------- SEND -------- SEND ---------
> SEND: loop 1
> REC: loop 4_Send, data: 0
> REC: loop 4_Sent
> REC: loop 1
> REC: loop 2_Irecv
> REC: loop 3_Wait
> SEND: loop 2_Irecv
> SEND: loop 3_Send, data: 1
> SEND: loop 4_Sent
> SEND: loop 5_Wait
> SEND: loop 6_Rec, data: 1
> ------ SEND -------- SEND -------- SEND --------- etc.....
>
>
> simple_sender is started locally (linux - zm203.zmet.agh.edu.pl) and
> simple_receiver remotely (windows - 149.156.197.238). Ring is started
> from linux, and they die before first MPI_Send(...) is finished. Below
> debug output and exit code:
>
> mpiexec -n 1 -localonly ssimple_sender/Debug/ssimple_sender : -n 1
> -host
> 149.156.197.238 -path "C:\Documents and
> Settings\kwant\Desktop\workspace \ssimple_receiver\Debug" -channel
> sock ssimple_receiver
> SEND: init_1
> REC: init_1
> SEND: init_2
> SEND: init_3 numprocs (2): 2
> SEND: init_4 myid (0): 0
> SEND: loop 1
> REC: init_2
> REC: init_3 numprocs (2): 2
> REC: init_4 myid (1): 1
> REC: loop 1
> REC: loop 2_Irecv
> REC: loop 3_Wait
> SEND: loop 2_Irecv
> SEND: loop 3_Send, data: 0
>
> job aborted:
> rank: node: exit code[: error message]
> 0: zm203.zmet.agh.edu.pl: -2
> 1: 149.156.197.238: 1: Fatal error in MPI_Wait: Other MPI error, error
> stack:
> MPI_Wait(156)................................:
> MPI_Wait(request=003E3DB0, status003E2A08) failed
> MPIDI_CH3i_Progress_wait(215)................: an error occurred while
> handling an event returned by MPIDU_Sock_Wait()
> MPIDI_CH3I_Progress_handle_sock_event(640)...:
> MPIDI_CH3_Sockconn_handle_connopen_event(887): unable to find the
> process group structure with id <>
>
>
> Both computers are x86_64 boxes, linux is native 64bits, windows is in
> 32bits version. I'm using MinGW on windows (g++ -v: gcc version 3.4.5
> (mingw special)), and gcc version 4.1.2 (Gentoo 4.1.2) on Linux.
>
> Could anyone help me with this broken communication?
>
>
> Jarek!
>
More information about the mpich-discuss
mailing list