[MPICH] communication between windows and linux

Jayesh Krishna jayesh at mcs.anl.gov
Fri Dec 21 12:17:54 CST 2007


 Hi,
  You can use 32-bit or 64-bit versions of MPICH2 for running procs spanning
linux and windows machines. You have to make sure that all the machines
involved have either 32-bit version installed or the 64-bit version
installed (As I mentioned before the only config not supported is
communication across 32-bit and 64-bit machines).
  The early version of heterogeneous support for MPICH2 is tentatively
slated for summer next year.

Regards,
Jayesh

-----Original Message-----
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Jaroslaw Bulat
Sent: Friday, December 21, 2007 11:50 AM
To: 'mpich2'
Subject: RE: [MPICH] communication between windows and linux

Hi!

On both side I have started smpd (on linux compiled with --with-pm=smpd). I
try homogenus config. Should I chose 64bit or 32bit configuration (linux and
windows)? First of all, is it working and usable MPICH2 on 64bit Windows (XP
or Vista) or should I use 32bit linux? 

When do you expect heterogeneous MPICH2? 

Thanks for answer!


Jarek!


On Fri, 2007-12-21 at 10:54 -0600, Jayesh Krishna wrote:
> Hi,
>   The only process manager available on MPICH2 for windows is SMPD. 
> You should configure MPICH2 on the linux machine to use SMPD (./configure
...
> --with-pm=smpd ...).
>   Also note that we currently don't have support for heterogeneous 
> machines in MPICH2 (You cannot run your program across 32-bit and 64-bit
machines).
> Heterogeneous support is slated for a future release (sometime next year).
> 
> (PS: The default process manager on the linux side is MPD.)
> 
> Regards,
> Jayesh
> 
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Jaroslaw Bulat
> Sent: Friday, December 21, 2007 10:11 AM
> To: mpich2
> Subject: [MPICH] communication between windows and linux
> 
> Hi!
> 
> I cannot establish communication between linux and windows. On both 
> side I have MPICH2-1.0.6p1 with smpd. In order to test communication, 
> I wrote simple ping-pong like program (two programs, c-sources in
attachment).
> When they both working either on linux or windows side, everything is 
> correct. When one program is executed on linux and second on windows, 
> ring crushes during first message passing. Bellow some debug outputs:
> 
> Both programs (ssimple_sender and ssimple_receiver) are working in a 
> windows box (149.156.197.238), however, they are started from linux 
> box (mpiexec command are executed in a linux environment - different 
> computer, different IP). Debug output are mixed from both programs and 
> appears on linux box. In this case everything is correct.
> 
> mpiexec -n 1 -host 149.156.197.238 -path "C:\Documents and Settings 
> \kwant\Desktop\workspace\ssimple_sender\Debug"
> ssimple_sender/Debug/ssimple_sender : -n 1 -host 149.156.197.238 -path 
> "C:\Documents and Settings\kwant\Desktop\workspace\ssimple_receiver
> \Debug" ssimple_receiver
>     REC: init_1
> SEND: init_1
>     REC: init_2
>     REC: init_3 numprocs (2): 2
>     REC: init_4 myid (1): 1
>     REC: loop 1
> SEND: init_2
> SEND: init_3 numprocs (2): 2
> SEND: init_4 myid (0): 0
> SEND: loop 1
>     REC: loop 2_Irecv
>     REC: loop 3_Wait
> SEND: loop 2_Irecv
> SEND: loop 3_Send, data: 0
> SEND: loop 4_Sent
> SEND: loop 5_Wait
> SEND: loop 6_Rec, data: 0
> ------ SEND -------- SEND -------- SEND ---------
> SEND: loop 1
>     REC: loop 4_Send, data: 0
>     REC: loop 4_Sent
>     REC: loop 1
>     REC: loop 2_Irecv
>     REC: loop 3_Wait
> SEND: loop 2_Irecv
> SEND: loop 3_Send, data: 1
> SEND: loop 4_Sent
> SEND: loop 5_Wait
> SEND: loop 6_Rec, data: 1
> ------ SEND -------- SEND -------- SEND --------- etc.....
> 
> 
> simple_sender is started locally (linux - zm203.zmet.agh.edu.pl) and 
> simple_receiver remotely (windows - 149.156.197.238). Ring is started 
> from linux, and they die before first MPI_Send(...) is finished. Below 
> debug output and exit code:
> 
> mpiexec -n 1 -localonly ssimple_sender/Debug/ssimple_sender : -n 1 
> -host
> 149.156.197.238 -path "C:\Documents and 
> Settings\kwant\Desktop\workspace \ssimple_receiver\Debug" -channel 
> sock ssimple_receiver
> SEND: init_1
>     REC: init_1
> SEND: init_2
> SEND: init_3 numprocs (2): 2
> SEND: init_4 myid (0): 0
> SEND: loop 1
>     REC: init_2
>     REC: init_3 numprocs (2): 2
>     REC: init_4 myid (1): 1
>     REC: loop 1
>     REC: loop 2_Irecv
>     REC: loop 3_Wait
> SEND: loop 2_Irecv
> SEND: loop 3_Send, data: 0
> 
> job aborted:
> rank: node: exit code[: error message]
> 0: zm203.zmet.agh.edu.pl: -2
> 1: 149.156.197.238: 1: Fatal error in MPI_Wait: Other MPI error, error
> stack:
> MPI_Wait(156)................................:
> MPI_Wait(request=003E3DB0, status003E2A08) failed
> MPIDI_CH3i_Progress_wait(215)................: an error occurred while 
> handling an event returned by MPIDU_Sock_Wait()
> MPIDI_CH3I_Progress_handle_sock_event(640)...: 
> MPIDI_CH3_Sockconn_handle_connopen_event(887): unable to find the 
> process group structure with id <>
> 
> 
> Both computers are x86_64 boxes, linux is native 64bits, windows is in 
> 32bits version. I'm using MinGW on windows (g++ -v: gcc version 3.4.5 
> (mingw special)), and gcc version 4.1.2 (Gentoo 4.1.2) on Linux.
> 
> Could anyone help me with this broken communication?
> 
> 
> Jarek!
> 





More information about the mpich-discuss mailing list