[MPICH] MPICH between Playstation3 and Intel PC?
William Gropp
gropp at mcs.anl.gov
Mon Mar 12 09:22:17 CDT 2007
This most likely means that the p4 code got the byte order wrong for
this platform. The p4 part of the MPICH1 code predates tools like
configure and portable operating systems, and uses a simple
identification with the OS name to determine the architectural
parameters. You might need to add a new machine type for the
Playstation so that MPICH1 gets the right byte order. Let us know if
you need help finding that code (its in the ch_p4/p4/lib directory).
Bill
On Mar 8, 2007, at 6:56 AM, Jan Wagner wrote:
>
> On Thu, 8 Mar 2007, Jan Wagner wrote:
>> I am trying to get an MPICH 1.2.7 program to work on Playstation,
>> and execute an MPICH 1.2.5 program on an Intel PC (named "warp"
>> below). However, after a short while the Init gives me a
>> net_conn_to_listener error.
>>
>> It's the same when executing the remote exec command on the
>> command line:
>>
>> [jwagner at ps3-001 ~]$ ssh warp -l jwagner -n /usr/bin/mpifxcorr
>> ps3-001.kurp.hut.fi 56995 \-p4amslave \-p4yourname warp
>> DiFX Intel IPP Version
>> About to run MPIInit
>> rm_29828: p4_error: rm_start: net_conn_to_listener failed: 56995
>
> Well almost there now. I had to shut down iptables on the
> playstation, the random port forwarding that MPI wants to set up
> did not really work for some reason. Without firewall it works.
>
> But now it crashes with:
>
> DiFX Generic CPU Version
> About to run MPIInit
> p2_859: p4_error: Could not allocate memory for commandline args:
> 889192448
> rm_l_2_871: (1.829863) net_send: could not write to fd=5, errno = 32
> 0.16user 0.39system 0:03.69elapsed 15%CPU (0ap1_4309: p4_error:
> net_recv read: probable EOF on socket: 1
> p3_875: p4_error: Could not allocate memory for commandline args:
> 889192448
> rm_l_3_886: (1.346863) net_send: could not write to fd=5, errno = 32
>
> Any ideas?
>
> I'm using p4, by the way. And starting the process with
> $ declare -x P4_RSHCOMMAND="ssh"
> $ declare -x RSHCOMMAND="ssh"
> $ mpirun -v -np 6 -machinefile /nfs/mpitest/machines.CLUSTER /usr/
> bin/mpitest /nfs/mpitest/data.in.cfg
>
> Wasn't MPICH-1 supposed to be for heterogenous clusters too?
>
> - Jan
>
More information about the mpich-discuss
mailing list