[mpich-discuss] FW: MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
Pavan Balaji
balaji at mcs.anl.gov
Fri Nov 4 09:57:12 CDT 2011
On 11/04/2011 03:57 AM, Miguel Angel Fernández wrote:
> I thought I answered your email,... anyway, I'm doing to much things at
> the same time ;-)
If it was sent to me directly instead of the mpich-discuss mailing list,
it was probably ignored. Please don't do that.
> Yes, all machines can communicate to each others.
They can communicate as in ssh to each other, or communicate over any port?
> Attached, you have the output of the commands "configure", "make" and
> "make install" for users "mpi" and "root".
It doesn't matter which user you are doing this as, i.e., "mpi" or
"root". Let's just pick one to avoid confusion. The build seems to have
gone through fine.
So far, as I understand it, the following works correctly:
% mpiexec -f machinefile hostname
But the following does not:
% mpiexec -f machinefile ./mpi_application
Assuming the above is true, my guess is that there is a firewall issue
between the nodes. Note that many firewalls allow port 22 to pass
through which is used for ssh. So you won't notice this with ssh.
> I sent to your personal email "balaji at mcs.anl.gov"one document with the
> configuration I am using. Maybe you can find the thing I am doing wrong.
Please send all emails to the mailing list.
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list