[mpich2-dev] configuration problem
Pavan Balaji
balaji at mcs.anl.gov
Fri Apr 16 16:30:50 CDT 2010
It looks like one of the application processes died for some reason.
You can use the -print-all-exitcodes option to mpiexec to see the exit
codes for each process. Once you know which one the culprit process is,
you can run it through a debugger to figure out what's going on.
For example, if you find that the second process is behaving badly, you
can use:
% mpiexec -f host -np 1 ./abinit : -np 1 ddd ./abinit : -np 1 ./abinit <
tnlo.files
This will bring up a ddd debugger window only for the second process.
Alternatively, you can use a separate debugger window for all processes
using:
% mpiexec -f host -np 3 ddd ./abinit < tnlo.files
-- Pavan
On 04/16/2010 04:22 PM, lagoun brahim wrote:
>
> HELLO :-)
> thank you Pavan for your reply
> my problem was at the firewall
> Now I have another problem:
> I created the host file (br:2/dft:2) when I start the calculation with
> the following command: mpiexec -f host -np 3(4) apl I had the following
> error message:
> mpiexec -f host -np 3 abinit<tnlo.files
> ABINIT
>
> Give name for formatted input file:
> tnlo2.in
> Give name for formatted output file:
> tnlo2.out
> Give root name for generic input files:
> tnlo2i
> Give root name for generic output files:
> tnlo2o
> Give root name for generic temporary files:
> tnlo2
> Fatal error in PMPI_Barrier: Other MPI error, error stack:
> PMPI_Barrier(476).................: MPI_Barrier(MPI_COMM_WORLD) failed
> MPIR_Barrier(82)..................:
> MPIC_Sendrecv(161)................:
> MPIC_Wait(513)....................:
> MPIDI_CH3I_Progress(150)..........:
> MPID_nem_mpich2_blocking_recv(948):
> MPID_nem_tcp_connpoll(1709).......: Communication error
> Fatal error in PMPI_Barrier: Other MPI error, error stack:
> PMPI_Barrier(476).................: MPI_Barrier(MPI_COMM_WORLD) failed
> MPIR_Barrier(82)..................:
> MPIC_Sendrecv(161)................:
> MPIC_Wait(513)....................:
> MPIDI_CH3I_Progress(150)..........:
> MPID_nem_mpich2_blocking_recv(948):
> MPID_nem_tcp_connpoll(1709).......: Communication error
> Terminated (signal 15)
>
> thank you again for your help
>
>
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich2-dev
mailing list