[mpich-discuss] MPI Fatal error, but only with more cluster nodes!

Rajeev Thakur thakur at mcs.anl.gov
Thu Sep 2 12:57:49 CDT 2010


Try running the cpi example from the MPICH2 examples directory across two machines. There could be a connection issue between the two machines.

Rajeev

On Sep 2, 2010, at 8:20 AM, Fabio F.Gervasi wrote:

> Hi,
> 
> I have a "strange" MPI problem when I run a WRF-NMM model, compiled with Intel v11.1.072 (by GNU run ok!).
> Mpich2-1.2 also is compiled by Intel.
> 
> If I run on a single Quad-core machine everything is ok, but when I try on two or more Quad-core machine,
> initially the wrf.exe processes seem start on every pc, but after few second wrf stop and I get the error:
> Fatal error in MPI_Allreduce other mpi error error stack.. and so on...
> 
> I just set: "ulimit -s unlimited", otherwise wrf crash also with a single machine...
> 
> This probably is an MPI problem, but how can I fix it?
> 
> Thank you very much
> Fabio.
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list