[MPICH] RE: rm_3422: p4_error: interrupt SIGSEGV: 11
Rajeev Thakur
thakur at mcs.anl.gov
Mon Feb 18 18:00:36 CST 2008
Nothing needs to be on the child nodes except the executables. When
compiling, make sure that no mpif.h files already exist in any of the
application directories. Some Fortran applications come with those files,
and they are not compatible across implementations. Other than that I don't
know what the problem might be. Or you can check with the wrf application
developers.
Rajeev
_____
From: Mina Azer [mailto:azer at envsci.rutgers.edu]
Sent: Monday, February 18, 2008 5:19 PM
To: Rajeev Thakur
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: rm_3422: p4_error: interrupt SIGSEGV: 11
the entire application is compiled using the same MPI. the program does run
with all other 11 child nodes except for the recently installed node "Centos
5". What "files" need to be on the child nodes for mpirun to work correctly?
Rajeev Thakur wrote:
Make sure that there are no mpif.h files lying around in any of the
application directories and make sure that the entire application is
compiled using the same MPI implementation.
Rajeev
_____
From: Mina Azer [mailto:azer at envsci.rutgers.edu]
Sent: Monday, February 18, 2008 4:00 PM
To: mpich-discuss at mcs.anl.gov
Subject: rm_3422: p4_error: interrupt SIGSEGV: 11
Hello all,
I am new to MPICH however i have read and google ed about my question but
couldn't find a solution.
we have a cluster consisting of a head node and 12 child nodes all running
redhat AS V3.
head node
runing pgi + mpich 1.2.6.2+wrf
nfs share directory to all child nodes where we keep programs we want to run
with mpich
we use rsh to login since the system is on private newtork
a node list with nodes name is created and saved to the shared nfs direcotry
recently I reinstalled a node with centos 5 and since we are not able to run
mpirun on it
/usr/local/mpich-1.2.6-2/bin/mpirun -p4pg nodes.list wrf.exe
rm_3422: p4_error: interrupt SIGSEGV: 11
Segmentation fault
p0_22242: p4_error: Child process exited while making connection to remote
process on c3cn12: 0
/usr/local/mpich-1.2.6-2/bin/mpirun: line 1: 22242 Broken pipe
/d7/user/Dir7/wrf.exe -p4pg nodes.list -p4wd /d7/user/Dir7
P4 procgroup file is nodes.list.
I have mounted the pgi directory on the child node and still getting the
error message.
Is the error related to the fact that this child node is Centos5 where
everything else is redhat? or the version of redhat is old?
Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080218/949a70f8/attachment.htm>
More information about the mpich-discuss
mailing list