[MPICH] RE: rm_3422: p4_error: interrupt SIGSEGV: 11

Rajeev Thakur thakur at mcs.anl.gov
Mon Feb 18 17:06:11 CST 2008


Make sure that there are no mpif.h files lying around in any of the
application directories and make sure that the entire application is
compiled using the same MPI implementation.
 
Rajeev


  _____  

From: Mina Azer [mailto:azer at envsci.rutgers.edu] 
Sent: Monday, February 18, 2008 4:00 PM
To: mpich-discuss at mcs.anl.gov
Subject: rm_3422: p4_error: interrupt SIGSEGV: 11


Hello all,

I am new to MPICH however i have read and google ed about my question but
couldn't find a solution. 
we have a cluster consisting of a head node and 12 child nodes all running
redhat AS V3. 
head node 
runing pgi + mpich 1.2.6.2+wrf
nfs share directory to all child nodes where we keep programs we want to run
with mpich
we use rsh to login since the system is on private newtork
a node list with nodes name is created and saved to the shared nfs direcotry

recently I reinstalled a node with centos 5 and since we are not able to run
mpirun on it

/usr/local/mpich-1.2.6-2/bin/mpirun -p4pg nodes.list wrf.exe
rm_3422:  p4_error: interrupt SIGSEGV: 11
Segmentation fault
p0_22242:  p4_error: Child process exited while making connection to remote
process on c3cn12: 0
/usr/local/mpich-1.2.6-2/bin/mpirun: line 1: 22242 Broken pipe
/d7/user/Dir7/wrf.exe -p4pg nodes.list -p4wd /d7/user/Dir7
P4 procgroup file is nodes.list.

I have mounted the pgi directory on the child node and still getting the
error message. 

Is the error related to the fact that this child node is Centos5 where
everything else is redhat? or the version of redhat is old?

Thanks


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080218/21513d82/attachment.htm>


More information about the mpich-discuss mailing list