<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
the entire application is compiled using the same MPI. the program does
run with all other 11 child nodes except for the recently installed
node "Centos 5". What "files" need to be on the child nodes for mpirun
to work correctly?<br>
<br>
Rajeev Thakur wrote:
<blockquote cite="mid00d201c87282$daa599e0$860add8c@mcs.anl.gov"
type="cite">
<meta http-equiv="Content-Type" content="text/html; ">
<meta content="MSHTML 6.00.6000.16608" name="GENERATOR">
<div dir="ltr" align="left"><span class="884160323-18022008"><font
color="#0000ff" face="Arial" size="2">Make sure that there are no
mpif.h files lying around in any of the application directories and
make sure that the entire application is compiled using the same MPI
implementation.</font></span></div>
<div dir="ltr" align="left"><span class="884160323-18022008"></span> </div>
<div dir="ltr" align="left"><span class="884160323-18022008"><font
color="#0000ff" face="Arial" size="2">Rajeev</font></span></div>
<br>
<blockquote
style="border-left: 2px solid rgb(0, 0, 255); padding-left: 5px; margin-left: 5px; margin-right: 0px;">
<div class="OutlookMessageHeader" dir="ltr" align="left"
lang="en-us">
<hr tabindex="-1"> <font face="Tahoma" size="2"><b>From:</b> Mina
Azer [<a class="moz-txt-link-freetext" href="mailto:azer@envsci.rutgers.edu">mailto:azer@envsci.rutgers.edu</a>] <br>
<b>Sent:</b> Monday, February 18, 2008 4:00 PM<br>
<b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
<b>Subject:</b> rm_3422: p4_error: interrupt SIGSEGV: 11<br>
</font><br>
</div>
Hello all,<br>
<br>
I am new to MPICH however i have read and google ed about my question
but couldn't find a solution. <br>
we have a cluster consisting of a head node and 12 child nodes all
running redhat AS V3. <br>
head node <br>
runing pgi + mpich 1.2.6.2+wrf<br>
nfs share directory to all child nodes where we keep programs we want
to run with mpich<br>
we use rsh to login since the system is on private newtork<br>
a node list with nodes name is created and saved to the shared nfs
direcotry<br>
<br>
recently I reinstalled a node with centos 5 and since we are not able
to run mpirun on it<br>
<br>
<i>/usr/local/mpich-1.2.6-2/bin/mpirun -p4pg nodes.list wrf.exe<br>
rm_3422: p4_error: interrupt SIGSEGV: 11<br>
Segmentation fault<br>
p0_22242: p4_error: Child process exited while making connection to
remote process on c3cn12: 0<br>
/usr/local/mpich-1.2.6-2/bin/mpirun: line 1: 22242 Broken
pipe /d7/user/Dir7/wrf.exe -p4pg nodes.list -p4wd
/d7/user/Dir7<br>
P4 procgroup file is nodes.list.</i><br>
<br>
I have mounted the pgi directory on the child node and still getting
the error message. <br>
<br>
Is the error related to the fact that this child node is Centos5 where
everything else is redhat? or the version of redhat is old?<br>
<br>
Thanks<br>
</blockquote>
</blockquote>
<br>
</body>
</html>