[MPICH] MPICH2 startup w/ PBS
Adam Hock
ahock at ittc.ku.edu
Tue Apr 4 11:49:32 CDT 2006
I have spent some time developing the following scripts. They are not
perfect but they do get the job done. I use them a prologue and epilogue
scripts. These scripts need to be put in /var/spool/PBS/mom_priv on each
node and have the premission of 755
Here are a list of variables you will need to change to make it work on
your system.
$mpich = "/bio/tools/mpich/mpich2-1.0.2-PGI-6.0/bin";
$HOME = "/bio/users";
$log = "/bio/tools/admin/report/log/epilogue.$date";
The log is for getting info about jobs and is not needed.
Feel free to contact me if you have any questions.
Here is the prologue
## prologue script
## for PBSpro to play nice and work with mpich2 daemon system
## Adam Hock - University of Kansas
## 02-07-2006
## Version 1.6
# This script is ran as root and needs to have permissions of 644
# It also needs to be called prologue
# Put this file in ~PBS_INSTALL/mom_priv of all nodes
# The 10 arguments pbs passed in.
$arg1 = shift; # the job id
$arg2 = shift; # the user name under which the job executes
$arg3 = shift; # the group name under whch the job executes
$arg4 = shift; # the job name / not given in prologue
$arg5 = shift; # the session id / not given in prologue
$arg6 = shift; # the requested resource limits (list) / not given in
$arg7 = shift; # the list of resources used / not given in prologue
$arg8 = shift; # the name of the queue in which the ob resides / not
given in prologue
$arg9 = shift; # the account string, if one exits / not given in prologue
$arg10 = shift; # the exit status of the job.
#set to pbs_pro or torque
#pbs_pro = 1
#torque = 0
$mode = 1;
#set divider
$set = 0;
$divide = 0;
#path variables
$mpich = "/bio/tools/mpich/mpich2-1.0.2-PGI-6.0/bin";
$pbs = "/usr/pbs/bin";
$qstat_cm = "qstat -n -1 -u";
#files to make this all work nice and neat
$output = "> /tmp/qstat.$arg1";
$input = "/tmp/qstat.$arg1";
$node_list = "> /tmp/node_list.$arg1";
$list = "/tmp/node_list.$arg1";
$session_list = ">>/tmp/$arg2.session_list";
# $temp_debug =">/tmp/debug";
# open(debug,$temp_debug);
#needed commands
# have to use sudo so that the command is run as the user and not root
$mpdboot = "/usr/bin/sudo -u $arg2 $mpich/mpdboot --remcons";
#Enviroment variable that needs to be set, so that mpdboot can find
# in the users home directory
$HOME = "/bio/users";
#get a queue status, this way we can grab the nodes the queue is going
to use
# write them out to a tmp file
system "$pbs/$qstat_cm $arg2 $output";
#get the job id, so we start daemons for this job only.
@id = split(/\./,$arg1);
#get the list of nodes after you find the right job
if ( $mode == 1) {
foreach $line (@qstat) {
$line =~ m/(\d+)\./;
if ("$1" eq "$id[0]") {
@nodes = split(/\+/,$line);
$size = @nodes;
#get first node don't need it for node list
@first = split(/ /,$nodes[0]);
$size_first = @first;
#checking for multiples of same node.
foreach $elem (@nodes) {
if("$first[$size_first-1]" eq "$elem") {
#$size=$size/2; # need to add divider because diff node types
$divide = $divide + 1;
} }
if ( $mode == 0) {
$c = 0;
foreach $line (@qstat) {
$line =~ m/(\d+)\./;
if ("$1" eq "$id[0]") {
if ( $line =~ /compute\-\d+\-\d+/ ) {
@temp_nodes = split(/\+/,$line);
$temp_size = @temp_nodes;
for ($i=0;$i<$temp_size;$i++) {
if( $temp_nodes[$i] =~ /compute\-\d+\-\d+/) {
if($i == $temp_size-1) {
$nodes[$c] = $temp_nodes[$i];
$c = $c + 1;
} #end for ( $temp_nodes[$i] =~ /compute\-\d+\-\d+/)
}#end for ($i=0;$i<$temp_size;$i++)
} #end if ( $line =~ /compute\-\d+\-\d+/ )
} #end if for if ("$1" eq "$id[0]")
$size = @nodes;
} #end foreach $line (@qstat)
#checking for multiples of same node.
foreach $elem (@nodes) {
if("$first[$size_first-1]" eq "$elem" ) {
$divide = $divide + 1;
} #end if ( $mode == "torque")
#create a nodes list for mpdboot to use to start the nodes
# some fancyness here to get the first node right -- NOT NEEDED
#print A "$first[$size_first-1]\n" if ($mode == 1);
if($size > 1) {
for($count=$mode;$count<$size;$count++) {
print A "$nodes[$count]\n";
#need to change the permissions so user submitting the job can read the
file just created
chmod 0644, "$list";
#set HOME to users home directory so mpdboot can find .mpd.conf
$ENV{HOME} = "$HOME/$arg2";
#boot the nodes
`$mpdboot -n $size -f $list`;
print B "$id[0]\n";
Here is the epilogue
## epilogue script
## for PBSpro to play nice and work with mpich2 daemon system
## Adam Hock
## 02-07-2006
## Version 1.6
# This script is ran as root and needs to have permissions of 644
# It also needs to be called epilogue
# Put this file in ~PBS_INSTALL/mom_priv of all nodes
# The 10 arguments pbs pass to the script
$arg1 = shift; # the job id
$arg2 = shift; # the user name under which the job executes
$arg3 = shift; # the group name under whch the job executes
$arg4 = shift; # the job name
$arg5 = shift; # the session id
$arg6 = shift $arg7 = shift; # the list of resources used
$arg8 = shift; # the name of the queue in which the ob resides
$arg9 = shift; # the account string, if one exits
$arg10 = shift; # the exit status of the job.
#time stamp
$date = time();
#path variables
$mpich = "/bio/tools/mpich/mpich2-1.0.2-PGI-6.0/bin";
$log = "/bio/tools/admin/report/log/epilogue.$date";
# Files need to make all things work created by the prologue script
$input = "/tmp/qstat.$arg1";
$list = "/tmp/node_list.$arg1";
$session_list = "/tmp/$arg2.session_list";
$session_write = ">/tmp/$arg2.session_list";
# Commands need to be run at finish of job
$mpdexit = "/usr/bin/sudo -u $arg2 $mpich/mpdallexit";
$rm = "/bin/rm -f";
# Where home directories are found
$HOME = "/bio/users";
#Make sure all enviromental variables are of the users especially HOME
$ENV{HOME} = "$HOME/$arg2";
#Shut the daemons off for that user
@id = split(/\./,$arg1);
#determine how many sessions are on this node
#if there are more then one.. we don't want to kill the daemons
# remove my session and update sessions file
if($num_sessions > 1) {
foreach $line (@sessions) {
if("$line" ne "$id[0]") {
print B "$line";
#Remove all those files used to start the daemons. Clean up
`$rm -f $input $list`;
#If there is only one session shut down daemons
} elsif ($num_sessions == 1) {
#Shutdown daemons
#Remove files
`$rm -f $input $list $session_list`;
#update usage database
if ( -e $log ) {
$date = time();
$log = "/bio/tools/admin/report/log/epilogue.$date";
`echo $arg1 >> $log`;
`echo $arg2 >> $log`;
`echo $arg3 >> $log`;
`echo $arg4 >> $log`;
`echo $arg5 >> $log`;
`echo $arg6 >> $log`;
`echo $arg7 >> $log`;
`echo $arg8 >> $log`;
`echo $arg9 >> $log`;
`echo $arg10 >> $log`;
Jeffrey B. Layton wrote:
> Darius Buntinas wrote:
>> What's "screaming"? mpdboot or mpiexec?
> I'm pretty sure it's mpdboot.
> I'll try the method below to see what happens. I'm also going
> to try Pete's mpiexec based on some recommendations to see
> if that reduces the pain.
> Thanks!
> Jeff
>> Try:
>> mpdboot -n ${NP} -f ${PBS_NODEFILE}
>> mpiexec -n ${NP} ./${EXE}
>> You don't need a machinefile with mpiexec unless you want to execute
>> on a subset of the nodes in your mpd ring, or you want control of the
>> process-to-node mapping.
>> I think that mpdboot should only start one mpd oer node, even if the
>> node is specified more than one time in the file (you really only ever
>> need one mpd per node). If mpdboot is having trouble because you're
>> asking for ${NP} mpds but there are only ${NP}/2 unique nodes in the
>> file, you can try something like:
>> NUM_NODES=`sort -u ${PBS_NODEFILE} | wc -l | awk '{print $1}'`
>> mpdboot -n ${NUM_NODES} -f ${PBS_NODEFILE}
>> mpiexec -n ${NP} ./${EXE}
>> I'm not PBS expert, so there might be an easier way to do that, but
>> give it a try.
>> If you are concerned about your process-to-node mapping and want to
>> check what it is try:
>> mpiexec -l -n ${NP} hostname
>> -d
>> On Tue, 4 Apr 2006, Jeffrey B. Layton wrote:
>>> No joy. It always screams about not having enough hosts:
>>> totalnum=16 numhosts=8
>>> there are not enough hosts on which to start all processes
>>> I think this because we have two processors per node (ppn=2).
>>> Consequently PBS_NODEFILE has the hosts repeated. I've
>>> tried using --totalnum=${NP} --ncpus=2 and this didn't work
>>> either (same error message).
>>> Thanks!
>>> Jeff
>>>> How about the following 3 lines in your script:
>>>> mpdboot -n ${NP} -f ${PBS_NODEFILE}
>>>> mpiexec -machinefile ${PBS_NODEFILE} -n ${NP} ./${EXE}
>>>> mpdallexit
>>>> Wei-keng
>>>> On Tue, 4 Apr 2006, Jeffrey B. Layton wrote:
>>>>> Good morning,
>>>>> I hate to bother everyone early in the morning, but I'm
>>>>> looking for some advice on MPICH2 startup. I've been starting
>>>>> an mpd on each node in the cluster via,
>>>>> mpdboot -n 25 -f /home/jlayton/mpd.hosts
>>>>> where the file mpd.hosts contains a list of all possible hosts.
>>>>> So I'm basically starting mpd on every node. Then I run the
>>>>> code using mpiexec
>>>>> mpiexec -machinefile ${PBS_NODEFILE} -n ${NP} ./${EXE}
>>>>> and run mpdallexit after the code is finished to stop all of the
>>>>> mpds. Notice that I'm using PBS for queuing/scheduling.
>>>>> This is something of a pain, because we lose nodes for
>>>>> various projects or training so I'm constantly having to go into
>>>>> the list of hosts and edit it. I also have to change the count on
>>>>> the mpdboot command.
>>>>> Is there a better way to start up MPICH2 codes using PBS?
>>>>> Thanks!
>>>>> Jeff
+ Adam Hock Senior Network Systems Administrator +
+ for Bioinformatics +
+ +
+ Office 241 Nichols Hall +
+ Office Telephone (785) 864-7728 +
+ Email ahock at ittc.ku.edu +
More information about the mpich-discuss
mailing list