[Swift-devel] swift changing walltime of prews-gram jobs
Allan Espinosa
aespinosa at cs.uchicago.edu
Sat Jan 24 17:03:42 CST 2009
Hi,
I am using swift0.8rc1. the same also happens to v0.7
I tried submitting a job from communicado to tp-grid1 (teraport) using
coasters. The swift runtime does not give any error but it does not
finish as well. Looking through the files received by the teraport
head node, i observed that swift keeps submitting gram jobs. It looks
like that the submitted pbs scripts kept finishing / failing.
diging through ~/.globus/jobs/tp-grid1.uchicago.edu/*/scheduler* we
see that maxwalltime become 101:00 from 00:10:00 (in sites.xml)
/usr/bin/perl "/home/aespinosa/.globus/coasters/cscript63266.pl"
"http://128.135.125.118:50001" "1728236079"
#! /bin/sh
# PBS batch job script built by Globus job manager
#
#PBS -S /bin/sh
#PBS -m n
#PBS -q fast
#PBS -l walltime=101:00
#PBS -o /dev/null
#PBS -e /dev/null
#PBS -l nodes=1
HOME="/home/aespinosa";
export HOME;
OSG_DATA="/gpfs1/osg/data";
...
...
counter=0
exit_code=0
while test $counter -lt 1; do
/bin/touch /home/aespinosa/.globus/job/tp-grid1.ci.uchicago.edu/7432.1232837576/exit.$counter;
read tmp_exit_code <
/home/aespinosa/.globus/job/tp-grid1.ci.uchicago.edu/7432.1232837576/exit.$counter
if [ $exit_code = 0 -a $tmp_exit_code != 0 ]; then
exit_code=$tmp_exit_code
fi
counter=`expr $counter + 1`
done
exit $exit_code
qsub: Job exceeds queue resource limits MSG=cannot satisfy queue max
walltime requirement
Below is my sites.xml:
<config>
<pool handle="Teraport" sysinfo="INTEL32::LINUX">
<profile namespace="globus" key="queue">fast</profile>
<profile namespace="globus" key="maxwalltime">00:10:00</profile>
<gridftp url="gsiftp://tp-grid1.ci.uchicago.edu/disks/tp-gpfs/scratch/aespinosa"
storage="/opt/osg/data/aespinosa" major="2" minor="2" patch="4">
</gridftp>
<execution provider="coaster" url="tp-grid1.uchicago.edu"
jobmanager="gt2:gt2:pbs" />
<filesystem provider="coaster" url="gt2://tp-grid1.uchicago.edu" />
<workdirectory >/disks/tp-gpfs/scratch/aespinosa</workdirectory>
</pool>
</config>
This does not happen if i use "local:pbs" as the jobmanager for the
coaster and was successful in running jobs
-Allan
--
Allan M. Espinosa <http://allan.88-mph.net/blog>
PhD student, Computer Science
University of Chicago <http://people.cs.uchicago.edu/~aespinosa>
More information about the Swift-devel
mailing list