[Swift-devel] cleanup fails on abe

Sarah Kenny skenny at uchicago.edu
Mon Aug 16 11:28:33 CDT 2010


here's the entirety of the gram log for the rm job:

8/16 10:58:31 JM: Security context imported
8/16 10:58:31 Pre-parsed RSL string: &( directory =
"/scratch/users/skenny" )( arguments = "-rf"
"rmx-20100816-1055-k1z90mf7" )( maxnodes = "16" )( executable =
"/bin/rm" )( maxwalltime = "30" )( project = "TG-DBS080004N" )( queue
= "normal" )( slots = "10" )( nodegranularity = "16" )( name =
"cleantest" )( workerspernode = "1" )
8/16 10:58:31
<<<<<Job Request RSL
&("directory" = "/scratch/users/skenny" )("arguments" = "-rf"
"rmx-20100816-1055-k1z90mf7" )("maxnodes" = "16" )("executable" =
"/bin/rm" )("maxwalltime" = "30" )("project" = "TG-DBS080004N"
)("queue" = "normal" )("slots" = "10" )("nodegranularity" = "16"
)("name" = "cleantest" )("workerspernode" = "1" )
>>>>>Job Request RSL
8/16 10:58:31
<<<<<Job Request RSL (canonical)
&("directory" = "/scratch/users/skenny" )("arguments" = "-rf"
"rmx-20100816-1055-k1z90mf7" )("maxnodes" = "16" )("executable" =
"/bin/rm" )("maxwalltime" = "30" )("project" = "TG-DBS080004N"
)("queue" = "normal" )("slots" = "10" )("nodegranularity" = "16"
)("name" = "cleantest" )("workerspernode" = "1" )
>>>>>Job Request RSL (canonical)
8/16 10:58:31
<<<<<Job RSL
&("environment" = ("HOME" "/u/ac/skenny" ) ("LOGNAME" "skenny" )
)("directory" = "/scratch/users/skenny" )("arguments" = "-rf"
"rmx-20100816-1055-k1z90mf7")("maxnodes" = "16" )("executable" =
"/bin/rm" )("maxwalltime" = "30" )("project" = "TG-DBS080004N"
)("queue" = "normal" )("slots" = "10" )("nodegranularity" = "16"
)("name" = "cleantest" )("workerspernode" = "1" )
>>>>>Job RSL
8/16 10:58:31
<<<<<Job RSL (post-eval)
&("environment" = ("HOME" "/u/ac/skenny" ) ("LOGNAME" "skenny" )
)("directory" = "/scratch/users/skenny" )("arguments" = "-rf"
"rmx-20100816-1055-k1z90mf7" )("maxnodes" = "16" )("executable" =
"/bin/rm" )("maxwalltime" = "30" )("project" = "TG-DBS080004N"
)("queue" = "normal" )("slots" = "10" )("nodegranularity\
" = "16" )("name" = "cleantest" )("workerspernode" = "1" )
>>>>>Job RSL (post-eval)
8/16 10:58:31 JMI: testing job manager scripts for type pbs exist and
permissions are ok.
8/16 10:58:31 JMI: completed script validation: job manager type is pbs.
8/16 10:58:31 JMI: cmd = cache_cleanup
Mon Aug 16 10:58:31 2010 JM_SCRIPT: New Perl JobManager created.
Mon Aug 16 10:58:31 2010 JM_SCRIPT: Using jm supplied job dir:
/u/ac/skenny/.globus/job/abe1196.ncsa.uiuc.edu/2408.1281974311
Mon Aug 16 10:58:31 2010 JM_SCRIPT: Using jm supplied job dir:
/u/ac/skenny/.globus/job/abe1196.ncsa.uiuc.edu/2408.1281974311
Mon Aug 16 10:58:31 2010 JM_SCRIPT: cache_cleanup(enter)
Mon Aug 16 10:58:31 2010 JM_SCRIPT: Cleaning files in job dir
/u/ac/skenny/.globus/job/abe1196.ncsa.uiuc.edu/2408.1281974311
Mon Aug 16 10:58:31 2010 JM_SCRIPT: Removed 1 files from
/u/ac/skenny/.globus/job/abe1196.ncsa.uiuc.edu/2408.1281974311
Mon Aug 16 10:58:31 2010 JM_SCRIPT: cache_cleanup(exit)
8/16 10:58:31 JM: before sending to client: rc=0 (Success)
8/16 10:58:31 JM: in globus_gram_job_manager_reporting_file_remove()
8/16 10:58:31 JM: in globus_gram_job_manager_reporting_file_remove()
8/16 10:58:31 JM: exiting globus_gram_job_manager.

as far as i can tell i'm not at quota on my work or home dir's on abe.
yeah we were able to run fine before...haven't changed our config
since then so maybe something on their end.


On Fri, Aug 13, 2010 at 1:19 PM, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> Can you can find the gram log for the cleanup job (it's a /bin/rm)?
>
> Also, I remember you being able to run things just fine on Abe. Are you
> aware of any configuration changes there? Any disks full?
>
> On Fri, 2010-08-13 at 13:11 -0500, Sarah Kenny wrote:
>> hi all, not sure if anyone else is running on abe, but for some reason
>> cleanup seems to fail on there very consistently. swift throws a
>> warning:
>>
>> The following warnings have occurred:
>> 1. Cleanup on ABE failed
>> Caused by:
>>
>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
>> Cannot submit job
>>         at org.globus.cog.abstraction.impl.execution.gt2.JobSubmissionTaskHandler.submitSingleJob(JobSubmissionTaskHandler.java:146)
>>         at org.globus.cog.abstraction.impl.execution.gt2.JobSubmissionTaskHandler.submit(JobSubmissionTaskHandler.java:100)
>>         at org.globus.cog.abstraction.impl.common.AbstractTaskHandler.submit(AbstractTaskHandler.java:46)
>>         at org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.submit(ExecutionTaskHandler.java:50)
>>         at org.globus.cog.abstraction.coaster.service.job.manager.LocalQueueProcessor.run(LocalQueueProcessor.java:40)
>> Caused by: org.globus.gram.GramException: Parameter not supported
>>         at org.globus.gram.Gram.request(Gram.java:358)
>>         at org.globus.gram.GramJob.request(GramJob.java:262)
>>         at org.globus.cog.abstraction.impl.execution.gt2.JobSubmissionTaskHandler.submitSingleJob(JobSubmissionTaskHandler.java:134)
>>         ... 4 more
>>
>> if i shut off cleanup, i don't get the warning and the workflow
>> 'apprears' to have completed successfully, however even with cleanup
>> shut off pbs still generates the email below giving the error:
>>
>>
>> i'm still poking around to see if i can figure out what's up, but
>> thought i would throw this out there in case someone else has come
>> across it.
>>
>> swift, coaster and gram logs attached.
>>
>> ~sk
>>
>> ---------- Forwarded message ----------
>> From: adm <adm at ncsa.uiuc.edu>
>> Date: Fri, Aug 13, 2010 at 12:53 PM
>> Subject: PBS JOB 3000582.abem5.ncsa.uiuc.edu
>> To: skenny at abe1196.ncsa.uiuc.edu
>>
>>
>> PBS Job Id: 3000582.abem5.ncsa.uiuc.edu
>> Job Name:   configtester
>> Exec host:  abe0553/0+abe0314/0+abe0313/0+abe0311/0+abe0310/0+abe0307/0+abe0294/0+abe0290/0+abe0287/0+abe0286/0+abe0285/0+abe0284/0+abe0283/0+abe0279/0+abe0278/0+abe0277/0+abe0275/0+abe0273/0+abe0272/0+abe0271/0+abe0256/0+abe0254/0+abe0174/0+abe0173/0+abe0166/0+abe0165/0+abe0163/0+abe0087/0+abe0085/0+abe0084/0+abe0010/0+abe0387/0
>> An error has occurred processing your job, see below.
>> Post job file processing error; job 3000582.abem5.ncsa.uiuc.edu on
>> host abe0553/0+abe0314/0+abe0313/0+abe0311/0+abe0310/0+abe0307/0+abe0294/0+abe0290/0+abe0287/0+abe0286/0+abe0285/0+abe0284/0+abe0283/0+abe0279/0+abe0278/0+abe0277/0+abe0275/0+abe0273/0+abe0272/0+abe0271/0+abe0256/0+abe0254/0+abe0174/0+abe0173/0+abe0166/0+abe0165/0+abe0163/0+abe0087/0+abe0085/0+abe0084/0+abe0010/0+abe0387/0
>>
>> Unable to copy file
>> /u/ac/skenny/.pbs_spool//3000582.abem5.ncsa.uiuc.edu.OU to
>> /u/ac/skenny/.globus/job/abe1196.ncsa.uiuc.edu/15575.1281721892/stdout
>> *** error from copy
>> /bin/cp: cannot create regular file
>> `/u/ac/skenny/.globus/job/abe1196.ncsa.uiuc.edu/15575.1281721892/stdout':
>> No such file or directory
>> *** end error output
>>
>> Unable to copy file
>> /u/ac/skenny/.pbs_spool//3000582.abem5.ncsa.uiuc.edu.ER to
>> /u/ac/skenny/.globus/job/abe1196.ncsa.uiuc.edu/15575.1281721892/stderr
>> *** error from copy
>> /bin/cp: cannot create regular file
>> `/u/ac/skenny/.globus/job/abe1196.ncsa.uiuc.edu/15575.1281721892/stderr':
>> No such file or directory
>> *** end error output
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
>
>



More information about the Swift-devel mailing list