[Swift-user] pbs failure on pads

Zhao Zhang zhaozhang at uchicago.edu
Tue Mar 2 11:56:08 CST 2010


Hi,

I am having the following failure right now on pads using coaster, it 
failed occasionally but unexpected.
I am not sure what the following info means, could some one point out? 
Thanks

[zzhang at login2 final]$ cat pbs.xml
<config>

  <pool handle="pbs">
    <execution provider="coaster" url="none" jobManager="local:pbs"/>
    <profile namespace="globus" key="queue">extended</profile>

    <profile namespace="globus" key="maxtime">3600</profile>
    <profile namespace="globus" key="maxwalltime">00:40:00</profile>   
    <profile namespace="globus" key="workersPerNode">8</profile>
    <profile namespace="globus" key="maxnodes">8</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>
    <profile namespace="karajan" key="jobThrottle">.63</profile>

    <gridftp  url="local://localhost" />
    <workdirectory >/home/zzhang/swiftwork</workdirectory>
  </pool>
</config>


[zzhang at login2 final]$ swift -tc.file tc -sites.file pbs.xml movie.swift
Swift svn swift-r3255 (swift modified locally) cog-r2723

RunID: 20100302-1151-1tu5u5ac
Progress:
Progress:
Progress:
Progress:  uninitialized:1
Progress:  Initializing:16325  Selecting site:58
Progress:  Selecting site:16382  Initializing site shared directory:1
Progress:  Selecting site:16319  Stage in:63  Submitting:1
Progress:  Selecting site:16319  Stage in:46  Submitting:2  Submitted:16
Progress:  Selecting site:16319  Stage in:13  Submitting:1  Submitted:50
Progress:  Selecting site:16319  Submitted:63  Active:1
Progress:  Selecting site:16319  Submitted:60  Active:4
Progress:  Selecting site:16319  Submitted:45  Active:18  Checking 
status:1  Finished successfully:3
Progress:  Selecting site:16319  Submitting:1  Submitted:39  Active:21  
Checking status:1  Stage out:2  Finished successfully:13
Progress:  Selecting site:16319  Stage in:3  Submitted:33  Active:24  
Stage out:3  Finished successfully:17
Progress:  Selecting site:16317  Stage in:3  Submitted:35  Active:18  
Checking status:1  Stage out:7  Finished successfully:29
Progress:  Selecting site:16318  Stage in:11  Submitted:29  Active:24  
Finished successfully:42
Progress:  Selecting site:16318  Stage in:4  Submitted:34  Active:22  
Checking status:2  Stage out:2  Finished successfully:47
Progress:  Selecting site:16316  Stage in:6  Submitted:28  Active:23  
Checking status:1  Stage out:6  Finished successfully:55
Worker task failed: 0302-521133-000002 Block task ended prematurely
----------------------------------------
Begin PBS Prologue Tue Mar  2 11:52:38 CST 2010
Job ID:         6870.svc.pads.ci.uchicago.edu
Username:       zzhang
Group:          ci-users
Nodes:          
c05.pads.ci.uchicago.edu,c15.pads.ci.uchicago.edu,c42.pads.ci.uchicago.edu,c43.pads.ci.uchicago.edu
End PBS Prologue Tue Mar  2 11:52:38 CST 2010
----------------------------------------
----------------------------------------
Begin PBS Epilogue Tue Mar  2 11:52:41 CST 2010
Job ID:         6870.svc.pads.ci.uchicago.edu
Username:       zzhang
Group:          ci-users
Job Name:       null
Session:        7051
Limits:         nodes=4,walltime=00:59:00
Resources:      cput=00:00:00,mem=700kb,vmem=8400kb,walltime=00:00:02
Nodes:          
c05.pads.ci.uchicago.edu,c15.pads.ci.uchicago.edu,c42.pads.ci.uchicago.edu,c43.pads.ci.uchicago.edu
End PBS Epilogue Tue Mar  2 11:52:41 CST 2010

Progress:  Selecting site:16316  Stage in:6  Submitted:27  Active:23  
Stage out:7  Finished successfully:64 Failed but can retry:1
Failed to transfer wrapper log from movie-20100302-1151-1tu5u5ac/info/3 
on pbs
Execution failed:
        Exception in transform:
Arguments: [training_set/mv_0002679.txt]
Host: pbs
Directory: movie-20100302-1151-1tu5u5ac/jobs/3/transform-35lvrioj
stderr.txt:

stdout.txt:

----

Caused by:
        Task failed: 0302-521133-000002 Block task ended prematurely




More information about the Swift-user mailing list