[Swift-user] Success with fork, but exception in getFile with condor

Jing Tie tiejing at gmail.com
Tue Sep 11 00:09:07 CDT 2007


Hi,

Thanks! Is it possible that the status file was generated in an
unexpected directory?

I run SID application on another site atlas.dpcc.uta.edu
(jobmanager-pbs), and it succeed! But on site u2-grid.ccr.buffalo.edu
(jobmanager-pbs), there was an execution error after task submitting:
"FileResourceCache Maximum idle time exceeded. Removing resource for
gsiftp://u2-grid.ccr.buffalo.edu". logs are attached (sid*.log ---
u2-grid.ccr.buffalo.edu, simple*.log --- cmsgrid01.hep.wisc.edu).

Thanks,
Jing

On 9/10/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> The wrapper produces exactly one status file: <jobid>-success or
> <jobid>-error. If none is present it means that either the very unlikely
> thing that the wrapper didn't write any of them, due to some weird thing
> I'm missing, or that GridFTP on the head node doesn't see what the
> wrapper has written.
>
> On Mon, 2007-09-10 at 16:42 -0500, Jing Tie wrote:
> > Hi,
> >
> > I think there is a problem running swift script with jobmanager-condor
> > on some OSG sites. I run simple-wf.dtm (very simple swift script to
> > copy content of input file to output file) and SID script on GLOW site
> > separately. Everything is great when running by jobmanager-fork, but
> > "exception in getFile" happened with jobmanager-condor. The log from
> > swift client is attached. However, no log/info/output files were
> > generated in the swift work cache, neither was any duplicate-***
> > directory, though in the log file the directory seemed had been
> > created.
> >
> > The site GLOW (cmsgrid01.hep.wisc.edu) can successfully run
> > globus-url-copy, copy files between OSG_DATA and OSG_WN_TMP.
> >
> > Exception:
> > Task(type=2, identity=urn:0-0-1189455037519) setting status to Failed
> > Exception in getFile
> > File transfer failed
> > duplicate failed
> > The following errors have occurred:
> > 1. Application "duplicate" failed (No status file was found. Check the
> > shared filesystem on GLOW)
> >         Arguments: "simpleFile.txt"
> >         Host: GLOW
> >         Directory: simple-wf-7l8vqstrkud90/duplicate-7niqt1hi
> >         STDERR:
> >         STDOUT:
> >
> > Thanks,
> > Jing
> > _______________________________________________
> > Swift-user mailing list
> > Swift-user at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sid-wf1-ryatce3d38vg1.log
Type: application/octet-stream
Size: 216808 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20070911/614d922d/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: simple-wf-8fzfp19rn1in0.log
Type: application/octet-stream
Size: 77141 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20070911/614d922d/attachment-0001.obj>


More information about the Swift-user mailing list