[Swift-user] Errors in 13-site OSG run: lazy error question

Mihael Hategan hategan at mcs.anl.gov
Thu Aug 26 23:15:44 CDT 2010


Wait, wait, wait. Is this a new "invalid path (..logfile)" error?

On Thu, 2010-08-26 at 22:11 -0600, Michael Wilde wrote:
> Glen, I wonder if whats happening here is that Swift will retry and lazily run past *job* errors, but the error below (a mapping error) is maybe being treated as an error in Swift's interpretation of the script itself, and this causes an immediate halt to execution?
> 
> Can anyone confirm that this is whats happening, and if it is the expected behavior?
> 
> Also, Glen, 2 questions:
> 
> 1) Isn't the error below the one that was fixed by Mihael in a recent revision - the same one I looked at earlier in the week?
> 
> 2) Do you know what errors the "Failed but can retry:8" message is referring to?
> 
> Where is the log/run directory for this run?  How long did it take to get the 589 jobs finished?  It would be good to start plotting these large multi-site runs to get a sense of how the scheduler is doing.
> 
> - Mike
> 
> 
> ----- "Glen Hocky" <hockyg at uchicago.edu> wrote:
> 
> > here's the result of my 13 site run that ran while i was out this
> > evening. It did pretty well!
> > but seems to have that problem of not quite lazy errors
> > ........
> > Progress: Submitting:3 Submitted:262 Active:147 Checking status:3
> > Stage out:1 Finished successfully:586
> > Progress: Submitting:3 Submitted:262 Active:144 Checking status:4
> > Stage out:2 Finished successfully:587
> > Progress: Submitting:3 Submitted:262 Active:142 Stage out:2 Finished
> > successfully:587 Failed but can retry:6
> > Progress: Submitting:3 Submitted:262 Active:140 Finished
> > successfully:589 Failed but can retry:8
> > Failed to transfer wrapper log from
> > glassRunCavities-20100826-1718-7gi0dzs1/info/5 on
> > UCHC_CBG_vdgateway.vcell.uchc.edu
> > Execution failed:
> > org.griphyn.vdl.mapping.InvalidPathException: Invalid path (..logfile)
> > for org.griphyn.vdl.mapping.DataNode identifier
> > tag:benc at ci.uchicago.edu
> > ,2008:swift:dataset:20100826-1718-sznq1qr2:720000002968 type GlassOut
> > with no value at dataset=modelOut path=[3][1][11] (not closed)
> 





More information about the Swift-user mailing list