[Swift-devel] Update on Teraport problems with wavlet workflow
    Ben Clifford 
    benc at hawaga.org.uk
       
    Wed Feb 28 12:24:13 CST 2007
    
    
  
do you have kickstart records for the nodes that *do* run?
On Wed, 28 Feb 2007, Tiberiu Stef-Praun wrote:
> Nothing gets generated in the individual job's temporary directories.
> There is no kickstart record.
> It would be really useful finding out the hostname of the node on
> which these jobs ran.
> 
> Let me retry some more workflow runs.
> 
> On 2/28/07, Ben Clifford <benc at hawaga.org.uk> wrote:
> > 
> > 
> > On Wed, 28 Feb 2007, Ben Clifford wrote:
> > 
> > > do you have kickstart records for the jobs that are failing?
> > 
> > if you do, then:
> > 
> > > > Summary/Speculation: bad teraport node causes job to be declared as
> > > > done even though the execution failed
> > 
> > this speculation can be investigated further by:
> > 
> > finding a job that breaks. finding the node name from the kickstart
> > record. grepping all the kickstart records to find other kickstart records
> > for those jobs. looking to see if they all fail, or if some work and some
> > fail. then report back findings here.
> > 
> > --
> > 
> 
> 
> 
    
    
More information about the Swift-devel
mailing list