[Swift-devel] Persistent coasters on OSG Swift not getting started cores

Mihael Hategan hategan at mcs.anl.gov
Fri Sep 9 21:38:34 CDT 2011


There seem to be lots of errors in that log, but a lot of them have to
do with workers failing for unknown reasons.

This is no different than what you mentioned before. So we really need
to troubleshoot that. So please enable worker logging and collect worker
logs.

On Fri, 2011-09-09 at 11:52 -0500, Ketan Maheshwari wrote:
> Hi Mihael, All,
> 
> 
> I am trying to run the DSSAT workflow, a simple one process catsn-like
> loop.
> 
> 
> The setup on OSG is persisten coasters based with the following
> elements:
> 
> 
> 1. A coaster service is started on the head node
> 2. Workers are started on OSG sites. I am using 11 OSG sites.
> 3. The workers are submitted in the form of condor jobs which connect
> back to the service running at the headnode.
> 4. In the current instance that I am running, 500 workers are
> submitted to start, out of which 280 workers are in running state as
> of now.
> 
> 
> My throttles: jobthrottle, foreach throttle are set to run 500 tasks
> at a time.
> 
> 
> However, I am seeing a see-saw pattern of active tasks whose peak is
> very low. What I am seeing is: the number of active tasks start rising
> gradually from 0 to about 30 followed by a decrease from 30 to 0 and
> back to 30. 
> 
> 
> The logs and sources are
> at : http://ci.uchicago.edu/~ketan/DSSAT-logs.tgz
> 
> 
> This tarball contains the following:
> 
> 
> DSSAT-logs/sites.grid-ps.xml
> DSSAT-logs/tc-provider-staging
> DSSAT-logs/cf.ps
> DSSAT-logs/RunDSSAT.swift
> 
> 
> Condor, swift logs
> 
> 
> DSSAT-logs/condor.log
> DSSAT-logs/swift.log
> 
> 
> Service and worker's stdouts
> 
> 
> DSSAT-logs/service-0.out
> DSSAT-logs/swift-workers.out
> 
> 
> Three runlogs since the run was resumed twice:
> 
> 
> DSSAT-logs/RunDSSAT-20110909-1025-hjcelum9.log
> DSSAT-logs/RunDSSAT-20110909-1030-jjefp0sb.log
> DSSAT-logs/RunDSSAT-20110909-0918-0hk7ign5.log
> 
> 
> Any insights would be helpful.
> 
> 
> Regards,
> -- 
> Ketan
> 
> 
> 





More information about the Swift-devel mailing list