[Swift-devel] Persistent coasters on OSG Swift not getting started cores
Ketan Maheshwari
ketancmaheshwari at gmail.com
Fri Sep 9 11:52:17 CDT 2011
Hi Mihael, All,
I am trying to run the DSSAT workflow, a simple one process catsn-like loop.
The setup on OSG is persisten coasters based with the following elements:
1. A coaster service is started on the head node
2. Workers are started on OSG sites. I am using 11 OSG sites.
3. The workers are submitted in the form of condor jobs which connect back
to the service running at the headnode.
4. In the current instance that I am running, 500 workers are submitted to
start, out of which 280 workers are in running state as of now.
My throttles: jobthrottle, foreach throttle are set to run 500 tasks at a
time.
However, I am seeing a see-saw pattern of active tasks whose peak is very
low. What I am seeing is: the number of active tasks start rising gradually
from 0 to about 30 followed by a decrease from 30 to 0 and back to 30.
The logs and sources are at : http://ci.uchicago.edu/~ketan/DSSAT-logs.tgz
This tarball contains the following:
DSSAT-logs/sites.grid-ps.xml
DSSAT-logs/tc-provider-staging
DSSAT-logs/cf.ps
DSSAT-logs/RunDSSAT.swift
Condor, swift logs
DSSAT-logs/condor.log
DSSAT-logs/swift.log
Service and worker's stdouts
DSSAT-logs/service-0.out
DSSAT-logs/swift-workers.out
Three runlogs since the run was resumed twice:
DSSAT-logs/RunDSSAT-20110909-1025-hjcelum9.log
DSSAT-logs/RunDSSAT-20110909-1030-jjefp0sb.log
DSSAT-logs/RunDSSAT-20110909-0918-0hk7ign5.log
Any insights would be helpful.
Regards,
--
Ketan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20110909/8df34b0f/attachment.html>
More information about the Swift-devel
mailing list