[Swift-devel] Swift gsiftp staging issues on OSG

Mihael Hategan hategan at mcs.anl.gov
Sun Oct 16 15:06:39 CDT 2011


There are craploads of errors in there of all kinds and sorts, but very
few of them are actual transfer problems. It looks more like
gridftp/filesystem configuration issues.

I attached a sorted list of exception.

However, this is irrelevant. It looks like so far we keep running this
test that clearly doesn't work and hope that it will work. That's silly.
We need to figure out each problem individually and fix things one by
one.

So here's my proposal. We list all the problems that can be seen in that
log and try to fix them in order. And we do not re-run the whole thing
unless we actually solved at least one problem. Also, we sync
periodically on what was done (i.e. we keep a list that we update
immediately after something was done about an item). Also, before doing
an integration test after a problem is fixed, we do a test for that
specific problem/on a specific site only. There is way too much noise in
these big runs and that makes it very hard to see what is happening.

So here's a first list:
http://www.ci.uchicago.edu/wiki/bin/view/SWFT/OSGTesting



On Sun, 2011-10-16 at 09:54 -0500, Ketan Maheshwari wrote:
> Hello,
> 
> 
> While running an Extenci workflow on OSG with persistent coasters
> (multiple coasters services, 1 per OSG site) and gsiftp staging, I am
> facing some gridftp related issues. Following are some details of the
> run:
> 
> 
> A set of 15 OSG sites were selected after testing them for being
> responsive ('greensites'). I performed a separate guc test on these
> sites which seemed to have succeeded for each site (200MB roundtrip
> transfer in 7 mins for all sites).
> 
> 
> However, while running my workflow from Swift, many of these transfers
> fail showing a variety of errors, most pertaining to the data
> transfers.
> 
> 
> I noticed, that these transfers fail irrespective of data sizes (250K
> - 150M) and also seems to fail intermittently for different sites.
> 
> 
> The log for this run is
> here: http://www.mcs.anl.gov/~ketan/postproc-gridftp-20111013-2324-5qzebq16.log
> 
> 
> I am providing a 7G of Heap space at Swift commandline and the host
> has 50G of total memory.
> 
> 
> Any ideas?
> 
> 
> 
> 
> Regards,
> -- 
> Ketan
> 
> 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel

-------------- next part --------------
A non-text attachment was scrubbed...
Name: err2.txt.gz
Type: application/x-gzip
Size: 10001 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20111016/316a1c70/attachment.bin>


More information about the Swift-devel mailing list