[Swift-devel] [Bug 357] New: Script hangs in staging on OSG

bugzilla-daemon at mcs.anl.gov bugzilla-daemon at mcs.anl.gov
Thu Apr 14 15:38:49 CDT 2011


https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=357

           Summary: Script hangs in staging on OSG
           Product: Swift
           Version: 0.92
          Platform: All
        OS/Version: Linux
            Status: ASSIGNED
          Severity: major
          Priority: P1
         Component: Providers
        AssignedTo: hategan at mcs.anl.gov
        ReportedBy: wilde at mcs.anl.gov
                CC: aespinosa at cs.uchicago.edu


Allan's SCEC script is hanging after several hours of successful execution on
approx. 10 OSG sites.

Staging is via the gridftp provider. Execution is via coasters.

It appears that staging for a single job never completes; then a short time
later all staging hangs.

There is info in recent email threads from Allan to swift-devel with replies
from Mihael on the problem.

This has now happened to 4 multi-hour runs, always after several hours of
execution

Two attached images show stage-in and stage-out events.

Allan is trying to pin this down to a single transfer that may have triggered
the hang.

We have one log of  the hang (1.8GB):

dir: /home/aespinosa/workflows/cybershake/archive-runs/test

-rw-r--r-- 1 aespinosa ci-users 1868896768 Apr  8 14:46
postproc-20110407-1438-i90jepr3.log

-- 
Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
You are watching the reporter.



More information about the Swift-devel mailing list