[Swift-devel] [Bug 357] New: Script hangs in staging on OSG
    bugzilla-daemon at mcs.anl.gov 
    bugzilla-daemon at mcs.anl.gov
       
    Thu Apr 14 15:38:49 CDT 2011
    
    
  
https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=357
           Summary: Script hangs in staging on OSG
           Product: Swift
           Version: 0.92
          Platform: All
        OS/Version: Linux
            Status: ASSIGNED
          Severity: major
          Priority: P1
         Component: Providers
        AssignedTo: hategan at mcs.anl.gov
        ReportedBy: wilde at mcs.anl.gov
                CC: aespinosa at cs.uchicago.edu
Allan's SCEC script is hanging after several hours of successful execution on
approx. 10 OSG sites.
Staging is via the gridftp provider. Execution is via coasters.
It appears that staging for a single job never completes; then a short time
later all staging hangs.
There is info in recent email threads from Allan to swift-devel with replies
from Mihael on the problem.
This has now happened to 4 multi-hour runs, always after several hours of
execution
Two attached images show stage-in and stage-out events.
Allan is trying to pin this down to a single transfer that may have triggered
the hang.
We have one log of  the hang (1.8GB):
dir: /home/aespinosa/workflows/cybershake/archive-runs/test
-rw-r--r-- 1 aespinosa ci-users 1868896768 Apr  8 14:46
postproc-20110407-1438-i90jepr3.log
-- 
Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
You are watching the reporter.
    
    
More information about the Swift-devel
mailing list