[Swift-devel] Coaster socket issue

David Kelly davidk at ci.uchicago.edu
Wed Mar 28 20:49:21 CDT 2012


Strange, I just ran into a similar issues tonight while running on ibicluster (SGE). I saw the "too many open files" error after sitting in the queue waiting for a job to start. I restarted the job and then periodically ran 'lsof' to see the number of java pipes increasing over time. I thought at first this might be SGE specific, but perhaps it is something else. (This was with 0.93)

----- Original Message -----
> From: "Jonathan Monette" <jonmon at mcs.anl.gov>
> To: "swift-devel at ci.uchicago.edu Devel" <swift-devel at ci.uchicago.edu>
> Sent: Wednesday, March 28, 2012 8:30:52 PM
> Subject: [Swift-devel] Coaster socket issue
> Hello,
> In running the SciColSim app on raven(which is a cluster similar to
> Beagle) I noticed that the app hung. It was not hung where the hang
> checker kicked in but Swift was waiting for jobs to be active but
> there was none submitted to PBS. I took a look at the log file and
> noticed that I had a java.io.IOException thrown for "too many open
> files". Since I killed it I couldn't probe the run but I had the same
> run running on Beagle. Upon Mike's suggestion I took a look at the
> /proc/<pid>/fd directory. There were over 2000 sockets in the
> CLOSE_WAIT state with a single message in the receive queue. Raven has
> a limit of 1024 open files at a time while Beagle has a limit around
> 60K number of files open. I got this limit using ulimit -n.
> 
> So my question is, why is there so many sockets waiting to be closed?
> I did some reading about the CLOSE_WAIT state and it seems this
> happens when one of the ends closes there socket but the other does
> not. Is Coaster not closing the socket when a worker shuts down? What
> other information should I be looking for to help debug the issue.
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel



More information about the Swift-devel mailing list