[Swift-devel] Re: [Fwd: TeraGrid News: UC/ANL GPFS and PVFS file systems unavailable]

Ti Leggett leggett at ci.uchicago.edu
Mon Sep 10 10:42:07 CDT 2007


For last few weeks a particular user has been running nodes out of  
memory, causing the kernel to start randomly killing processes. This  
leaves the nodes in quite weird states, but in many cases the  
resource manager still thinks they're ok to schedule jobs on. The  
best thing to do when you find this or any problem on the TG is to  
file a ticket with help at teragrid.org giving at the least your userid,  
the jobid, and the nodes you were running on and, if you can  
determine, the particularly problematic node.

On Sep 10, 2007, at 9:58 AM, Michael Wilde wrote:

> Ti, just an fyi:
>
> as you work on this, I'd like to mention that Ioan, Nika and I have  
> been plagued for quite a while now (4 weeks or more I think for  
> Ioan and Nika) by an *occasional* node that seems to return "Stale  
> NFS file handle" for access for scratchgpfs1 files.
>
> We've been working around this, but I wonder if something that  
> occasionally knocks out a few nodes' access to gpfs has now  
> happened en-masse?
>
> All: Moving forward, everyone who encounters this (or any other  
> system problems) should file a trouble ticket right away so the bad  
> nodes can be fixed.
>
> Thanks,
>
> Mike
>
>
> -------- Original Message --------
> Subject: TeraGrid News: UC/ANL GPFS and PVFS file systems unavailable
> Date: Mon, 10 Sep 2007 06:48:21 -0700 (PDT)
> From: news at teragrid.org
>
>
> UC/ANL GPFS and PVFS file systems unavailable
>
> Systems: UC/ANL
> Posted on Sep 10 2007, 13:46:33 (GMT/UTC) by Ti Leggett
>
> The GPFS (local and WAN) and PVFS scrach file systems are currently  
> unavailable. We are working to re-establish them.
>
> _______________________________________________________________
> This message can also be found at http://news.teragrid.org/ 
> announcements/20070910_01.php.  To unsubscribe or change the  
> categories to which you are subscribed, go to http:// 
> news.teragrid.org/user.php#manage.
>
>




More information about the Swift-devel mailing list