[Swift-devel] Index out of bounds

Michael Wilde wilde at mcs.anl.gov
Sat Aug 20 21:03:35 CDT 2011


Jon, the list you want for Beagle issue notifications is beagle-users. You can subscribe via the link: 



https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/beagle-users 


- Mike 



----- Forwarded Message ----- 
From: "Greg Cross" <grog at ci.uchicago.edu> 
To: beagle-users at ci.uchicago.edu 
Sent: Saturday, August 20, 2011 2:12:45 PM 
Subject: [beagle-users] Outage update 


Lustre is mounting properly but there is a communication failure between the Moab and ALPS scheduler components. This issue is under investigation and has been escalated to Cray. 


As a reminder, please DO NOT attempt to log into the system during this or any other maintenance period. While logins should be denied at this time, any user processes found running on login or sandbox nodes will be terminated without warning. Users who do not respect this may be contacted individually. 


Definitive notification will be sent to this mailing list when the system is available for use. 




_______________________________________________ 
beagle-users mailing list 
beagle-users at ci.uchicago.edu 
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/beagle-users 


----- Original Message -----


From: "Jonathan Monette" <jonmon at mcs.anl.gov> 
To: "Daniel S. Katz" <dsk at ci.uchicago.edu> 
Cc: swift-devel at ci.uchicago.edu 
Sent: Saturday, August 20, 2011 4:20:35 PM 
Subject: Re: [Swift-devel] Index out of bounds 

Thanks. In the meantime could someone let me know when beagle is back in production so I can check my run? 


----- Reply message ----- 
From: "Daniel S. Katz" <dsk at ci.uchicago.edu> 
Date: Sat, Aug 20, 2011 3:14 pm 
Subject: [Swift-devel] Index out of bounds 
To: "Jonathan Monette" <jonmon at mcs.anl.gov> 
Cc: "Ketan Maheshwari" <ketancmaheshwari at gmail.com>, "swift-devel at ci.uchicago.edu" <swift-devel at ci.uchicago.edu> 



Yes, write to beagle-support. 

On Aug 20, 2011, at 14:52, "Jonathan Monette" < jonmon at mcs.anl.gov > wrote: 





Ok thanks. It seems that I was not added to the beagle-notify list. Could someone point me to a link I can subscribe to? Or do I subscribe by sending mail to beagle-support? 


----- Reply message ----- 
From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > 
Date: Sat, Aug 20, 2011 7:45 am 
Subject: [Swift-devel] Index out of bounds 
To: "Jonathan Monette" < jonmon at mcs.anl.gov > 
Cc: < swift-devel at ci.uchicago.edu > 


Yes, Beagle went down yesterday. There was a notice. 


Current status as of Aug 19, 5.30PM: 


== 
At this time, Lustre is not starting properly on Beagle. This may be related to a configuration change that was made during the last outage. The effort to restore system availability is still in active progress. 
== 




Ketan 


On Sat, Aug 20, 2011 at 12:03 AM, Jonathan Monette < jonmon at mcs.anl.gov > wrote: 


I updated and rebuilt and added that line to my log4j properties. Does anyone know if Beagle is down? showq says there is no service listening to sdb:<number>. qstat shows that I have a job sitting in the queue but it doesn't look like jobs are running. 

I am using both PADS and Beagle for this execution. In this case where jobs are not executing on Beagle shouldn't Swift start submitting jobs to PADS? I do not see that behavior. 

This run is still executing. But if you would like to look at the log it is at www.ci.uchicago.edu/~jonmon/logs/montage-2.log . Only 23 tasks have finished before it just sits there waiting for Beagle to run. 



On Aug 19, 2011, at 2:46 PM, Jonathan Monette wrote: 

> Sure can. I add that line to the log4j file or in a different properties file. 
> 
> ----- Reply message ----- 
> From: "Mihael Hategan" < hategan at mcs.anl.gov > 
> Date: Fri, Aug 19, 2011 2:03 pm 
> Subject: Index out of bounds 
> To: "Jonathan Monette" < jonmon at mcs.anl.gov > 
> Cc: < swift-devel at ci.uchicago.edu > 
> 
> 
> Hmm. So I can't see how this manages to happen. 
> 
> I added some checks and debugging statements. Can you update, set log 
> level of org.globus.cog.abstraction.impl.file.local to DEBUG, re-run and 
> then post the log when the exception pops up? 
> 
> Mihael 
> 
> On Thu, 2011-08-18 at 23:14 -0500, Jonathan Monette wrote: 
> > Ok. The log is at 
> > www.ci.uchicago.edu/~jonmon/logs/montage-1.log 
> > On Aug 18, 2011, at 5:56 PM, Mihael Hategan wrote: 
> > 
> > > It's probably a good idea to post the stack trace of that exception now 
> > > rather than later. 
> > > 
> > > On Thu, 2011-08-18 at 13:09 -0500, Jonathan Monette wrote: 
> > >> Hello, 
> > >> I was running 0.93 with one a relatively small run, a 350 task run. 
> > >> The run failed on one of the final tasks. I checked the log file and 
> > >> saw some index out of bounds errors. I tried with a smaller run and 
> > >> didn't see the error. 
> > >> 
> > >> This run was using beagle, pads, and communicado. I was also using 
> > >> cdm. I will post the log in a bit. I am seeing if I cam replicate it 
> > >> without using cdm and with a smaller site pool. 
> > >> 
> > > 
> > > 
> > 
> 
> 
> 
> 



> _______________________________________________ 
> Swift-devel mailing list 
> Swift-devel at ci.uchicago.edu 
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel 

_______________________________________________ 
Swift-devel mailing list 
Swift-devel at ci.uchicago.edu 
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel 




-- 
Ketan 





_______________________________________________ 
Swift-devel mailing list 
Swift-devel at ci.uchicago.edu 
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel 

_______________________________________________ 
Swift-devel mailing list 
Swift-devel at ci.uchicago.edu 
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel 



-- 
Michael Wilde 
Computation Institute, University of Chicago 
Mathematics and Computer Science Division 
Argonne National Laboratory 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20110820/fdb96dc9/attachment.html>


More information about the Swift-devel mailing list