[Swift-devel] Q about MolDyn

Mihael Hategan hategan at mcs.anl.gov
Mon Aug 6 15:52:31 CDT 2007


On Mon, 2007-08-06 at 15:17 -0500, Veronika Nefedova wrote:
> OK. There is something weird happening. I've got several such entries  
> in my swift log:
> 
> 2007-08-06 14:46:58,565 DEBUG vdl:execute2 Application exception:  
> Task failed
>          task:execute @ vdl-int.k, line: 332
>          vdl:execute2 @ execute-default.k, line: 22
>          vdl:execute @ MolDyn-244-loops.kml, line: 20
>          antchmbr @ MolDyn-244-loops.kml, line: 2845
>          vdl:mains @ MolDyn-244-loops.kml, line: 2267

That doesn't say much. Any more details in the logs?

> 
> 
> Looks like antechamber has failed (?). And the failure is only on a  
> swfit side, it never made it across to Falcon (there are no remote  
> directories created). But I see some of antechamber jobs have  
> finished (in shared).
> 
> Yuqing -- could the changes you've made be responsible for these  
> failures (I do not see how it could though) ?
> 
> Ioan, what do you see in your logs ion these tasks:
> 
> 2007-08-06 14:46:58,555 DEBUG TaskImpl Task(type=1, identity=urn: 
> 0-1-56-0-1186429255786) setting status to Failed
> 2007-08-06 14:46:58,556 DEBUG TaskImpl Task(type=1, identity=urn: 
> 0-1-57-0-1186429255798) setting status to Failed
> 2007-08-06 14:46:58,558 DEBUG TaskImpl Task(type=1, identity=urn: 
> 0-1-59-0-1186429255800) setting status to Failed
> 2007-08-06 14:46:58,558 DEBUG TaskImpl Task(type=1, identity=urn: 
> 0-1-60-0-1186429255805) setting status to Failed
> 2007-08-06 14:46:58,558 DEBUG TaskImpl Task(type=1, identity=urn: 
> 0-1-61-0-1186429255811) setting status to Failed
> 2007-08-06 14:46:58,558 DEBUG TaskImpl Task(type=1, identity=urn: 
> 0-1-58-0-1186429255814) setting status to Failed
> 
> Nika
> 
> On Aug 6, 2007, at 2:29 PM, Ioan Raicu wrote:
> 
> > OK!
> > Why don't we do one last run from my allocation, as everything is  
> > set up already and ready to go!  Make sure to enable all debug  
> > logging.  Falkon is up and running with all debug enabled!
> >
> > Falkon location is unchanged from the last experiment.
> > Falkon Factory Service: http://tg-viz-login2:50010/wsrf/services/ 
> > GenericPortal/core/WS/GPFactoryService
> > Web Server (graphs): http://tg-viz-login2.uc.teragrid.org:51000/ 
> > index.htm
> >
> > ANL/UC is not quite so idle as it was earlier, but I bet we could  
> > still get 150~200 processors!
> >
> > Ioan
> >
> > Veronika Nefedova wrote:
> >> m050 and m179 finished just fine now via GRAM (thanks to Yuqing  
> >> who fixed the m179 just in time!). We could start again the 244-  
> >> molecule run to verify that nothing is wrong with the whole system.
> >>
> >> Nika
> >>
> >> On Aug 6, 2007, at 12:20 PM, Veronika Nefedova wrote:
> >>
> >>>
> >>> On Aug 6, 2007, at 11:51 AM, Ioan Raicu wrote:
> >>>
> >>>
> >>> I started those 2 molecules via GRAM. I have no trust in m179  
> >>> finishing completely since I didn't change anything. I hope for  
> >>> m050 to finish though...
> >>> You can watch the swift log on viper in ~nefedova/alamines/ 
> >>> MolDyn-2-loops-be9484k93kk21.log
> >>>
> >>> Nika
> >>>
> >>>> Then, let's try another run with 244 molecules soon, as most of  
> >>>> ANL/UC is free!
> >>>>
> >>>> Ioan
> >>>>
> >>
> >>
> >
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 




More information about the Swift-devel mailing list