[Swift-devel] Q about MolDyn

Veronika Nefedova nefedova at mcs.anl.gov
Mon Aug 6 15:17:05 CDT 2007


OK. There is something weird happening. I've got several such entries  
in my swift log:

2007-08-06 14:46:58,565 DEBUG vdl:execute2 Application exception:  
Task failed
         task:execute @ vdl-int.k, line: 332
         vdl:execute2 @ execute-default.k, line: 22
         vdl:execute @ MolDyn-244-loops.kml, line: 20
         antchmbr @ MolDyn-244-loops.kml, line: 2845
         vdl:mains @ MolDyn-244-loops.kml, line: 2267


Looks like antechamber has failed (?). And the failure is only on a  
swfit side, it never made it across to Falcon (there are no remote  
directories created). But I see some of antechamber jobs have  
finished (in shared).

Yuqing -- could the changes you've made be responsible for these  
failures (I do not see how it could though) ?

Ioan, what do you see in your logs ion these tasks:

2007-08-06 14:46:58,555 DEBUG TaskImpl Task(type=1, identity=urn: 
0-1-56-0-1186429255786) setting status to Failed
2007-08-06 14:46:58,556 DEBUG TaskImpl Task(type=1, identity=urn: 
0-1-57-0-1186429255798) setting status to Failed
2007-08-06 14:46:58,558 DEBUG TaskImpl Task(type=1, identity=urn: 
0-1-59-0-1186429255800) setting status to Failed
2007-08-06 14:46:58,558 DEBUG TaskImpl Task(type=1, identity=urn: 
0-1-60-0-1186429255805) setting status to Failed
2007-08-06 14:46:58,558 DEBUG TaskImpl Task(type=1, identity=urn: 
0-1-61-0-1186429255811) setting status to Failed
2007-08-06 14:46:58,558 DEBUG TaskImpl Task(type=1, identity=urn: 
0-1-58-0-1186429255814) setting status to Failed

Nika

On Aug 6, 2007, at 2:29 PM, Ioan Raicu wrote:

> OK!
> Why don't we do one last run from my allocation, as everything is  
> set up already and ready to go!  Make sure to enable all debug  
> logging.  Falkon is up and running with all debug enabled!
>
> Falkon location is unchanged from the last experiment.
> Falkon Factory Service: http://tg-viz-login2:50010/wsrf/services/ 
> GenericPortal/core/WS/GPFactoryService
> Web Server (graphs): http://tg-viz-login2.uc.teragrid.org:51000/ 
> index.htm
>
> ANL/UC is not quite so idle as it was earlier, but I bet we could  
> still get 150~200 processors!
>
> Ioan
>
> Veronika Nefedova wrote:
>> m050 and m179 finished just fine now via GRAM (thanks to Yuqing  
>> who fixed the m179 just in time!). We could start again the 244-  
>> molecule run to verify that nothing is wrong with the whole system.
>>
>> Nika
>>
>> On Aug 6, 2007, at 12:20 PM, Veronika Nefedova wrote:
>>
>>>
>>> On Aug 6, 2007, at 11:51 AM, Ioan Raicu wrote:
>>>
>>>
>>> I started those 2 molecules via GRAM. I have no trust in m179  
>>> finishing completely since I didn't change anything. I hope for  
>>> m050 to finish though...
>>> You can watch the swift log on viper in ~nefedova/alamines/ 
>>> MolDyn-2-loops-be9484k93kk21.log
>>>
>>> Nika
>>>
>>>> Then, let's try another run with 244 molecules soon, as most of  
>>>> ANL/UC is free!
>>>>
>>>> Ioan
>>>>
>>
>>
>




More information about the Swift-devel mailing list