[Swift-devel] Workers fail on some OSG sites

Yadu Nand Babuji yadunand at uchicago.edu
Tue Oct 14 20:30:40 CDT 2014


Mike, Mihael,

I've tried to match the error in the Condor*stderr, which had the block 
id of the workers
with those in the swift log. In every case where the worker failed with  
the error string
"Failed to process data", that worker had connected to the right coaster 
service, and was
live for some time, after which it dies.

Each section of the log starts with the name of the Coaster*stderr file, 
and the grepped worker block id.
This is followed by a grep of the block id in the whole run folder:

http://users.rcc.uchicago.edu/~yadunand/worker_mia.log
Complete run folder : http://users.rcc.uchicago.edu/~yadunand/run011/

I remember David had reported channel timeouts from workers on midway, 
with 0.95, and
I want to be sure that there isn't some deeper problem being overlooked. 
I see several
channel timeouts in the logs.

Thanks,
Yadu


On 10/14/2014 12:16 PM, Michael Wilde wrote:
> On 10/14/14 12:10 PM, Mihael Hategan wrote:
>> On Tue, 2014-10-14 at 11:54 -0500, Michael Wilde wrote:
>>
>>> But on several sites, worker.pl encounters errors like the ones below:
>> What do you mean by "like"?
> I meant that these were an actual example of the error messages being seen.
>>> send: Cannot determine peer address at cscript3076873223245775853.pl
>>> line 490
>>> Failed to process data: Failed to register (service returned error: No
>>> such block: 1013-5205390-000018) at cscript3076873223245775853.pl line 1101.
>>> Failed to process data: Failed to register (service returned error: No
>>> such block: 1013-5205390-000051) at cscript3076873223245775853.pl line 1101.
>>> Failed to process data: Failed to register (service returned error: No
>>> such block: 1013-5205390-000043) at cscript3076873223245775853.pl line 1101.
>> This occurs when a worker started by one service tries to connect to a
>> different service.
> Thats a good clue. Yadu, is that possible in this case?  Please provide
> details of the configuration.
>
> Thanks,
>
> - Mike
>
>> Mihael
>>




More information about the Swift-devel mailing list