[Swift-devel] IO overheads of swift wrapper scripts on BlueGene/P
Michael Wilde
wilde at mcs.anl.gov
Sat Oct 17 12:29:38 CDT 2009
Remember that any situation in which multiple IONs modify the same file
or directory (ie by creating files or directories in the same parent
directory) will cause severe contention and performance degradation on
any GPFS filesystem.
In addition to creating many directories, you need to ensure that no
single file or directories is likely to ever be written to from multiple
client nodes (eg IONs on the BG/P) concurrently.
Have you done that in this workload, Allan?
- Mike
On 10/17/09 2:59 AM, Allan Espinosa wrote:
> I was using 1000 files (or was it 3000?) per directory. it looks like
> i need to lower my ratio...
>
> -Allan
>
> 2009/10/16 Mihael Hategan <hategan at mcs.anl.gov>:
>> On Fri, 2009-10-16 at 21:07 -0500, Allan Espinosa wrote:
>>> Progress 2009-10-16 18:00:33.756364000-0500 COPYING_OUTPUTS
>>> Progress 2009-10-16 18:08:19.970449000-0500 RM_JOBDIR
>> Grr. 8 minutes spent COPYING_OUTPUTS.
>>
>> What would be useful is to aggregate all the access that happened on
>> that FS from all the relevant jobs, to see the exact thing that causes
>> contention. I strongly suspect it's
>> home/espinosa/workflows/jgi_blastp/test3.4.7_3cpn.32ifs.192cpu/output/
>>
>> Pretty much all the outputs seem to go to that directory.
>>
>> I'm afraid however that the information in the logs is insufficient.
>> Strace with relevant options (for fs calls only) may be useful if you
>> want to try.
>>
>> Alternatively, you could try to spread your output over multiple
>> directories and see what the difference is.
>>
>> Also, it may be interesting to see the dependence between the delay and
>> the number of contending processes. That is so that we know the limit of
>> how many processes we can allow to compete for a shared resource without
>> causing too much trouble.
>>
>> Mihael
>>
>>
>>
>
>
>
More information about the Swift-devel
mailing list