[Swift-user] Problems in complex mapping situation
Michael Wilde
wilde at mcs.anl.gov
Fri Mar 20 11:13:36 CDT 2009
Very nice. This script:
--
type file;
type struct {
file mem1;
file mem2;
}
app (file o) echo (string s) { echo s stdout=@o; }
file a[];
a[0] = echo("0");
a[49] = echo("49");
a[1000] = echo("1000");
a[20000] = echo("20000");
struct s[][];
s[123][456].mem1 = echo("s[123][456].mem1");
s[12300][987].mem1 = echo("s[12300][987].mem1");
--
gives:
sur$ find _concurrent
_concurrent
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array/h12
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array/h12/h14
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array/h12/h14/elt-987.-field
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array/h12/h14/elt-987.-field/mem1
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23/elt-123_-array
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23/elt-123_-array/h6
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23/elt-123_-array/h6/elt-456.-field
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23/elt-123_-array/h6/elt-456.-field/mem1
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h0
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h0/h7
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h0/h7/elt-20000
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h15
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h15/elt-1000
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h24
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h24/elt-49
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/elt-0
sur$
On 3/20/09 11:05 AM, Michael Wilde wrote:
> Cool. I forgot about this. I can certainly try it for perf measurements.
>
> A mapper-like external script can walk the output tree of the concurrent
> mapper and create links with the desired filenames. I'll try that.
>
> Some questions (but I'll find this out in a moment when I try it):
>
> - does this behavior start as soon as the directory hits 50 elements?
> - if the first element I store is, say 1000, is it triggered?
> - does it map arrays of structures?
>
> Im assuming yes to all these, and will experiment. Thanks!
>
> On 3/20/09 4:15 AM, Ben Clifford wrote:
>> On Fri, 20 Mar 2009, Michael Wilde wrote:
>>
>>> This script was working fine, and I was trying to improve its
>>> performance on
>>> large datasets on the BG/P by using an ext mapper instead of
>>> simple_mapper to
>>> map a large 2D array of structures (OOPSOut result) so that it
>>> spreads across
>>> multiple directories (to avoid the GPFS locking issue. (I have 4K cores
>>> writing 14,000 files to one directory, as I see no way with simple
>>> mapper to
>>> use the array index to, for example, insert more directory entries in
>>> the
>>> prefix or suffix.)
>>
>> The concurrent mapper (used to map variables that have no explicitly
>> declared mapper) makes a tree of directories for large arrays.
>>
>> This is done in a deliberately unspecified manner, but at present
>> (according to the source) /** determines how many directories and
>> element files are permitted
>> in each directory. There will be no more than
>> DIRECTORY_LOAD_FACTOR element files and no more than
>> DIRECTORY_LOAD_FACTOR directories, so there could be up to
>> 2 * DIRECTORY_LOAD_FACTOR elements. */
>> public final static int DIRECTORY_LOAD_FACTOR=25;
>>
>> so if you declare an array as myarray foo[]; with no mapper, you'll
>> end up with a tree of directories, each directory having no more than
>> 50 entries in it.
>>
>> You lose the ability to specify the filenames at all here, which may
>> or may not be a problem for your applications.
>>
>> It might be that this functionality could be more made more general so
>> that you can (for example) specify some option to the simple mapper to
>> get automatically-computed hierarchical directories. It seems like a
>> common enough use case.
>>
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
More information about the Swift-user
mailing list