[Swift-user] Problems in complex mapping situation

Michael Wilde wilde at mcs.anl.gov
Fri Mar 20 11:13:36 CDT 2009


Very nice.  This script:

--

type file;

type struct {
   file mem1;
   file mem2;
}

app (file o) echo (string s) { echo s stdout=@o; }

file a[];
a[0] = echo("0");
a[49] = echo("49");
a[1000] = echo("1000");
a[20000] = echo("20000");

struct s[][];

s[123][456].mem1 = echo("s[123][456].mem1");
s[12300][987].mem1 = echo("s[12300][987].mem1");

--

gives:

sur$ find _concurrent
_concurrent
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array/h12
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array/h12/h14
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array/h12/h14/elt-987.-field
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h0/h17/elt-12300_-array/h12/h14/elt-987.-field/mem1
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23/elt-123_-array
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23/elt-123_-array/h6
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23/elt-123_-array/h6/elt-456.-field
_concurrent/s-3b2cbe1c-4a65-4452-ba62-11becc80935c--array/h23/elt-123_-array/h6/elt-456.-field/mem1
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h0
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h0/h7
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h0/h7/elt-20000
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h15
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h0/h15/elt-1000
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h24
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/h24/elt-49
_concurrent/a-cfcdbe0b-ed93-47fe-865f-5d27efb33f3f--array/elt-0
sur$



On 3/20/09 11:05 AM, Michael Wilde wrote:
> Cool. I forgot about this. I can certainly try it for perf measurements.
> 
> A mapper-like external script can walk the output tree of the concurrent 
> mapper and create links with the desired filenames.  I'll try that.
> 
> Some questions (but I'll find this out in a moment when I try it):
> 
> - does this behavior start as soon as the directory hits 50 elements?
> - if the first element I store is, say 1000, is it triggered?
> - does it map arrays of structures?
> 
> Im assuming yes to all these, and will experiment. Thanks!
> 
> On 3/20/09 4:15 AM, Ben Clifford wrote:
>> On Fri, 20 Mar 2009, Michael Wilde wrote:
>>
>>> This script was working fine, and I was trying to improve its 
>>> performance on
>>> large datasets on the BG/P by using an ext mapper instead of 
>>> simple_mapper to
>>> map a large 2D array of structures (OOPSOut result) so that it 
>>> spreads across
>>> multiple directories (to avoid the GPFS locking issue. (I have 4K cores
>>> writing 14,000 files to one directory, as I see no way with simple 
>>> mapper to
>>> use the array index to, for example, insert more directory entries in 
>>> the
>>> prefix or suffix.)
>>
>> The concurrent mapper (used to map variables that have no explicitly 
>> declared mapper) makes a tree of directories for large arrays.
>>
>> This is done in a deliberately unspecified manner, but at present 
>> (according to the source)  /** determines how many directories and 
>> element files are permitted
>>             in each directory. There will be no more than
>>             DIRECTORY_LOAD_FACTOR element files and no more than
>>             DIRECTORY_LOAD_FACTOR directories, so there could be up to
>>             2 * DIRECTORY_LOAD_FACTOR elements. */
>>         public final static int DIRECTORY_LOAD_FACTOR=25;
>>
>> so if you declare an array as  myarray foo[]; with no mapper, you'll 
>> end up with a tree of directories, each directory having no more than 
>> 50 entries in it.
>>
>> You lose the ability to specify the filenames at all here, which may 
>> or may not be a problem for your applications.
>>
>> It might be that this functionality could be more made more general so 
>> that you can (for example) specify some option to the simple mapper to 
>> get automatically-computed hierarchical directories. It seems like a 
>> common enough use case.
>>
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user



More information about the Swift-user mailing list