[Swift-user] Problems in complex mapping situation

Ben Clifford benc at hawaga.org.uk
Fri Mar 20 04:15:36 CDT 2009


On Fri, 20 Mar 2009, Michael Wilde wrote:

> This script was working fine, and I was trying to improve its performance on
> large datasets on the BG/P by using an ext mapper instead of simple_mapper to
> map a large 2D array of structures (OOPSOut result) so that it spreads across
> multiple directories (to avoid the GPFS locking issue. (I have 4K cores
> writing 14,000 files to one directory, as I see no way with simple mapper to
> use the array index to, for example, insert more directory entries in the
> prefix or suffix.)

The concurrent mapper (used to map variables that have no explicitly 
declared mapper) makes a tree of directories for large arrays.

This is done in a deliberately unspecified manner, but at present 
(according to the source)  /** determines how many directories and element 
files are permitted
            in each directory. There will be no more than
            DIRECTORY_LOAD_FACTOR element files and no more than
            DIRECTORY_LOAD_FACTOR directories, so there could be up to
            2 * DIRECTORY_LOAD_FACTOR elements. */
        public final static int DIRECTORY_LOAD_FACTOR=25;

so if you declare an array as  myarray foo[]; with no mapper, you'll end 
up with a tree of directories, each directory having no more than 50 
entries in it.

You lose the ability to specify the filenames at all here, which may or 
may not be a problem for your applications.

It might be that this functionality could be more made more general so 
that you can (for example) specify some option to the simple mapper to get 
automatically-computed hierarchical directories. It seems like a common 
enough use case.

-- 




More information about the Swift-user mailing list