[Swift-user] Problems in complex mapping situation
Ben Clifford
benc at hawaga.org.uk
Fri Mar 20 04:15:36 CDT 2009
On Fri, 20 Mar 2009, Michael Wilde wrote:
> This script was working fine, and I was trying to improve its performance on
> large datasets on the BG/P by using an ext mapper instead of simple_mapper to
> map a large 2D array of structures (OOPSOut result) so that it spreads across
> multiple directories (to avoid the GPFS locking issue. (I have 4K cores
> writing 14,000 files to one directory, as I see no way with simple mapper to
> use the array index to, for example, insert more directory entries in the
> prefix or suffix.)
The concurrent mapper (used to map variables that have no explicitly
declared mapper) makes a tree of directories for large arrays.
This is done in a deliberately unspecified manner, but at present
(according to the source) /** determines how many directories and element
files are permitted
in each directory. There will be no more than
DIRECTORY_LOAD_FACTOR element files and no more than
DIRECTORY_LOAD_FACTOR directories, so there could be up to
2 * DIRECTORY_LOAD_FACTOR elements. */
public final static int DIRECTORY_LOAD_FACTOR=25;
so if you declare an array as myarray foo[]; with no mapper, you'll end
up with a tree of directories, each directory having no more than 50
entries in it.
You lose the ability to specify the filenames at all here, which may or
may not be a problem for your applications.
It might be that this functionality could be more made more general so
that you can (for example) specify some option to the simple mapper to get
automatically-computed hierarchical directories. It seems like a common
enough use case.
--
More information about the Swift-user
mailing list