[Swift-user] What system calls do the mappers use?

Michael Wilde wilde at mcs.anl.gov
Wed Mar 20 15:37:53 CDT 2013


The best approach, if you want fine-grained control over how the mapper operates (other then writing your own Java mapper) is to do the mapping in an app(), return an array of mapped strings via readData(), and then map the strings to the file array using array_mapper (from an array of strings) or fixed_array_mapper (from one string).

You can also use an ext_mapper to get the degree of control you want.

The difference between travering a dir structure with find vs ls is that a simple "find ." with no other filters just reads directories, while most ls calls and find filters all need to do a stat() on the file's inode to look at its metadata.

On a heavily loaded shared file server like GPFS or lustre, these metadata operations are what typically causes significant overhead and slowdown, and may be further by locks help by apps doing metadata updates.

- Mike


----- Original Message -----
> From: "Lorenzo Pesce" <lpesce at uchicago.edu>
> To: "Swift User Discussion List" <swift-user at ci.uchicago.edu>
> Sent: Wednesday, March 20, 2013 3:27:05 PM
> Subject: [Swift-user] What system calls do the mappers use?
> 
> Hi --
> 
> I am working with mappers that might be repeated thousands of times
> in each workflow run.
> Lustre doesn't like that type of search when it is based on
> approaches similar to "ls", on the other hand "find" works fine.
> 
> I could conceivably find a work around, but I would rather not have
> to do it.
> 
> Lorenzo
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> 



More information about the Swift-user mailing list