use of MPI derived types in Flexible API

Wed Sep 24 20:14:15 CDT 2014

On 09/24/2014 07:44 PM, Wei-keng Liao wrote:
>
> If the data is contiguous in memory, then there is no need to use varm or flexible APIs.
>
> There is a new set of APIs named varn (available in PnetCDF version 1.4.0 and later), eg.
>      ncmpi_put_varn_float_all()
> It allows a single API call to write a contiguous buffer to a set of noncontiguous places in file.
> Each noncontiguous place is specified by a (start, count) pair. The start-count pairs can be
> arbitrary in file offsets (i.e. unsorted order in offsets).
> Please note this API family is blocking. There is no nonblocking counterpart.
>
> In term of performance, this call is equivalent to making multiple iput or bput calls.

What is the aggregate in-file layout like?  If, taken as a whole, the 
processes need to read/write all of the data then you'll probably be 
fine.  If the data is sparse, then we'll probably need to look at some 
MPI-IO tuning, like disabling data sieving in two-phase I/O.

==rob

> Wei-keng
>
> On Sep 24, 2014, at 6:58 PM, Jim Edwards wrote:
>
>> Data is contiguous in memory but data on a given task maps to various non contiguous points in the file.   I can guarantee that the data in memory on a given mpi task is in monotonically increasing order with respect to offsets into the file, but not more than that.
>>
>> On Wed, Sep 24, 2014 at 3:43 PM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
>> Hi, Jim
>>
>> Do you mean the local I/O buffer contains a list of non-contiguous data in memory?
>> Or do you mean "distributed" as data is partitioned across multiple MPI processes?
>>
>> The varm APIs and the "flexible" APIs that take an MPI derived datatype argument
>> are for users to describe non-contiguous data in the local I/O buffer. The imap
>> and MPI datatype argument has no effect to the data access in files. So, I need
>> to know which case you are referring to first.
>>
>> Thanks for pointing out the error in the user guide. It is fixed.
>>
>> Wei-keng
>>
>> On Sep 24, 2014, at 2:30 PM, Jim Edwards wrote:
>>
>>> I want to write a distributed variable to a file and the way the
>>> data is distributed is fairly random with respect to the ordering on the file.
>>>
>>> It seems like I can do several things from each task in order to write the data -
>>>
>>>        • I can specify several blocks of code using start and count and make mulitple calls on each task to ncmpi_bput_vara_all
>>>        • I can define an MPI derived type and make a single call to ncmpi_bput_var_all on each task
>>>        • I (think I) can use ncmpi_bput_varm_all and specify an imap  (btw: the pnetcdf users guide has this interface wrong)
>>> Are any of these better from a performance standpoint?
>>>
>>> Thanks,
>>>
>>>
>>>
>>>
>>> --
>>> Jim Edwards
>>>
>>> CESM Software Engineer
>>> National Center for Atmospheric Research
>>> Boulder, CO
>>
>>
>>
>>
>> --
>> Jim Edwards
>>
>> CESM Software Engineer
>> National Center for Atmospheric Research
>> Boulder, CO
>

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA