collective write with 1 dimension being global

Thu Mar 17 17:55:51 CDT 2011

Hi, Mark,

Based on your I/O description, I wrote this simple program.
The first half creates a file, write a 4D array, and closes the file.
The second part opens the file and read it back using the same partitioning setting.
Please let us know if this is similar to your I/O requests.
I tested it and data seems OK.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 4d.f90
Type: application/octet-stream
Size: 2590 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20110317/74d90fa6/attachment.obj>
-------------- next part --------------

Wei-keng

On Mar 17, 2011, at 4:22 PM, Rob Latham wrote:

> ok, i'm having a hard time mentally visualizing 4d, so let me make
> sure I have a good understanding of the 3d version of this problem:
> 
> - Face-wise decomposition should work fine
> - Splitting up the big 3d cube into N smaller cubes should work fine
>  (at least, that's a workload we've seen many times: there would be a
>  lot of bug reports if it did not)
> 
> - The problem, though is when one dimension is the same for all
>  processors.  in 3d space, that would mean... that all the sub-cubes end
>  up jammed against one face?  
> 
> If there's an (offset, count) tuple that's the same for every process,
> then I guess that means the decomposition overlaps.  For writes,
> overlapping decompositions result in undefined behavior.  For reads,
> overlapping decompositions should just get sorted out in the MPI-IO
> layer. 
> 
> If that's the crux of your problem, I can verify with a test case.
> Let me know if I understand your application correctly.
> 
> ==rob
> 
> On Thu, Mar 10, 2011 at 05:08:39PM +0300, Nicholas K Allsopp wrote:
>> Hi Rob,
>> 
>> Below is the section of code which Mark is describing.
>> 
>> Thanks
>> Nick
>> 
>>      use param, only: f_now, nv
>>      use comms, only: die
>>      implicit none
>> 
>>      integer :: status, ncid, varID
>>      integer(kind=MPI_OFFSET_KIND) :: count(4), offset(4), tmp(1)
>>      real(kind=8) :: tmp2(1)
>>      real(kind=8), dimension(:,:,:,:), allocatable :: val
>>      logical :: here=.false.
>> 
>>      status = nfmpi_open( cart_comm, "restart.nc", nf_nowrite, &
>>                           MPI_INFO_NULL, ncid )
>> 
>>      status = nfmpi_inq_dimlen( ncid, 1, tmp(1) )
>> 
>>    ! Read in the initial model time
>>    !------------------------------------------------------------------
>>      status = nfmpi_get_att_double( ncid, nf_global, "Model_Time", &
>>                                     tmp2(1) )
>>      model_time = tmp2(1)
>> 
>>    ! Read in the initial ion distribution field
>>    !------------------------------------------------------------------
>>      count = (/nx_local,ny_local,nz_local,nv/)
>>      offset(1) = global_start(1)
>>      offset(2) = global_start(2)
>>      offset(3) = global_start(3)
>>      offset(4) = 1
>> 
>>      allocate( val(nx_local,ny_local,nz_local,nv) )
>> 
>>      status = nfmpi_inq_varid( ncid, "Ion_Distribution", varID )
>>      status = nfmpi_get_vara_double_all( ncid, varID, offset, count, val )
>>      f_now = 0.0d0
>>      f_now( 1:nx_local,1:ny_local,1:nz_local,1:nv ) = val
>>      deallocate( val )
>> 
>>      status = nfmpi_close( ncid )
>>      return
>> 
>> 
>> 
>> On 3/10/11 5:00 PM, "Mark P Cheeseman" <mark.cheeseman at kaust.edu.sa> wrote:
>> 
>>> Hi Nick,
>>> 
>>> Could you please make a code snippet from the read_restart subroutine
>>> in io.f90 source file for Rob? I do not have access to the KSL_Drift
>>> source currently (I do not bring my laptop to purposely keep me from
>>> doing work).
>>> 
>>> Thanks,
>>> Mark
>>> 
>>> 
>>> 
>>> ---------- Forwarded message ----------
>>> From: Rob Latham <robl at mcs.anl.gov>
>>> Date: Wednesday, March 9, 2011
>>> Subject: collective write with 1 dimension being global
>>> To: Mark Cheeseman <mark.cheeseman at kaust.edu.sa>
>>> Cc: parallel-netcdf at mcs.anl.gov
>>> 
>>> 
>>> On Sun, Mar 06, 2011 at 01:47:27PM +0300, Mark Cheeseman wrote:
>>>> Hello,
>>>> 
>>>> I have a 4D variable inside a NetCDF file that I wish to distribute over a
>>>> number of MPI tasks.  The variable will be decomposed over the first 3
>>>> dimensions but not the fouth (i.e. the fourth dimension is kept global for
>>>> all MPI tasks). In other words:
>>>> 
>>>>               GLOBAL_FIELD[nx,ny,nz,nv]  ==>
>>>> LOCAL_FIELD[nx_local,ny_local,nz_local,nv]
>>>> 
>>>> I am trying to achieve via a nfmpi_get_vara_double_all call but the data
>>>> keeps getting corrupted.  I am sure that my offsets and local domain sizes
>>>> are correct.  If I modify my code to read only a single 3D slice (i.e. along
>>>> 1 point in the fourth dimension), the code and input data are correct.
>>>> 
>>>> Can parallel-netcdf handle a local dimension being equal to a global
>>>> dimension?  Or should I be using another call?
>>> 
>>> Hi: sorry for the delay.  Several of us are on travel this week.
>>> 
>>> I think what you are trying to do is legal.
>>> 
>>> Do you have a test case you could share?  Does writing exhibit the
>>> same bug?  Does the C interface (either reading or writing)?
>>> 
>>> ==rob
>>> 
>>> --
>>> Rob Latham
>>> Mathematics and Computer Science Division
>>> Argonne National Lab, IL USA
>>> 
>>> 
>> 
> 
> -- 
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>