collective write with 1 dimension being global
Wei-keng Liao
wkliao at ece.northwestern.edu
Thu Mar 17 17:55:51 CDT 2011
Hi, Mark,
Based on your I/O description, I wrote this simple program.
The first half creates a file, write a 4D array, and closes the file.
The second part opens the file and read it back using the same partitioning setting.
Please let us know if this is similar to your I/O requests.
I tested it and data seems OK.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 4d.f90
Type: application/octet-stream
Size: 2590 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20110317/74d90fa6/attachment.obj>
-------------- next part --------------
Wei-keng
On Mar 17, 2011, at 4:22 PM, Rob Latham wrote:
> ok, i'm having a hard time mentally visualizing 4d, so let me make
> sure I have a good understanding of the 3d version of this problem:
>
> - Face-wise decomposition should work fine
> - Splitting up the big 3d cube into N smaller cubes should work fine
> (at least, that's a workload we've seen many times: there would be a
> lot of bug reports if it did not)
>
> - The problem, though is when one dimension is the same for all
> processors. in 3d space, that would mean... that all the sub-cubes end
> up jammed against one face?
>
> If there's an (offset, count) tuple that's the same for every process,
> then I guess that means the decomposition overlaps. For writes,
> overlapping decompositions result in undefined behavior. For reads,
> overlapping decompositions should just get sorted out in the MPI-IO
> layer.
>
> If that's the crux of your problem, I can verify with a test case.
> Let me know if I understand your application correctly.
>
> ==rob
>
> On Thu, Mar 10, 2011 at 05:08:39PM +0300, Nicholas K Allsopp wrote:
>> Hi Rob,
>>
>> Below is the section of code which Mark is describing.
>>
>> Thanks
>> Nick
>>
>> use param, only: f_now, nv
>> use comms, only: die
>> implicit none
>>
>> integer :: status, ncid, varID
>> integer(kind=MPI_OFFSET_KIND) :: count(4), offset(4), tmp(1)
>> real(kind=8) :: tmp2(1)
>> real(kind=8), dimension(:,:,:,:), allocatable :: val
>> logical :: here=.false.
>>
>> status = nfmpi_open( cart_comm, "restart.nc", nf_nowrite, &
>> MPI_INFO_NULL, ncid )
>>
>> status = nfmpi_inq_dimlen( ncid, 1, tmp(1) )
>>
>> ! Read in the initial model time
>> !------------------------------------------------------------------
>> status = nfmpi_get_att_double( ncid, nf_global, "Model_Time", &
>> tmp2(1) )
>> model_time = tmp2(1)
>>
>> ! Read in the initial ion distribution field
>> !------------------------------------------------------------------
>> count = (/nx_local,ny_local,nz_local,nv/)
>> offset(1) = global_start(1)
>> offset(2) = global_start(2)
>> offset(3) = global_start(3)
>> offset(4) = 1
>>
>> allocate( val(nx_local,ny_local,nz_local,nv) )
>>
>> status = nfmpi_inq_varid( ncid, "Ion_Distribution", varID )
>> status = nfmpi_get_vara_double_all( ncid, varID, offset, count, val )
>> f_now = 0.0d0
>> f_now( 1:nx_local,1:ny_local,1:nz_local,1:nv ) = val
>> deallocate( val )
>>
>> status = nfmpi_close( ncid )
>> return
>>
>>
>>
>> On 3/10/11 5:00 PM, "Mark P Cheeseman" <mark.cheeseman at kaust.edu.sa> wrote:
>>
>>> Hi Nick,
>>>
>>> Could you please make a code snippet from the read_restart subroutine
>>> in io.f90 source file for Rob? I do not have access to the KSL_Drift
>>> source currently (I do not bring my laptop to purposely keep me from
>>> doing work).
>>>
>>> Thanks,
>>> Mark
>>>
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: Rob Latham <robl at mcs.anl.gov>
>>> Date: Wednesday, March 9, 2011
>>> Subject: collective write with 1 dimension being global
>>> To: Mark Cheeseman <mark.cheeseman at kaust.edu.sa>
>>> Cc: parallel-netcdf at mcs.anl.gov
>>>
>>>
>>> On Sun, Mar 06, 2011 at 01:47:27PM +0300, Mark Cheeseman wrote:
>>>> Hello,
>>>>
>>>> I have a 4D variable inside a NetCDF file that I wish to distribute over a
>>>> number of MPI tasks. The variable will be decomposed over the first 3
>>>> dimensions but not the fouth (i.e. the fourth dimension is kept global for
>>>> all MPI tasks). In other words:
>>>>
>>>> GLOBAL_FIELD[nx,ny,nz,nv] ==>
>>>> LOCAL_FIELD[nx_local,ny_local,nz_local,nv]
>>>>
>>>> I am trying to achieve via a nfmpi_get_vara_double_all call but the data
>>>> keeps getting corrupted. I am sure that my offsets and local domain sizes
>>>> are correct. If I modify my code to read only a single 3D slice (i.e. along
>>>> 1 point in the fourth dimension), the code and input data are correct.
>>>>
>>>> Can parallel-netcdf handle a local dimension being equal to a global
>>>> dimension? Or should I be using another call?
>>>
>>> Hi: sorry for the delay. Several of us are on travel this week.
>>>
>>> I think what you are trying to do is legal.
>>>
>>> Do you have a test case you could share? Does writing exhibit the
>>> same bug? Does the C interface (either reading or writing)?
>>>
>>> ==rob
>>>
>>> --
>>> Rob Latham
>>> Mathematics and Computer Science Division
>>> Argonne National Lab, IL USA
>>>
>>>
>>
>
> --
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>
More information about the parallel-netcdf
mailing list