how to do nonblocking collective i/o

Mon Jan 28 09:03:43 CST 2013

On Mon, Jan 28, 2013 at 06:32:54AM +0000, Liu, Jaln wrote:
> Hi,
> 
> I want to test the nonblocking i/o of PnetCDF, is there an
> implementation of non-blocking version's two-phase collective I/O?

You're close. I  bet by the time I finish writing this email Wei-keng
will already respond.

> Here are the codes I wrote:
> 
>         float ** nb_temp_in=malloc(numcalls*sizeof(float *));
>         int * request=calloc(numcalls, sizeof(int));
>         int * status=calloc(numcalls,sizeof(int));
>         int varasize;
>         for(j=0;j<numcalls;j++)
>         {
>           mpi_count[1]=(j>NLVL)?NLVL:j+1;
>           varasize=mpi_count[0]*mpi_count[1]*NLAT*NLON;
>           nb_temp_in[j]=calloc(varasize,sizeof(float));
>           if (ret = ncmpi_iget_vara(ncid, temp_varid, 
>                mpi_start,mpi_count,nb_temp_in[j],
>                varasize,MPI_FLOAT,&(request[j])));
>           if (ret != NC_NOERR) handle_error(ret);
>         }
> 
>         ret = ncmpi_wait_all(ncid, numcalls, request, status);
>         for (j=0; j<numcalls; j++)
>          if (status[j] != NC_NOERR) handle_error(status[j]);
>       }
> 
> I have two questions,
> 1, in the above code, what is right way to parallelize the program?
> by decomposing the for loop " for(j=0;j<numcalls;j++)"?

No "right" way, really. Depends on what the reader needs.  Decomposing
over numcalls is definitely one way.  Or you can decompose over
'mpi_start' and 'mpi_count' -- though I personally have to wrestle
with block decomposition for a while before it's correct.

> 2, how to do non-blocking collective I/O? is there a function like
> 'ncmpi_iget_vara_all'?

you already did it.   

We've iterated over a few nonblocking-pnetcdf approaches over the
years, but settled on this way: 
- operations are posted independently.
- One can collectively wait for completion with "ncmpi_wait_all", as
  you did.  
- If one needs to wait for completion locally due to the nature of the
  application, one might not get the best performance, but
  "ncmpi_wait" is still there if the app needs independent I/O
  completion.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA