program not scalable

Rob Latham robl at mcs.anl.gov
Mon Sep 24 09:59:57 CDT 2012


On Sat, Sep 22, 2012 at 03:31:50PM +0000, Liu, Jaln wrote:
> Hi,
> 
> I found that the problem only happened in requesting noncontiguous small I/O along those slow dimension.
> For example, there is a 4D dataset, Temperature[time][level][lat][lon], if we only read a 2D plane per time step, per level, or in other words, as is shown in the following codes:
> 
> mpi_count[0]=1;//read length in time dimension
> mpi_count[1]=1;//read length in level dimension
> mpi_count[2]=NLAT;//read length in lat dimension
> mpi_count[3]=NLON;//read length in lon dimension
> mpi_start[2]=0;//start position in lat dimension
> mpi_stat[3]=0;//start position in lon dimension
> 
> loop_start=(int)(TIME_STEPS/nprocs)*rank;
> loop_length=(int)(TIME_STEPS/nprocs);
> for (time=loop_start;time< loop_start+loop_length;time++)
>     for(level=0;level<LEVEL_STEPS;level++){
>        mpi_start[0]=time;
>        mpi_start[1]=level;
> 
>        //read 2D plane per time step and per level step
>        ncmpi_get_vara_float_all(ncid, temp_varid, mpi_start, mpi_count, temp_in)
> }
> 
> I have no idea why the program is not scalable well. Or if I'm wrong
> in somewhere, any help is appreciated. Thanks,

Wei-keng has provided some ways you might try to improve performance
of an individual request.  

You might want to additionally consider adapting your code to use the
non-blocking interface.  Through that interface, you would build up a
list of requests, then tell pnetcdf "go take care of these".  

You would, at the end of your for loops have
N=(TIME_STEPS*LEVEL_STEPS/nprocs) calls: combining them all together
would, in the aggregate, give you N requests that encompass all of the
'time' and 'level' dimension. 

We probably need more documentation, but perhaps this is a start?

https://trac.mcs.anl.gov/projects/parallel-netcdf/wiki/CombiningOperations
https://trac.mcs.anl.gov/projects/parallel-netcdf/wiki/QuickTutorial#Non-blockinginterface

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


More information about the parallel-netcdf mailing list