pnetcdf performance question

Robert Latham robl at mcs.anl.gov
Tue Jan 13 17:04:14 CST 2009


On Mon, Jan 12, 2009 at 11:35:34AM -0500, Wong.David-C at epamail.epa.gov wrote:
>     The first column indicates the processor configuration since the code 
> is based on domain decomposition. For example 4x2 means 4 processors 
> assigned to the column dimension and 2 to row dimension. The domain size 
> is fixed. The code has read and write but the read portion uses regular 
> netcdf operation and the write portion goes through pnetcdf function. The 
> timing is the wall clock time to finish the execution.

> >      regular     pnetcdf
> > 1x1   40513       39954
> > 2x1   23325       23418
> > 2x2   13021       13198
> > 4x2    9734        8542
> > 4x4    4578        4751
> > 8x4    2810        3342
> > 8x8    1821        2755
> > 12x8    1445        2426
> > 14x8    1372        2599

OK, great, thanks for clarifying.  I have one more question about your
I/O.  Is this "one file per processor" or is it I/O to a single file,
or is it a mix? 

Let me just throw up a strawman and you can tell me how it really is:

- rank 0 reads in a netcdf dataset
- rank 0 broadcasts data to other N processes
- N processes each write out their own netcdf dataset (serial mode)
OR
- N processes do a parallel-netcdf write to a netcdf dataset (pnetcdf
  mode)

One thing to note is that up to 16 processes, the pnetcdf and serial
netcdf performance are essentially equivalent.  I wonder if maybe at
larger processor counts each process is working on smaller and smaller
amounts of I/O ?

Do you know how much of this runtime is in the reading stage and how
much is in writing?

Also, what is the file system on this Altix machine?  Since you only
have one or two links to the file system, it's certainly possible that
once more than a few processors do I/O, things slow down as nodes get
starved for I/O resources.   We have some ways to address that,
depending on your I/O access pattern.

thanks
==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B



More information about the parallel-netcdf mailing list