Performance when reading many small variables

Schlottke-Lakemper, Michael m.schlottke-lakemper at aia.rwth-aachen.de
Mon Nov 30 23:53:23 CST 2015


Dear all,

We recently converted all of our code to use the Parallel netCDF library instead of the NetCDF library (before we had a mix), also using Pnetcdf for non-parallel file access. We did not have any issues whatsoever, until one user notified us of a performance regression in a particular case.

He is trying to read many (O(2000)) variables from a single file in a loop, each variable with just one entry. Since this is very old code and usually only few variables are concerned, each process reads the same data individually. Before, the NetCDF library was used for this task, and during refactoring it was replaced by Pnetcdf with MPI_COMM_SELF. When using the code on a moderate number of MPI ranks (~500), the user noticed a severe performance degradation since switching to Pnetcdf:

Before, the read process of the 2000 Variables cumulatively amounted to ~0.6s. After switching to Pnetcdf (using ncmpi_get_vara_int_all), this number increased to ~300s. Going from MPI_COMM_SELF to MPI_COMM_WORLD reduced this number to ~30s, which is still high in comparison.

What, if anything, can we do to get similar performance when using Pnetcdf in this particular case? I know this is a rather degenerate case and that one possible fix would be to change the layout to 1 Variable with 2000 entries, but I was hoping that someone here has a suggestion what we could try anyways.

Thanks a lot in advance

Michael


--
Michael Schlottke-Lakemper

SimLab Highly Scalable Fluids & Solids Engineering
Jülich Aachen Research Alliance (JARA-HPC)
RWTH Aachen University
Wüllnerstraße 5a
52062 Aachen
Germany

Phone: +49 (241) 80 95188
Fax: +49 (241) 80 92257
Mail: m.schlottke-lakemper at aia.rwth-aachen.de<mailto:m.schlottke-lakemper at aia.rwth-aachen.de>
Web: http://www.jara.org/jara-hpc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20151201/d3385eb5/attachment.html>


More information about the parallel-netcdf mailing list