PnetCDF: MPI error with large number of processes

Tue Jul 12 16:11:38 CDT 2022

It is not recommended to use NFS when performing parallel I/O.
Different kinds of errors may pop up, due to NFS's client-side
aggressive caching. Performance can also be poor if using NFS.

Wei-keng

On Jul 12, 2022, at 3:41 AM, Lukas Umek <lukas.umek at gmail.com<mailto:lukas.umek at gmail.com>> wrote:

Hello Wei-King,
Thanks for the pointing this out! At the moment we are using an NFS where WRF is running but we are planning to try out Lustre soon.
I did set export ROMIO_TUNEGATHER=0 but this had no obvious effect.

best,
Luaks

Am Do., 7. Juli 2022 um 17:37 Uhr schrieb Wei-Keng Liao <wkliao at northwestern.edu<mailto:wkliao at northwestern.edu>>:
Hi, Lukas

The error message points to MPI_Allgather, but PnetCDF does not call MPI_Allgather
internally. This most likely came from the MPI-IO library. What file system
are you using when running WRF? That can narrow down to the source codes in
MPI-IO. In the meantime, can you try setting the following environment variable?

export ROMIO_TUNEGATHER=0

Wei-keng

> On Jul 7, 2022, at 7:17 AM, Lukas Umek <lukas.umek at gmail.com<mailto:lukas.umek at gmail.com>> wrote:
>
> Hi,
> I am using PnetCDF v1.12.2 read&write large netCDF files (64bit offset and CDF5 formats, > 10GB per file) with the WRF model. This works fine up to a certain number of MPI processes.  Running on 4080 MPI processes works but a job with 4200 MPI processes fails during I/O. An example for the error message I get is below:
>
> Invalid error code (-1) (error ring index 127 invalid)
> INTERNAL ERROR: invalid error code ffffffff (Ring Index out of range) in MPIDI_NM_mpi_allgather:202
> Abort(873534479) on node 1450 (rank 1450 in comm 0): Fatal error in PMPI_Allgather: Other MPI error, error stack:
> PMPI_Allgather(401)..........................: MPI_Allgather(sbuf=0x7ffc94b87a48, scount=1, MPI_LONG_LONG_INT, rbuf=0xd1bba70, rcount=1, datatype=MPI_LONG_LONG_INT, comm=comm=0xc400001a) failed
> MPIDI_Allgather_intra_composition_alpha(1844):
> MPIDI_NM_mpi_allgather(202)..................:
>
> This is happening with Intel MPI 2019.9 and 2021.2. When I use mvapich2-2.3.5
> I am able to write files with PnetCDF with more MPI processes involved (e.g. I tried up to 5760 MPI processes and that worked). However performance is much degraded when using mvapich so this is not really an option (time for writing to the disks more than triples compared to jobs using intelMPI with the same core count and data).
>
> My problem sounds similar to some threads I found online:
> - https://lists.mcs.anl.gov/pipermail/parallel-netcdf/2013-August/001519.html<https://urldefense.com/v3/__https://lists.mcs.anl.gov/pipermail/parallel-netcdf/2013-August/001519.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!TQVGtYvdEZg4k-8B7Zi6sWhaivDzMcl2cBdRTlt-0Z-f5CdB6KhX4wHA8wR8NJKq-KZYgbSj6QtroHEfFoMxAydI$>
> - https://lists.mcs.anl.gov/pipermail/parallel-netcdf/2010-October/001143.html<https://urldefense.com/v3/__https://lists.mcs.anl.gov/pipermail/parallel-netcdf/2010-October/001143.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!TQVGtYvdEZg4k-8B7Zi6sWhaivDzMcl2cBdRTlt-0Z-f5CdB6KhX4wHA8wR8NJKq-KZYgbSj6QtroHEfFpn68yjZ$>
> (Setting the MPI_TYPE_MAX  suggested in the second post did not help with my problem.)
>
> Is anybody aware of some limitations intelMPI imposes when used with PnetCDF?
>
> cheers,
> Lukas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20220712/81a77139/attachment.html>