error in enddef

Wei-Keng Liao wkliao at northwestern.edu
Tue Jun 21 20:25:00 CDT 2022


Hi, Jim

Is the ncmpi_enddef the first enddef call after the file creation,
or after a ncmpi_redef?

In the former case, there is no MPI communication in PnetCDF, except
for an MPI_Barrier. In the latter case, if the file header size expands,
existing variables need to be moved to higher offsets, which require
PnetCDF to call MPI collective reads and writes and thus leads to MPI_Issend.

Can you try to get a coredump so to trace the call stacks?

You can also enable PnetCDF safe mode which will make additional MPI
communication calls for debugging purpose. Sometimes it helps narrow
down the problem cause. It can be enabled by setting environment
variable PNETCDF_SAFE_MODE to 1.

Wei-keng

On Jun 21, 2022, at 5:03 PM, Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:

I am using pnetcdf 1.12.3 and getting an error when compiled with intel/19.1.1 and impi/19.0.9 on the TACC Frontera system
I am getting very little information to guide me in debugging the error.

[785] Abort(634628) on node 785 (rank 785 in comm 0): Fatal error in PMPI_Issend: Invalid tag, error stack:
[785] PMPI_Issend(156): MPI_Issend(buf=0x2b5c81edf40f, count=1025120, MPI_BYTE, dest=0, tag=1048814, comm=0xc40000d7, request=0x7f2002783540) failed
[785] PMPI_Issend(95).: Invalid tag, value is 1048814
TACC:  MPI job exited with code: 4
TACC:  Shutdown complete. Exiting.


I can tell that I am in a call to ncmpi_enddef but not getting anything beyond that - any ideas?

--
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20220622/5e4ed10f/attachment.html>


More information about the parallel-netcdf mailing list