error in enddef
Wei-Keng Liao
wkliao at northwestern.edu
Tue Jun 21 20:25:00 CDT 2022
Hi, Jim
Is the ncmpi_enddef the first enddef call after the file creation,
or after a ncmpi_redef?
In the former case, there is no MPI communication in PnetCDF, except
for an MPI_Barrier. In the latter case, if the file header size expands,
existing variables need to be moved to higher offsets, which require
PnetCDF to call MPI collective reads and writes and thus leads to MPI_Issend.
Can you try to get a coredump so to trace the call stacks?
You can also enable PnetCDF safe mode which will make additional MPI
communication calls for debugging purpose. Sometimes it helps narrow
down the problem cause. It can be enabled by setting environment
variable PNETCDF_SAFE_MODE to 1.
Wei-keng
On Jun 21, 2022, at 5:03 PM, Jim Edwards <jedwards at ucar.edu<mailto:jedwards at ucar.edu>> wrote:
I am using pnetcdf 1.12.3 and getting an error when compiled with intel/19.1.1 and impi/19.0.9 on the TACC Frontera system
I am getting very little information to guide me in debugging the error.
[785] Abort(634628) on node 785 (rank 785 in comm 0): Fatal error in PMPI_Issend: Invalid tag, error stack:
[785] PMPI_Issend(156): MPI_Issend(buf=0x2b5c81edf40f, count=1025120, MPI_BYTE, dest=0, tag=1048814, comm=0xc40000d7, request=0x7f2002783540) failed
[785] PMPI_Issend(95).: Invalid tag, value is 1048814
TACC: MPI job exited with code: 4
TACC: Shutdown complete. Exiting.
I can tell that I am in a call to ncmpi_enddef but not getting anything beyond that - any ideas?
--
Jim Edwards
CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20220622/5e4ed10f/attachment.html>
More information about the parallel-netcdf
mailing list