Error when leaving the define mode

Jin-De Huang s85012921804 at gmail.com
Thu Jul 15 03:17:42 CDT 2021


Hi Robert and Wei-Keng,

I have solved this problem by adding
PNETCDF_HINTS='nc_header_align_size=40960'.

Thanks for your help.

JIn-De



Wei-Keng Liao <wkliao at northwestern.edu> 於 2021年7月15日 週四 上午6:21寫道:

> Hi, Jin-De
>
> You can turn on the "safe mode" by setting the environment
> variable PNETCDF_SAFE_MODE to 1.
>
> This mode will check the consistency of arguments passed
> to all PnetCDF functions. It will print out more error messages
> that may be related to the error you are seeing.
>
> Wei-keng
>
> > On Jul 14, 2021, at 4:42 PM, Latham, Robert J. <robl at mcs.anl.gov> wrote:
> >
> > On Wed, 2021-07-14 at 22:00 +0800, Jin-De Huang wrote:
> >> I am testing my model with 2304 processes on a supercluster with the
> >> Fujitsu Fortran compiler and Pnetcdf 1.12.1. The model halted when
> >> leaving the define mode. The error message only appeared in the log
> >> files that MPI ranks are greater than 2047.
> >>
> >> MPI error (MPI_File_read_at_all) : MPI_ERR_ARG: invalid argument of
> >> some other kind
> >>
> >> Some problems happened in these processes, but the error codes from
> >> each Pnetcdf function were 0 until the above error message appeared.
> >> As I used the number of processes less than 2048, the model worked
> >> normally. I have no idea to solve this problem. Is it any way to
> >> identify the reason for this problem?
> >
> > My first guess might be a "too many open files" problem, though I would
> > have hoped the MPI-IO implementation would have said that instead of
> > "some error happened".
> >
> > If it is open files, then there is a 'ulimit' setting you can
> > raise:  `ulimit -a` will show you what limits are in place now, and
> > `ulimit -n` changes the "open files" limit.  Try doubling whatever it
> > is set to now.
> >
> > ==rob
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20210715/fa4c632a/attachment-0001.html>


More information about the parallel-netcdf mailing list