problem with using impi 19.0.5
Wei-keng Liao
wkliao at eecs.northwestern.edu
Thu Sep 5 21:18:51 CDT 2019
Does your I/O pattern result in some of the MPI processes making
zero-length read requests? That is the condition to run into this bug.
I suggest you to run the test program given in that github issue on Frontera.
If a similar error message beginning with "ADIOI_LUSTRE_IOCONTIG(228)” appears,
then you got your confirmation.
Wei-keng
> On Sep 5, 2019, at 8:17 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>
> Hi Si,
>
> Can you contact the impi developers and see if this could be the issue and if we could get a fix quickly if it is.
>
> Wei-keng - thanks!
>
> On Thu, Sep 5, 2019 at 7:06 PM Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
> I guess you are probably hit with the same issue reported in
> https://github.com/pmodels/mpich/pull/3634
>
> It has been fixed in the MPICH master branch.
>
> Wei-keng
>
> > On Sep 5, 2019, at 7:43 PM, Jim Edwards via parallel-netcdf <parallel-netcdf at lists.mcs.anl.gov> wrote:
> >
> > I'm using a new system TACC Frontera and having problems on the first read - I get:
> > [15065] ADIOI_LUSTRE_IOCONTIG(228): Other I/O error Cannot allocate memory
> > : Unknown error occurs in reading file
> >
> > Just wondering if anyone has seen this before?
> >
> > --
> > Jim Edwards
> >
> > CESM Software Engineer
> > National Center for Atmospheric Research
> > Boulder, CO
>
>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO
More information about the parallel-netcdf
mailing list