problem with using impi 19.0.5

Wei-keng Liao wkliao at eecs.northwestern.edu
Thu Sep 5 21:18:51 CDT 2019


Does your I/O pattern result in some of the MPI processes making
zero-length read requests? That is the condition to run into this bug.

I suggest you to run the test program given in that github issue on Frontera.
If a similar error message beginning with "ADIOI_LUSTRE_IOCONTIG(228)” appears,
then you got your confirmation.

Wei-keng

> On Sep 5, 2019, at 8:17 PM, Jim Edwards <jedwards at ucar.edu> wrote:
> 
> Hi Si, 
> 
> Can you contact the impi developers and see if this could be the issue and if we could get a fix quickly if it is.   
> 
> Wei-keng - thanks!  
> 
> On Thu, Sep 5, 2019 at 7:06 PM Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
> I guess you are probably hit with the same issue reported in
> https://github.com/pmodels/mpich/pull/3634
> 
> It has been fixed in the MPICH master branch.
> 
> Wei-keng
> 
> > On Sep 5, 2019, at 7:43 PM, Jim Edwards via parallel-netcdf <parallel-netcdf at lists.mcs.anl.gov> wrote:
> > 
> > I'm using a new system TACC Frontera and having problems on the first read - I get:
> > [15065] ADIOI_LUSTRE_IOCONTIG(228): Other I/O error Cannot allocate memory
> >  : Unknown error occurs in reading file
> > 
> > Just wondering if anyone has seen this before?
> > 
> > -- 
> > Jim Edwards
> > 
> > CESM Software Engineer
> > National Center for Atmospheric Research
> > Boulder, CO 
> 
> 
> 
> -- 
> Jim Edwards
> 
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO 



More information about the parallel-netcdf mailing list