problem with using impi 19.0.5

Jim Edwards jedwards at ucar.edu
Fri Sep 6 07:30:03 CDT 2019


Hi Si,

I have confirmed using the test case in

/scratch1/02503/edwardsj/impi_bugtest

that this is indeed the problem.

On Thu, Sep 5, 2019 at 8:33 PM Jim Edwards <jedwards at ucar.edu> wrote:

> Thanks Wei-keng I'll try the test program - but I know I have the 0 length
> arrays - this is the same app we found the bug in mpich with...
>
> On Thu, Sep 5, 2019 at 8:18 PM Wei-keng Liao <wkliao at eecs.northwestern.edu>
> wrote:
>
>> Does your I/O pattern result in some of the MPI processes making
>> zero-length read requests? That is the condition to run into this bug.
>>
>> I suggest you to run the test program given in that github issue on
>> Frontera.
>> If a similar error message beginning with "ADIOI_LUSTRE_IOCONTIG(228)”
>> appears,
>> then you got your confirmation.
>>
>> Wei-keng
>>
>> > On Sep 5, 2019, at 8:17 PM, Jim Edwards <jedwards at ucar.edu> wrote:
>> >
>> > Hi Si,
>> >
>> > Can you contact the impi developers and see if this could be the issue
>> and if we could get a fix quickly if it is.
>> >
>> > Wei-keng - thanks!
>> >
>> > On Thu, Sep 5, 2019 at 7:06 PM Wei-keng Liao <
>> wkliao at eecs.northwestern.edu> wrote:
>> > I guess you are probably hit with the same issue reported in
>> > https://github.com/pmodels/mpich/pull/3634
>> >
>> > It has been fixed in the MPICH master branch.
>> >
>> > Wei-keng
>> >
>> > > On Sep 5, 2019, at 7:43 PM, Jim Edwards via parallel-netcdf <
>> parallel-netcdf at lists.mcs.anl.gov> wrote:
>> > >
>> > > I'm using a new system TACC Frontera and having problems on the first
>> read - I get:
>> > > [15065] ADIOI_LUSTRE_IOCONTIG(228): Other I/O error Cannot allocate
>> memory
>> > >  : Unknown error occurs in reading file
>> > >
>> > > Just wondering if anyone has seen this before?
>> > >
>> > > --
>> > > Jim Edwards
>> > >
>> > > CESM Software Engineer
>> > > National Center for Atmospheric Research
>> > > Boulder, CO
>> >
>> >
>> >
>> > --
>> > Jim Edwards
>> >
>> > CESM Software Engineer
>> > National Center for Atmospheric Research
>> > Boulder, CO
>>
>>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO
>


-- 
Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20190906/7855f6fd/attachment-0001.html>


More information about the parallel-netcdf mailing list