pnetcdf-1.7.0 and MPT
Wei-keng Liao
wkliao at eecs.northwestern.edu
Wed Sep 21 18:16:29 CDT 2016
Hi, Jim,
If Eric's approach does not solve your case, please try the following.
From the error messages, I suspect the cause might be due to an MPI internal
error that fails to create zero-length MPI derived datatypes. I found this
problem in OpenMPI and they fixed it in the latest release.
https://github.com/open-mpi/ompi/issues/1611
FYI, mpich does not have such problem, but I don't know about SGI MPT.
If the error message you got came from a PnetCDF test program, then there
is a patch to avoid the error.
http://lists.mcs.anl.gov/pipermail/parallel-netcdf/2016-June/001859.html
Please note this patch does not solve the problem, the fundamental
problem still lies in MPI.
Wei-keng
On Sep 21, 2016, at 3:46 PM, Kemp, Eric M. (GSFC-606.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] wrote:
>
> Hi Jim:
>
> In my case, README.SGI gave me the clue about setting the “LIBS” variable on the command line when running configure:
>
> ./configure --prefix=$PREFIX \
> --disable-cxx MPICC="$MPICC" MPIF77="$MPIF77" \
> MPIF90="$MPIF90" CC="$CC" F77="$F77" F90="$F90" \
> FC="$FC" TEST_SEQRUN="$TEST_SEQRUN" \
> TEST_MPIRUN="$TEST_MPIRUN" \
> --enable-large-file-test \
> LIBS="-lmpi”
>
> It’s not clear to me why that is necessary (I have MPICC, MPIF77, and MPIF90 set to the SGI MPT wrappers), but it is necessary in my case.
>
> -Eric
>
> Eric M. Kemp (SSAI)
> NASA/GSFC
> Mail Code: 606
> Greenbelt, MD 20771
> 301.286.9768
> eric.kemp at nasa.gov
> eric.kemp at ssaihq.com
>
>
> From: <parallel-netcdf-bounces at lists.mcs.anl.gov> on behalf of Jim Edwards <jedwards at ucar.edu>
> Date: Wednesday, September 21, 2016 at 4:28 PM
> To: Wei-keng Liao <wkliao at eecs.northwestern.edu>
> Cc: "parallel-netcdf at mcs.anl.gov" <parallel-netcdf at mcs.anl.gov>
> Subject: Re: pnetcdf-1.7.0 and MPT
>
> There wasn't anything useful in the README.SGI as far as I could tell. I am exploring getting an update to MPT/2.15 which may solve the problem.
>
> On Wed, Sep 21, 2016 at 11:53 AM, Jim Edwards <jedwards at ucar.edu> wrote:
>> Thanks - I'll give that a try and let you know.
>>
>> On Wed, Sep 21, 2016 at 11:50 AM, Wei-keng Liao <wkliao at eecs.northwestern.edu> wrote:
>>> Hi, Jim, Michael
>>>
>>> Eric Kemp @ NASA/GSFC also encountered a similar error message.
>>> http://lists.mcs.anl.gov/pipermail/parallel-netcdf/2016-June/001854.html
>>>
>>> It seems like he was able to solve the problem by trying the build recipe
>>> in README.SGI. Let me know whether this works for you.
>>>
>>> Wei-keng
>>>
>>> On Sep 21, 2016, at 12:36 PM, Michael Raymond wrote:
>>>
>>> > Are you passing a count of 0? That’s probably what MPT is getting caught on. I can have a fix for you to try in a few minutes if so.
>>> >
>>> > Michael A. Raymond
>>> > SGI MPT Team Leader
>>> > 1 (651) 683-7523
>>> >
>>> >
>>> >
>>> >> On Sep 21, 2016, at 12:26, Jim Edwards <jedwards at ucar.edu> wrote:
>>> >>
>>> >> Trying to use parallel netcdf on an SGI system with mpi/2.14 I am getting the following error:
>>> >>
>>> >> MPT ERROR: rank:10, function:MPI_TYPE_CREATE_HVECTOR, Invalid argument
>>> >>
>>> >> with a traceback:
>>> >>
>>> >> MPT: #7 0x00002aaaaf46dd2a in PMPI_Type_create_hindexed (count=<optimized out>,
>>> >>
>>> >> MPT: blocklens=0x0, indices=0xbf71470, oldtype=27, newtype=0x7fffffff5588)
>>> >>
>>> >> MPT: at type_create_hindexed.c:23
>>> >>
>>> >> MPT: #8 0x0000000000cc01ac in fillerup_aggregate (ncp=0x4ff9,
>>> >>
>>> >> MPT: old_ncp=0x7fffffff4990) at fill.c:727
>>> >>
>>> >> MPT: #9 0x0000000000cb5141 in ncmpii_NC_enddef (ncp=0x4ff9,
>>> >>
>>> >> MPT: h_align=140737488308624, h_minfree=0, v_align=-1, v_minfree=20438,
>>> >>
>>> >> MPT: r_align=0) at nc.c:1187
>>> >>
>>> >> MPT: #10 0x0000000000cb42fb in ncmpii_enddef (ncp=0x4ff9) at nc.c:1318
>>> >>
>>> >>
>>> >> MPT: #11 0x0000000000ca8d7f in ncmpi_enddef (ncid=20473) at mpinetcdf.c:806
>>> >>
>>> >>
>>> >>
>>> >> Have you seen this before or have an idea of a fix?
>>> >>
>>> >>
>>> >>
>>> >> Thanks,
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Jim Edwards
>>> >>
>>> >> CESM Software Engineer
>>> >> National Center for Atmospheric Research
>>> >> Boulder, CO
>>> >
>>>
>>
>>
>>
>> --
>> Jim Edwards
>>
>> CESM Software Engineer
>> National Center for Atmospheric Research
>> Boulder, CO
>
>
>
> --
> Jim Edwards
>
> CESM Software Engineer
> National Center for Atmospheric Research
> Boulder, CO
More information about the parallel-netcdf
mailing list