Problems with testing PNETCDF 1.6.1

Kemp, Eric M. (GSFC-606.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] eric.kemp at nasa.gov
Tue Feb 2 15:53:53 CST 2016


Hi Wei-keng:

I tried rerunning the entire installation with PNETCDF_SAFE_MODE=1.  FLASH-IO still hangs with SGI MPT (with no error message), but it works fine with Intel MPI.

-Eric


From: Wei-keng Liao <wkliao at eecs.northwestern.edu<mailto:wkliao at eecs.northwestern.edu>>
Date: Tuesday, February 2, 2016 12:39 PM
To: Eric Kemp <eric.kemp at nasa.gov<mailto:eric.kemp at nasa.gov>>
Cc: "parallel-netcdf at mcs.anl.gov<mailto:parallel-netcdf at mcs.anl.gov>" <parallel-netcdf at mcs.anl.gov<mailto:parallel-netcdf at mcs.anl.gov>>
Subject: Re: Problems with testing PNETCDF 1.6.1

Hi, Eric

Sorry for sending the wrong file. The correct one is attached, in case you would like
to use it.

I check your config.log file but could not find any thing fishy.
I just now tested it with Intel compiler 16.0.0.109 without a problem.
Could you try running FLASH-IO under the safe mode? i.e. set the environment
variable PNETCDF_SAFE_MODE to 1. It will enable internal checking for
data inconsistency.

Just want to make sure for 1.7.0.pre1 that your "make ptest" failed only on FLASH-IO.
Because FLAH-IO is the last test program, this means all other tests have passed.
Let me know. Thanks.

Wei-keng


On Feb 2, 2016, at 8:27 AM, Kemp, Eric M. (GSFC-606.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] wrote:

>
> Hi Wei-keng:
>
> I think you sent me the wrong copy of that file — it was identical to what is in 1.7.0.pre1.  But I went ahead and added "cmd" as an argument to subroutine check_err, and that test code compiles and runs.
>
> The large file tests pass in 1.7.0.pre1 as you indicated. However, FLASH-IO still hangs with SGI MPT.  I took your suggestion and tried running this test separately (cd benchmarks/FLASH-IO ; make ptest) but the code still hangs.
>
> I've attached the (gzipped) config.log file from the 1.7.0pre1 installations.
>
> Thanks,
>
> -Eric
>
> Eric M. Kemp (SSAI)
> NASA/GSFC
> Mail Code: 606
> Greenbelt, MD 20771
> 301.286.9768
> eric.kemp at nasa.gov<mailto:eric.kemp at nasa.gov>
> eric.kemp at ssaihq.com<mailto:eric.kemp at ssaihq.com>
>
>
> From: Wei-keng Liao <wkliao at eecs.northwestern.edu<mailto:wkliao at eecs.northwestern.edu>>
> Date: Monday, February 1, 2016 5:12 PM
> To: Eric Kemp <eric.kemp at nasa.gov<mailto:eric.kemp at nasa.gov>>
> Cc: "parallel-netcdf at mcs.anl.gov<mailto:parallel-netcdf at mcs.anl.gov>" <parallel-netcdf at mcs.anl.gov<mailto:parallel-netcdf at mcs.anl.gov>>
> Subject: Re: Problems with testing PNETCDF 1.6.1
>
> Hi, Eric,
>
> Thanks for reporting the error. This is another oversight, Sorry.
> The fixed file, bigrecords.f, is attached.
>
>
> Wei-keng
>
>
>
> On Feb 1, 2016, at 2:41 PM, Kemp, Eric M. (GSFC-606.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] wrote:
>
> >
> > Hi Wei-keng:
> >
> > Thanks for your quick response. I tried installing 1.7.0.pre1 but I get a
> > different error when compiling the tests:
> >
> > /usr/local/intel/2016/impi/5.1.2.150/bin64/mpif90    -I../../src/lib
> > -I./../common   -I../../src/libf -I../../src/libf90 -fpic -O2 -fp-model
> > strict  -c bigrecords.f
> > bigrecords.f(333): error #6514: A substring must be of type CHARACTER.
> > [CMD]
> >          msg = '*** TESTING F77 '//cmd(1:XTRIM(cmd))//
> > ------------------------------------^
> > bigrecords.f(333): error #6054: A CHARACTER data type is required in this
> > context.   [CMD]
> >          msg = '*** TESTING F77 '//cmd(1:XTRIM(cmd))//
> > ------------------------------------^
> > compilation aborted for bigrecords.f (code 1)
> >
> >
> >
> > This appears to be a legitimate syntax error in the test program, in
> > subroutine check_err.  "cmd" is not defined in that subroutine, nor is it
> > a global variable.
> >
> > I will try patching 1.6.1 with the NC_64BIT_DATA constant instead.
> >
> > -Eric
> >
> >
> >
> >
> > On 2/1/16 12:02 PM, "Wei-keng Liao" <wkliao at eecs.northwestern.edu<mailto:wkliao at eecs.northwestern.edu>> wrote:
> >
> >> Hi, Eric
> >>
> >> For the large file tests, the error is caused by a oversight of using a
> >> wrong flag.
> >> Line 81 of file large_files.c should have used NC_64BIT_DATA, instead of
> >> NC_64BIT_OFFSET.
> >> This error has been fixed in the pre-release of 1.7.0.pre1. Could you
> >> give it a try?
> >> http://cucis.ece.northwestern.edu/projects/PnetCDF/download.html
> >>
> >> As for the FLASH-IO test, could you try running it alone? I.e. cd to the
> >> folder
> >> benchmarks/FLASH-IO and run "make ptest" there. In the meantime, please
> >> send me
> >> the file config.log.
> >>
> >>
> >> Wei-keng
> >>
> >> On Feb 1, 2016, at 7:32 AM, Kemp, Eric M. (GSFC-606.0)[SCIENCE SYSTEMS
> >> AND APPLICATIONS INC] wrote:
> >>
> >>>
> >>> Dear PNETCDF developers:
> >>>
> >>> I'm attempting to install PNETCDF1.6.1 on a Linux cluster running SLES
> >>> 11.3.  I'm using Intel 15 Fortran and C compilers (no C++), and I'm
> >>> trying to install for two separate MPI implementations (SGI MPT 2.12 and
> >>> Intel MPI 5.1.2).
> >>>
> >>> I'm encountering two problems when I run 'make ptest'.
> >>>
> >>> 1)  For both MPI implementations, the large file tests fail with an
> >>> integer overflow.  The error message is:
> >>>
> >>> *** Testing large files, slowly.
> >>> line 116 of large_files.c: Overflow when type cast to 4-byte integer.
> >>> *** Creating large file ./testfile.nc...srun.slurm: error: borgo018:
> >>> task 0: Exited with exit code 1
> >>>
> >>> I reviewed the README.large_files for guidance, and I can confirm that
> >>> both 'MPI_Offset' and 'off_t' are 8 bytes.
> >>>
> >>> 2) For SGI MPT only, if I disable support for large file tests, 'make
> >>> ptest' hangs when testing FLASH-IO:
> >>>
> >>> make -w -C FLASH-IO ptest
> >>> make[2]: Entering directory
> >>> `/gpfsm/dnb32/emkemp/NUWRFLIB/svn/trunk/builds/parallel-netcdf-1.6.1/benc
> >>> hmarks/FLASH-IO'
> >>> mpiexec_mpt -n 4 ./flash_benchmark_io ./flash_io_test_
> >>> srun.slurm: cluster configuration lacks support for cpu binding
> >>>
> >>> The earlier tests with both single and multiple processes work for SGI
> >>> MPT. And all tests (again, excluding large file tests) work for Intel
> >>> MPI.
> >>>
> >>> I can provide more information (e.g., output from the configure script)
> >>> upon request.
> >>>
> >>> Thanks,
> >>>
> >>> -Eric
> >>>
> >>> Eric M. Kemp (SSAI)
> >>> NASA/GSFC
> >>> Mail Code: 606
> >>> Greenbelt, MD 20771
> >>> 301.286.9768
> >>> eric.kemp at nasa.gov<mailto:eric.kemp at nasa.gov>
> >>> eric.kemp at ssaihq.com<mailto:eric.kemp at ssaihq.com>
> >>>
> >>
> >
>
> <config.log.gz>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20160202/4f097944/attachment.html>


More information about the parallel-netcdf mailing list