Problem with testing PNETCDF 1.7.0

Kemp, Eric M. (GSFC-606.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] eric.kemp at nasa.gov
Mon Jun 20 15:38:57 CDT 2016


Dear PNETCDF developers:

I’m attempting to install PNETCDF 1.7.0 on a Linux cluster running SLES 11.3 and GPFS, using SGI MPT 2.12 and Intel 15.0.3.187 compilers.  I get the following error when ‘make testing’ runs the tst_f90 program:

mpiexec_mpt -n 1 ./tst_f90      ./testfile.nc
srun.slurm: cluster configuration lacks support for cpu binding
MPT ERROR: rank:0, function:MPI_TYPE_CREATE_HVECTOR, Invalid argument
MPT: Global rank 0 is aborting with error code 0.
     Process ID: 17524, Host: borgo015, Program: /gpfsm/dnb32/emkemp/NUWRFLIB/svn/branches/features/external_upgrades/builds/parallel-netcdf-1.7.0/test/F90/tst_f90

MPT: --------stack traceback-------
MPT: Attaching to program: /proc/17524/exe, process 17524
MPT: Try: zypper install -C "debuginfo(build-id)=48172710254f4e2549684d7d3e9f9622272d6c66"
MPT: (no debugging symbols found)...done.
MPT: [Thread debugging using libthread_db enabled]
MPT: Using host libthread_db library "/lib64/libthread_db.so.1".
MPT: Try: zypper install -C "debuginfo(build-id)=f0721cb50ab9fbdf06314a53bff5af581bbefe64"
MPT: (no debugging symbols found)...done.
MPT: Try: zypper install -C "debuginfo(build-id)=e2cab3c95cb1189420734b4af264b047355be2e5"
MPT: (no debugging symbols found)...done.
MPT: Try: zypper install -C "debuginfo(build-id)=732292820e69f70459cb927ade5b49bc56d32b0f"
MPT: (no debugging symbols found)...done.
MPT: Try: zypper install -C "debuginfo(build-id)=9fdc592b21682a31f460f6f043f50eea8c8b6821"
MPT: (no debugging symbols found)...done.
MPT: Try: zypper install -C "debuginfo(build-id)=e1a13ecb56367b69b89d1c9ca1a4c42167336030"
MPT: (no debugging symbols found)...done.
MPT: Try: zypper install -C "debuginfo(build-id)=719375f80fd84b85b905db2c20ec70e8805b36e5"
MPT: (no debugging symbols found)...done.
MPT: Try: zypper install -C "debuginfo(build-id)=c4ce7f7c226abce4cec56fdbb4ed87e49024868d"
MPT: (no debugging symbols found)...done.
MPT: 0x00002aaaaacdc3bf in waitpid () from /lib64/libpthread.so.0
MPT: (gdb) #0  0x00002aaaaacdc3bf in waitpid () from /lib64/libpthread.so.0
MPT: #1  0x00002aaaab3e40ec in mpi_sgi_system (command=<optimized out>) at sig.c:99
MPT: #2  MPI_SGI_stacktraceback (header=<optimized out>) at sig.c:319
MPT: #3  0x00002aaaab337aea in print_traceback (ecode=0) at abort.c:197
MPT: #4  0x00002aaaab337bde in MPI_SGI_abort () at abort.c:85
MPT: #5  0x00002aaaab36f042 in errors_are_fatal (comm=<optimized out>,
MPT:     code=<optimized out>) at errhandler.c:220
MPT: #6  0x00002aaaab36f2f1 in MPI_SGI_error (comm=1, code=13) at errhandler.c:56
MPT: #7  0x00002aaaab3edd1d in PMPI_Type_create_hindexed (count=0, blocklens=0x0,
MPT:     indices=0x2872760, oldtype=27, newtype=0x7fffffff8a98)
MPT:     at type_create_hindexed.c:25
MPT: #8  0x00000000006f40fc in fillerup_aggregate..0 ()
MPT: #9  0x00000000006ea9fb in ncmpii_NC_enddef ()
MPT: #10 0x00000000006eb349 in ncmpii_enddef ()
MPT: #11 0x00000000006dedef in ncmpi_enddef ()
MPT: #12 0x0000000000406877 in pnetcdf_mp_nf90mpi_enddef_ ()
MPT: #13 0x0000000000405173 in MAIN__ ()
MPT: #14 0x00000000004049ee in main ()
MPT: (gdb) A debugging session is active.
MPT:
MPT:    Inferior 1 [process 17524] will be detached.
MPT:
MPT: Quit anyway? (y or n) [answered Y; input not from terminal]
MPT: Detaching from program: /proc/17524/exe, process 17524

MPT: -----stack traceback ends-----
slurmstepd-borgo015: *** STEP 8841377.772 CANCELLED AT 2016-06-20T16:18:33 *** on borgo015
srun.slurm: Job step aborted: Waiting up to 2 seconds for job step to finish.
srun.slurm: error: borgo015: task 0: Exited with exit code 255

I’ve traced the trigger to the setting of the “_FillValue” attribute for the 3D pressure variable P in tst_f90.f90.  If I change the attribute name to “FillValue” the test program runs w/o apparent error.

I have not found anything in the underlying PNetCDF Fortran or C codes that would explain this.  Do you have any suggestions?

Thanks,

-Eric

Eric M. Kemp (SSAI)
NASA/GSFC
Mail Code: 606
Greenbelt, MD 20771
301.286.9768
eric.kemp at nasa.gov
eric.kemp at ssaihq.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20160620/8d96cc67/attachment.html>


More information about the parallel-netcdf mailing list