With intel 2015.1.133 and gcc 5.1.0 in the path, error on make check with cxx interface
Nick Papior Andersen
nickpapior at gmail.com
Mon Jun 1 18:48:26 CDT 2015
I did this (adding traceback to the debugging compilation)
../configure CXXFLAGS='-g -O0 -traceback' CFLAGS='-g -O0 -traceback'
CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpif90
--prefix=/zdata/groups/common/nicpa/2015-test/XeonX5550/pnetcdf/1.6.0/intel-15.0.1
--with-mpi=/zdata/groups/common/nicpa/2015-test/XeonX5550/openmpi/1.8.5/intel-15.0.1
--enable-debug --disable-fortran
make
cd test/CXX
make check
And got (well the same thing :( ):
./nctst ./testfile.nc
ncmpi_inq_typeids not implemented
[n-62-12-2:31968] *** Process received signal ***
[n-62-12-2:31968] Signal: Segmentation fault (11)
[n-62-12-2:31968] Signal code: Address not mapped (1)
[n-62-12-2:31968] Failing at address: 0xffffffffffffffe8
[n-62-12-2:31968] [ 0] /lib64/libpthread.so.0(+0xf710)[0x2b13425f0710]
[n-62-12-2:31968] [ 1]
/zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZNSo6sentryC2ERSo+0x19)[0x2b1342157e79]
[n-62-12-2:31968] [ 2]
/zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l+0x29)[0x2b1342158589]
[n-62-12-2:31968] [ 3]
/zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc+0x27)[0x2b13421589e7]
[n-62-12-2:31968] [ 4] ./nctst[0x40ad62]
[n-62-12-2:31968] [ 5] ./nctst[0x40b93c]
[n-62-12-2:31968] [ 6]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x2b1342a21d5d]
[n-62-12-2:31968] [ 7] ./nctst[0x405899]
[n-62-12-2:31968] *** End of error message ***
make: *** [check] Segmentation fault
Note that I have "corrected" the return to an empty string and the above
compilation.
However, it is clear from the code that the values that are to be updated
(number of types) isn't set. Hence the error is in the logic behind calling
the inq_typ, not the preceding code, which should function once the number
of types code has been implemented, correctly setting the number of types.
(code snippet:
int ntypesp;
ncmpiCheck(ncmpi_inq_typeids(getId(),
&ntypesp,typeidsp),__FILE__,__LINE__);
)
with no initialization.
Maybe the "not implemented" routine should set the ntypesp to 0?
Note, that this may also be a compiler bug. :(
2015-06-02 1:24 GMT+02:00 Wei-keng Liao <wkliao at eecs.northwestern.edu>:
> Hi, Nick
>
> I tried to see if any place that can indirectly invoke that message, but
> could not find one.
>
> I wonder if you can kindly help me find more information about this error,
> by rebuilding your PnetCDF with the following configure command (with debug
> option enabled):
>
> ./configure --enable-debug --disable-fortran
> --with-mpi=/zdata/groups/common/nicpa/2015-test/XeonX5550/openmpi/1.8.5/intel-15.0.1
>
> Disabling Fortran gives you a shorter build time.
>
> Once you build it, please cd directly to test/CXX and run "make check"
> there.
> This can skip all other tests.
>
> thanks
>
> Wei-keng
>
> On Jun 1, 2015, at 5:21 PM, Nick Papior Andersen wrote:
>
> > Oh, yeah sorry. it is the latest 1.6.0 version.
> > Here is a tar with the config.log and the tmp.test (it was 1 mb, so
> sorry for taring).
> >
> >
> >
> > 2015-06-02 0:16 GMT+02:00 Wei-keng Liao <wkliao at eecs.northwestern.edu>:
> > That message "ncmpi_inq_typeids not implemented" should not appear.
> > It is fishy.
> >
> > I forgot to ask the PnetCDF version you are using.
> > Please let me know. Also, please send me the file config.log. Thanks.
> >
> > Wei-keng
> >
> > On Jun 1, 2015, at 5:02 PM, Nick Papior Andersen wrote:
> >
> > > Oh, and I do not have these problems using pure gcc 5.1.0 on my local
> machine.
> > >
> > > 2015-06-02 0:00 GMT+02:00 Nick Papior Andersen <nickpapior at gmail.com>:
> > > Dear Wei-keng and Rob,
> > >
> > > My default options did not include -g flag, so the coredump was quite
> un-useful ;(
> > >
> > > I did the catchsegv thing... Here is the output:
> > >
> > > $> catchsegv ./nctst ./testfile.nc
> > > ncmpi_inq_typeids not implemented
> > > *** Segmentation fault
> > > Register dump:
> > >
> > > RAX: 00002b61bec50000 RBX: 00002b61bec50000 RCX: 000000000000000c
> > > RDX: 0000000000000000 RSI: 00002b61bec50000 RDI: 00007fff5e677fa0
> > > RBP: 00007fff5e677fa0 R8 : 0000000000e79b60 R9 : 00000000000000f0
> > > R10: 00007fff5e677d70 R11: 00002b61be9e1560 R12: 00007fff5e6780e0
> > > R13: 00007fff5e678920 R14: 000000000000004c R15: 00007fff5e677fa0
> > > RSP: 00007fff5e677f60
> > >
> > > RIP: 00002b61be9e0e79 EFLAGS: 00010206
> > >
> > > CS: 0033 FS: 0000 GS: 0000
> > >
> > > Trap: 0000000e Error: 00000005 OldMask: 00000000 CR2: ffffffe8
> > >
> > > FPUCW: 0000037f FPUSW: 00000000 TAG: 00002b61
> > > RIP: bf8f7fff RDP: 5e677ff0
> > >
> > > ST(0) 0000 0000000000000033 ST(1) 0000 000000000000000d
> > > ST(2) 0000 0000000000c80000 ST(3) 0000 0000000000000640
> > > ST(4) 0000 0000000000000000 ST(5) 0000 0000000000000000
> > > ST(6) 0000 0000000000000000 ST(7) 8000 8000000000000000
> > > mxcsr: 9fe0
> > > XMM0: 00000000000000000000000000000000 XMM1:
> 00000000000000000000000000000000
> > > XMM2: 00000000000000000000000000000000 XMM3:
> 00000000000000000000000000000000
> > > XMM4: 00000000000000000000000000000000 XMM5:
> 00000000000000000000000000000000
> > > XMM6: 00000000000000000000000000000000 XMM7:
> 00000000000000000000000000000000
> > > XMM8: 00000000000000000000000000000000 XMM9:
> 00000000000000000000000000000000
> > > XMM10: 00000000000000000000000000000000 XMM11:
> 00000000000000000000000000000000
> > > XMM12: 00000000000000000000000000000000 XMM13:
> 00000000000000000000000000000000
> > > XMM14: 00000000000000000000000000000000 XMM15:
> 00000000000000000000000000000000
> > >
> > > Backtrace:
> > >
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZNSo6sentryC2ERSo+0x19)[0x2b61be9e0e79]
> > >
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l+0x29)[0x2b61be9e1589]
> > >
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc+0x27)[0x2b61be9e19e7]
> > >
> ??:0(_Z3genRKP19ompi_communicator_tPKcN7PnetCDF9NcmpiFile10FileFormatE)[0x40d4a5]
> > > ??:0(main)[0x409f60]
> > > /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b61bf0a5d5d]
> > > ??:0(_start)[0x409cb9]
> > >
> > >
> > > Where I think the top line is pretty self-explanatory ;)
> > > And looking at the code, it makes sense, whether it should be called
> at all is another matter...
> > >
> > > 2015-06-01 18:25 GMT+02:00 Wei-keng Liao <wkliao at eecs.northwestern.edu
> >:
> > > Hi, Nick,
> > >
> > > To print the trace of a segmentation fault is easy. You can run command
> > > "gdb corefile" and when at the gdb prompt, type command "where".
> > > If you can send me the printout, it will be helpful.
> > >
> > > Just to clarify, if gcc 5.1.0 is used, are you saying there is no
> problem of building PnetCDF?
> > > Can you tell me the configure command line you used?
> > >
> > > Wei-keng
> > >
> > > On Jun 1, 2015, at 11:11 AM, Nick Papior Andersen wrote:
> > >
> > > >
> > > >
> > > > 2015-06-01 17:09 GMT+02:00 Wei-keng Liao <
> wkliao at eecs.northwestern.edu>:
> > > > Hi, Nick
> > > >
> > > > Your fix for the first bug makes all sense. I will add that to
> PnetCDF. Thanks.
> > > >
> > > > As for the second error, can you use gdb to print the location of
> the segmentation fault?
> > > > I haven't done that. I would rather not go that path? I ain't an
> avid user of gdb (yet).
> > > > Also, do both errors happen to Intel C compiler?
> > > > It is only the intel cxx compiler. I only show the gcc version as
> intel uses that for compatibility issues. (see -gcc-name)
> > > > I thought that this could be problem... Maybe it isn't.
> > > >
> > > > The two C compilers you used are the latest ones. I have not tried
> them.
> > > > Could you compile/run a simple C++ program to see if gcc works in
> your environment?
> > > > Or, if you installed gcc from source, have you tried run "make -k
> check"?
> > > > See https://gcc.gnu.org/install/test.html
> > > >
> > > > Gcc/g++ runs fine. I can compile 20 other different libraries with
> full support. If anything, it is related to the intel compiler. :(
> > > > If you do not need C++ component, you can build a PnetCDF without
> it, by adding option
> > > > "--disable-cxx" to the configure command.
> > > > This is already my diverting methodology :) Thanks.
> > > >
> > > > By the way, I plan to release 1.6.1 today, but it will not fix the
> second error you are seeing.
> > > > I will try if I can find those new versions of C compiler and fix
> the problem.
> > > > The fix will have to wait for the next release, though.
> > > > Ok. :)
> > > >
> > > > Thanks again for reporting the problem.
> > > > You are welcome.
> > > > Thanks for the software.
> > > >
> > > > Wei-keng
> > > >
> > > > On Jun 1, 2015, at 1:00 AM, Nick Papior Andersen wrote:
> > > >
> > > > > I am trying to compile and make check with these compilers:
> > > > > intel 2015.1.13
> > > > > and
> > > > > gcc 5.1.0 in the path.
> > > > >
> > > > > Compiling goes fine and everything seems to link correctly.
> > > > > However make check errors out in the CXX test.
> > > > >
> > > > > First I get this error message:
> > > > > ./nctst ./testfile.nc
> > > > > terminate called after throwing an instance of 'std::logic_error'
> > > > > what(): basic_string::_M_construct null not valid
> > > > > [n-62-12-2:09803] *** Process received signal ***
> > > > > [n-62-12-2:09803] Signal: Aborted (6)
> > > > > [n-62-12-2:09803] Signal code: (-6)
> > > > > [n-62-12-2:09803] [ 0]
> /lib64/libpthread.so.0(+0xf710)[0x2aae4ac54710]
> > > > > [n-62-12-2:09803] [ 1]
> /lib64/libc.so.6(gsignal+0x35)[0x2aae4ae94625]
> > > > > [n-62-12-2:09803] [ 2]
> /lib64/libc.so.6(abort+0x175)[0x2aae4ae95e05]
> > > > > [n-62-12-2:09803] [ 3]
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x15d)[0x2aae4a7428cd]
> > > > > [n-62-12-2:09803] [ 4]
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(+0x8c936)[0x2aae4a740936]
> > > > > [n-62-12-2:09803] [ 5]
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(+0x8c981)[0x2aae4a740981]
> > > > > [n-62-12-2:09803] [ 6]
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(+0x8cb98)[0x2aae4a740b98]
> > > > > [n-62-12-2:09803] [ 7]
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZSt19__throw_logic_errorPKc+0x3f)[0x2aae4a767faf]
> > > > > [n-62-12-2:09803] [ 8] ./nctst[0x461b42]
> > > > > [n-62-12-2:09803] [ 9] ./nctst[0x4644b6]
> > > > > [n-62-12-2:09803] [10] ./nctst[0x40c313]
> > > > > [n-62-12-2:09803] [11] ./nctst[0x409f60]
> > > > > [n-62-12-2:09803] [12]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2aae4ae80d5d]
> > > > > [n-62-12-2:09803] [13] ./nctst[0x409cb9]
> > > > > [n-62-12-2:09803] *** End of error message ***
> > > > >
> > > > >
> > > > > Secondly I change in file src/libcxx/ncmpiType.cpp:
> > > > > function inq_type has 'return NULL' which cannot be done using
> returns of string (unless it is a pointer, which it isn't)
> > > > > So I change it to an empty string:
> > > > > 'return ""'
> > > > > (I am not sure when this is reached, but the error message changes
> as can be seen below, hence my suspicion is at that code segment)
> > > > >
> > > > > Now I recompile and get this alternate error message:
> > > > > ./nctst ./testfile.nc
> > > > > [n-62-12-2:23419] *** Process received signal ***
> > > > > [n-62-12-2:23419] Signal: Segmentation fault (11)
> > > > > [n-62-12-2:23419] Signal code: Address not mapped (1)
> > > > > [n-62-12-2:23419] Failing at address: 0xffffffffffffffe8
> > > > > [n-62-12-2:23419] [ 0]
> /lib64/libpthread.so.0(+0xf710)[0x2ab6b02fa710]
> > > > > [n-62-12-2:23419] [ 1]
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZNSo6sentryC2ERSo+0x19)[0x2ab6afe61e79]
> > > > > [n-62-12-2:23419] [ 2]
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l+0x29)[0x2ab6afe62589]
> > > > > [n-62-12-2:23419] [ 3]
> /zdata/groups/common/nicpa/2015-test/generic/gcc/5.1.0/lib64/libstdc++.so.6(_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc+0x27)[0x2ab6afe629e7]
> > > > > [n-62-12-2:23419] [ 4] ./nctst[0x40d4a5]
> > > > > [n-62-12-2:23419] [ 5] ./nctst[0x409f60]
> > > > > [n-62-12-2:23419] [ 6]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x2ab6b0526d5d]
> > > > > [n-62-12-2:23419] [ 7] ./nctst[0x409cb9]
> > > > > [n-62-12-2:23419] *** End of error message ***
> > > > > make[2]: *** [testing] Segmentation fault
> > > > >
> > > > > There seem to be something fishy with the cxx interface?
> > > > > I am no expert in cxx... :( So had troubles debugging further...
> > > > >
> > > > > --
> > > > > Kind regards Nick
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Kind regards Nick
> > >
> > >
> > >
> > >
> > > --
> > > Kind regards Nick
> > >
> > >
> > >
> > > --
> > > Kind regards Nick
> >
> >
> >
> >
> > --
> > Kind regards Nick
> > <log-test.tar.gz>
>
>
--
Kind regards Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/parallel-netcdf/attachments/20150602/f098bdca/attachment-0001.html>
More information about the parallel-netcdf
mailing list