[MOAB-dev] Simple code to reproduce ICC segmentation fault

Vijay S. Mahadevan vijay.m at gmail.com
Tue Sep 10 14:03:37 CDT 2013


You need to compile with -C to catch static allocation errors. That
will specifically turn on range checks.

Good to know that memcheck doesn't give you invalid read errors on
statically allocated arrays. Look at faq:
http://valgrind.org/docs/manual/faq.html

> Why doesn't Memcheck find the array overruns in this program?
> Unfortunately, Memcheck doesn't do bounds checking on global or stack arrays. We'd like to, but it's just not possible to do in a reasonable way that fits with how Memcheck works. Sorry.

> However, the experimental tool SGcheck can detect errors like this. Run Valgrind with the --tool=exp-sgcheck option to try it, but be aware that it is not as robust as Memcheck.

Vijay

On Tue, Sep 10, 2013 at 1:51 PM, Tim Tautges <tautges at mcs.anl.gov> wrote:
> Good catch Danqing, I didn't know that (that valgrind wouldn't catch out of
> bounds errors on statically-allocated arrays).
>
> The preferred way to do this, then, will be to use std::vector, with a
> static size set at instantiation.  That makes it dynamically allocated but
> still static size.  I'll remember that one.
>
> - tim
>
> On 09/10/2013 01:22 PM, Danqing Wu wrote:
>>
>> Here is what I found online:
>>
>> What Won't Valgrind Find?
>> Valgrind doesn't perform bounds checking on static arrays (allocated on
>> the stack). So if you declare an array inside your function:
>>
>> int main()
>> {
>>      char x[10];
>>      x[11] = 'a';
>> }
>>
>> then Valgrind won't alert you! One possible solution for testing purposes
>> is simply to change your static arrays into dynamically allocated memory
>> taken from the heap, where you will get bounds-checking, though this could
>> be a mess of unfreed memory.
>>
>> ----- Original Message -----
>> From: "Iulian Grindeanu" <iulian at mcs.anl.gov>
>> To: "Danqing Wu" <wuda at mcs.anl.gov>
>> Cc: "Tim Tautges" <tautges at mcs.anl.gov>
>> Sent: Tuesday, September 10, 2013 1:02:45 PM
>> Subject: Re: Simple code to reproduce ICC segmentation fault
>>
>>
>>
>>
>> ----- Original Message -----
>>
>>
>>
>> After correcting that, moab-intel test works fine!
>> Good job again, Danqing!
>>
>> Thanks,
>> Iulian
>>
>> now the question is why valgrind did not find this ...
>>
>>
>>
>>
>> ----- Original Message -----
>>
>>
>> I think I found one possible reason.
>>
>> ErrorCode ScdInterface::get_neighbor_alljkbal(int np, int pfrom,
>> const int * const gdims, const int * const gperiodic, const int * const
>> dijk,
>> int &pto, int *rdims, int *facedims, int *across_bdy)
>> {
>> ...
>> int ldims[6], pijk[3], lperiodic[2];
>> ErrorCode rval = compute_partition_alljkbal(np, pfrom, gdims, gperiodic,
>> ldims, lperiodic, pijk);
>> ...
>> }
>>
>> Here lperiodic[2] should be lperiodic[3], as the third element will be
>> accessed inside compute_partition_alljkbal().
>>
>> The behaviour could be dependent on compilers. Maybe only for ICC 12 and
>> O2, and when assert is disabled, this out of memory issue causes a
>> segmentation fault.
>>
>> I will retest after this fix.
>>
>> ----- Original Message -----
>> From: "Iulian Grindeanu" <iulian at mcs.anl.gov>
>> To: "Danqing Wu" <wuda at mcs.anl.gov>
>> Cc: "Tim Tautges" <tautges at mcs.anl.gov>
>> Sent: Tuesday, September 10, 2013 10:17:28 AM
>> Subject: Re: Simple code to reproduce ICC segmentation fault
>>
>>
>> If it works on icc 13 / ubuntu 12, I suggest moving moab-intel build to
>> jenkins; we may have to rebuild netcdf with icc if there are issues with
>> libcurl.
>>
>> Any suggestions?
>>
>> Iulian
>> ----- Original Message -----
>>
>>
>> On gnep, icc 12.
>>
>> Configure option
>> ./configure --prefix=/homes/fathom/libs/current/moabintel
>> --with-netcdf=/homes/fathom/3rdparty/netcdf-4.1.3-intel
>> --with-hdf5=/homes/fathom/3rdparty/hdf5-1.8.8-ser-intel
>> --with-zlib=/homes/fathom/3rdparty/zlib/zlib-1.2.4/gcc --enable-igeom
>> --enable-imesh CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort
>>
>> So the flags will include both -O2 and -DNDEBUG
>>
>> Here since NDEBUG is enabled, all of the assert(...) will do nothing, and
>> this could make some differences.
>>
>> On gnep, icc 12, if only -O2, but no NDEBUG, the original test can pass. I
>> guess ICC 12 would be affected by the assert stuff.
>>
>> ----- Original Message -----
>> From: "Iulian Grindeanu" <iulian at mcs.anl.gov>
>> To: "Danqing Wu" <wuda at mcs.anl.gov>
>> Cc: "Tim Tautges" <tautges at mcs.anl.gov>
>> Sent: Tuesday, September 10, 2013 10:04:39 AM
>> Subject: Re: Simple code to reproduce ICC segmentation fault
>>
>>
>> so this is with icc -O2 or what are the compile options?
>> Is this on gnep? icc 12? icc 13?
>>
>> Should we try to use ubuntu 12 for intel builds?
>>
>> (we can do that on jenkins auto build platform)
>>
>> Iulian
>>
>>
>> ----- Original Message -----
>>
>>
>> I am still debugging, but it seems that the two calls of
>> ScdInterface::get_neighbor() caused the crash. If I comment out the second
>> call, no segmentaion fault.
>>
>>
>> #include "moab/ScdInterface.hpp"
>> #include "moab/Core.hpp"
>>
>> #include <iostream>
>>
>> using namespace moab;
>>
>> int main()
>> {
>> Core moab;
>> ScdInterface* scdi;
>> ErrorCode rval = moab.Interface::query_interface(scdi);
>>
>> int gdims[] = {0, 0, 0, 48, 40, 18};
>> int nprocs = 4;
>> int pto = 0;
>> int across_bdy_a[3] = {0};
>> int rdims_a[6] = {0};
>> int facedims_a[6] = {0};
>>
>> ScdParData spd;
>> int n;
>> for (n = 0; n < 6; n++)
>> spd.gDims[n] = gdims[n];
>> for (n = 0; n < 3; n++)
>> spd.gPeriodic[n] = 0;
>>
>> spd.partMethod = ScdParData::ALLJKBAL;
>>
>> int dijka[3] = {0};
>>
>> dijka[0] = -1;
>> dijka[1] = -1;
>> dijka[2] = -1;
>> rval = ScdInterface::get_neighbor(nprocs, 0, spd, dijka, pto, rdims_a,
>> facedims_a, across_bdy_a);
>>
>> dijka[0] = 0;
>> dijka[1] = -1;
>> dijka[2] = -1;
>> rval = ScdInterface::get_neighbor(nprocs, 0, spd, dijka, pto, rdims_a,
>> facedims_a, across_bdy_a);
>>
>> std::cout << "Return from main()" << std::endl;
>>
>> return 0;
>> }
>>
>>
>>
>>
>>
>
> --
> ================================================================
> "You will keep in perfect peace him whose mind is
>   steadfast, because he trusts in you."               Isaiah 26:3
>
>              Tim Tautges            Argonne National Laboratory
>          (tautges at mcs.anl.gov)      (telecommuting from UW-Madison)
>  phone (gvoice): (608) 354-1459      1500 Engineering Dr.
>             fax: (608) 263-4499      Madison, WI 53706
>


More information about the moab-dev mailing list