[petsc-dev] Did someone fucking break bfort?

Matthew Knepley knepley at gmail.com
Fri Dec 25 13:18:28 CST 2009


Here is the valgrind for your +100 fix:

knepley at khan:/PETSc3/petsc/petsc-dev/src/mat/utils$ valgrind
/PETSc3/petsc/petsc-dev/linux-gnu-cxx-debug/bin/bfort -dir
/PETSc3/petsc/petsc-dev/src/mat/utils/ftn-auto -mnative -ansi -nomsgs
-noprofile -anyname -mapptr -mpi -mpi2 -ferr -ptrprefix Petsc -ptr64
PETSC_USE_POINTER_CONVERSION -fcaps PETSC_HAVE_FORTRAN_CAPS -fuscore
PETSC_HAVE_FORTRAN_UNDERSCORE -f90mod_skip_header matio.c convert.c
gcreate.c freespace.c getcolv.c ptap.c compressedrow.c matstash.c
multequal.c axpy.c freespace.h zerodiag.c matstashspace.c
==20868== Memcheck, a memory error detector.
==20868== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==20868== Using LibVEX rev 1804, a library for dynamic binary translation.
==20868== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==20868== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation
framework.
==20868== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==20868== For more details, rerun with: -v
==20868==
==20868== Conditional jump or move depends on uninitialised value(s)
==20868==    at 0x804C2BB: PrintBody (bfort.c:1362)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==
==20868== Conditional jump or move depends on uninitialised value(s)
==20868==    at 0x804C293: PrintBody (bfort.c:1363)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==
==20868== Conditional jump or move depends on uninitialised value(s)
==20868==    at 0x804C5E7: PrintBody (bfort.c:1384)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==
==20868== Conditional jump or move depends on uninitialised value(s)
==20868==    at 0x804C396: PrintBody (bfort.c:1385)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==
==20868== Conditional jump or move depends on uninitialised value(s)
==20868==    at 0x804C3CE: PrintBody (bfort.c:1387)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==
==20868== Conditional jump or move depends on uninitialised value(s)
==20868==    at 0x804C3E9: PrintBody (bfort.c:1387)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==
==20868== Conditional jump or move depends on uninitialised value(s)
==20868==    at 0x804C589: PrintBody (bfort.c:1406)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==
==20868== Use of uninitialised value of size 4
==20868==    at 0x40239D8: strlen (mc_replace_strmem.c:242)
==20868==    by 0x4198127: fputs (in /lib/tls/i686/cmov/libc-2.7.so)
==20868==    by 0x804C5BE: PrintBody (bfort.c:1408)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==
==20868== Invalid read of size 1
==20868==    at 0x40239D8: strlen (mc_replace_strmem.c:242)
==20868==    by 0x4198127: fputs (in /lib/tls/i686/cmov/libc-2.7.so)
==20868==    by 0x804C5BE: PrintBody (bfort.c:1408)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==20868==
==20868== Process terminating with default action of signal 11 (SIGSEGV)
==20868==  Access not within mapped region at address 0x0
==20868==    at 0x40239D8: strlen (mc_replace_strmem.c:242)
==20868==    by 0x4198127: fputs (in /lib/tls/i686/cmov/libc-2.7.so)
==20868==    by 0x804C5BE: PrintBody (bfort.c:1408)
==20868==    by 0x804A622: OutputRoutine (bfort.c:575)
==20868==    by 0x804A0B2: main (bfort.c:475)
==20868==
==20868== ERROR SUMMARY: 72 errors from 9 contexts (suppressed: 17 from 1)
==20868== malloc/free: in use at exit: 1,056 bytes in 3 blocks.
==20868== malloc/free: 6 allocs, 3 frees, 2,112 bytes allocated.
==20868== For counts of detected errors, rerun with: -v
==20868== searching for pointers to 3 not-freed blocks.
==20868== checked 243,864 bytes.
==20868==
==20868== LEAK SUMMARY:
==20868==    definitely lost: 0 bytes in 0 blocks.
==20868==      possibly lost: 0 bytes in 0 blocks.
==20868==    still reachable: 1,056 bytes in 3 blocks.
==20868==         suppressed: 0 bytes in 0 blocks.
==20868== Rerun with --leak-check=full to see details of leaked memory.
Segmentation fault

The problem is that argument lists are just not parsed correctly for
gcreate.c. You can send that to Bill.

  Matt

On Fri, Dec 25, 2009 at 10:55 AM, Satish Balay <balay at mcs.anl.gov> wrote:

> One more thing. If I remove this patch from sowing - the valgrind log
> is clean.
>
>
> http://petsc.cs.iit.edu/petsc/externalpackages/sowing-1.1.11/rev/e591c037e500
>
> Perhaps you can find the bug in this change. If not - I'll send a bug
> report to Bill.
>
> Satish
>
> On Fri, 25 Dec 2009, Satish Balay wrote:
>
> > Can you send me the valgrind.log - with the patch applied to the
> > unmodified sowing-1.1.11-a.tar.gz?
> >
> > Also the command you are using to generate this log?
> >
> >
> > I've used the following:
> > valgrind --tool=memcheck -q --log-file=valgrind.log bfort -dir
> `pwd`/ftn-auto -ansi -nomsgs -noprofile -anyname -mapptr -mpi -mpi2 -ferr
> -ptrprefix Petsc -ptr64 PETSC_USE_POINTER_CONVERSION -fcaps
> PETSC_HAVE_FORTRAN_CAPS -fuscore PETSC_HAVE_FORTRAN_UNDERSCORE matrix.c
> >
> > Satish
> >
> > On Fri, 25 Dec 2009, Matthew Knepley wrote:
> >
> > > Valgrind is not clean for me with the change.
> > >
> > >   Matt
> > >
> > > On Fri, Dec 25, 2009 at 10:28 AM, Satish Balay <balay at mcs.anl.gov>
> wrote:
> > >
> > > > Well - normally the first step with detecting the bugs is to report
> > > > them to the author - and ask for a fix..
> > > >
> > > > Satish
> > > >
> > > > On Fri, 25 Dec 2009, Matthew Knepley wrote:
> > > >
> > > > > I can try, but I still think replacement is the only real
> alternative.
> > > > This
> > > > > is not
> > > > > able to be debugged, or you would not recommend sticking in random
> > > > numbers
> > > > > in malloc() and I would be able to see where an SEGV occurs with
> gdb.
> > > > >
> > > > >   Matt
> > > > >
> > > > > On Fri, Dec 25, 2009 at 10:16 AM, Satish Balay <balay at mcs.anl.gov>
> > > > wrote:
> > > > >
> > > > > > BTW: What linux are you using? ubuntu version? i686 or x86_64?
> etc...
> > > > > >
> > > > > > also try:
> > > > > >
> > > > > > arg->name     = (char *)MALLOC( strlen(p) + 100 );
> > > > > >
> > > > > > satish
> > > > > >
> > > > > >
> > > > > > On Fri, 25 Dec 2009, Satish Balay wrote:
> > > > > >
> > > > > > > Did my suggested change not work for you?
> > > > > > >
> > > > > > > Satish
> > > > > > >
> > > > > > > On Thu, 24 Dec 2009, Matthew Knepley wrote:
> > > > > > >
> > > > > > > > I spent a bunch of time on this today. This shit is
> hopelessly
> > > > broken.
> > > > > > It
> > > > > > > > sucks completely.
> > > > > > > > I cannot get it to run, nor see why it is causing stack
> overruns
> > > > and
> > > > > > SEGVs.
> > > > > > > > If anyone does
> > > > > > > > not think it is hopeless, speak up now. This is a complete
> fucking
> > > > > > > > embarrassment.
> > > > > > > >
> > > > > > > >    Matt
> > > > > > > >
> > > > > > > > On Mon, Dec 21, 2009 at 4:42 PM, Matthew Knepley <
> > > > knepley at gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > This does not make any sense to me because it would be a
> heap
> > > > > > violation,
> > > > > > > > > not a stack smash.
> > > > > > > > >
> > > > > > > > >   Matt
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Dec 21, 2009 at 4:30 PM, Satish Balay <
> balay at mcs.anl.gov
> > > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> [I don't know the correct fix for this - but ] The
> following
> > > > change
> > > > > > is
> > > > > > > > >> getting rid of valgrind messages for me. Maybe you can use
> this,
> > > > > > build
> > > > > > > > >> sowing separately - and continue..
> > > > > > > > >>
> > > > > > > > >> Satish
> > > > > > > > >>
> > > > > > > > >> ----------
> > > > > > > > >>
> > > > > > > > >> diff -r dbe25084c0e4 src/bfort/bfort.c
> > > > > > > > >> --- a/src/bfort/bfort.c Mon Dec 15 22:20:58 2008 -0600
> > > > > > > > >> +++ b/src/bfort/bfort.c Mon Dec 21 16:29:09 2009 -0600
> > > > > > > > >> @@ -2157,7 +2157,7 @@
> > > > > > > > >>
> > > > > > > > >>     /* Current token is name */
> > > > > > > > >>     arg->has_star = (nstar > 0);
> > > > > > > > >> -    arg->name     = (char *)MALLOC( strlen(p) + 1 );
> > > > > > > > >> +    arg->name     = (char *)MALLOC( strlen(p) + 10 );
> > > > > > > > >>     strcpy( arg->name, p );
> > > > > > > > >>
> > > > > > > > >>     /* We can't output the name just yet, because if it is
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> On Mon, 21 Dec 2009, Matthew Knepley wrote:
> > > > > > > > >>
> > > > > > > > >> > The problem appears to be in OutputRoutine() in bfort.c,
> but
> > > > that
> > > > > > code
> > > > > > > > >> is
> > > > > > > > >> > impossible
> > > > > > > > >> > to debug. I can't see where something is getting
> overwritten,
> > > > and
> > > > > > it
> > > > > > > > >> looks
> > > > > > > > >> > like the check
> > > > > > > > >> > only happens when the routine returns. bfort is such
> crap.
> > > > > > > > >> >
> > > > > > > > >> >   Matt
> > > > > > > > >> >
> > > > > > > > >> > On Mon, Dec 21, 2009 at 3:25 PM, Matthew Knepley <
> > > > > > knepley at gmail.com>
> > > > > > > > >> wrote:
> > > > > > > > >> >
> > > > > > > > >> > > On Mon, Dec 21, 2009 at 3:21 PM, Satish Balay <
> > > > > > balay at mcs.anl.gov>
> > > > > > > > >> wrote:
> > > > > > > > >> > >
> > > > > > > > >> > >> On Mon, 21 Dec 2009, Lisandro Dalcín wrote:
> > > > > > > > >> > >>
> > > > > > > > >> > >> > On Mon, Dec 21, 2009 at 5:37 PM, Matthew Knepley <
> > > > > > > > >> knepley at gmail.com>
> > > > > > > > >> > >> wrote:
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > > It says there is a stack smash and no other info.
> This
> > > > is
> > > > > > > > >> completely
> > > > > > > > >> > >> fucking
> > > > > > > > >> > >> > > my development right now.
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > Any chance bfort was built with -fstack-protector
> flag?
> > > > This
> > > > > > > > >> failure
> > > > > > > > >> > >> > could could be signaling an actual old bug in
> bfort... I
> > > > > > would
> > > > > > > > >> > >> > re-build bfort with debug and re-run under
> valgrind...
> > > > > > > > >> > >>
> > > > > > > > >> > >> That must be it.
> > > > > > > > >> > >>
> > > > > > > > >> > >> I just ran my build [which is without
> -fstack-protector] -
> > > > and
> > > > > > > > >> > >> valgrind does flag a bunch of things with bfort.
> > > > > > > > >> > >>
> > > > > > > > >> > >
> > > > > > > > >> > > 1) That flag is nowhere in my build.
> > > > > > > > >> > >
> > > > > > > > >> > > 2) Something changed
> > > > > > > > >> > >
> > > > > > > > >> > >   Matt
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >> I normally install sowing separately and have it in
> my PATH
> > > > -
> > > > > > so that
> > > > > > > > >> > >> it doesn't have to be rebuilt each time I build
> petsc.
> > > > > > > > >> > >>
> > > > > > > > >> > >> I guess we should sync up [our patches] with latest
> sowing
> > > > and
> > > > > > make
> > > > > > > > >> > >> sure its valgrind clean aswell.
> > > > > > > > >> > >>
> > > > > > > > >> > >> Satish
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > --
> > > > > > > > >> > > What most experimenters take for granted before they
> begin
> > > > their
> > > > > > > > >> > > experiments is infinitely more interesting than any
> results
> > > > to
> > > > > > which
> > > > > > > > >> their
> > > > > > > > >> > > experiments lead.
> > > > > > > > >> > > -- Norbert Wiener
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > What most experimenters take for granted before they begin
> their
> > > > > > > > > experiments is infinitely more interesting than any results
> to
> > > > which
> > > > > > their
> > > > > > > > > experiments lead.
> > > > > > > > > -- Norbert Wiener
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > >
> >
>



-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20091225/5b81b0bd/attachment.html>


More information about the petsc-dev mailing list