Here is the valgrind for your +100 fix:<br><br>knepley@khan:/PETSc3/petsc/petsc-dev/src/mat/utils$ valgrind /PETSc3/petsc/petsc-dev/linux-gnu-cxx-debug/bin/bfort -dir /PETSc3/petsc/petsc-dev/src/mat/utils/ftn-auto -mnative -ansi -nomsgs -noprofile -anyname -mapptr -mpi -mpi2 -ferr -ptrprefix Petsc -ptr64 PETSC_USE_POINTER_CONVERSION -fcaps PETSC_HAVE_FORTRAN_CAPS -fuscore PETSC_HAVE_FORTRAN_UNDERSCORE -f90mod_skip_header matio.c convert.c gcreate.c freespace.c getcolv.c ptap.c compressedrow.c matstash.c multequal.c axpy.c freespace.h zerodiag.c matstashspace.c<br>
==20868== Memcheck, a memory error detector.<br>==20868== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.<br>==20868== Using LibVEX rev 1804, a library for dynamic binary translation.<br>==20868== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.<br>
==20868== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation framework.<br>==20868== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.<br>==20868== For more details, rerun with: -v<br>==20868== <br>
==20868== Conditional jump or move depends on uninitialised value(s)<br>==20868== at 0x804C2BB: PrintBody (bfort.c:1362)<br>==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>
==20868== <br>==20868== Conditional jump or move depends on uninitialised value(s)<br>==20868== at 0x804C293: PrintBody (bfort.c:1363)<br>==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>
==20868== <br>==20868== Conditional jump or move depends on uninitialised value(s)<br>==20868== at 0x804C5E7: PrintBody (bfort.c:1384)<br>==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>
==20868== <br>==20868== Conditional jump or move depends on uninitialised value(s)<br>==20868== at 0x804C396: PrintBody (bfort.c:1385)<br>==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>
==20868== <br>==20868== Conditional jump or move depends on uninitialised value(s)<br>==20868== at 0x804C3CE: PrintBody (bfort.c:1387)<br>==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>
==20868== <br>==20868== Conditional jump or move depends on uninitialised value(s)<br>==20868== at 0x804C3E9: PrintBody (bfort.c:1387)<br>==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>
==20868== <br>==20868== Conditional jump or move depends on uninitialised value(s)<br>==20868== at 0x804C589: PrintBody (bfort.c:1406)<br>==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>
==20868== <br>==20868== Use of uninitialised value of size 4<br>==20868== at 0x40239D8: strlen (mc_replace_strmem.c:242)<br>==20868== by 0x4198127: fputs (in /lib/tls/i686/cmov/<a href="http://libc-2.7.so">libc-2.7.so</a>)<br>
==20868== by 0x804C5BE: PrintBody (bfort.c:1408)<br>==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>==20868== <br>==20868== Invalid read of size 1<br>==20868== at 0x40239D8: strlen (mc_replace_strmem.c:242)<br>
==20868== by 0x4198127: fputs (in /lib/tls/i686/cmov/<a href="http://libc-2.7.so">libc-2.7.so</a>)<br>==20868== by 0x804C5BE: PrintBody (bfort.c:1408)<br>==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>
==20868== Address 0x0 is not stack'd, malloc'd or (recently) free'd<br>==20868== <br>==20868== Process terminating with default action of signal 11 (SIGSEGV)<br>==20868== Access not within mapped region at address 0x0<br>
==20868== at 0x40239D8: strlen (mc_replace_strmem.c:242)<br>==20868== by 0x4198127: fputs (in /lib/tls/i686/cmov/<a href="http://libc-2.7.so">libc-2.7.so</a>)<br>==20868== by 0x804C5BE: PrintBody (bfort.c:1408)<br>
==20868== by 0x804A622: OutputRoutine (bfort.c:575)<br>==20868== by 0x804A0B2: main (bfort.c:475)<br>==20868== <br>==20868== ERROR SUMMARY: 72 errors from 9 contexts (suppressed: 17 from 1)<br>==20868== malloc/free: in use at exit: 1,056 bytes in 3 blocks.<br>
==20868== malloc/free: 6 allocs, 3 frees, 2,112 bytes allocated.<br>==20868== For counts of detected errors, rerun with: -v<br>==20868== searching for pointers to 3 not-freed blocks.<br>==20868== checked 243,864 bytes.<br>
==20868== <br>==20868== LEAK SUMMARY:<br>==20868== definitely lost: 0 bytes in 0 blocks.<br>==20868== possibly lost: 0 bytes in 0 blocks.<br>==20868== still reachable: 1,056 bytes in 3 blocks.<br>==20868== suppressed: 0 bytes in 0 blocks.<br>
==20868== Rerun with --leak-check=full to see details of leaked memory.<br>Segmentation fault<br><br>The problem is that argument lists are just not parsed correctly for gcreate.c. You can send that to Bill.<br><br> Matt<br>
<br><div class="gmail_quote">On Fri, Dec 25, 2009 at 10:55 AM, Satish Balay <span dir="ltr"><<a href="mailto:balay@mcs.anl.gov">balay@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
One more thing. If I remove this patch from sowing - the valgrind log<br>
is clean.<br>
<br>
<a href="http://petsc.cs.iit.edu/petsc/externalpackages/sowing-1.1.11/rev/e591c037e500" target="_blank">http://petsc.cs.iit.edu/petsc/externalpackages/sowing-1.1.11/rev/e591c037e500</a><br>
<br>
Perhaps you can find the bug in this change. If not - I'll send a bug<br>
report to Bill.<br>
<font color="#888888"><br>
Satish<br>
</font><div><div></div><div class="h5"><br>
On Fri, 25 Dec 2009, Satish Balay wrote:<br>
<br>
> Can you send me the valgrind.log - with the patch applied to the<br>
> unmodified sowing-1.1.11-a.tar.gz?<br>
><br>
> Also the command you are using to generate this log?<br>
><br>
><br>
> I've used the following:<br>
> valgrind --tool=memcheck -q --log-file=valgrind.log bfort -dir `pwd`/ftn-auto -ansi -nomsgs -noprofile -anyname -mapptr -mpi -mpi2 -ferr -ptrprefix Petsc -ptr64 PETSC_USE_POINTER_CONVERSION -fcaps PETSC_HAVE_FORTRAN_CAPS -fuscore PETSC_HAVE_FORTRAN_UNDERSCORE matrix.c<br>
><br>
> Satish<br>
><br>
> On Fri, 25 Dec 2009, Matthew Knepley wrote:<br>
><br>
> > Valgrind is not clean for me with the change.<br>
> ><br>
> > Matt<br>
> ><br>
> > On Fri, Dec 25, 2009 at 10:28 AM, Satish Balay <<a href="mailto:balay@mcs.anl.gov">balay@mcs.anl.gov</a>> wrote:<br>
> ><br>
> > > Well - normally the first step with detecting the bugs is to report<br>
> > > them to the author - and ask for a fix..<br>
> > ><br>
> > > Satish<br>
> > ><br>
> > > On Fri, 25 Dec 2009, Matthew Knepley wrote:<br>
> > ><br>
> > > > I can try, but I still think replacement is the only real alternative.<br>
> > > This<br>
> > > > is not<br>
> > > > able to be debugged, or you would not recommend sticking in random<br>
> > > numbers<br>
> > > > in malloc() and I would be able to see where an SEGV occurs with gdb.<br>
> > > ><br>
> > > > Matt<br>
> > > ><br>
> > > > On Fri, Dec 25, 2009 at 10:16 AM, Satish Balay <<a href="mailto:balay@mcs.anl.gov">balay@mcs.anl.gov</a>><br>
> > > wrote:<br>
> > > ><br>
> > > > > BTW: What linux are you using? ubuntu version? i686 or x86_64? etc...<br>
> > > > ><br>
> > > > > also try:<br>
> > > > ><br>
> > > > > arg->name = (char *)MALLOC( strlen(p) + 100 );<br>
> > > > ><br>
> > > > > satish<br>
> > > > ><br>
> > > > ><br>
> > > > > On Fri, 25 Dec 2009, Satish Balay wrote:<br>
> > > > ><br>
> > > > > > Did my suggested change not work for you?<br>
> > > > > ><br>
> > > > > > Satish<br>
> > > > > ><br>
> > > > > > On Thu, 24 Dec 2009, Matthew Knepley wrote:<br>
> > > > > ><br>
> > > > > > > I spent a bunch of time on this today. This shit is hopelessly<br>
> > > broken.<br>
> > > > > It<br>
> > > > > > > sucks completely.<br>
> > > > > > > I cannot get it to run, nor see why it is causing stack overruns<br>
> > > and<br>
> > > > > SEGVs.<br>
> > > > > > > If anyone does<br>
> > > > > > > not think it is hopeless, speak up now. This is a complete fucking<br>
> > > > > > > embarrassment.<br>
> > > > > > ><br>
> > > > > > > Matt<br>
> > > > > > ><br>
> > > > > > > On Mon, Dec 21, 2009 at 4:42 PM, Matthew Knepley <<br>
> > > <a href="mailto:knepley@gmail.com">knepley@gmail.com</a>><br>
> > > > > wrote:<br>
> > > > > > ><br>
> > > > > > > > This does not make any sense to me because it would be a heap<br>
> > > > > violation,<br>
> > > > > > > > not a stack smash.<br>
> > > > > > > ><br>
> > > > > > > > Matt<br>
> > > > > > > ><br>
> > > > > > > ><br>
> > > > > > > > On Mon, Dec 21, 2009 at 4:30 PM, Satish Balay <<a href="mailto:balay@mcs.anl.gov">balay@mcs.anl.gov</a><br>
> > > ><br>
> > > > > wrote:<br>
> > > > > > > ><br>
> > > > > > > >> [I don't know the correct fix for this - but ] The following<br>
> > > change<br>
> > > > > is<br>
> > > > > > > >> getting rid of valgrind messages for me. Maybe you can use this,<br>
> > > > > build<br>
> > > > > > > >> sowing separately - and continue..<br>
> > > > > > > >><br>
> > > > > > > >> Satish<br>
> > > > > > > >><br>
> > > > > > > >> ----------<br>
> > > > > > > >><br>
> > > > > > > >> diff -r dbe25084c0e4 src/bfort/bfort.c<br>
> > > > > > > >> --- a/src/bfort/bfort.c Mon Dec 15 22:20:58 2008 -0600<br>
> > > > > > > >> +++ b/src/bfort/bfort.c Mon Dec 21 16:29:09 2009 -0600<br>
> > > > > > > >> @@ -2157,7 +2157,7 @@<br>
> > > > > > > >><br>
> > > > > > > >> /* Current token is name */<br>
> > > > > > > >> arg->has_star = (nstar > 0);<br>
> > > > > > > >> - arg->name = (char *)MALLOC( strlen(p) + 1 );<br>
> > > > > > > >> + arg->name = (char *)MALLOC( strlen(p) + 10 );<br>
> > > > > > > >> strcpy( arg->name, p );<br>
> > > > > > > >><br>
> > > > > > > >> /* We can't output the name just yet, because if it is<br>
> > > > > > > >><br>
> > > > > > > >><br>
> > > > > > > >><br>
> > > > > > > >><br>
> > > > > > > >> On Mon, 21 Dec 2009, Matthew Knepley wrote:<br>
> > > > > > > >><br>
> > > > > > > >> > The problem appears to be in OutputRoutine() in bfort.c, but<br>
> > > that<br>
> > > > > code<br>
> > > > > > > >> is<br>
> > > > > > > >> > impossible<br>
> > > > > > > >> > to debug. I can't see where something is getting overwritten,<br>
> > > and<br>
> > > > > it<br>
> > > > > > > >> looks<br>
> > > > > > > >> > like the check<br>
> > > > > > > >> > only happens when the routine returns. bfort is such crap.<br>
> > > > > > > >> ><br>
> > > > > > > >> > Matt<br>
> > > > > > > >> ><br>
> > > > > > > >> > On Mon, Dec 21, 2009 at 3:25 PM, Matthew Knepley <<br>
> > > > > <a href="mailto:knepley@gmail.com">knepley@gmail.com</a>><br>
> > > > > > > >> wrote:<br>
> > > > > > > >> ><br>
> > > > > > > >> > > On Mon, Dec 21, 2009 at 3:21 PM, Satish Balay <<br>
> > > > > <a href="mailto:balay@mcs.anl.gov">balay@mcs.anl.gov</a>><br>
> > > > > > > >> wrote:<br>
> > > > > > > >> > ><br>
> > > > > > > >> > >> On Mon, 21 Dec 2009, Lisandro Dalcín wrote:<br>
> > > > > > > >> > >><br>
> > > > > > > >> > >> > On Mon, Dec 21, 2009 at 5:37 PM, Matthew Knepley <<br>
> > > > > > > >> <a href="mailto:knepley@gmail.com">knepley@gmail.com</a>><br>
> > > > > > > >> > >> wrote:<br>
> > > > > > > >> > >> > ><br>
> > > > > > > >> > >> > > It says there is a stack smash and no other info. This<br>
> > > is<br>
> > > > > > > >> completely<br>
> > > > > > > >> > >> fucking<br>
> > > > > > > >> > >> > > my development right now.<br>
> > > > > > > >> > >> > ><br>
> > > > > > > >> > >> ><br>
> > > > > > > >> > >> > Any chance bfort was built with -fstack-protector flag?<br>
> > > This<br>
> > > > > > > >> failure<br>
> > > > > > > >> > >> > could could be signaling an actual old bug in bfort... I<br>
> > > > > would<br>
> > > > > > > >> > >> > re-build bfort with debug and re-run under valgrind...<br>
> > > > > > > >> > >><br>
> > > > > > > >> > >> That must be it.<br>
> > > > > > > >> > >><br>
> > > > > > > >> > >> I just ran my build [which is without -fstack-protector] -<br>
> > > and<br>
> > > > > > > >> > >> valgrind does flag a bunch of things with bfort.<br>
> > > > > > > >> > >><br>
> > > > > > > >> > ><br>
> > > > > > > >> > > 1) That flag is nowhere in my build.<br>
> > > > > > > >> > ><br>
> > > > > > > >> > > 2) Something changed<br>
> > > > > > > >> > ><br>
> > > > > > > >> > > Matt<br>
> > > > > > > >> > ><br>
> > > > > > > >> > ><br>
> > > > > > > >> > >> I normally install sowing separately and have it in my PATH<br>
> > > -<br>
> > > > > so that<br>
> > > > > > > >> > >> it doesn't have to be rebuilt each time I build petsc.<br>
> > > > > > > >> > >><br>
> > > > > > > >> > >> I guess we should sync up [our patches] with latest sowing<br>
> > > and<br>
> > > > > make<br>
> > > > > > > >> > >> sure its valgrind clean aswell.<br>
> > > > > > > >> > >><br>
> > > > > > > >> > >> Satish<br>
> > > > > > > >> > ><br>
> > > > > > > >> > ><br>
> > > > > > > >> > ><br>
> > > > > > > >> > ><br>
> > > > > > > >> > > --<br>
> > > > > > > >> > > What most experimenters take for granted before they begin<br>
> > > their<br>
> > > > > > > >> > > experiments is infinitely more interesting than any results<br>
> > > to<br>
> > > > > which<br>
> > > > > > > >> their<br>
> > > > > > > >> > > experiments lead.<br>
> > > > > > > >> > > -- Norbert Wiener<br>
> > > > > > > >> > ><br>
> > > > > > > >> ><br>
> > > > > > > >> ><br>
> > > > > > > >> ><br>
> > > > > > > >> ><br>
> > > > > > > >><br>
> > > > > > > ><br>
> > > > > > > ><br>
> > > > > > > ><br>
> > > > > > > > --<br>
> > > > > > > > What most experimenters take for granted before they begin their<br>
> > > > > > > > experiments is infinitely more interesting than any results to<br>
> > > which<br>
> > > > > their<br>
> > > > > > > > experiments lead.<br>
> > > > > > > > -- Norbert Wiener<br>
> > > > > > > ><br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > ><br>
> > > > ><br>
> > > ><br>
> > > ><br>
> > > ><br>
> > > ><br>
> > ><br>
> ><br>
> ><br>
> ><br>
> ><br>
><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener<br>