<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Jan 22, 2014 at 7:22 PM, David Liu <span dir="ltr"><<a href="mailto:daveliu@mit.edu" target="_blank">daveliu@mit.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Okay, I reinstalled Petsc with Mpich, and the list of errors is a lot shorter: It looks like I already have openblas here too. Is this the best I can get?</div></blockquote><div><br></div><div>The common solution is to put these in a valgrind suppressions file.</div>
<div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>
<p>==22666== Memcheck, a memory error detector</p>
<p>==22666== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.</p>
<p>==22666== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info</p>
<p>==22666== Command: ./run</p>
<p>==22666== </p>
<p>==22666== Conditional jump or move depends on uninitialised value(s)</p>
<p>==22666== at 0xAE774AF: ____strtoul_l_internal (strtol_l.c:438)</p>
<p>==22666== by 0x7442EE2: gotoblas_affinity_init (in /usr/lib/openblas-base/libopenblas.so.0)</p>
<p>==22666== by 0x711811A: gotoblas_init (in /usr/lib/openblas-base/libopenblas.so.0)</p>
<p>==22666== by 0x400DF7F: call_init (dl-init.c:85)</p>
<p>==22666== by 0x400E076: _dl_init (dl-init.c:134)</p>
<p>==22666== by 0x4000B29: ??? (in /lib/x86_64-linux-gnu/<a href="http://ld-2.13.so" target="_blank">ld-2.13.so</a>)</p>
<p>==22666== </p>
<p>==22666== Conditional jump or move depends on uninitialised value(s)</p>
<p>==22666== at 0xAE77427: ____strtoul_l_internal (strtol_l.c:442)</p>
<p>==22666== by 0x7442EE2: gotoblas_affinity_init (in /usr/lib/openblas-base/libopenblas.so.0)</p>
<p>==22666== by 0x711811A: gotoblas_init (in /usr/lib/openblas-base/libopenblas.so.0)</p>
<p>==22666== by 0x400DF7F: call_init (dl-init.c:85)</p>
<p>==22666== by 0x400E076: _dl_init (dl-init.c:134)</p>
<p>==22666== by 0x4000B29: ??? (in /lib/x86_64-linux-gnu/<a href="http://ld-2.13.so" target="_blank">ld-2.13.so</a>)</p>
<p>==22666== </p>
<p>==22666== Use of uninitialised value of size 8</p>
<p>==22666== at 0xAE77465: ____strtoul_l_internal (strtol_l.c:466)</p>
<p>==22666== by 0x7442EE2: gotoblas_affinity_init (in /usr/lib/openblas-base/libopenblas.so.0)</p>
<p>==22666== by 0x711811A: gotoblas_init (in /usr/lib/openblas-base/libopenblas.so.0)</p>
<p>==22666== by 0x400DF7F: call_init (dl-init.c:85)</p>
<p>==22666== by 0x400E076: _dl_init (dl-init.c:134)</p>
<p>==22666== by 0x4000B29: ??? (in /lib/x86_64-linux-gnu/<a href="http://ld-2.13.so" target="_blank">ld-2.13.so</a>)</p>
<p>==22666== </p>
<p>==22666== </p>
<p>==22666== HEAP SUMMARY:</p>
<p>==22666== in use at exit: 0 bytes in 0 blocks</p>
<p>==22666== total heap usage: 222 allocs, 222 frees, 117,046 bytes allocated</p>
<p>==22666== </p>
<p>==22666== All heap blocks were freed -- no leaks are possible</p>
<p>==22666== </p>
<p>==22666== For counts of detected and suppressed errors, rerun with: -v</p>
<p>==22666== Use --track-origins=yes to see where uninitialised values come from</p>
<p>==22666== ERROR SUMMARY: 6 errors from 3 contexts (suppressed: 4 from 4)</p></div></div><div class="gmail_extra"><br><br><div class="gmail_quote"><div class="im">On Wed, Jan 22, 2014 at 6:57 PM, Jed Brown <span dir="ltr"><<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>></span> wrote:<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><div>David Liu <<a href="mailto:daveliu@mit.edu" target="_blank">daveliu@mit.edu</a>> writes:<br>
<br>
> sure thing. Here's what I get when I directly run the executable (no MPI if<br>
> I understand correctly).<br>
<br>
</div></div>No, you are linked to Open MPI and it is very noisy under Valgrind. Use<br>
MPICH if you want something tight. I don't know if the gotoblas noise<br>
has been fixed, but that project has evolved into OpenBLAS, which you<br>
may as well use since it is the maintained code base.<br>
<br>
<a href="http://www.openblas.net/" target="_blank">http://www.openblas.net/</a><br>
<div><div><div class="im"><br>
> If I do "mpirun -n 1 valgrind ./run", I get the exact same thing.<br>
><br></div><div><div class="h5">
> ==29900== Memcheck, a memory error detector<br>
><br>
> ==29900== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.<br>
><br>
> ==29900== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info<br>
><br>
> ==29900== Command: ./run<br>
><br>
> ==29900==<br>
><br>
> ==29900== Conditional jump or move depends on uninitialised value(s)<br>
><br>
> ==29900== at 0xACE64AF: ____strtoul_l_internal (strtol_l.c:438)<br>
><br>
> ==29900== by 0x71C2EE2: gotoblas_affinity_init (in<br>
> /usr/lib/openblas-base/libopenblas.so.0)<br>
><br>
> ==29900== by 0x6E9811A: gotoblas_init (in<br>
> /usr/lib/openblas-base/libopenblas.so.0)<br>
><br>
> ==29900== by 0x400DF7F: call_init (dl-init.c:85)<br>
><br>
> ==29900== by 0x400E076: _dl_init (dl-init.c:134)<br>
><br>
> ==29900== by 0x4000B29: ??? (in /lib/x86_64-linux-gnu/<a href="http://ld-2.13.so" target="_blank">ld-2.13.so</a>)<br>
><br>
> ==29900==<br>
><br>
> ==29900== Conditional jump or move depends on uninitialised value(s)<br>
><br>
> ==29900== at 0xACE6427: ____strtoul_l_internal (strtol_l.c:442)<br>
><br>
> ==29900== by 0x71C2EE2: gotoblas_affinity_init (in<br>
> /usr/lib/openblas-base/libopenblas.so.0)<br>
><br>
> ==29900== by 0x6E9811A: gotoblas_init (in<br>
> /usr/lib/openblas-base/libopenblas.so.0)<br>
><br>
> ==29900== by 0x400DF7F: call_init (dl-init.c:85)<br>
><br>
> ==29900== by 0x400E076: _dl_init (dl-init.c:134)<br>
><br>
> ==29900== by 0x4000B29: ??? (in /lib/x86_64-linux-gnu/<a href="http://ld-2.13.so" target="_blank">ld-2.13.so</a>)<br>
><br>
> ==29900==<br>
><br>
> ==29900== Use of uninitialised value of size 8<br>
><br>
> ==29900== at 0xACE6465: ____strtoul_l_internal (strtol_l.c:466)<br>
><br>
> ==29900== by 0x71C2EE2: gotoblas_affinity_init (in<br>
> /usr/lib/openblas-base/libopenblas.so.0)<br>
><br>
> ==29900== by 0x6E9811A: gotoblas_init (in<br>
> /usr/lib/openblas-base/libopenblas.so.0)<br>
><br>
> ==29900== by 0x400DF7F: call_init (dl-init.c:85)<br>
><br>
> ==29900== by 0x400E076: _dl_init (dl-init.c:134)<br>
><br>
> ==29900== by 0x4000B29: ??? (in /lib/x86_64-linux-gnu/<a href="http://ld-2.13.so" target="_blank">ld-2.13.so</a>)<br>
><br>
> ==29900==<br>
><br>
> ==29900== Invalid read of size 8<br>
><br>
> ==29900== at 0xAD378CD: _wordcopy_fwd_dest_aligned (wordcopy.c:205)<br>
><br>
> ==29900== by 0xAD3156E: __GI_memmove (memmove.c:76)<br>
><br>
> ==29900== by 0xAD38B7B: argz_insert (argz-insert.c:55)<br>
><br>
> ==29900== by 0xA4405E5: ??? (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA4407FF: ??? (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA43FFC8: ??? (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA440F57: lt_dlforeachfile (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA447FCE: mca_base_component_find (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA448AC1: mca_base_components_open (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA463C44: opal_paffinity_base_open (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA439AD2: opal_init (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA1E6B3E: orte_init (in<br>
> /usr/lib/openmpi/lib/libopen-rte.so.0.0.0)<br>
><br>
> ==29900== Address 0xc15a998 is 40 bytes inside a block of size 47 alloc'd<br>
><br>
> ==29900== at 0x4C28BED: malloc (vg_replace_malloc.c:263)<br>
><br>
> ==29900== by 0xA43F658: lt__malloc (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA44078E: ??? (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA43FFC8: ??? (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA440F57: lt_dlforeachfile (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA447FCE: mca_base_component_find (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA448AC1: mca_base_components_open (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA463C44: opal_paffinity_base_open (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA439AD2: opal_init (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA1E6B3E: orte_init (in<br>
> /usr/lib/openmpi/lib/libopen-rte.so.0.0.0)<br>
><br>
> ==29900== by 0x9F5E373: ??? (in /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x9F7F20D: PMPI_Init_thread (in<br>
> /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900==<br>
><br>
> ==29900== Syscall param sched_setaffinity(mask) points to unaddressable<br>
> byte(s)<br>
><br>
> ==29900== at 0xAD852F9: syscall (syscall.S:39)<br>
><br>
> ==29900== by 0xFD75621: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_paffinity_linux.so)<br>
><br>
> ==29900== by 0xFD75A3C: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_paffinity_linux.so)<br>
><br>
> ==29900== by 0xFD76599: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_paffinity_linux.so)<br>
><br>
> ==29900== by 0xFD754AC: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_paffinity_linux.so)<br>
><br>
> ==29900== by 0xA463AEA: opal_paffinity_base_select (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA439B0D: opal_init (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA1E6B3E: orte_init (in<br>
> /usr/lib/openmpi/lib/libopen-rte.so.0.0.0)<br>
><br>
> ==29900== by 0x9F5E373: ??? (in /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x9F7F20D: PMPI_Init_thread (in<br>
> /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x4F95F49: PetscInitialize (pinit.c:675)<br>
><br>
> ==29900== by 0x400CC6: main (prog.c:5)<br>
><br>
> ==29900== Address 0x0 is not stack'd, malloc'd or (recently) free'd<br>
><br>
> ==29900==<br>
><br>
> ==29900== Conditional jump or move depends on uninitialised value(s)<br>
><br>
> ==29900== at 0x9F5E578: ??? (in /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x9F7F20D: PMPI_Init_thread (in<br>
> /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x4F95F49: PetscInitialize (pinit.c:675)<br>
><br>
> ==29900== by 0x400CC6: main (prog.c:5)<br>
><br>
> ==29900==<br>
><br>
> ==29900== Conditional jump or move depends on uninitialised value(s)<br>
><br>
> ==29900== at 0x9F5E57C: ??? (in /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x9F7F20D: PMPI_Init_thread (in<br>
> /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x4F95F49: PetscInitialize (pinit.c:675)<br>
><br>
> ==29900== by 0x400CC6: main (prog.c:5)<br>
><br>
> ==29900==<br>
><br>
> ==29900== Syscall param writev(vector[...]) points to uninitialised byte(s)<br>
><br>
> ==29900== at 0xAD81BE7: writev (writev.c:56)<br>
><br>
> ==29900== by 0x11397E22: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so)<br>
><br>
> ==29900== by 0x11398C5C: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so)<br>
><br>
> ==29900== by 0x1139C2EB: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so)<br>
><br>
> ==29900== by 0x1118E7BD: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_rml_oob.so)<br>
><br>
> ==29900== by 0x1118EDB8: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_rml_oob.so)<br>
><br>
> ==29900== by 0x10D85AD8: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so)<br>
><br>
> ==29900== by 0x10D854DE: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so)<br>
><br>
> ==29900== by 0x9F5EBAE: ??? (in /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x9F7F20D: PMPI_Init_thread (in<br>
> /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x4F95F49: PetscInitialize (pinit.c:675)<br>
><br>
> ==29900== by 0x400CC6: main (prog.c:5)<br>
><br>
> ==29900== Address 0x1815faf7 is 87 bytes inside a block of size 256 alloc'd<br>
><br>
> ==29900== at 0x4C28CCE: realloc (vg_replace_malloc.c:632)<br>
><br>
> ==29900== by 0xA43ADB7: ??? (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0xA43B8CD: ??? (in<br>
> /usr/lib/openmpi/lib/libopen-pal.so.0.0.0)<br>
><br>
> ==29900== by 0x10D85AAC: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so)<br>
><br>
> ==29900== by 0x10D854DE: ??? (in<br>
> /usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so)<br>
><br>
> ==29900== by 0x9F5EBAE: ??? (in /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x9F7F20D: PMPI_Init_thread (in<br>
> /usr/lib/openmpi/lib/libmpi.so.0.0.4)<br>
><br>
> ==29900== by 0x4F95F49: PetscInitialize (pinit.c:675)<br>
><br>
> ==29900== by 0x400CC6: main (prog.c:5)<br>
><br>
> ==29900==<br>
><br>
> ==29900==<br>
><br>
> ==29900== HEAP SUMMARY:<br>
><br>
> ==29900== in use at exit: 258,198 bytes in 2,787 blocks<br>
><br>
> ==29900== total heap usage: 11,718 allocs, 8,931 frees, 17,060,981 bytes<br>
> allocated<br>
><br>
> ==29900==<br>
><br>
> ==29900== LEAK SUMMARY:<br>
><br>
> ==29900== definitely lost: 5,956 bytes in 55 blocks<br>
><br>
> ==29900== indirectly lost: 3,722 bytes in 22 blocks<br>
><br>
> ==29900== possibly lost: 0 bytes in 0 blocks<br>
><br>
> ==29900== still reachable: 248,520 bytes in 2,710 blocks<br>
><br>
> ==29900== suppressed: 0 bytes in 0 blocks<br>
><br>
> ==29900== Rerun with --leak-check=full to see details of leaked memory<br>
><br>
> ==29900==<br>
><br>
> ==29900== For counts of detected and suppressed errors, rerun with: -v<br>
><br>
> ==29900== Use --track-origins=yes to see where uninitialised values come<br>
> from<br>
><br>
> ==29900== ERROR SUMMARY: 619 errors from 8 contexts (suppressed: 4 from 4)<br>
><br>
><br>
> On Wed, Jan 22, 2014 at 6:40 PM, Jed Brown <<a href="mailto:jed@jedbrown.org" target="_blank">jed@jedbrown.org</a>> wrote:<br>
><br>
>> David Liu <<a href="mailto:daveliu@mit.edu" target="_blank">daveliu@mit.edu</a>> writes:<br>
>><br>
>> > Hi, I'm running a very simple code, consisting of just PetscFinalize and<br>
>> > PetscInitialize and nothing else.<br>
>> ><br>
>> > I'm running it with valgrind using the command<br>
>> > "valgrind ./run"<br>
>> ><br>
>> > I also tried (as the petsc homepage suggests)<br>
>> > ${PETSC_DIR}/bin/petscmpiexec -valgrind -n 1 ./run -malloc off<br>
>> ><br>
>> > For both cases, I get tons of error messages like<br>
>> > "Conditional jump or move depends on uninitialised value(s)"<br>
>> > "Use of uninitialised value of size 8"<br>
>> > "Invalid read of size 8"<br>
>> > "Address 0xc15a998 is 40 bytes inside a block of size 47 alloc'd"<br>
>><br>
>> You have to send the *exact and complete* output, not snippets. Are you<br>
>> using Open MPI? Chances are this is in the stack somewhere because we<br>
>> regularly test PETSc itself using valgrind.<br>
>><br>
</div></div></div></div></blockquote></div><br></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener
</div></div>