[petsc-dev] Error during KSPDestroy

Matthew Knepley knepley at gmail.com
Sun May 6 08:34:15 CDT 2012


On Sun, May 6, 2012 at 9:24 AM, Alexander Grayver
<agrayver at gfz-potsdam.de>wrote:

> **
> Hm, valgrind gives a lot of output like that (see full log in previous
> message):
>

Can you run this with --download-f-blas-lapack? This sounds much more like
an MKL bug.

   Matt


> ==20287== Invalid read of size 8
> ==20287==    at 0x1AE79DA1: mkl_lapack_dlasq3 (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so)
> ==20287==    by 0x5CF7AE5: mkl_lapack_dlasq3 (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so)
> ==20287==    by 0x1AE79617: mkl_lapack_dlasq2 (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so)
> ==20287==    by 0x5CF7A15: mkl_lapack_dlasq2 (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so)
> ==20287==    by 0x1AA3E72A: mkl_lapack_dlasq1 (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so)
> ==20287==    by 0x5CF79C7: mkl_lapack_dlasq1 (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so)
> ==20287==    by 0x1AC44D6C: mkl_lapack_zbdsqr (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so)
> ==20287==    by 0x5CFFEF8: mkl_lapack_zbdsqr (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so)
> ==20287==    by 0x1AC7D989: mkl_lapack_zgesvd (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so)
> ==20287==    by 0x5D021C0: mkl_lapack_zgesvd (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so)
> ==20287==    by 0x5899E43: ZGESVD (in
> /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_lp64.so)
> ==20287==    by 0x697017: KSPComputeExtremeSingularValues_GMRES
> (gmreig.c:46)
> ==20287==    by 0x69EFBA: KSPComputeExtremeSingularValues (itfunc.c:47)
> ==20287==    by 0x4509BC: main (solveTest.c:62)
> ==20287==  Address 0x11363d48 is not stack'd, malloc'd or (recently) free'd
>
>
> On 06.05.2012 15:21, Alexander Grayver wrote:
>
> On 06.05.2012 15:07, Matthew Knepley wrote:
>
>    Hello,
>>>
>>> I use KSP and random rhs to compute largest singular value:
>>>
>>
>>  1) Is this the whole program? If not, this can be caused by memory
>> corruption somewhere else. This is what I suspect.
>>
>>
>> Matt,
>>
>> I can reproduce error using attached test programm and this matrix (7 mb):
>> http://dl.dropbox.com/u/60982984/A.dat
>>
>
>  I run it fine with the latest petsc-dev:
>
>    1.405802e+00
>
>  Can you valgrind it on your machine?
>
>
> I did:
> valgrind --tool=memcheck -q --num-callers=20 --log-file=valgrind.log.%p
> /solveTest -ksp_monitor_true_residual -log_summary -mat_type aij -ksp_rtol
> 1.0e-10 -malloc off
>
> The error is better constrained:
>
> ==20287== Invalid read of size 8
> ==20287==    at 0x7874B4C: opal_os_path (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libopen-pal.so.0.0.0)
> ==20287==    by 0x75F2E27: orte_session_dir_finalize (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0)
> ==20287==    by 0x76012E8: orte_errmgr_base_error_abort (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0)
> ==20287==    by 0x73396E9: ompi_mpi_abort (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libmpi.so.0.0.2)
> ==20287==    by 0x734F36E: PMPI_Abort (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libmpi.so.0.0.2)
> ==20287==    by 0x7499AB: PetscDefaultSignalHandler (signal.c:169)
> ==20287==    by 0x749267: PetscSignalHandler_Private (signal.c:53)
> ==20287==    by 0x924B9DF: ??? (in /lib64/libc-2.11.1.so)
> ==20287==    by 0x535D9E: VecDestroyVecs (vector.c:653)
> ==20287==    by 0x68B61D: KSPReset_GMRES (gmres.c:258)
> ==20287==    by 0x6A9D39: KSPReset (itfunc.c:733)
> ==20287==    by 0x6AA839: KSPDestroy (itfunc.c:780)
> ==20287==    by 0x4509F8: main (solveTest.c:66)
> ==20287==  Address 0xbde4860 is 0 bytes inside a block of size 2 alloc'd
> ==20287==    at 0x4C26B9B: malloc (vg_replace_malloc.c:263)
> ==20287==    by 0x92876DF: vasprintf (in /lib64/libc-2.11.1.so)
> ==20287==    by 0x9266C67: asprintf (in /lib64/libc-2.11.1.so)
> ==20287==    by 0x75F1701: orte_util_convert_vpid_to_string (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0)
> ==20287==    by 0x75F2D4A: orte_session_dir_finalize (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0)
> ==20287==    by 0x76012E8: orte_errmgr_base_error_abort (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0)
> ==20287==    by 0x73396E9: ompi_mpi_abort (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libmpi.so.0.0.2)
> ==20287==    by 0x734F36E: PMPI_Abort (in
> /opt/mpi/intel/openmpi-1.4.2/lib/libmpi.so.0.0.2)
> ==20287==    by 0x7499AB: PetscDefaultSignalHandler (signal.c:169)
> ==20287==    by 0x749267: PetscSignalHandler_Private (signal.c:53)
> ==20287==    by 0x924B9DF: ??? (in /lib64/libc-2.11.1.so)
> ==20287==    by 0x535D9E: VecDestroyVecs (vector.c:653)
> ==20287==    by 0x68B61D: KSPReset_GMRES (gmres.c:258)
> ==20287==    by 0x6A9D39: KSPReset (itfunc.c:733)
> ==20287==    by 0x6AA839: KSPDestroy (itfunc.c:780)
> ==20287==    by 0x4509F8: main (solveTest.c:66)
>
> Full log is attached.
>
> Important.
> If I comment this line:
> KSPComputeExtremeSingularValues(ksp, &maxx, &minx);
>
> It works.
>
> --
>
> Regards,
> Alexander
>
>
>
> --
> Regards,
> Alexander
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120506/69670179/attachment.html>


More information about the petsc-dev mailing list