[petsc-users] MUMPS solver crash with petsc-3.2

Barry Smith bsmith at mcs.anl.gov
Sat Dec 3 13:36:21 CST 2011


   It seems likely a bug in that older version of Parmetis or MUMPS.   Since parametis has been major fixed up since that release it would be crazy to debug the old version.

   You should switch to petsc-dev http://www.mcs.anl.gov/petsc/developers/index.html and use the options --download-mumps --download-metis --download-parmetis if the problem still persists we will help you debug it but my guess is it will start working.

    Barry


   
On Dec 3, 2011, at 6:21 AM, Gong Ding wrote:

> I indid use --download-parmetis and --download-mumps.
> The parmetis version is 3.2.0-p1.
> Maybe mumps is not compatable with this version?
> 
> 
> 
>   This is a problem in Metis/Parmetis.   PETSc 3.2 MUST be used with parmetis 3.2.0 
> 
>    Are you perhaps using Metis/Parmetis metis-5.0.2 and parmetis-4.0.2? Did you use --download-parmetis or install it yourself? You should use the --download to the the right one
> 
>   Barry
> 
> 
> On Dec 2, 2011, at 3:47 AM, Gong Ding wrote:
> 
>> petsc-3.1-p8 & mumps 4.9 works well.
>> However, petsc-3.2 p3-p5 & mumps 4.10 seems have memory problem.
>> 
>> The code occasionally crash on Linux/X64 and alwasy crash on AIX/PPC.
>> This problem may caused by mumps4.10 or parmetis, but I havn't test.
>> 
>> The valgrind report:
>> 
>> ==5139== Invalid read of size 8
>> ==5139==    at 0x57BE749: __intel_new_memcpy (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
>> ==5139==    by 0x57A0AF5: _intel_fast_memcpy.J (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
>> ==5139==    by 0x2269296: __GrowBisection (initpart.c:200)
>> ==5139==    by 0x2269C5F: __Init2WayPartition (initpart.c:36)
>> ==5139==    by 0x223DDFA: __MlevelNodeBisection (ometis.c:485)
>> ==5139==    by 0x223DB9B: __MlevelNodeBisectionMultiple (ometis.c:405)
>> ==5139==    by 0x223D957: __MlevelNestedDissection (ometis.c:289)
>> ==5139==    by 0x223DA09: __MlevelNestedDissection (ometis.c:309)
>> ==5139==    by 0x223E837: METIS_NodeND (ometis.c:157)
>> ==5139==    by 0x2245E31: metis_nodend_ (frename.c:122)
>> ==5139==    by 0x2190666: dmumps_195_ (dmumps_part2.F:1435)
>> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
>> ==5139==  Address 0x8718408 is 440 bytes inside a block of size 444 alloc'd
>> ==5139==    at 0x4A0776F: malloc (vg_replace_malloc.c:263)
>> ==5139==    by 0x226E6A2: __GKmalloc (util.c:111)
>> ==5139==    by 0x226E6E4: __idxmalloc (util.c:60)
>> ==5139==    by 0x2269014: __GrowBisection (initpart.c:101)
>> ==5139==    by 0x2269C5F: __Init2WayPartition (initpart.c:36)
>> ==5139==    by 0x223DDFA: __MlevelNodeBisection (ometis.c:485)
>> ==5139==    by 0x223DB9B: __MlevelNodeBisectionMultiple (ometis.c:405)
>> ==5139==    by 0x223D957: __MlevelNestedDissection (ometis.c:289)
>> ==5139==    by 0x223DA09: __MlevelNestedDissection (ometis.c:309)
>> ==5139==    by 0x223E837: METIS_NodeND (ometis.c:157)
>> ==5139==    by 0x2245E31: metis_nodend_ (frename.c:122)
>> ==5139==    by 0x2190666: dmumps_195_ (dmumps_part2.F:1435)
>> ==5139==
>> ==5139== Invalid read of size 8
>> ==5139==    at 0x57BE609: __intel_new_memcpy (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
>> ==5139==    by 0x73F8CCCCC: ???
>> ==5139==    by 0x300000001F: ???
>> ==5139==    by 0x7FEFF8B4F: ???
>> ==5139==    by 0x7FEFF8C37: ???
>> ==5139==    by 0xE326EEF: ???
>> ==5139==    by 0x216F: ???
>> ==5139==    by 0x223DA09: __MlevelNestedDissection (ometis.c:309)
>> ==5139==    by 0x223E837: METIS_NodeND (ometis.c:157)
>> ==5139==    by 0x2245E31: metis_nodend_ (frename.c:122)
>> ==5139==    by 0x2190666: dmumps_195_ (dmumps_part2.F:1435)
>> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
>> ==5139==  Address 0x9288b38 is 0 bytes after a block of size 15,224 alloc'd
>> ==5139==    at 0x4A0776F: malloc (vg_replace_malloc.c:263)
>> ==5139==    by 0x226E6A2: __GKmalloc (util.c:111)
>> ==5139==    by 0x226E6E4: __idxmalloc (util.c:60)
>> ==5139==    by 0x223DB60: __MlevelNodeBisectionMultiple (ometis.c:402)
>> ==5139==    by 0x223D957: __MlevelNestedDissection (ometis.c:289)
>> ==5139==    by 0x223DA09: __MlevelNestedDissection (ometis.c:309)
>> ==5139==    by 0x223E837: METIS_NodeND (ometis.c:157)
>> ==5139==    by 0x2245E31: metis_nodend_ (frename.c:122)
>> ==5139==    by 0x2190666: dmumps_195_ (dmumps_part2.F:1435)
>> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
>> ==5139==    by 0x216A8DC: dmumps_ (dmumps_part1.F:409)
>> ==5139==    by 0x209407F: dmumps_f77_ (dmumps_part3.F:6651)
>> ==5139==
>> ==5139== Invalid read of size 8
>> ==5139==    at 0x57BE609: __intel_new_memcpy (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
>> ==5139==    by 0x57A0AF5: _intel_fast_memcpy.J (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
>> ==5139==    by 0x21913DD: dmumps_557_ (dmumps_part2.F:2147)
>> ==5139==    by 0x218E243: dmumps_195_ (dmumps_part2.F:1535)
>> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
>> ==5139==    by 0x216A8DC: dmumps_ (dmumps_part1.F:409)
>> ==5139==    by 0x209407F: dmumps_f77_ (dmumps_part3.F:6651)
>> ==5139==    by 0x2073047: dmumps_c (mumps_c.c:422)
>> ==5139==    by 0x19755BF: MatLUFactorSymbolic_AIJMUMPS (mumps.c:893)
>> ==5139==    by 0x1831D89: MatLUFactorSymbolic (matrix.c:2823)
>> ==5139==    by 0x1C98F64: PCSetUp_LU (lu.c:135)
>> ==5139==    by 0x1F5B26B: PCSetUp (precon.c:819)
>> ==5139==  Address 0xf24f6e8 is 3,163,816 bytes inside a block of size 3,163,820 alloc'd
>> ==5139==    at 0x4A0776F: malloc (vg_replace_malloc.c:263)
>> ==5139==    by 0x5CB3C43: for_allocate (in /opt/intel/Compiler/11.1/038/lib/intel64/libifcore.so.5)
>> ==5139==    by 0x5CB3B50: for_alloc_allocatable (in /opt/intel/Compiler/11.1/038/lib/intel64/libifcore.so.5)
>> ==5139==    by 0x218CD64: dmumps_195_ (dmumps_part2.F:1072)
>> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
>> ==5139==    by 0x216A8DC: dmumps_ (dmumps_part1.F:409)
>> ==5139==    by 0x209407F: dmumps_f77_ (dmumps_part3.F:6651)
>> ==5139==    by 0x2073047: dmumps_c (mumps_c.c:422)
>> ==5139==    by 0x19755BF: MatLUFactorSymbolic_AIJMUMPS (mumps.c:893)
>> ==5139==    by 0x1831D89: MatLUFactorSymbolic (matrix.c:2823)
>> ==5139==    by 0x1C98F64: PCSetUp_LU (lu.c:135)
>> ==5139==    by 0x1F5B26B: PCSetUp (precon.c:819)
>> 



More information about the petsc-users mailing list