[petsc-users] MUMPS solver crash with petsc-3.2

Gong Ding gdiso at ustc.edu
Mon Dec 5 05:50:05 CST 2011


Yes, petsc-dev works. Please merge the update to petsc-3.2.



>    It seems likely a bug in that older version of Parmetis or MUMPS.   Since parametis has been major fixed up since that release it would be crazy to debug the old version.
> 
> 
> 
>    You should switch to petsc-dev http://www.mcs.anl.gov/petsc/developers/index.html and use the options --download-mumps --download-metis --download-parmetis if the problem still persists we will help you debug it but my guess is it will start working.
> 
> 
> 
>     Barry
> 
> 
> 
> 
> 
>    
> 
> On Dec 3, 2011, at 6:21 AM, Gong Ding wrote:
> 
> 
> 
> > I indid use --download-parmetis and --download-mumps.
> 
> > The parmetis version is 3.2.0-p1.
> 
> > Maybe mumps is not compatable with this version?
> 
> > 
> 
> > 
> 
> > 
> 
> >   This is a problem in Metis/Parmetis.   PETSc 3.2 MUST be used with parmetis 3.2.0 
> 
> > 
> 
> >    Are you perhaps using Metis/Parmetis metis-5.0.2 and parmetis-4.0.2? Did you use --download-parmetis or install it yourself? You should use the --download to the the right one
> 
> > 
> 
> >   Barry
> 
> > 
> 
> > 
> 
> > On Dec 2, 2011, at 3:47 AM, Gong Ding wrote:
> 
> > 
> 
> >> petsc-3.1-p8 & mumps 4.9 works well.
> 
> >> However, petsc-3.2 p3-p5 & mumps 4.10 seems have memory problem.
> 
> >> 
> 
> >> The code occasionally crash on Linux/X64 and alwasy crash on AIX/PPC.
> 
> >> This problem may caused by mumps4.10 or parmetis, but I havn't test.
> 
> >> 
> 
> >> The valgrind report:
> 
> >> 
> 
> >> ==5139== Invalid read of size 8
> 
> >> ==5139==    at 0x57BE749: __intel_new_memcpy (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
> 
> >> ==5139==    by 0x57A0AF5: _intel_fast_memcpy.J (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
> 
> >> ==5139==    by 0x2269296: __GrowBisection (initpart.c:200)
> 
> >> ==5139==    by 0x2269C5F: __Init2WayPartition (initpart.c:36)
> 
> >> ==5139==    by 0x223DDFA: __MlevelNodeBisection (ometis.c:485)
> 
> >> ==5139==    by 0x223DB9B: __MlevelNodeBisectionMultiple (ometis.c:405)
> 
> >> ==5139==    by 0x223D957: __MlevelNestedDissection (ometis.c:289)
> 
> >> ==5139==    by 0x223DA09: __MlevelNestedDissection (ometis.c:309)
> 
> >> ==5139==    by 0x223E837: METIS_NodeND (ometis.c:157)
> 
> >> ==5139==    by 0x2245E31: metis_nodend_ (frename.c:122)
> 
> >> ==5139==    by 0x2190666: dmumps_195_ (dmumps_part2.F:1435)
> 
> >> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
> 
> >> ==5139==  Address 0x8718408 is 440 bytes inside a block of size 444 alloc'd
> 
> >> ==5139==    at 0x4A0776F: malloc (vg_replace_malloc.c:263)
> 
> >> ==5139==    by 0x226E6A2: __GKmalloc (util.c:111)
> 
> >> ==5139==    by 0x226E6E4: __idxmalloc (util.c:60)
> 
> >> ==5139==    by 0x2269014: __GrowBisection (initpart.c:101)
> 
> >> ==5139==    by 0x2269C5F: __Init2WayPartition (initpart.c:36)
> 
> >> ==5139==    by 0x223DDFA: __MlevelNodeBisection (ometis.c:485)
> 
> >> ==5139==    by 0x223DB9B: __MlevelNodeBisectionMultiple (ometis.c:405)
> 
> >> ==5139==    by 0x223D957: __MlevelNestedDissection (ometis.c:289)
> 
> >> ==5139==    by 0x223DA09: __MlevelNestedDissection (ometis.c:309)
> 
> >> ==5139==    by 0x223E837: METIS_NodeND (ometis.c:157)
> 
> >> ==5139==    by 0x2245E31: metis_nodend_ (frename.c:122)
> 
> >> ==5139==    by 0x2190666: dmumps_195_ (dmumps_part2.F:1435)
> 
> >> ==5139==
> 
> >> ==5139== Invalid read of size 8
> 
> >> ==5139==    at 0x57BE609: __intel_new_memcpy (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
> 
> >> ==5139==    by 0x73F8CCCCC: ???
> 
> >> ==5139==    by 0x300000001F: ???
> 
> >> ==5139==    by 0x7FEFF8B4F: ???
> 
> >> ==5139==    by 0x7FEFF8C37: ???
> 
> >> ==5139==    by 0xE326EEF: ???
> 
> >> ==5139==    by 0x216F: ???
> 
> >> ==5139==    by 0x223DA09: __MlevelNestedDissection (ometis.c:309)
> 
> >> ==5139==    by 0x223E837: METIS_NodeND (ometis.c:157)
> 
> >> ==5139==    by 0x2245E31: metis_nodend_ (frename.c:122)
> 
> >> ==5139==    by 0x2190666: dmumps_195_ (dmumps_part2.F:1435)
> 
> >> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
> 
> >> ==5139==  Address 0x9288b38 is 0 bytes after a block of size 15,224 alloc'd
> 
> >> ==5139==    at 0x4A0776F: malloc (vg_replace_malloc.c:263)
> 
> >> ==5139==    by 0x226E6A2: __GKmalloc (util.c:111)
> 
> >> ==5139==    by 0x226E6E4: __idxmalloc (util.c:60)
> 
> >> ==5139==    by 0x223DB60: __MlevelNodeBisectionMultiple (ometis.c:402)
> 
> >> ==5139==    by 0x223D957: __MlevelNestedDissection (ometis.c:289)
> 
> >> ==5139==    by 0x223DA09: __MlevelNestedDissection (ometis.c:309)
> 
> >> ==5139==    by 0x223E837: METIS_NodeND (ometis.c:157)
> 
> >> ==5139==    by 0x2245E31: metis_nodend_ (frename.c:122)
> 
> >> ==5139==    by 0x2190666: dmumps_195_ (dmumps_part2.F:1435)
> 
> >> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
> 
> >> ==5139==    by 0x216A8DC: dmumps_ (dmumps_part1.F:409)
> 
> >> ==5139==    by 0x209407F: dmumps_f77_ (dmumps_part3.F:6651)
> 
> >> ==5139==
> 
> >> ==5139== Invalid read of size 8
> 
> >> ==5139==    at 0x57BE609: __intel_new_memcpy (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
> 
> >> ==5139==    by 0x57A0AF5: _intel_fast_memcpy.J (in /opt/intel/Compiler/11.1/038/lib/intel64/libirc.so)
> 
> >> ==5139==    by 0x21913DD: dmumps_557_ (dmumps_part2.F:2147)
> 
> >> ==5139==    by 0x218E243: dmumps_195_ (dmumps_part2.F:1535)
> 
> >> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
> 
> >> ==5139==    by 0x216A8DC: dmumps_ (dmumps_part1.F:409)
> 
> >> ==5139==    by 0x209407F: dmumps_f77_ (dmumps_part3.F:6651)
> 
> >> ==5139==    by 0x2073047: dmumps_c (mumps_c.c:422)
> 
> >> ==5139==    by 0x19755BF: MatLUFactorSymbolic_AIJMUMPS (mumps.c:893)
> 
> >> ==5139==    by 0x1831D89: MatLUFactorSymbolic (matrix.c:2823)
> 
> >> ==5139==    by 0x1C98F64: PCSetUp_LU (lu.c:135)
> 
> >> ==5139==    by 0x1F5B26B: PCSetUp (precon.c:819)
> 
> >> ==5139==  Address 0xf24f6e8 is 3,163,816 bytes inside a block of size 3,163,820 alloc'd
> 
> >> ==5139==    at 0x4A0776F: malloc (vg_replace_malloc.c:263)
> 
> >> ==5139==    by 0x5CB3C43: for_allocate (in /opt/intel/Compiler/11.1/038/lib/intel64/libifcore.so.5)
> 
> >> ==5139==    by 0x5CB3B50: for_alloc_allocatable (in /opt/intel/Compiler/11.1/038/lib/intel64/libifcore.so.5)
> 
> >> ==5139==    by 0x218CD64: dmumps_195_ (dmumps_part2.F:1072)
> 
> >> ==5139==    by 0x20C79D2: dmumps_26_ (dmumps_part5.F:313)
> 
> >> ==5139==    by 0x216A8DC: dmumps_ (dmumps_part1.F:409)
> 
> >> ==5139==    by 0x209407F: dmumps_f77_ (dmumps_part3.F:6651)
> 
> >> ==5139==    by 0x2073047: dmumps_c (mumps_c.c:422)
> 
> >> ==5139==    by 0x19755BF: MatLUFactorSymbolic_AIJMUMPS (mumps.c:893)
> 
> >> ==5139==    by 0x1831D89: MatLUFactorSymbolic (matrix.c:2823)
> 
> >> ==5139==    by 0x1C98F64: PCSetUp_LU (lu.c:135)
> 
> >> ==5139==    by 0x1F5B26B: PCSetUp (precon.c:819)
> 
> >> 
> 
> 
> 
> 


More information about the petsc-users mailing list