[petsc-dev] uni

Barry Smith bsmith at mcs.anl.gov
Sun Nov 27 17:23:18 CST 2011


   Mark,

     The mpiuni MPI_Comm_create() didn't increase the reference count for the comm hence freed too soon.

     Also I removed your use of PetscCommDuplicate(), there is no reason to call that before passing the communicator to create a PETSc object, you can just pass any old MPI_Comm and PETSc does the right thing. Only reason to call PetscCommDuplicate() is if you are going to do MPI message passing on the communicator and it may not be a PETSc communicator.

    Let me know if there are additional problems,


   Barry

On Nov 23, 2011, at 5:47 PM, Mark F. Adams wrote:

> 
> On Nov 23, 2011, at 6:10 PM, Barry Smith wrote:
> 
>> 
>> On Nov 23, 2011, at 3:14 PM, Mark F. Adams wrote:
>> 
>>> I fixed my code error with valgrind and now I see the same thing (appended) as before (eg, valgrind does not seem to help).
>>> 
>>> I'm suspecting that MatConvert is messed, but I will try not deleting this matrix and see what happens.
>>> 
>>> 
>>> ==69836== Conditional jump or move depends on uninitialised value(s)
>>> ==69836==    at 0x1E8A5DE: ATL_dgemvT_a1_x1_b0_y1 (in /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib)
>>> ==69836== 
>>> ==69836== Conditional jump or move depends on uninitialised value(s)
>>> ==69836==    at 0x1EBAC7A: ATL_dger1_a1_x1_yX (in /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib)
>>> ==69836== 
>>> [0]
>> 
>>  Yes it pisses me off to no end that Apple got suckered into using that ATLAS crap and it is buggy with basic memory errors and Jack doesn't give a shit.
>> 
> 
> Wow ....
> 
> So I fixed the problem.  The errors happened when calling the repartitioning code w/o MPI.  THe code works fine on one processor with the MPI version.  So I think there is a bug in PETSc having to do with MatConvert or Partitioning code w/o MPI.  
> 
> I have pushed fix today, but if you really want to debug it ... I'm sure that any no-MPI test using GAMG would trip this.  All that needs to change is on line 191 of gamg.c:
> 
> if( s_avoid_repart || npe==1 ) { 
> 
> change it to 
> 
> if( s_avoid_repart ) { 
> 
> This will run the partitioning code even on one processor.
> 
> Let me know if I can help.
> 
> Thanks,
> Mark
> 
> 
>> 
>>   It you can tell us how to run a code (or push an example and tell use it) that reproduces this we can also debug it to see what the problem is.
>> 
>>  Barry
>> 
>>> 
>>> On Nov 23, 2011, at 2:01 PM, Barry Smith wrote:
>>> 
>>>> 
>>>> On Nov 23, 2011, at 10:31 AM, Mark F. Adams wrote:
>>>> 
>>>>> Yep, need to keep the extra include with uni, but not the lib anymore.
>>>>> 
>>>>> Now I get this error, any ideas?  I need a uni regression test apparently.
>>>> 
>>>> After you do the "make all test" on PETSc do "make alltests" and it should run all the tests that run on one process.
>>>> 
>>>> Something is wrong with your reference counting likely here. Does this work with --with-debugging=0? I would run debug version with valgrind first. Then debug version in debugger.
>>>> 
>>>> Barry
>>>> 
>>>> 
>>>>> 
>>>>> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
>>>>> [0]PETSC ERROR: Corrupt argument:
>>>>> see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind!
>>>>> [0]PETSC ERROR: Inner MPI_Comm does not have expected reference to outer comm!
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: Petsc Development HG revision: 3dda30872eeec1a2ea5c130f267935c5e0b2534a  HG Date: Mon Nov 21 11:19:39 2011 -0800
>>>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>>> [0]PETSC ERROR: See docs/index.html for manual pages.
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: ./viscousTensorSolve2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.ex on a arch-maco named madams-macbk-3.local by markadams Wed Nov 23 11:25:30 2011
>>>>> [0]PETSC ERROR: Libraries linked from /Users/markadams/Codes/petsc-dev/arch-macosx-gnu-seq/lib
>>>>> [0]PETSC ERROR: Configure run at Tue Nov 22 12:41:46 2011
>>>>> [0]PETSC ERROR: Configure options CXX=/sw/lib/gcc4.4/bin/g++-4 CC=/sw/lib/gcc4.4/bin/gcc-4 FC=/sw/lib/gcc4.4/bin/gfortran --with-x=0 --with-clanguage=c++ --with-debugging=0 --with-mpi=0 PETSC_ARCH=arch-macosx-gnu-seq
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: Petsc_DelComm() line 448 in /Users/markadams/Codes/petsc-dev/src/sys/objects/pinit.c
>>>>> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
>>>>> [0]PETSC ERROR: Corrupt argument:
>>>>> see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind!
>>>>> [0]PETSC ERROR: MPI_Comm does not have tag/name counter nor does it have inner MPI_Comm!
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: Petsc Development HG revision: 3dda30872eeec1a2ea5c130f267935c5e0b2534a  HG Date: Mon Nov 21 11:19:39 2011 -0800
>>>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>>> [0]PETSC ERROR: See docs/index.html for manual pages.
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: ./viscousTensorSolve2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.ex on a arch-maco named madams-macbk-3.local by markadams Wed Nov 23 11:25:30 2011
>>>>> [0]PETSC ERROR: Libraries linked from /Users/markadams/Codes/petsc-dev/arch-macosx-gnu-seq/lib
>>>>> [0]PETSC ERROR: Configure run at Tue Nov 22 12:41:46 2011
>>>>> [0]PETSC ERROR: Configure options CXX=/sw/lib/gcc4.4/bin/g++-4 CC=/sw/lib/gcc4.4/bin/gcc-4 FC=/sw/lib/gcc4.4/bin/gfortran --with-x=0 --with-clanguage=c++ --with-debugging=0 --with-mpi=0 PETSC_ARCH=arch-macosx-gnu-seq
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: PetscCommDestroy() line 229 in /Users/markadams/Codes/petsc-dev/src/sys/objects/tagm.c
>>>>> [0]PETSC ERROR: PetscHeaderDestroy_Private() line 110 in /Users/markadams/Codes/petsc-dev/src/sys/objects/inherit.c
>>>>> [0]PETSC ERROR: MatDestroy() line 1045 in /Users/markadams/Codes/petsc-dev/src/mat/interface/matrix.c
>>>>> [0]PETSC ERROR: partitionLevel() line 347 in /Users/markadams/Codes/petsc-dev/src/ksp/pc/impls/gamg/gamg.c
>>>>> [0]PETSC ERROR: PCSetUp_GAMG() line 599 in /Users/markadams/Codes/petsc-dev/src/ksp/pc/impls/gamg/gamg.c
>>>>> [0]PETSC ERROR: PCSetUp() line 819 in /Users/markadams/Codes/petsc-dev/src/ksp/pc/interface/precon.c
>>>>> [0]PETSC ERROR: KSPSetUp() line 261 in /Users/markadams/Codes/petsc-dev/src/ksp/ksp/interface/itfunc.c
>>>>> [0]PETSC ERROR: solveprivate() line 191 in "unknowndirectory/"PetscLinearSolverI.H
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Nov 22, 2011, at 3:19 PM, Barry Smith wrote:
>>>>> 
>>>>>> 
>>>>>> You have som other MPI include file hanging around that is getting intercepted before the MPI Uni one. 
>>>>>> 
>>>>>> Barry
>>>>>> 
>>>>>> On Nov 22, 2011, at 12:00 PM, Mark F. Adams wrote:
>>>>>> 
>>>>>>> Scorched earth did not work:
>>>>>>> 
>>>>>>> "VecCreate(ompi_communicator_t*, _p_Vec**)", referenced from:
>>>>>>> PetscLinearSolver<LevelData<FArrayBox> >::solveprivate(LevelData<FArrayBox>&, LevelData<FArrayBox> const&) in libamrelliptic2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.a(PetscLinearSolver.o)
>>>>>>> "_ompi_mpi_comm_self", referenced from:
>>>>>>> _ompi_mpi_comm_self$non_lazy_ptr in libamrelliptic2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.a(PetscLinearSolver.o)
>>>>>>> 
>>>>>>> <configure.log.gz>
>>>>>>> 
>>>>>>> On Nov 22, 2011, at 11:47 AM, Satish Balay wrote:
>>>>>>> 
>>>>>>>> Something is strange.
>>>>>>>> 
>>>>>>>> I'll suggeset
>>>>>>>> 
>>>>>>>> rm -rf externalpackages arch-macosx-gnu-seq
>>>>>>>> 
>>>>>>>> and then doing a fresh build.
>>>>>>>> 
>>>>>>>> And then verifying the build with 'make test'
>>>>>>>> 
>>>>>>>> If 'make test' works - and the application build is not working- I'd
>>>>>>>> like to see the complete compile log thats producing this error.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> wrt libmpiuni.a - all the relavent stuff is added to libpetsc.a and a separate
>>>>>>>> libmpiuni.a is not created.
>>>>>>>> 
>>>>>>>> Satish
>>>>>>>> 
>>>>>>>> On Tue, 22 Nov 2011, Mark F. Adams wrote:
>>>>>>>> 
>>>>>>>>> I'm trying to build a non MPI code and get this error:
>>>>>>>>> 
>>>>>>>>> Undefined symbols:
>>>>>>>>> "_ompi_mpi_comm_self", referenced from:
>>>>>>>>> _ompi_mpi_comm_self$non_lazy_ptr in libamrelliptic2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.a(PetscLinearSolver.o)
>>>>>>>>> (maybe you meant: _ompi_mpi_comm_self$non_lazy_ptr)
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> In the past I have linked libmpiuni.a but that did not seem to get created.  Did libmpiuni.a go away or did my make fail ...
>>>>>>>>> 
>>>>>>>>> Mark
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
> 




More information about the petsc-dev mailing list