[petsc-dev] uni
Barry Smith
bsmith at mcs.anl.gov
Sun Nov 27 17:23:18 CST 2011
Mark,
The mpiuni MPI_Comm_create() didn't increase the reference count for the comm hence freed too soon.
Also I removed your use of PetscCommDuplicate(), there is no reason to call that before passing the communicator to create a PETSc object, you can just pass any old MPI_Comm and PETSc does the right thing. Only reason to call PetscCommDuplicate() is if you are going to do MPI message passing on the communicator and it may not be a PETSc communicator.
Let me know if there are additional problems,
Barry
On Nov 23, 2011, at 5:47 PM, Mark F. Adams wrote:
>
> On Nov 23, 2011, at 6:10 PM, Barry Smith wrote:
>
>>
>> On Nov 23, 2011, at 3:14 PM, Mark F. Adams wrote:
>>
>>> I fixed my code error with valgrind and now I see the same thing (appended) as before (eg, valgrind does not seem to help).
>>>
>>> I'm suspecting that MatConvert is messed, but I will try not deleting this matrix and see what happens.
>>>
>>>
>>> ==69836== Conditional jump or move depends on uninitialised value(s)
>>> ==69836== at 0x1E8A5DE: ATL_dgemvT_a1_x1_b0_y1 (in /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib)
>>> ==69836==
>>> ==69836== Conditional jump or move depends on uninitialised value(s)
>>> ==69836== at 0x1EBAC7A: ATL_dger1_a1_x1_yX (in /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib)
>>> ==69836==
>>> [0]
>>
>> Yes it pisses me off to no end that Apple got suckered into using that ATLAS crap and it is buggy with basic memory errors and Jack doesn't give a shit.
>>
>
> Wow ....
>
> So I fixed the problem. The errors happened when calling the repartitioning code w/o MPI. THe code works fine on one processor with the MPI version. So I think there is a bug in PETSc having to do with MatConvert or Partitioning code w/o MPI.
>
> I have pushed fix today, but if you really want to debug it ... I'm sure that any no-MPI test using GAMG would trip this. All that needs to change is on line 191 of gamg.c:
>
> if( s_avoid_repart || npe==1 ) {
>
> change it to
>
> if( s_avoid_repart ) {
>
> This will run the partitioning code even on one processor.
>
> Let me know if I can help.
>
> Thanks,
> Mark
>
>
>>
>> It you can tell us how to run a code (or push an example and tell use it) that reproduces this we can also debug it to see what the problem is.
>>
>> Barry
>>
>>>
>>> On Nov 23, 2011, at 2:01 PM, Barry Smith wrote:
>>>
>>>>
>>>> On Nov 23, 2011, at 10:31 AM, Mark F. Adams wrote:
>>>>
>>>>> Yep, need to keep the extra include with uni, but not the lib anymore.
>>>>>
>>>>> Now I get this error, any ideas? I need a uni regression test apparently.
>>>>
>>>> After you do the "make all test" on PETSc do "make alltests" and it should run all the tests that run on one process.
>>>>
>>>> Something is wrong with your reference counting likely here. Does this work with --with-debugging=0? I would run debug version with valgrind first. Then debug version in debugger.
>>>>
>>>> Barry
>>>>
>>>>
>>>>>
>>>>> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
>>>>> [0]PETSC ERROR: Corrupt argument:
>>>>> see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind!
>>>>> [0]PETSC ERROR: Inner MPI_Comm does not have expected reference to outer comm!
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: Petsc Development HG revision: 3dda30872eeec1a2ea5c130f267935c5e0b2534a HG Date: Mon Nov 21 11:19:39 2011 -0800
>>>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>>> [0]PETSC ERROR: See docs/index.html for manual pages.
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: ./viscousTensorSolve2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.ex on a arch-maco named madams-macbk-3.local by markadams Wed Nov 23 11:25:30 2011
>>>>> [0]PETSC ERROR: Libraries linked from /Users/markadams/Codes/petsc-dev/arch-macosx-gnu-seq/lib
>>>>> [0]PETSC ERROR: Configure run at Tue Nov 22 12:41:46 2011
>>>>> [0]PETSC ERROR: Configure options CXX=/sw/lib/gcc4.4/bin/g++-4 CC=/sw/lib/gcc4.4/bin/gcc-4 FC=/sw/lib/gcc4.4/bin/gfortran --with-x=0 --with-clanguage=c++ --with-debugging=0 --with-mpi=0 PETSC_ARCH=arch-macosx-gnu-seq
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: Petsc_DelComm() line 448 in /Users/markadams/Codes/petsc-dev/src/sys/objects/pinit.c
>>>>> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
>>>>> [0]PETSC ERROR: Corrupt argument:
>>>>> see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind!
>>>>> [0]PETSC ERROR: MPI_Comm does not have tag/name counter nor does it have inner MPI_Comm!
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: Petsc Development HG revision: 3dda30872eeec1a2ea5c130f267935c5e0b2534a HG Date: Mon Nov 21 11:19:39 2011 -0800
>>>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>>>> [0]PETSC ERROR: See docs/index.html for manual pages.
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: ./viscousTensorSolve2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.ex on a arch-maco named madams-macbk-3.local by markadams Wed Nov 23 11:25:30 2011
>>>>> [0]PETSC ERROR: Libraries linked from /Users/markadams/Codes/petsc-dev/arch-macosx-gnu-seq/lib
>>>>> [0]PETSC ERROR: Configure run at Tue Nov 22 12:41:46 2011
>>>>> [0]PETSC ERROR: Configure options CXX=/sw/lib/gcc4.4/bin/g++-4 CC=/sw/lib/gcc4.4/bin/gcc-4 FC=/sw/lib/gcc4.4/bin/gfortran --with-x=0 --with-clanguage=c++ --with-debugging=0 --with-mpi=0 PETSC_ARCH=arch-macosx-gnu-seq
>>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>>> [0]PETSC ERROR: PetscCommDestroy() line 229 in /Users/markadams/Codes/petsc-dev/src/sys/objects/tagm.c
>>>>> [0]PETSC ERROR: PetscHeaderDestroy_Private() line 110 in /Users/markadams/Codes/petsc-dev/src/sys/objects/inherit.c
>>>>> [0]PETSC ERROR: MatDestroy() line 1045 in /Users/markadams/Codes/petsc-dev/src/mat/interface/matrix.c
>>>>> [0]PETSC ERROR: partitionLevel() line 347 in /Users/markadams/Codes/petsc-dev/src/ksp/pc/impls/gamg/gamg.c
>>>>> [0]PETSC ERROR: PCSetUp_GAMG() line 599 in /Users/markadams/Codes/petsc-dev/src/ksp/pc/impls/gamg/gamg.c
>>>>> [0]PETSC ERROR: PCSetUp() line 819 in /Users/markadams/Codes/petsc-dev/src/ksp/pc/interface/precon.c
>>>>> [0]PETSC ERROR: KSPSetUp() line 261 in /Users/markadams/Codes/petsc-dev/src/ksp/ksp/interface/itfunc.c
>>>>> [0]PETSC ERROR: solveprivate() line 191 in "unknowndirectory/"PetscLinearSolverI.H
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Nov 22, 2011, at 3:19 PM, Barry Smith wrote:
>>>>>
>>>>>>
>>>>>> You have som other MPI include file hanging around that is getting intercepted before the MPI Uni one.
>>>>>>
>>>>>> Barry
>>>>>>
>>>>>> On Nov 22, 2011, at 12:00 PM, Mark F. Adams wrote:
>>>>>>
>>>>>>> Scorched earth did not work:
>>>>>>>
>>>>>>> "VecCreate(ompi_communicator_t*, _p_Vec**)", referenced from:
>>>>>>> PetscLinearSolver<LevelData<FArrayBox> >::solveprivate(LevelData<FArrayBox>&, LevelData<FArrayBox> const&) in libamrelliptic2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.a(PetscLinearSolver.o)
>>>>>>> "_ompi_mpi_comm_self", referenced from:
>>>>>>> _ompi_mpi_comm_self$non_lazy_ptr in libamrelliptic2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.a(PetscLinearSolver.o)
>>>>>>>
>>>>>>> <configure.log.gz>
>>>>>>>
>>>>>>> On Nov 22, 2011, at 11:47 AM, Satish Balay wrote:
>>>>>>>
>>>>>>>> Something is strange.
>>>>>>>>
>>>>>>>> I'll suggeset
>>>>>>>>
>>>>>>>> rm -rf externalpackages arch-macosx-gnu-seq
>>>>>>>>
>>>>>>>> and then doing a fresh build.
>>>>>>>>
>>>>>>>> And then verifying the build with 'make test'
>>>>>>>>
>>>>>>>> If 'make test' works - and the application build is not working- I'd
>>>>>>>> like to see the complete compile log thats producing this error.
>>>>>>>>
>>>>>>>>
>>>>>>>> wrt libmpiuni.a - all the relavent stuff is added to libpetsc.a and a separate
>>>>>>>> libmpiuni.a is not created.
>>>>>>>>
>>>>>>>> Satish
>>>>>>>>
>>>>>>>> On Tue, 22 Nov 2011, Mark F. Adams wrote:
>>>>>>>>
>>>>>>>>> I'm trying to build a non MPI code and get this error:
>>>>>>>>>
>>>>>>>>> Undefined symbols:
>>>>>>>>> "_ompi_mpi_comm_self", referenced from:
>>>>>>>>> _ompi_mpi_comm_self$non_lazy_ptr in libamrelliptic2d.Darwin.g++-4.gfortran.DEBUG.OPT.PETSC.a(PetscLinearSolver.o)
>>>>>>>>> (maybe you meant: _ompi_mpi_comm_self$non_lazy_ptr)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> In the past I have linked libmpiuni.a but that did not seem to get created. Did libmpiuni.a go away or did my make fail ...
>>>>>>>>>
>>>>>>>>> Mark
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
More information about the petsc-dev
mailing list