[petsc-users] Petsc has generated inconsistent data!
Dominik Szczerba
dominik at itis.ethz.ch
Fri Aug 26 16:51:38 CDT 2011
Valgrind excerpts below, none seems related, and both are there also with gmres.
I just cant believe bicgstab would quit on such a trivial problem...
Regards,
Dominik
==10423== Syscall param writev(vector[...]) points to uninitialised byte(s)
==10423== at 0x6B5E789: writev (writev.c:56)
==10423== by 0x515658: MPIDU_Sock_writev (sock_immed.i:610)
==10423== by 0x517943: MPIDI_CH3_iSendv (ch3_isendv.c:84)
==10423== by 0x4FB756: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:509)
==10423== by 0x4FD6E4: MPID_Isend (mpid_isend.c:118)
==10423== by 0x4E502A: MPIC_Isend (helper_fns.c:210)
==10423== by 0xED701D: MPIR_Alltoall (alltoall.c:420)
==10423== by 0xED7880: PMPI_Alltoall (alltoall.c:685)
==10423== by 0xE4BC95: SetUp__ (setup.c:122)
==10423== by 0xE4C4E4: PartitionSmallGraph__ (weird.c:39)
==10423== by 0xE497D0: ParMETIS_V3_PartKway (kmetis.c:131)
==10423== by 0x71B774: MatPartitioningApply_Parmetis (pmetis.c:97)
==10423== by 0x717D70: MatPartitioningApply (partition.c:236)
==10423== by 0x5287D7: PetscSolver::LoadMesh(std::string const&)
(PetscSolver.cxx:676)
==10423== by 0x4C6157: CD3T10_USER::ProcessInputFile()
(cd3t10mpi_main.cxx:321)
==10423== by 0x4C3B57: main (cd3t10mpi_main.cxx:568)
==10423== Address 0x71ce9b4 is 4 bytes inside a block of size 72 alloc'd
==10423== at 0x4C28FAC: malloc (vg_replace_malloc.c:236)
==10423== by 0xE5C792: GKmalloc__ (util.c:151)
==10423== by 0xE57C09: PreAllocateMemory__ (memory.c:38)
==10423== by 0xE496B3: ParMETIS_V3_PartKway (kmetis.c:116)
==10423== by 0x71B774: MatPartitioningApply_Parmetis (pmetis.c:97)
==10423== by 0x717D70: MatPartitioningApply (partition.c:236)
==10423== by 0x5287D7: PetscSolver::LoadMesh(std::string const&)
(PetscSolver.cxx:676)
==10423== by 0x4C6157: CD3T10_USER::ProcessInputFile()
(cd3t10mpi_main.cxx:321)
==10423== by 0x4C3B57: main (cd3t10mpi_main.cxx:568)
==10423==
==10423== Conditional jump or move depends on uninitialised value(s)
==10423== at 0x55B6510: inflateReset2 (in
/lib/x86_64-linux-gnu/libz.so.1.2.3.4)
==10423== by 0x55B6605: inflateInit2_ (in
/lib/x86_64-linux-gnu/libz.so.1.2.3.4)
==10423== by 0x5308C13: H5Z_filter_deflate (H5Zdeflate.c:110)
==10423== by 0x5308170: H5Z_pipeline (H5Z.c:1103)
==10423== by 0x518BB69: H5D_chunk_lock (H5Dchunk.c:2758)
==10423== by 0x518CB20: H5D_chunk_read (H5Dchunk.c:1728)
==10423== by 0x519BDDA: H5D_read (H5Dio.c:447)
==10423== by 0x519C248: H5Dread (H5Dio.c:173)
==10423== by 0x4CEB99: HDF5::HDF5Reader::readData(std::string
const&) (HDF5Reader.cxx:634)
==10423== by 0x4CE305: HDF5::HDF5Reader::read(std::vector<int,
std::allocator<int> >&, std::string const&) (HDF5Reader.cxx:527)
==10423== by 0x4C71A3: CD3T10_USER::SetupConstraints()
(cd3t10mpi_main.cxx:404)
==10423== by 0x4BA02A: CD3T10::Solve() (CD3T10mpi.cxx:658)
==10423== by 0x4C3BE1: main (cd3t10mpi_main.cxx:590)
==10423==
On Fri, Aug 26, 2011 at 11:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> First run with valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind to make sure it does not have some "code bug cause".
>
> Do you get the same message on one process.
>
> My previous message still holds, if it is not a "code bug" then bi-CG-stab is just breaking down on that matrix and preconditioner combination.
>
> Barry
>
> On Aug 26, 2011, at 4:27 PM, Dominik Szczerba wrote:
>
>> Later in the message he only requested that I use "-ksp_norm_type
>> unpreconditioned". So I did, and the error comes back, now fully
>> documented below. As I wrote, it works fine with gmres, and the
>> problem is very simple, diagonal dominant steady state diffusion.
>>
>> Any hints are highly appreciated.
>>
>> Dominik
>>
>> #PETSc Option Table entries:
>> -ksp_converged_reason
>> -ksp_converged_use_initial_residual_norm
>> -ksp_monitor_true_residual
>> -ksp_norm_type unpreconditioned
>> -ksp_rtol 1e-3
>> -ksp_type bcgs
>> -ksp_view
>> -log_summary
>> -pc_type jacobi
>> #End of PETSc Option Table entries
>> 0 KSP preconditioned resid norm 1.166190378969e+01 true resid norm
>> 1.166190378969e+01 ||Ae||/||Ax|| 1.000000000000e+00
>> 1 KSP preconditioned resid norm 5.658835826231e-01 true resid norm
>> 5.658835826231e-01 ||Ae||/||Ax|| 4.852411688762e-02
>> [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [0]PETSC ERROR: Petsc has generated inconsistent data!
>> [0]PETSC ERROR: Divide by zero!
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [2]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [2]PETSC ERROR: Petsc has generated inconsistent data!
>> [2]PETSC ERROR: Divide by zero!
>> [2]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [2]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17
>> 13:37:48 CDT 2011
>> [2]PETSC ERROR: See docs/changes/index.html for recent updates.
>> [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> [2]PETSC ERROR: See docs/index.html for manual pages.
>> [2]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [2]PETSC ERROR: /home/dsz/build/framework-debug/trunk/bin/cd3t10mpi on
>> a linux-gnu named tharsis by dsz Fri Aug 26 23:23:47 2011
>> [2]PETSC ERROR: Libraries linked from
>> /home/dsz/pack/petsc-3.1-p8/linux-gnu-c-debug/lib
>> [2]PETSC ERROR: Configure run at Mon Jul 25 14:20:10 2011
>> [2]PETSC ERROR: Configure options
>> PETSC_DIR=/home/dsz/pack/petsc-3.1-p8 PETSC_ARCH=linux-gnu-c-debug
>> --download-f-blas-lapack=CD3T10::SaveSolution()
>> [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17
>> 13:37:48 CDT 2011
>> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> [0]PETSC ERROR: See docs/index.html for manual pages.
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: /home/dsz/build/framework-debug/trunk/bin/cd3t10mpi on
>> a linux-gnu named tharsis by dsz Fri Aug 26 23:23:47 2011
>> [0]PETSC ERROR: Libraries linked from
>> /home/dsz/pack/petsc-3.1-p8/linux-gnu-c-debug/lib
>> [0]PETSC ERROR: Configure run at Mon Jul 25 14:20:10 2011
>> [0]PETSC ERROR: Configure options
>> PETSC_DIR=/home/dsz/pack/petsc-3.1-p8 PETSC_ARCH=linux-gnu-c-debug
>> --download-f-blas-lapack=1 --download-mpich=1 --download-hypre=1
>> --with-parmetis=1 --download-parmetis=1 --with-x=0 --with-debugging=1
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: KSPSolve_BCGS() line 75 in
>> src/ksp/ksp/impls/bcgs/[1]PETSC ERROR: --------------------- Error
>> Message ------------------------------------
>> [1]PETSC ERROR: Petsc has generated inconsistent data!
>> [1]PETSC ERROR: Divide by zero!
>> [1]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [1]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17
>> 13:37:48 CDT 2011
>> [1]PETSC ERROR: See docs/changes/index.html for recent updates.
>> [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> [1]PETSC ERROR: See docs/index.html for manual pages.
>> [1]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [1]PETSC ERROR: /home/dsz/build/framework-debug/trunk/bin/cd3t10mpi on
>> a linux-gnu named tharsis by dsz Fri Aug 26 23:23:47 2011
>> [1]PETSC ERROR: Libraries linked from
>> /home/dsz/pack/petsc-3.1-p8/linux-gnu-c-debug/lib
>> [1]PETSC ERROR: Configure run at Mon Jul 25 14:20:10 2011
>> [1]PETSC ERROR: Configure options
>> PETSC_DIR=/home/dsz/pack/petsc-3.1-p8 PETSC_ARCH=linux-gnu-c-debug
>> --download-f-blas-lapack=1 --download-mpich=1 --download-hypre=1
>> --with-parmetis=1 --download-parmetis=1 --with-x=0 --with-debugging=1
>> [2]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [2]PETSC ERROR: KSPSolve_BCGS() line 75 in src/ksp/ksp/impls/bcgs/bcgs.c
>> [2]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
>> [2]PETSC ERROR: User provided function() line 1215 in
>> "unknowndirectory/"/home/dsz/src/framework/trunk/solve/PetscSolver.cxx
>> [3]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [3]PETSC ERROR: Petsc has generated inconsistent data!
>> [3]PETSC ERROR: Divide by zero!
>> [3]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [3]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17
>> 13:37:48 CDT 2011
>> [3]PETSC ERROR: See docs/changes/index.html for recent updates.
>> [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>> [3]PETSC ERROR: See docs/index.html for manual pages.
>> [3]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [3]PETSC ERROR: /home/dsz/build/framework-debug/trunk/bin/cd3t10mpi on
>> a linux-gnu named tharsis by dsz Fri Aug 26 23:23:47 2011
>> [3]PETSC ERROR: Libraries linked from
>> /home/dsz/pack/petsc-3.1-p8/linux-gnu-c-debug/lib
>> [3]PETSC ERROR: Configure run at Mon Jul 25 14:20:10 2011
>> [3]PETSC ERROR: Configure options
>> PETSC_DIR=/home/dsz/pack/petsc-3.1-p8 PETSC_ARCH=linux-gnu-c-debug
>> --download-f-blas-lapack=bcgs.c
>> [0]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
>> [0]PETSC ERROR: User provided function() line 1215 in
>> "unknowndirectory/"/home/dsz/src/framework/trunk/solve/PetscSolver.cxx
>> 1 --download-mpich=1 --download-hypre=1 --with-parmetis=1
>> --download-parmetis=1 --with-x=0 --with-debugging=1
>> [1]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [1]PETSC ERROR: KSPSolve_BCGS() line 75 in src/ksp/ksp/impls/bcgs/bcgs.c
>> [1]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
>> [1]PETSC ERROR: User provided function() line 1215 in
>> "unknowndirectory/"/home/dsz/src/framework/trunk/solve/PetscSolver.cxx
>> 1 --download-mpich=1 --download-hypre=1 --with-parmetis=1
>> --download-parmetis=1 --with-x=0 --with-debugging=1
>> [3]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [3]PETSC ERROR: KSPSolve_BCGS() line 75 in src/ksp/ksp/impls/bcgs/bcgs.c
>> [3]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
>> [3]PETSC ERROR: User provided function() line 1215 in
>> "unknowndirectory/"/home/dsz/src/framework/trunk/solve/PetscSolver.cxx
>> PetscSolver::Finalize()
>> PetscSolver::FinalizePetsc()
>>
>>
>> On Fri, Aug 26, 2011 at 11:17 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>> Are you sure that is the entire error message. It should print the routine and the line number where this happens.
>>>
>>> Likely it is at
>>>
>>> do {
>>> ierr = VecDot(R,RP,&rho);CHKERRQ(ierr); /* rho <- (r,rp) */
>>> beta = (rho/rhoold) * (alpha/omegaold);
>>> ierr = VecAXPBYPCZ(P,1.0,-omegaold*beta,beta,R,V);CHKERRQ(ierr); /* p <- r - omega * beta* v + beta * p */
>>> ierr = KSP_PCApplyBAorAB(ksp,P,V,T);CHKERRQ(ierr); /* v <- K p */
>>> ierr = VecDot(V,RP,&d1);CHKERRQ(ierr);
>>> if (d1 == 0.0) SETERRQ(((PetscObject)ksp)->comm,PETSC_ERR_PLIB,"Divide by zero");
>>> alpha = rho / d1; /* a <- rho / (v,rp) */
>>>
>>> Which means bi-cg-stab has broken down. You'll need to consult references to Bi-CG-stab to see why this might happen (it can happen while GMRES is happy). It may be KSPBCGSL can proceed past this point with a problem values for
>>> Options Database Keys:
>>> + -ksp_bcgsl_ell <ell> Number of Krylov search directions
>>> - -ksp_bcgsl_cxpol Use a convex function of the MR and OR polynomials after the BiCG step
>>> - -ksp_bcgsl_xres <res> Threshold used to decide when to refresh computed residuals
>>>
>>> but most likely the preconditioner or matrix is bogus in some way since I think Bi-CG-stab rarely breaks down in practice.
>>>
>>>
>>> Barry
>>>
>>>
>>>
>>> On Aug 26, 2011, at 4:05 PM, Dominik Szczerba wrote:
>>>
>>>> When solving my linear system with -ksp_type bcgs I get:
>>>>
>>>> 0 KSP preconditioned resid norm 1.166190378969e+01 true resid norm
>>>> 1.166190378969e+01 ||Ae||/||Ax|| 1.000000000000e+00
>>>> 1 KSP preconditioned resid norm 5.658835826231e-01 true resid norm
>>>> 5.658835826231e-01 ||Ae||/||Ax|| 4.852411688762e-02
>>>> [1]PETSC ERROR: --------------------- Error Message
>>>> ------------------------------------
>>>> [1]PETSC ERROR: Petsc has generated inconsistent data!
>>>> [1]PETSC ERROR: Divide by zero!
>>>> [1]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> [1]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17
>>>> 13:37:48 CDT 2011
>>>> [1]PETSC ERROR: See docs/changes/index.html for recent updates.
>>>> [1]PETSC ERROR: [3]PETSC ERROR: --------------------- Error Message
>>>> ------------------------------------
>>>> [3]PETSC ERROR: Petsc has generated inconsistent data!
>>>> [3]PETSC ERROR: Divide by zero!
>>>>
>>>> PETSc Option Table entries:
>>>> -ksp_converged_reason
>>>> -ksp_monitor
>>>> -ksp_monitor_true_residual
>>>> -ksp_norm_type unpreconditioned
>>>> -ksp_right_pc
>>>> -ksp_rtol 1e-3
>>>> -ksp_type bcgs
>>>> -ksp_view
>>>> -log_summary
>>>> -pc_type jacobi
>>>>
>>>> When solving the same system with GMRES all works fine. This is a
>>>> simple test diffusion problem. How can I find out what the problem is?
>>>>
>>>> Dominik
>>>
>>>
>
>
More information about the petsc-users
mailing list