[petsc-users] problem with initial value
Barry Smith
bsmith at mcs.anl.gov
Sat Oct 22 12:38:42 CDT 2011
On Oct 22, 2011, at 11:05 AM, Dominik Szczerba wrote:
> After upgrade to 3.2 the mentioned valgrind issues were still there
> (except for the first one, related to partitioning). However, I seem
> to be able to find the cause for them, which is NOT updating the ghost
> values in x BEFORE kspsolve, only AFTER. That way my coefficient
> matrix, depending on 'x', and obviously assembled before kspsolve,
> contained uninitialized values. When fixing the issue, bcgs solver
> behaves as expected, and as the other solvers. I am relieved the issue
> was with me.
>
> However, bcgs and only bcgs will occasionally break down (inconsistent
> state, division by zero), if the initial solution is exact zero.
> Pre-filling it with something small (compared to the expected
> solution) fixes the breakdown, this small issue, however, still
> bothers me a bit, what's so special about zero?
>
We'd need an example code that reproduces this. You could use -ksp_view_binary to generate binaryoutput file and send it to petsc-maint at mcs.anl.gov
Barry
> Many thanks for your valuable support,
> Dominik
>
> On Fri, Oct 21, 2011 at 7:57 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>> On Oct 21, 2011, at 11:57 AM, Dominik Szczerba wrote:
>>
>>> On Fri, Oct 21, 2011 at 6:29 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>
>>>> On Oct 21, 2011, at 9:29 AM, Dominik Szczerba wrote:
>>>>
>>>>> I am doing a transient computation, solving one linear problem per timestep, so naturally I want to exploit 'x' from the previous time step to be the initial value for the next solve (KSPSetInitialGuessNonzero).
>>>>> For the longest time, however, I was getting wrong results, unless I was resetting 'x' each time step (to some constant value, pure zero caused bcgs to break down).
>>>>
>>>> What happened if you did not set it to some constant (that is kept the old solution)? Did you get KSP_DIVERGED_BREAKDOWN? It would be very odd that starting with a good initial guess would lead to breakdown but that cannot be completely ruled out.
>>>>
>>>
>>> There was no error, the iterations reportedly converged. Only the
>>> results were wrong, sort of strong random spikes.
>>>
>>>>
>>>> I would also check with valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind
>>>>
>>>
>>> There are 3 issues with valgrind:
>>>
>>> 1) Syscall param writev(vector[...]) points to uninitialised byte(s)
>>> -> tribbered by MatPartitioningApply, then leading deep into ParMetis
>>>
>>> 2) Conditional jump or move depends on uninitialised value(s) ->
>>> many times, in VecMin and VecMax and KSPSolve_BCGSL
>>> and
>>>
>>> 3) Syscall param writev(vector[...]) points to uninitialised byte(s)
>>> -> just once, in VecScatterBegin triggered by VecCreateGhost on the
>>> 'x' vector, which is ghosted.\
>>
>> These are very bad things and should not happen at all. They must be tracked down before anything can be trusted. Start by sending the full valgrind output from a PETSc 3.2 run to petsc-maint at mcs.anl.gov
>>
>>
>> Barry
>>
>>>
>>> Do they pose any serious threats?
>>>
>>>>
>>>> Have you tried KSPBCGSL? This is "enhanced Bi-CG-stab" algorithm that is designed to handle certain situations that may cause grief for regular Bi-CG-stab I guess.
>>>>
>>>
>>> Thanks for the hint on bcgsl - it works as expected.
>>>
>>> So, do I have a problem in the code or bcgs is unreliable? If the
>>> latter: as a method or as this specific implementation?
>>>
>>> Thanks for any comments,
>>> Dominik
>>>
>>>
>>>>
>>>> Barry
>>>>
>>>>
>>>>
>>>>> After hours of debugging I was unable to find any errors in my coefficients, I experimentally found out, however, that changing the solver from bcgs to gmres or fgmres removes the problem: I no longer need to clear the solution vector.
>>>>> Now I am a bit worried, if this is still some time bomb in my code or is a known phenomenon. Thanks for any hints.
>>>>>
>>>>> Regards, Dominik
>>>>
>>
>>
More information about the petsc-users
mailing list