[petsc-users] problem with initial value

Sat Oct 22 11:05:07 CDT 2011

After upgrade to 3.2 the mentioned valgrind issues were still there
(except for the first one, related to partitioning). However, I seem
to be able to find the cause for them, which is NOT updating the ghost
values in x BEFORE kspsolve, only AFTER. That way my coefficient
matrix, depending on 'x', and obviously assembled before kspsolve,
contained uninitialized values. When fixing the issue, bcgs solver
behaves as expected, and as the other solvers. I am relieved the issue
was with me.

However, bcgs and only bcgs will occasionally break down (inconsistent
state, division by zero), if the initial solution is exact zero.
Pre-filling it with something small (compared to the expected
solution) fixes the breakdown, this small issue, however, still
bothers me a bit, what's so special about zero?

Many thanks for your valuable support,
Dominik

On Fri, Oct 21, 2011 at 7:57 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> On Oct 21, 2011, at 11:57 AM, Dominik Szczerba wrote:
>
>> On Fri, Oct 21, 2011 at 6:29 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>> On Oct 21, 2011, at 9:29 AM, Dominik Szczerba wrote:
>>>
>>>> I am doing a transient computation, solving one linear problem per timestep, so naturally I want to exploit 'x' from the previous time step to be the initial value for the next solve (KSPSetInitialGuessNonzero).
>>>> For the longest time, however, I was getting wrong results, unless I was resetting 'x' each time step (to some constant value, pure zero caused bcgs to break down).
>>>
>>>   What happened if you did not set it to some constant (that is kept the old solution)? Did you get KSP_DIVERGED_BREAKDOWN?   It would be very odd that starting with a good initial guess would lead to breakdown but that cannot be completely ruled out.
>>>
>>
>> There was no error, the iterations reportedly converged. Only the
>> results were wrong, sort of strong random spikes.
>>
>>>
>>>   I would also check with valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind
>>>
>>
>> There are 3 issues with valgrind:
>>
>> 1) Syscall param writev(vector[...]) points to uninitialised byte(s)
>> -> tribbered by MatPartitioningApply, then leading deep into ParMetis
>>
>> 2) Conditional jump or move depends on uninitialised value(s)     ->
>> many times, in VecMin and VecMax and KSPSolve_BCGSL
>> and
>>
>> 3) Syscall param writev(vector[...]) points to uninitialised byte(s)
>> -> just once, in VecScatterBegin triggered by VecCreateGhost on the
>> 'x' vector, which is ghosted.\
>
>   These are very bad things and should not happen at all.   They must be tracked down before anything can be trusted. Start by sending the full valgrind output from a PETSc 3.2 run to petsc-maint at mcs.anl.gov
>
>
>   Barry
>
>>
>> Do they pose any serious threats?
>>
>>>
>>>    Have you tried KSPBCGSL? This is "enhanced Bi-CG-stab" algorithm that is designed to handle certain situations that may cause grief for regular Bi-CG-stab I guess.
>>>
>>
>> Thanks for the hint on bcgsl - it works as expected.
>>
>> So, do I have a problem in the code or bcgs is unreliable? If the
>> latter: as a method or as this specific implementation?
>>
>> Thanks for any comments,
>> Dominik
>>
>>
>>>
>>>   Barry
>>>
>>>
>>>
>>>> After hours of debugging I was unable to find any errors in my coefficients, I experimentally found out, however, that changing the solver from bcgs to gmres or fgmres removes the problem: I no longer need to clear the solution vector.
>>>> Now I am a bit worried, if this is still some time bomb in my code or is a known phenomenon. Thanks for any hints.
>>>>
>>>> Regards, Dominik
>>>
>
>