<!DOCTYPE html><html><head><title></title><style type="text/css">p.MsoNormal,p.MsoNoSpacing{margin:0}</style></head><body><div>There are lots of places where we could use Valgrind client checks to invalidate memory or assert that it's valid, in order to catch bugs sooner and more reliably.</div><div><br></div><div>On Tue, Aug 4, 2020, at 1:05 PM, Barry Smith wrote:<br></div><blockquote type="cite" id="qt" style="overflow-wrap:break-word;"><div class="qt-"><br></div><div> So long as the error checking is not much more expensive then the computation than I see no harm in including it with the if (debug) model. For example, checking a sorted array is sorted is less expensive than the sort so can be included at the end of sort algorithm. <br></div><div class="qt-"><br></div><div class="qt-"> For the numerical solvers things are much trickery because it is difficult to check if the result is "correct", especially given the different criteria possible for "convergence" and norms that might be used. Thus checking the "little" things throughout the code is a good idea, because we can't catch the "big things".<br></div><div class="qt-"><br></div><div class="qt-"> One check we don't do consistently which requires elbow grease is to use VecGetArrayWrite() instead of VecGetArray() when the routine has to fill ALL the values in the array. Then VecGetArrayWrite() can fill the array with Nan, and VecRestoreArrayWrite() verifies that there are no Nan in the result to do the basic test that the routine did not miss setting some values. <br></div><div class="qt-"><br></div><div class="qt-">We could extend this concept to other places where a routine is suppose to "fill up" memory obtained with PetscMalloc(), have something like PetscMallocVerifyFilled() that checks malloced space after it has been filled by the code to make sure no places where missing. Note that valgrind does some of this checking, but not all, valgrind only generates messages when a resulting unset value is used in an if (something) or something that controls program flow etc. It will not detect when a numerical location is not filled in but is used later in a numerical computation that never controls program flow. Hence in debug mode I would like PetscMalloc() to fill all numerical arrays with Nan.<br></div><div class="qt-"><br></div><div class="qt-"> This is the simplest error checking, not even checking that correct values are used, just checking that something is used and we don't even do this everywhere yet.<br></div><div class="qt-"><br></div><div class="qt-"> Barry<br></div><div class="qt-"><br></div><div class="qt-"><br></div><div class="qt-"><br></div><div class="qt-"><br></div><div class="qt-"><br></div><div class="qt-"><div><div><br></div><blockquote type="cite" class="qt-"><div class="qt-">On Aug 4, 2020, at 12:24 PM, Jacob Faibussowitsch <<a href="mailto:jacob.fai@gmail.com" class="qt-">jacob.fai@gmail.com</a>> wrote:<br></div><div><br></div><div class="qt-"><div style="overflow-wrap:break-word;" class="qt-"><div>Hello All,<br></div><div class="qt-"><br></div><div class="qt-">How far should one go in error checking when using #if defined(PETSC_USE_DEBUG)? So far I have gone with the mantra that internal petsc routines (including ones authored by myself) should be considered infallible and that I should only be checking for garbage *input* from the user. But there is not a person in existence that doesn’t write buggy code, as evidenced by the somewhat routine “fixing bug in XYZ” merge requests one sees. <br></div><div class="qt-"><br></div><div class="qt-">On the other hand it is also not reasonable to check every single output with a fine-tooth comb because for the majority of cases code written by petsc developers is working as intended. Take for example writing an array sorting algorithm. Since every operation is "performance critical” these are often written in less logical or less readable formats leading to some subtle bugs that the writer doesn’t immediately catch. If these remain uncaught through CI/CD and then bleeds into a user code I see absolutely no chance of the user (or even other devs) ever being able to identify that the sorting algorithm deep in some function stack is the one producing the bug without significant effort. One could include a “dumb” version of the same algorithm that checks a copy of the initial array for missing/misplaced elements but as mentioned above this is a pointless slowdown 99% of the time. <br></div><div class="qt-"><br></div><div class="qt-">CI/CD -- while excellent at catching a lot of machine-specific bugs -- isn’t bulletproof either. It relies on the assumption that the writer knows all possible sources of bugs in their code and provides a test case for each, but to quote Isaac Newton “what we know is a drop, what we don’t know is an ocean”. I have been mulling over this problem for a while now, and have looked through the user manual/developers manual but have not found a definitive answer. <br></div><div class="qt-"><div><br></div><div class="qt-"><div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;-webkit-text-stroke-width:0px;text-decoration-line:none;text-decoration-style:initial;text-decoration-color:initial;overflow-wrap:break-word;" class="qt-"><div class="qt-"><div>Best regards,<br></div><div><br></div><div>Jacob Faibussowitsch<br></div><div>(Jacob Fai - booss - oh - vitch)<br></div><div>Cell: (312) 694-3391<br></div></div></div></div></div></div></div></blockquote></div></div></blockquote></body></html>