<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div> Thanks,<div class=""><br class=""></div><div class=""> So this means that all the double precision array pointers that PETSc is passing into these BLAS calls are addressable. Which means nothing has corrupted any of these pointers before the calls.</div><div class=""><br class=""></div><div class=""> What my patch did. Before each BLAS call, for each double array argument it set a special exception handler and then accessed the first entry in the array. Since the exception handler was never called this means that the first entry of each array was accessible and would not produce a SEGV or SIGBUS.</div><div class=""><br class=""></div><div class=""> What else could be corrupted. </div><div class=""><br class=""></div><div class="">1) the size arguments passed to the BLAS calls, if they were too large they could result in accessing incorrect memory but IMHO that would usually produce a SEGV not a SIGBUS. It is hard to put a check in the code because these sizes are problem dependent and there is no way to know if they are wrong.</div><div class=""><br class=""></div><div class="">2) corruption of the stack?</div><div class=""><br class=""></div><div class="">3) hardware issue due to overheating or bad memory etc. I assume the MPI rank that crashes changes for each crashing run. I am adding code to our patch branch to print the node name that hopefully is constant for all runs, then one can see if the problem is always on the same node. Patch attached</div></body></html>