<div dir="ltr"><div dir="ltr">On Wed, Feb 22, 2023 at 6:18 PM Sajid Ali Syed via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov">petsc-users@mcs.anl.gov</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg-8224865940736422939">




<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
One thing to note in relation to the trace attached in the previous email is that there are no warnings until the 36<span><sup>th</sup> call to KSP_Solve. The first error (as indicated by ASAN) occurs somewhere before the 40<span><sup>th</sup> call to KSP_Solve
 (part of what the application marks as turn 10 of the propagator). The crash finally occurs on the 43<span><sup>rd</sup> call to KSP_solve.</span></span></span></div></div></div></blockquote><div><br></div><div>Looking at the trace, it appears that stack handling is messed up and eventually it causes the crash. This can happen when</div><div>PetscFunctionBegin is not matched up with PetscFunctionReturn. Can you try running this with</div><div><br></div><div>  -checkstack</div><div><br></div><div>  Thanks,</div><div><br></div><div>     Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="msg-8224865940736422939"><div dir="ltr"><div><div id="m_-8224865940736422939Signature"><div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Thank You,<br>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div style="font-size:12.8px">Sajid Ali (he/him) | Research Associate<br>
Data Science, Simulation, and Learning Division<br>
</div>
<div style="font-size:12.8px">Fermi National Accelerator Laboratory<br>
</div>
<span style="font-size:12.8px"><a href="http://s-sajid-ali.github.io" target="_blank">s-sajid-ali.github.io</a></span></div>
</div>
</div>
</div>
</div>
</div>
<br>
</div>
</div>
</div>
</div>
<div id="m_-8224865940736422939appendonsend"></div>
<hr style="display:inline-block;width:98%">
<div id="m_-8224865940736422939divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Sajid Ali Syed <<a href="mailto:sasyed@fnal.gov" target="_blank">sasyed@fnal.gov</a>><br>
<b>Sent:</b> Wednesday, February 22, 2023 5:11 PM<br>
<b>To:</b> Barry Smith <<a href="mailto:bsmith@petsc.dev" target="_blank">bsmith@petsc.dev</a>><br>
<b>Cc:</b> <a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a> <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>><br>
<b>Subject:</b> Re: [petsc-users] KSP_Solve crashes in debug mode</font>
<div> </div>
</div>

<div dir="ltr">
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
Hi Barry, <br>
<br>
Thanks a lot for fixing this issue. I ran the same problem on a linux machine and have the following trace for the same crash (with ASAN turned on for both PETSc (on the latest commit of the branch) and the application) :
<a href="https://gist.github.com/s-sajid-ali/85bdf689eb8452ef8702c214c4df6940" id="m_-8224865940736422939LPlnkOWALinkPreview" target="_blank">
https://gist.github.com/s-sajid-ali/85bdf689eb8452ef8702c214c4df6940</a><br>
<br>
The trace seems to indicate a couple of buffer overflows, one of which causes the crash. I'm not sure as to what causes them.
<br>
</div>
<div>
<div id="m_-8224865940736422939x_Signature">
<div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0);background-color:rgb(255,255,255)">
<br>
</div>
<div style="font-family:Calibri,Arial,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">
Thank You,<br>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div style="font-size:12.8px">Sajid Ali (he/him) | Research Associate<br>
Data Science, Simulation, and Learning Division<br>
</div>
<div style="font-size:12.8px">Fermi National Accelerator Laboratory<br>
</div>
<span style="font-size:12.8px"><a href="http://s-sajid-ali.github.io" target="_blank">s-sajid-ali.github.io</a></span></div>
</div>
</div>
</div>
</div>
</div>
<br>
</div>
</div>
</div>
</div>
<div id="m_-8224865940736422939x_appendonsend"></div>
<hr style="display:inline-block;width:98%">
<div id="m_-8224865940736422939x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Barry Smith <<a href="mailto:bsmith@petsc.dev" target="_blank">bsmith@petsc.dev</a>><br>
<b>Sent:</b> Wednesday, February 15, 2023 2:01 PM<br>
<b>To:</b> Sajid Ali Syed <<a href="mailto:sasyed@fnal.gov" target="_blank">sasyed@fnal.gov</a>><br>
<b>Cc:</b> <a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a> <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>><br>
<b>Subject:</b> Re: [petsc-users] KSP_Solve crashes in debug mode</font>
<div> </div>
</div>
<div>
<div><br>
</div>
<a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_petsc_petsc_-2D_merge-5Frequests_6075&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=w-DPglgoOUOz8eiEyHKz0g&m=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX&s=QwRI_DzGnCHagpaQSC4MPPEUnC4aAkbMwdG1eg_QUII&e=" target="_blank">https://gitlab.com/petsc/petsc/-/merge_requests/6075</a> should
 fix the possible recursive error condition Matt pointed out
<div><br>
<div><br>
<blockquote type="cite">
<div>On Feb 9, 2023, at 6:24 PM, Matthew Knepley <<a href="mailto:knepley@gmail.com" target="_blank">knepley@gmail.com</a>> wrote:</div>
<br>
<div>
<div dir="ltr">
<div dir="ltr">On Thu, Feb 9, 2023 at 6:05 PM Sajid Ali Syed via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:<br>
</div>
<div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="ltr">
<div>
<p style="margin:0px 0px 1.2em">I added “-malloc_debug” in a .petscrc file and ran it again. The backtrace from lldb is in the attached file. The crash now seems to be at:</p>
<pre style="font-family:Consolas,Inconsolata,Courier,monospace;font-size:1em;line-height:1.2em;margin:1.2em 0px"><code style="font-size:0.85em;font-family:Consolas,Inconsolata,Courier,monospace;margin:0px 0.15em;background-color:rgb(248,248,248);white-space:pre-wrap;overflow:auto;border-radius:3px;border:1px solid rgb(204,204,204);padding:0.5em 0.7em;display:block">Process 32660 stopped* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x16f603fb8)
    frame #0: 0x0000000112ecc8f8 libpetsc.3.018.dylib`PetscFPrintf(comm=0, fd=0x0000000000000000, format=0x0000000000000000) at mprint.c:601
   598               `PetscViewerASCIISynchronizedPrintf()`, `PetscSynchronizedFlush()`
   599      @*/
   600      PetscErrorCode PetscFPrintf(MPI_Comm comm, FILE *fd, const char format[], ...)
-> 601      {
   602       PetscMPIInt rank;
   603      
   604       PetscFunctionBegin;
(lldb) frame info
frame #0: 0x0000000112ecc8f8 libpetsc.3.018.dylib`PetscFPrintf(comm=0, fd=0x0000000000000000, format=0x0000000000000000) at mprint.c:601
(lldb)
</code></pre>
<p style="margin:0px 0px 1.2em">The trace seems to indicate some sort of infinite loop causing an overflow.<br>
</p>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>Yes, I have also seen this. What happens is that we have a memory error. The error is reported inside PetscMallocValidate()</div>
<div>using PetscErrorPrintf, which eventually calls PetscCallMPI, which calls PetscMallocValidate again, which fails. We need to</div>
<div>remove all error checking from the prints inside Validate.</div>
<div><br>
</div>
<div>  Thanks,</div>
<div><br>
</div>
<div>     Matt</div>
<div> </div>
<blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="ltr">
<div>
<p style="margin:0px 0px 1.2em">PS: I'm using a arm64 mac, so I don't have access to valgrind.
<br>
<br>
Thank You,<br>
Sajid Ali (he/him) | Research Associate<br>
Scientific Computing Division<br>
Fermi National Accelerator Laboratory<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__s-2Dsajid-2Dali.github.io_&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=w-DPglgoOUOz8eiEyHKz0g&m=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX&s=JA1u9AHcO8HqY5oCgbEy-ghtKRjURlRDwdmxP-9YJac&e=" target="_blank"><br>
s-sajid-ali.github.io</a></p>
<div title="MDH:PGRpdiBzdHlsZT0iZm9udC1mYW1pbHk6IENhbGlicmksIEFyaWFsLCBIZWx2ZXRpY2EsIHNhbnMt
c2VyaWY7IGZvbnQtc2l6ZTogMTJwdDsgY29sb3I6IHJnYigwLCAwLCAwKTsgYmFja2dyb3VuZC1j
b2xvcjogcmdiKDI1NSwgMjU1LCAyNTUpOyIgY2xhc3M9ImVsZW1lbnRUb1Byb29mIENvbnRlbnRQ
YXN0ZWQwIj5JIGFkZGVkICItbWFsbG9jX2RlYnVnIiBpbiBhIC5wZXRzY3JjIGZpbGUgYW5kIHJh
biBpdCBhZ2Fpbi4gVGhlIGJhY2t0cmFjZSBmcm9tIGxsZGIgaXMgaW4gdGhlIGF0dGFjaGVkIGZp
bGUuIFRoZSBjcmFzaCBub3cgc2VlbXMgdG8gYmUgYXQ6PGJyPjwvZGl2PjxkaXYgc3R5bGU9ImZv
bnQtZmFtaWx5OiBDYWxpYnJpLCBBcmlhbCwgSGVsdmV0aWNhLCBzYW5zLXNlcmlmOyBmb250LXNp
emU6IDEycHQ7IGNvbG9yOiByZ2IoMCwgMCwgMCk7IGJhY2tncm91bmQtY29sb3I6IHJnYigyNTUs
IDI1NSwgMjU1KTsiIGNsYXNzPSJlbGVtZW50VG9Qcm9vZiBDb250ZW50UGFzdGVkMCI+YGBgPGJy
PlByb2Nlc3MgMzI2NjAgc3RvcHBlZDxkaXYgY2xhc3M9IkNvbnRlbnRQYXN0ZWQwIj4qIHRocmVh
ZCAjMSwgcXVldWUgPSAnY29tLmFwcGxlLm1haW4tdGhyZWFkJywgc3RvcCByZWFzb24gPSBFWENf
QkFEX0FDQ0VTUyAoY29kZT0yLCBhZGRyZXNzPTB4MTZmNjAzZmI4KTwvZGl2PjxkaXYgY2xhc3M9
IkNvbnRlbnRQYXN0ZWQwIj4mbmJzcDsgJm5ic3A7IGZyYW1lICMwOiAweDAwMDAwMDAxMTJlY2M4
ZjggbGlicGV0c2MuMy4wMTguZHlsaWJgUGV0c2NGUHJpbnRmKGNvbW09MCwgZmQ9MHgwMDAwMDAw
MDAwMDAwMDAwLCBmb3JtYXQ9MHgwMDAwMDAwMDAwMDAwMDAwKSBhdCBtcHJpbnQuYzo2MDE8L2Rp
dj48ZGl2IGNsYXNzPSJDb250ZW50UGFzdGVkMCI+Jm5ic3A7ICZuYnNwOzU5OCDigILigILigILi
gILigIIgJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7IGBQZXRzY1ZpZXdlckFTQ0lJU3luY2hy
b25pemVkUHJpbnRmKClgLCBgUGV0c2NTeW5jaHJvbml6ZWRGbHVzaCgpYDwvZGl2PjxkaXYgY2xh
c3M9IkNvbnRlbnRQYXN0ZWQwIj4mbmJzcDsgJm5ic3A7NTk5IOKAguKAguKAguKAguKAgkAqLzwv
ZGl2PjxkaXYgY2xhc3M9IkNvbnRlbnRQYXN0ZWQwIj4mbmJzcDsgJm5ic3A7NjAwIOKAguKAguKA
guKAguKAglBldHNjRXJyb3JDb2RlIFBldHNjRlByaW50ZihNUElfQ29tbSBjb21tLCBGSUxFICpm
ZCwgY29uc3QgY2hhciBmb3JtYXRbXSwgLi4uKTwvZGl2PjxkaXYgY2xhc3M9IkNvbnRlbnRQYXN0
ZWQwIj4tJmd0OyA2MDEg4oCC4oCC4oCC4oCC4oCCezwvZGl2PjxkaXYgY2xhc3M9IkNvbnRlbnRQ
YXN0ZWQwIj4mbmJzcDsgJm5ic3A7NjAyIOKAguKAguKAguKAguKAgiBQZXRzY01QSUludCByYW5r
OzwvZGl2PjxkaXYgY2xhc3M9IkNvbnRlbnRQYXN0ZWQwIj4mbmJzcDsgJm5ic3A7NjAzIOKAguKA
guKAguKAguKAgjwvZGl2PjxkaXYgY2xhc3M9IkNvbnRlbnRQYXN0ZWQwIj4mbmJzcDsgJm5ic3A7
NjA0IOKAguKAguKAguKAguKAgiBQZXRzY0Z1bmN0aW9uQmVnaW47PC9kaXY+PGRpdiBjbGFzcz0i
Q29udGVudFBhc3RlZDAiPihsbGRiKSBmcmFtZSBpbmZvPC9kaXY+PGRpdiBjbGFzcz0iQ29udGVu
dFBhc3RlZDAiPmZyYW1lICMwOiAweDAwMDAwMDAxMTJlY2M4ZjggbGlicGV0c2MuMy4wMTguZHls
aWJgUGV0c2NGUHJpbnRmKGNvbW09MCwgZmQ9MHgwMDAwMDAwMDAwMDAwMDAwLCBmb3JtYXQ9MHgw
MDAwMDAwMDAwMDAwMDAwKSBhdCBtcHJpbnQuYzo2MDE8L2Rpdj4obGxkYikgPGJyPjwvZGl2Pjxk
aXYgc3R5bGU9ImZvbnQtZmFtaWx5OiBDYWxpYnJpLCBBcmlhbCwgSGVsdmV0aWNhLCBzYW5zLXNl
cmlmOyBmb250LXNpemU6IDEycHQ7IGNvbG9yOiByZ2IoMCwgMCwgMCk7IGJhY2tncm91bmQtY29s
b3I6IHJnYigyNTUsIDI1NSwgMjU1KTsiIGNsYXNzPSJlbGVtZW50VG9Qcm9vZiBDb250ZW50UGFz
dGVkMCI+YGBgPGJyPjxicj48L2Rpdj48ZGl2IHN0eWxlPSJmb250LWZhbWlseTogQ2FsaWJyaSwg
QXJpYWwsIEhlbHZldGljYSwgc2Fucy1zZXJpZjsgZm9udC1zaXplOiAxMnB0OyBjb2xvcjogcmdi
KDAsIDAsIDApOyBiYWNrZ3JvdW5kLWNvbG9yOiByZ2IoMjU1LCAyNTUsIDI1NSk7IiBjbGFzcz0i
ZWxlbWVudFRvUHJvb2YgQ29udGVudFBhc3RlZDAiPjwvZGl2PjxkaXYgc3R5bGU9ImZvbnQtZmFt
aWx5OiBDYWxpYnJpLCBBcmlhbCwgSGVsdmV0aWNhLCBzYW5zLXNlcmlmOyBmb250LXNpemU6IDEy
cHQ7IGNvbG9yOiByZ2IoMCwgMCwgMCk7IGJhY2tncm91bmQtY29sb3I6IHJnYigyNTUsIDI1NSwg
MjU1KTsiIGNsYXNzPSJlbGVtZW50VG9Qcm9vZiBDb250ZW50UGFzdGVkMCI+PGJyPiZuYnNwOzwv
ZGl2PjxkaXYgY2xhc3M9ImVsZW1lbnRUb1Byb29mIj48ZGl2IHN0eWxlPSJmb250LWZhbWlseTog
Q2FsaWJyaSwgQXJpYWwsIEhlbHZldGljYSwgc2Fucy1zZXJpZjsgZm9udC1zaXplOiAxMnB0OyBj
b2xvcjogcmdiKDAsIDAsIDApOyI+PGJyPjwvZGl2PjxkaXYgaWQ9IlNpZ25hdHVyZSI+PGRpdj4K
PGRpdj48L2Rpdj4KPGRpdj4KVGhhbmsgWW91LDxicj4KPGRpdiBkaXI9Imx0ciI+CjxkaXYgZGly
PSJsdHIiPgo8ZGl2Pgo8ZGl2IGRpcj0ibHRyIj4KPGRpdj4KPGRpdiBkaXI9Imx0ciI+CjxkaXYg
c3R5bGU9ImZvbnQtc2l6ZToxMi44cHgiPlNhamlkIEFsaSAoaGUvaGltKSB8IFJlc2VhcmNoIEFz
c29jaWF0ZTxicj4KPC9kaXY+CjxkaXYgc3R5bGU9ImZvbnQtc2l6ZToxMi44cHgiPlNjaWVudGlm
aWMgQ29tcHV0aW5nIERpdmlzaW9uPGJyPgo8L2Rpdj4KPGRpdiBzdHlsZT0iZm9udC1zaXplOjEy
LjhweCI+RmVybWkgTmF0aW9uYWwgQWNjZWxlcmF0b3IgTGFib3JhdG9yeTxicj4KPC9kaXY+Cjxz
cGFuIHN0eWxlPSJmb250LXNpemU6MTIuOHB4Ij48YSBocmVmPSJodHRwOi8vcy1zYWppZC1hbGku
Z2l0aHViLmlvIiB0YXJnZXQ9Il9ibGFuayI+cy1zYWppZC1hbGkuZ2l0aHViLmlvPC9hPjwvc3Bh
bj48L2Rpdj4KPC9kaXY+CjwvZGl2Pgo8L2Rpdj4KPC9kaXY+CjwvZGl2Pjxicj48ZGl2IHN0eWxl
PSJsaW5lLWJyZWFrOmFmdGVyLXdoaXRlLXNwYWNlIj48ZGl2Pjxicj48L2Rpdj48L2Rpdj48L2Rp
dj48L2Rpdj48L2Rpdj48L2Rpdj4=" style="height:0px;width:0px;max-height:0px;max-width:0px;overflow:hidden;font-size:0em;padding:0px;margin:0px">
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener</div>
<div><br>
</div>
<div><a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cse.buffalo.edu_-7Eknepley_&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=w-DPglgoOUOz8eiEyHKz0g&m=P7R0CW9R-fGNfm2q3yTL-ehqhM5N9-r8hHBLNgDetm9-7jxVqNsujIZ2hdnhVrVX&s=CdEZKWQbBYiD2pzU3Az_EDIGUTBNkNHwSoD2n_2098Y&e=" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</div>

</div></blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.cse.buffalo.edu/~knepley/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div></div></div></div>