<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr"><div>Hi, Richard,</div><div> I tested the case you sent over and found it did fail due to the 32-bit overflow on number of non-zeros, and with a 64-bit built petsc it passed. You had a typo when you reported that --with-64-bit-indicies=yes failed. It should be --with-64-bit-indices=yes.</div><div> You can go with a 64-bit built petsc, or you can go with parallel computing and run with multiple MPI ranks so that each rank has less non-zeros and it is faster (but you need to make sure that code is correctly parallelized).</div><div> Barry's recent fix ierr = PetscIntCast(nz64,&nz);CHKERRQ(ierr); would print more useful error messages in this case. Barry, should we patch it back to 3.6.3?<br></div><div><br></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">--Junchao Zhang</div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Feb 16, 2020 at 11:37 PM Junchao Zhang <<a href="mailto:jczhang@mcs.anl.gov">jczhang@mcs.anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Richard,<div> I managed to get the code Simlul@trophy built. Could you tell me how to run your test? I want to see if I can reproduce the error. Thanks </div><div><br clear="all"><div><div dir="ltr"><div dir="ltr">--Junchao Zhang</div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Feb 14, 2020 at 8:34 PM Richard Beare <<a href="mailto:richard.beare@monash.edu" target="_blank">richard.beare@monash.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>It doesn't compile out of the box with master.</div><div><br></div><div>singularity def file attached.<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, 15 Feb 2020 at 08:03, Richard Beare <<a href="mailto:richard.beare@monash.edu" target="_blank">richard.beare@monash.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">I will see if I can build with master. The docs for simulatrophy say 3.6.3.1.</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, 15 Feb 2020 at 02:47, Junchao Zhang <<a href="mailto:jczhang@mcs.anl.gov" target="_blank">jczhang@mcs.anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Which petsc version do you use? In aij.c of the master branch, I saw Barry recently added a useful check to catch number of nonzero overflow, ierr = PetscIntCast(nz64,&nz);CHKERRQ(ierr); But you mentioned using 64-bit indices did not solve the problem, it might not be the reason. You should try the master branch if feasible. Also, vary number of MPI ranks to see if error stack changes. <div><div><br><div><div><div dir="ltr"><div dir="ltr">--Junchao Zhang</div></div></div><br></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Feb 14, 2020 at 5:12 AM Richard Beare via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>No luck - exactly the same error after including the --with-64-bit-indicies=yes --download-mpich=yes options</div><div><br></div>==8674== Argument 'size' of function memalign has a fishy (possibly negative) value: -17152036540<br>==8674== at 0x4C320A6: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)<br>==8674== by 0x4F0CFF2: PetscMallocAlign(unsigned long, int, char const*, char const*, void**) (mal.c:28)<br>==8674== by 0x4F0F716: PetscTrMallocDefault(unsigned long, int, char const*, char const*, void**) (mtr.c:188)<br>==8674== by 0x569AF3E: MatSeqAIJSetPreallocation_SeqAIJ (aij.c:3595)<br>==8674== by 0x569A531: MatSeqAIJSetPreallocation (aij.c:3539)<br>==8674== by 0x599080A: DMCreateMatrix_DA_3d_MPIAIJ(_p_DM*, _p_Mat*) (fdda.c:1085)<br>==8674== by 0x598B937: DMCreateMatrix_DA(_p_DM*, _p_Mat**) (fdda.c:759)<br>==8674== by 0x58A2BF2: DMCreateMatrix (dm.c:956)<br>==8674== by 0x5E377B3: KSPSetUp (itfunc.c:262)<br>==8674== by 0x409FFC: PetscAdLemTaras3D::solveModel(bool) (PetscAdLemTaras3D.hxx:255)<br>==8674== by 0x4239FB: AdLem3D<3u>::solveModel(bool, bool, bool) (AdLem3D.hxx:551)<br>==8674== by 0x41BD17: main (PetscAdLemMain.cxx:344)<br>==8674== <br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 14 Feb 2020 at 17:07, Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
Richard,<br>
<br>
It is likely that for these problems some of the integers become too large for the int variable to hold them, thus they overflow and become negative.<br>
<br>
You should make a new PETSC_ARCH configuration of PETSc that uses the configure option --with-64-bit-indices, this will change PETSc to use 64 bit integers which will not overflow.<br>
<br>
Good luck and let us know how it works out<br>
<br>
Barry<br>
<br>
Probably the code is built with an older version of PETSc; the later versions should produce a more useful error message.<br>
<br>
> On Feb 13, 2020, at 11:43 PM, Richard Beare via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>> wrote:<br>
> <br>
> Hi Everyone,<br>
> I am experimenting with the Simlul@trophy tool (<a href="https://github.com/Inria-Asclepios/simul-atrophy" rel="noreferrer" target="_blank">https://github.com/Inria-Asclepios/simul-atrophy</a>) that uses petsc to simulate brain atrophy based on segmented MRI data. I am not the author. I have this running on most of a dataset of about 50 scans, but experience crashes with several that I am trying to track down. However I am out of ideas. The problem images are slightly bigger than some of the successful ones, but not substantially so, and I have experimented on machines with sufficient RAM. The error happens very quickly, as part of setup - see the valgrind report below. I haven't managed to get the sgcheck tool to work yet. I can only guess that the ksp object is somehow becoming corrupted during the setup process, but the array sizes that I can track (which derive from image sizes), appear correct at every point I can check. Any suggestions as to how I can check what might go wrong in the setup of the ksp object?<br>
> Thankyou.<br>
> <br>
> valgrind tells me:<br>
> <br>
> ==18175== Argument 'size' of function memalign has a fishy (possibly negative) value: -17152038144<br>
> ==18175== at 0x4C320A6: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)<br>
> ==18175== by 0x4F0F1F2: PetscMallocAlign(unsigned long, int, char const*, char const*, void**) (mal.c:28)<br>
> ==18175== by 0x56B43CA: MatSeqAIJSetPreallocation_SeqAIJ (aij.c:3595)<br>
> ==18175== by 0x56B39BD: MatSeqAIJSetPreallocation (aij.c:3539)<br>
> ==18175== by 0x59A9B44: DMCreateMatrix_DA_3d_MPIAIJ(_p_DM*, _p_Mat*) (fdda.c:1085)<br>
> ==18175== by 0x59A4C71: DMCreateMatrix_DA(_p_DM*, _p_Mat**) (fdda.c:759)<br>
> ==18175== by 0x58BBD29: DMCreateMatrix (dm.c:956)<br>
> ==18175== by 0x5E509D5: KSPSetUp (itfunc.c:262)<br>
> ==18175== by 0x40A3DE: PetscAdLemTaras3D::solveModel(bool) (PetscAdLemTaras3D.hxx:269)<br>
> ==18175== by 0x42413F: AdLem3D<3u>::solveModel(bool, bool, bool) (AdLem3D.hxx:552)<br>
> ==18175== by 0x41C25C: main (PetscAdLemMain.cxx:349)<br>
> ==18175== <br>
> <br>
> -- <br>
> --<br>
> A/Prof Richard Beare<br>
> Imaging and Bioinformatics, Peninsula Clinical School<br>
> <a href="http://orcid.org/0000-0002-7530-5664" rel="noreferrer" target="_blank">orcid.org/0000-0002-7530-5664</a><br>
> <a href="mailto:Richard.Beare@monash.edu" target="_blank">Richard.Beare@monash.edu</a><br>
> +61 3 9788 1724<br>
> <br>
> <br>
> <br>
> Geospatial Research: <a href="https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis" rel="noreferrer" target="_blank">https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis</a><br>
<br>
</blockquote></div></div><br clear="all"><br>-- <br><div dir="ltr"><div dir="ltr"><div>--<br>A/Prof Richard Beare<br>Imaging and Bioinformatics, Peninsula Clinical School</div><div><span><div><span><a href="http://orcid.org/0000-0002-7530-5664" target="_blank">orcid.org/0000-0002-7530-5664</a></span></div></span></div><div><a href="mailto:Richard.Beare@monash.edu" target="_blank">Richard.Beare@monash.edu</a><br>+61 3 9788 1724<br><span><br></span></div><div><br></div><div><span></span><br>Geospatial Research: <a href="https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis" target="_blank">https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis</a></div></div></div>
</blockquote></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr"><div dir="ltr"><div>--<br>A/Prof Richard Beare<br>Imaging and Bioinformatics, Peninsula Clinical School</div><div><span><div><span><a href="http://orcid.org/0000-0002-7530-5664" target="_blank">orcid.org/0000-0002-7530-5664</a></span></div></span></div><div><a href="mailto:Richard.Beare@monash.edu" target="_blank">Richard.Beare@monash.edu</a><br>+61 3 9788 1724<br><span><br></span></div><div><br></div><div><span></span><br>Geospatial Research: <a href="https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis" target="_blank">https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis</a></div></div></div>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr"><div dir="ltr"><div>--<br>A/Prof Richard Beare<br>Imaging and Bioinformatics, Peninsula Clinical School</div><div><span><div><span><a href="http://orcid.org/0000-0002-7530-5664" target="_blank">orcid.org/0000-0002-7530-5664</a></span></div></span></div><div><a href="mailto:Richard.Beare@monash.edu" target="_blank">Richard.Beare@monash.edu</a><br>+61 3 9788 1724<br><span><br></span></div><div><br></div><div><span></span><br>Geospatial Research: <a href="https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis" target="_blank">https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis</a></div></div></div>
</blockquote></div>
</blockquote></div>