[petsc-users] 2^32 integer problems

Matthew Knepley knepley at gmail.com
Sun Jun 2 09:27:17 CDT 2024


On Sat, Jun 1, 2024 at 11:39 PM Carpenter, Mark H. (LARC-D302) via
petsc-users <petsc-users at mcs.anl.gov> wrote:

> Mark Carpenter, NASA Langley. I am a novice PETSC user of about 10 years.
> I’ve build a DG-FEM code with petsc as one of the solver paths (I have my
> own as well). Furthermore, I use petsc for MPI communication. I’m running
> the DG-FEM
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> Mark Carpenter,  NASA Langley.
>
>
>
> I am a novice PETSC user of about 10 years.  I’ve build  a DG-FEM code
> with petsc as one of the solver paths (I have my own as well).
> Furthermore, I use petsc for MPI communication.
>
>
>
> I’m running the DG-FEM code on our NAS supercomputer.  Everything works
> when my integer sizes are small.  When I exceed the 2^32 limit of integer
> arithmetic the code fails in very strange ways.
>
> The users that originally set up the petsc infrastructure in the code are
> no longer at NASA and I’m “dead in the water”.
>
>
>
> I think I’ve promoted all the integers that  are problematic in my code
> (F95).  On PETSC side:  I’ve tried
>
>    1. Reinstall petsc with –with-64-bit-integers  (no luck)
>
>
That option does not exist, so this will not work.


>
>    1.
>    2. Reinstall petsc with –with-64-bit-integers and
>    –with-64-bit-indices  (code will not compile with these options.
>    Additional variables on F90 side require promotion and then the errors
>    cascade through code  when making PETSC calls.
>
>
We should fix this. I feel confident we can get the code to compile.


>
>    1.
>    2. It’s possible that I’ve missed offending integers, but the petsc
>    error messages are so cryptic that I can’t even tell where it is failing.
>
>
>
> Further complicating matters:
>
> The problem by definition needs to be HUGE.  Problem sizes requiring 1000
> cores (10^6 elements at P5) are needed to experience the errors, which
> involves waiting in queues for ½ day at least.
>
>
>
> Attached are the
>
>    1. Install script used to install PETSC on our machine
>    2. The Makefile used on the fortran side
>    3. A data dump from an offending simulation (which is huge and I can’t
>    see any useful information.)
>
>
>
> How do I attack this problem.
>
> (I’ve never gotten debugging working properly).
>

Let's get the install for 64-bit indices to work. So we

1) Configure PETSc adding --with-64bit-indices to the configure line. Does
this work? If not, send configure.log

2) Compile PETSc. Does this work? If not, send make.log

3) Compile your code. Does this work? If not, send all output.

4) Do one of the 1/2 day runs and let us know what happens. An alternative
is to run a small number
    of processes on a large memory workstation. We do this to test at the
lab.

  Thanks,

     Matt


> Mark
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!auF9rrBGDlsDNKTGGczofe7W5jFe6xzdNRcYh93Hu_48IDvf_AkLauQ1sfAdN5qS_ENmKo_z_6HeyVJBTACI$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!auF9rrBGDlsDNKTGGczofe7W5jFe6xzdNRcYh93Hu_48IDvf_AkLauQ1sfAdN5qS_ENmKo_z_6HeyWbp2354$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240602/af53d8ee/attachment.html>


More information about the petsc-users mailing list