[petsc-dev] Error on large problems.

Barry Smith bsmith at mcs.anl.gov
Sat Mar 7 16:49:42 CST 2015


> On Mar 7, 2015, at 4:27 PM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> 
> 
> On Sat, Mar 7, 2015 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   Hmm,  my first guess is a mixup between PetscMPIInt and PetscInt arrays  or MPIU_INT somewhere. (But compilers catch most of these)
> 
> I have a run with ~9K eq/core 128K cores many times but have never gotten the 32K/core (4B eq.) run to work.  So it looks like an int overflow issue.

  Well in theory if we have done everything write with 64 bit indices there should never be an integer overflow (though of course there could be mistake somewhere but generally the compiler will detect if we are trying to stick a 64 bit int into a 32 bit slot.

   Can you try the same example without GAMG (say with hypre instead); if it goes through ok it might indicate an int issue either in gamg or the code that gamg calls?

  Barry

>  
> 
>   Another possibility is bug in the MPI for many large messages; any chance you can run the same thing on a very different system? Mira?
> 
> Not soon, but I have Chombo and PETSc built on Mira and it would not be hard to get this code setup and try it.
> 
> This is for SCE15 so I will turn the PETSc test off unless someone has any ideas on something to try.  I am using maint perhaps I should use master.
> 
> Mark
>  
> 
>   Barry
> 
> > On Mar 7, 2015, at 1:21 PM, Mark Adams <mfadams at lbl.gov> wrote:
> >
> > I seem to be getting this error on Edison with 128K cores and ~4 Billion equations.  I've seen this error several time.  I've attached a recent output from this.  I wonder if it is an integer overflow.  This built with 64 bit integers, but I notice that GAMG prints out N and I see N=0 for the finest level.
> >
> > Mark
> > <out.131072.uniform.txt>
> 
> 




More information about the petsc-dev mailing list