[petsc-users] Application Error

Barry Smith bsmith at mcs.anl.gov
Tue Apr 28 16:16:30 CDT 2015


 from the manual page

PetscMemoryGetMaximumUsage - Returns the maximum resident set size (memory used)
   for the program.

   Not Collective

this means each process gets its own memory usage. If you want across all processes you need to add them up.

> On Apr 28, 2015, at 4:12 PM, Sharp Stone <thronesf at gmail.com> wrote:
> 
> Hi Barry,
> 
> Thank you for your suggestive reply.
> 
> I've also got a question that the routine PetscMemoryGetMaximumUsage would return the memory used for the program, but this is returned memory values used for the root processor or for all of the processors involved in the computation? 
> 
> Thanks!
> 
> On Tue, Apr 28, 2015 at 3:04 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   Killed (signal 9)
> 
>   means that some process (generally external to the running process) has told the process to end. In HPC this often is because
> 
> 1) the OS has started running low on memory so killed the process (that is taking much of the memory)
> 
> 2) the batch system has killed the process because it has hit some limit that the has been set by the batch system (such as running too long).
> 
>    My guess is that it is an "out of memory" issue and you are simply using more memory than available. So to run the size problem you want to run you need to use more nodes on your system. It is not likely a "bug" in MPI or elsewhere.
> 
>   Barry
> 
> > On Apr 28, 2015, at 9:49 AM, Sharp Stone <thronesf at gmail.com> wrote:
> >
> > Dear All,
> >
> > I'm using Petsc to do the parallel computation. But an error confuses me recently. When I use a small scale computation (the computational task is small), the code runs smoothly. However, when I use a much larger parallel computation domain/task, I always get the error (as in the attachment):
> >
> > [proxy:0:0 at node01] HYD_pmcd_pmip_control_cmd_cb (../../../../source/mpich-3.1.1/src/pm/hydra/pm/pmiserv/pmip_cb.c:885): assert (!closed) failed
> > [proxy:0:0 at node01] HYDT_dmxu_poll_wait_for_event (../../../../source/mpich-3.1.1/src/pm/hydra/tools/demux/demux_poll.c:76): callback returned error status
> > [proxy:0:0 at node01] main (../../../../source/mpich-3.1.1/src/pm/hydra/pm/pmiserv/pmip.c:206): demux engine error waiting for event
> >
> >
> > I don't know what's been wrong. Is this because there is a bug in MPI?
> > Thank you in advance for any ideas and suggestions!
> >
> > --
> > Best regards,
> >
> > Feng
> > <out.o367446><out.e367446>
> 
> 
> 
> 
> -- 
> Best regards,
> 
> Feng



More information about the petsc-users mailing list