[petsc-users] KSPSolve errors on blues

Xujun Zhao xzhao99 at gmail.com
Wed Jun 8 16:29:56 CDT 2016


OK, this makes sense.
Now my libMesh code use SerialMesh, which keeps a copy on each processor,
although the operation is parallelized. So it requires more memory if
multiple CPUs are used. This may be a potential culprit. But I suppose the
60x60x60 mesh data(all second order) shouldn't be so large.... there may be
some other bugs

On Wed, Jun 8, 2016 at 4:18 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Jun 8, 2016, at 4:08 PM, Xujun Zhao <xzhao99 at gmail.com> wrote:
> >
> > Barry,
> >
> > Thank you. I am testing on the blues.
> > btw, what do they mean for the different types of memory usage? for
> example, below is the summary of mem usage for 60x60x60(with 2.9M dofs).
> The max process memory is 3 times as the max space PetscMalloc()ed.
>
>    PetscMalloced is basically the PETSc data structures; Process memory is
> size of the program plus PETSc malloced space plus space allocated by any
> other library, in this case libMesh. It looks like libMesh is requiring a
> lot of space?
>
>   Barry
>
> >
> > Summary of Memory Usage in PETSc
> > Maximum (over computational time) process memory:        total
> 1.0930e+11 max 5.6928e+10 min 5.2376e+10
> > Current process memory:
>    total 3.1762e+09 max 2.8804e+09 min 2.9583e+08
> > Maximum (over computational time) space PetscMalloc()ed: total
> 3.0071e+10 max 1.5286e+10 min 1.4785e+10
> > Current space PetscMalloc()ed:
>  total 1.5453e+05 max 7.7264e+04 min 7.7264e+04
> >
> >
> >
> > On Wed, Jun 8, 2016 at 1:40 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> > > On Jun 8, 2016, at 1:30 PM, Xujun Zhao <xzhao99 at gmail.com> wrote:
> > >
> > > A quick test of a smaller system on my laptop with 25x25x25 mesh gives
> the following info.
> > > The memory used keeps increasing from 1 to 3 CPUs, but slightly
> decreases with 4 CPUs.
> >
> >    yes this does not look problematic
> >
> > > On the other hand, 60x60x60 mesh (2.9M dofs) is also not a big
> system...
> >
> >   True.
> >
> >    I think you need to run the 60 60 60 system also on 1 2 4 and 8
> processes to see how the memory trends. I don't think we should eliminate
> memory as the culprit yet.
> >
> >
> >   Barry
> >
> > >
> > >
> > > ------------------------------------------------- 1 CPU
> -------------------------------------------------
> > > Summary of Memory Usage in PETSc
> > > Maximum (over computational time) process memory:        total
> 4.7054e+09 max 4.7054e+09 min 4.7054e+09
> > > Current process memory:
>      total 4.7054e+09 max 4.7054e+09 min 4.7054e+09
> > > Maximum (over computational time) space PetscMalloc()ed: total
> 1.6151e+09 max 1.6151e+09 min 1.6151e+09
> > > Current space PetscMalloc()ed:
>    total 7.7232e+04 max 7.7232e+04 min 7.7232e+04
> > >
> > > ------------------------------------------------- 2 CPU
> -------------------------------------------------
> > > Summary of Memory Usage in PETSc
> > > Maximum (over computational time) process memory:        total
> 6.2389e+09 max 3.1275e+09 min 3.1113e+09
> > > Current process memory:
>      total 6.2389e+09 max 3.1275e+09 min 3.1113e+09
> > > Maximum (over computational time) space PetscMalloc()ed: total
> 2.1589e+09 max 1.1193e+09 min 1.0397e+09
> > > Current space PetscMalloc()ed:
>    total 1.5446e+05 max 7.7232e+04 min 7.7232e+04
> > >
> > > ------------------------------------------------- 3 CPU
> -------------------------------------------------
> > > Summary of Memory Usage in PETSc
> > > Maximum (over computational time) process memory:        total
> 7.7116e+09 max 1.9572e+09 min 1.8715e+09
> > > Current process memory:
>      total 7.7116e+09 max 1.9572e+09 min 1.8715e+09
> > > Maximum (over computational time) space PetscMalloc()ed: total
> 2.1754e+09 max 5.8450e+08 min 5.0516e+08
> > > Current space PetscMalloc()ed:
>    total 3.0893e+05 max 7.7232e+04 min 7.7232e+04
> > >
> > > ------------------------------------------------- 4 CPU
> -------------------------------------------------
> > > Summary of Memory Usage in PETSc
> > > Maximum (over computational time) process memory:        total
> 7.1188e+09 max 2.4651e+09 min 2.2909e+09
> > > Current process memory:
>      total 7.1188e+09 max 2.4651e+09 min 2.2909e+09
> > > Maximum (over computational time) space PetscMalloc()ed: total
> 2.1750e+09 max 7.6982e+08 min 6.5289e+08
> > > Current space PetscMalloc()ed:
>    total 2.3170e+05 max 7.7232e+04 min 7.7232e+04
> > >
> > >
> > > On Wed, Jun 8, 2016 at 11:51 AM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >    Signal 9 SIGKILL on batch systems usually means the process was
> killed because it ran out of time or ran out of memory.
> > >
> > >     Perhaps there is something in the code that is unscalable and
> requires more more memory with more processes. You can run on 1 2 and 3
> processes and measure the memory usage to see if it goes up with the number
> of processes using for example -memory_view
> > >
> > >   Barry
> > >
> > >
> > >
> > > > On Jun 8, 2016, at 11:41 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I am running a FE Stokes Solver with schur complement type PC on
> blues. The program runs well when mesh is 40X40X40 (0.88M dofs), but when I
> use 60X60X60 mesh, the program crashes and gives out some errors, which
> looks like a segmentation fault. The "strange" thing is that it runs well
> with 1CPU, 2CPUs, but fails on 4 or 8 CPUs. The log files are also
> attached. It seems like the global matrix and vector are assembled well,
> and errors come out before calling the KSPSolve().
> > > >
> > > > btw, I use the recent PETSc 3.7 dbg version. for libMesh I use both
> dbg and opt version, but none of those can give useful information. Has
> anyone met such situations before? Many thinks.
> > > > <ex01_validation_test.o1386940><ex01_validation_test.o1387007>
> > >
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160608/a10fa618/attachment.html>


More information about the petsc-users mailing list