[petsc-users] Memory optimization
Matthew Knepley
knepley at gmail.com
Mon Nov 25 11:25:07 CST 2019
On Mon, Nov 25, 2019 at 11:20 AM Perceval Desforges <
perceval.desforges at polytechnique.edu> wrote:
> Hi,
>
> So I'm loading two matrices from files, both 1000000 by 10000000. I ran
> the program with -mat_view::ascii_info and I got:
>
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1000000, cols=1000000
> total: nonzeros=7000000, allocated nonzeros=7000000
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> 20 times, and then
>
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1000000, cols=1000000
> total: nonzeros=1000000, allocated nonzeros=1000000
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> 20 times as well, and then
>
> Mat Object: 1 MPI processes
> type: seqaij
> rows=1000000, cols=1000000
> total: nonzeros=7000000, allocated nonzeros=7000000
> total number of mallocs used during MatSetValues calls =0
> not using I-node routines
>
> 20 times as well before crashing.
>
> I realized it might be because I am setting up 20 krylov schur partitions
> which may be too much. I tried running the code again with only 2
> partitions and now the code runs but I have speed issues.
>
> I have one version of the code where my first matrix has 5 non-zero
> diagonals (so 5000000 non-zero entries), and the set up time is quite fast
> (8 seconds) and solving is also quite fast. The second version is the same
> but I have two extra non-zero diagonals (7000000 non-zero entries) and the
> set up time is a lot slower (2900 seconds ~ 50 minutes) and solving is also
> a lot slower. Is it normal that adding two extra diagonals increases solve
> and set up time so much?
>
> I can't see the rest of your code, but I am guessing your preallocation
statement has "5", so it does no mallocs when you create
your first matrix, but mallocs for every row when you create your second
matrix. When you load them from disk, we do all the
preallocation correctly.
Thanks,
Matt
> Thanks again,
>
> Best regards,
>
> Perceval,
>
>
>
> Then I guess it is the factorization that is failing. How many nonzero
> entries do you have? Run with
> -mat_view ::ascii_info
>
> Jose
>
>
> El 22 nov 2019, a las 19:56, Perceval Desforges <
> perceval.desforges at polytechnique.edu> escribió:
>
> Hi,
>
> Thanks for your answer. I tried looking at the inertias before solving,
> but the problem is that the program crashes when I call EPSSetUp with this
> error:
>
> slurmstepd: error: Step 2140.0 exceeded virtual memory limit (313526508 >
> 107317760), being killed
>
> I get this error even when there are no eigenvalues in the interval.
>
> I've started using BVMAT instead of BVVECS by the way.
>
> Thanks,
>
> Perceval,
>
>
>
>
>
> Don't use -mat_mumps_icntl_14 to reduce the memory used by MUMPS.
>
> Most likely the problem is that the interval you gave is too large and
> contains too many eigenvalues (SLEPc needs to allocate at least one vector
> per each eigenvalue). You can count the eigenvalues in the interval with
> the inertias, which are available at EPSSetUp (no need to call EPSSolve).
> See this example:
>
> http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html
> You can comment out the call to EPSSolve() and run with the option
> -show_inertias
> For example, the output
> Shift 0.1 Inertia 3
> Shift 0.35 Inertia 11
> means that the interval [0.1,0.35] contains 8 eigenvalues (=11-3).
>
> By the way, I would suggest using BVMAT instead of BVVECS (the latter is
> slower).
>
> Jose
>
>
> El 21 nov 2019, a las 18:13, Perceval Desforges via petsc-users <
> petsc-users at mcs.anl.gov> escribió:
>
> Hello all,
>
> I am trying to obtain all the eigenvalues in a certain interval for a
> fairly large matrix (1000000 * 1000000). I therefore use the spectrum
> slicing method detailed in section 3.4.5 of the manual. The calculations
> are run on a processor with 20 cores and 96 Go of RAM.
>
> The options I use are :
>
> -bv_type vecs -eps_krylovschur_detect_zeros 1 -mat_mumps_icntl_13 1
> -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12
>
>
>
> However the program quickly crashes with this error:
>
> slurmstepd: error: Step 2115.0 exceeded virtual memory limit (312121084 >
> 107317760), being killed
>
> I've tried reducing the amount of memory used by slepc with the
> -mat_mumps_icntl_14 option by setting it at -70 for example but then I get
> this error:
>
> [1]PETSC ERROR: Error in external library
> [1]PETSC ERROR: Error reported by MUMPS in numerical factorization phase:
> INFOG(1)=-9, INFO(2)=82733614
>
> which is an error due to setting the mumps icntl option so low from what
> I've gathered.
>
> Is there any other way I can reduce memory usage?
>
>
>
> Thanks,
>
> Regards,
>
> Perceval,
>
>
>
> P.S. I sent the same email a few minutes ago but I think I made a mistake
> in the address, I'm sorry if I've sent it twice.
>
>
>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191125/f2cf7807/attachment.html>
More information about the petsc-users
mailing list