[petsc-users] MUMPS error and superLU error
venkatesh g
venkateshgk.j at gmail.com
Mon Jun 22 10:57:39 CDT 2015
Hi
I have restructured my matrix eigenvalue problem to see why B is singular
as you suggested by changing the governing equations in different form.
Now my matrix B is not singular. Both A and B are invertible in Ax=lambda
Bx.
Still I receive error in MUMPS as it uses large memory (attached is the
error log)
I gave the command: aprun -n 240 -N 24 ./ex7 -f1 A100t -f2 B100t -st_type
sinvert -eps_target 0.01 -st_ksp_type preonly -st_pc_type lu
-st_pc_factor_mat_solver_package mumps -mat_mumps_cntl_1 1e-5
-mat_mumps_icntl_4 2 -evecs v100t
The matrix A is 60% with zeros.
Kindly help me.
Venkatesh
On Sun, May 31, 2015 at 8:04 PM, Hong <hzhang at mcs.anl.gov> wrote:
> venkatesh,
>
> As we discussed previously, even on smaller problems,
> both mumps and superlu_dist failed, although Mumps gave "OOM" error in
> numerical factorization.
>
> You acknowledged that B is singular, which may need additional
> reformulation for your eigenvalue problems. The option '-st_type sinvert'
> likely uses B^{-1} (have you read slepc manual?), which could be the source
> of trouble.
>
> Please investigate your model, understand why B is singular; if there is a
> way to dump null space before submitting large size simulation.
>
> Hong
>
>
> On Sun, May 31, 2015 at 8:36 AM, Dave May <dave.mayhem23 at gmail.com> wrote:
>
>> It failed due to a lack of memory. "OOM" stands for "out of memory". OOM
>> killer terminated your job means you ran out of memory.
>>
>>
>>
>>
>> On Sunday, 31 May 2015, venkatesh g <venkateshgk.j at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I tried to run my Generalized Eigenproblem in 120 x 24 = 2880 cores.
>>> The matrix size of A = 20GB and B = 5GB.
>>>
>>> It got killed after 7 Hrs of run time. Please see the mumps error log.
>>> Why must it fail ?
>>> I gave the command:
>>>
>>> aprun -n 240 -N 24 ./ex7 -f1 a110t -f2 b110t -st_type sinvert -eps_nev 1
>>> -log_summary -st_ksp_type preonly -st_pc_type lu
>>> -st_pc_factor_mat_solver_package mumps -mat_mumps_cntl_1 1e-2
>>>
>>> Kindly let me know.
>>>
>>> cheers,
>>> Venkatesh
>>>
>>> On Fri, May 29, 2015 at 10:46 PM, venkatesh g <venkateshgk.j at gmail.com>
>>> wrote:
>>>
>>>> Hi Matt, users,
>>>>
>>>> Thanks for the info. Do you also use Petsc and Slepc with MUMPS ? I get
>>>> into the segmentation error if I increase my matrix size.
>>>>
>>>> Can you suggest other software for direct solver for QR in parallel
>>>> since as LU may not be good for a singular B matrix in Ax=lambda Bx ? I am
>>>> attaching the working version mumps log.
>>>>
>>>> My matrix size here is around 47000x47000. If I am not wrong, the
>>>> memory usage per core is 272MB.
>>>>
>>>> Can you tell me if I am wrong ? or really if its light on memory for
>>>> this matrix ?
>>>>
>>>> Thanks
>>>> cheers,
>>>> Venkatesh
>>>>
>>>> On Fri, May 29, 2015 at 4:00 PM, Matt Landreman <
>>>> matt.landreman at gmail.com> wrote:
>>>>
>>>>> Dear Venkatesh,
>>>>>
>>>>> As you can see in the error log, you are now getting a segmentation
>>>>> fault, which is almost certainly a separate issue from the info(1)=-9
>>>>> memory problem you had previously. Here is one idea which may or may not
>>>>> help. I've used mumps on the NERSC Edison system, and I found that I
>>>>> sometimes get segmentation faults when using the default Intel compiler.
>>>>> When I switched to the cray compiler the problem disappeared. So you could
>>>>> perhaps try a different compiler if one is available on your system.
>>>>>
>>>>> Matt
>>>>> On May 29, 2015 4:04 AM, "venkatesh g" <venkateshgk.j at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Matt,
>>>>>>
>>>>>> I did what you told and read the manual of that CNTL parameters. I
>>>>>> solve for that with CNTL(1)=1e-4. It is working.
>>>>>>
>>>>>> But it was a test matrix with size 46000x46000. Actual matrix size is
>>>>>> 108900x108900 and will increase in the future.
>>>>>>
>>>>>> I get this error of memory allocation failed. And the binary matrix
>>>>>> size of A is 20GB and B is 5 GB.
>>>>>>
>>>>>> Now I submit this in 240 processors each 4 GB RAM and also in 128
>>>>>> Processors with total 512 GB RAM.
>>>>>>
>>>>>> In both the cases, it fails with the following error like memory is
>>>>>> not enough. But for 90000x90000 size it had run serially in Matlab with
>>>>>> <256 GB RAM.
>>>>>>
>>>>>> Kindly let me know.
>>>>>>
>>>>>> Venkatesh
>>>>>>
>>>>>> On Tue, May 26, 2015 at 8:02 PM, Matt Landreman <
>>>>>> matt.landreman at gmail.com> wrote:
>>>>>>
>>>>>>> Hi Venkatesh,
>>>>>>>
>>>>>>> I've struggled a bit with mumps memory allocation too. I think the
>>>>>>> behavior of mumps is roughly the following. First, in the "analysis step",
>>>>>>> mumps computes a minimum memory required based on the structure of nonzeros
>>>>>>> in the matrix. Then when it actually goes to factorize the matrix, if it
>>>>>>> ever encounters an element smaller than CNTL(1) (default=0.01) in the
>>>>>>> diagonal of a sub-matrix it is trying to factorize, it modifies the
>>>>>>> ordering to avoid the small pivot, which increases the fill-in (hence
>>>>>>> memory needed). ICNTL(14) sets the margin allowed for this unanticipated
>>>>>>> fill-in. Setting ICNTL(14)=200000 as in your email is not the solution,
>>>>>>> since this means mumps asks for a huge amount of memory at the start.
>>>>>>> Better would be to lower CNTL(1) or (I think) use static pivoting
>>>>>>> (CNTL(4)). Read the section in the mumps manual about these CNTL
>>>>>>> parameters. I typically set CNTL(1)=1e-6, which eliminated all the
>>>>>>> INFO(1)=-9 errors for my problem, without having to modify ICNTL(14).
>>>>>>>
>>>>>>> Also, I recommend running with ICNTL(4)=3 to display diagnostics.
>>>>>>> Look for the line in standard output that says "TOTAL space in MBYTES
>>>>>>> for IC factorization". This is the amount of memory that mumps is trying
>>>>>>> to allocate, and for the default ICNTL(14), it should be similar to
>>>>>>> matlab's need.
>>>>>>>
>>>>>>> Hope this helps,
>>>>>>> -Matt Landreman
>>>>>>> University of Maryland
>>>>>>>
>>>>>>> On Tue, May 26, 2015 at 10:03 AM, venkatesh g <
>>>>>>> venkateshgk.j at gmail.com> wrote:
>>>>>>>
>>>>>>>> I posted a while ago in MUMPS forums but no one seems to reply.
>>>>>>>>
>>>>>>>> I am solving a large generalized Eigenvalue problem.
>>>>>>>>
>>>>>>>> I am getting the following error which is attached, after giving
>>>>>>>> the command:
>>>>>>>>
>>>>>>>> /cluster/share/venkatesh/petsc-3.5.3/linux-gnu/bin/mpiexec -np 64
>>>>>>>> -hosts compute-0-4,compute-0-6,compute-0-7,compute-0-8 ./ex7 -f1 a72t -f2
>>>>>>>> b72t -st_type sinvert -eps_nev 3 -eps_target 0.5 -st_ksp_type preonly
>>>>>>>> -st_pc_type lu -st_pc_factor_mat_solver_package mumps -mat_mumps_icntl_14
>>>>>>>> 200000
>>>>>>>>
>>>>>>>> IT IS impossible to allocate so much memory per processor.. it is
>>>>>>>> asking like around 70 GB per processor.
>>>>>>>>
>>>>>>>> A serial job in MATLAB for the same matrices takes < 60GB.
>>>>>>>>
>>>>>>>> After trying out superLU_dist, I have attached the error there also
>>>>>>>> (segmentation error).
>>>>>>>>
>>>>>>>> Kindly help me.
>>>>>>>>
>>>>>>>> Venkatesh
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150622/e6aedeae/attachment-0001.html>
-------------- next part --------------
Generalized eigenproblem stored in file.
Reading COMPLEX matrices from binary files...
Entering ZMUMPS driver with JOB, N, NZ = 1 50000 0
ZMUMPS 4.10.0
L U Solver for unsymmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
** Max-trans not allowed because matrix is distributed
... Structural symmetry (in percent)= 86
Density: NBdense, Average, Median = 01635120109
Ordering based on METIS
A root of estimated size 21001 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 1314700738
-- (3) Storage of factors (REAL, estimated) = 1416564824
-- (4) Storage of factors (INT , estimated) = 10444785
-- (5) Maximum frontal size (estimated) = 21500
-- (6) Number of nodes in the tree = 240
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL(6) Maximum transversal option = 0
ICNTL(7) Pivot order option = 7
Percentage of memory relaxation (effective) = 35
Number of level 2 nodes = 139
Number of split nodes = 2
RINFOG(1) Operations during elimination (estim)= 2.341D+13
Distributed matrix entry format (ICNTL(18)) = 3
** Rank of proc needing largest memory in IC facto : 30
** Estimated corresponding MBYTES for IC facto : 21593
** Estimated avg. MBYTES per work. proc at facto (IC) : 7708
** TOTAL space in MBYTES for IC factorization : 1850075
** Rank of proc needing largest memory for OOC facto : 30
** Estimated corresponding MBYTES for OOC facto : 21681
** Estimated avg. MBYTES per work. proc at facto (OOC) : 7782
** TOTAL space in MBYTES for OOC factorization : 1867805
Entering ZMUMPS driver with JOB, N, NZ = 2 50000 716459748
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
NUMBER OF WORKING PROCESSES = 240
OUT-OF-CORE OPTION (ICNTL(22)) = 0
REAL SPACE FOR FACTORS = 1416564824
INTEGER SPACE FOR FACTORS = 10444785
MAXIMUM FRONTAL SIZE (ESTIMATED) = 21500
NUMBER OF NODES IN THE TREE = 240
Convergence error after scaling for ONE-NORM (option 7/8) = 0.59D+01
Maximum effective relaxed size of S = 1181811925
Average effective relaxed size of S = 228990839
REDISTRIB: TOTAL DATA LOCAL/SENT = 328575589 1437471711
GLOBAL TIME FOR MATRIX DISTRIBUTION = 206.6792
** Memory relaxation parameter ( ICNTL(14) ) : 35
** Rank of processor needing largest memory in facto : 30
** Space in MBYTES used by this processor for facto : 21593
** Avg. Space in MBYTES per working proc during facto : 7708
[NID 01360] 2015-06-22 20:00:41 Apid 432433: initiated application termination
[NID 01360] 2015-06-22 19:59:18 Apid 432433: OOM killer terminated this process.
Application 432433 exit signals: Killed
Application 432433 resources: utime ~0s, stime ~20s, Rss ~7716, inblocks ~16040, outblocks ~2380
More information about the petsc-users
mailing list