[petsc-users] slepc eating all my ram

Barry Smith bsmith at mcs.anl.gov
Sat Jul 16 10:00:58 CDT 2016


  Send configure.log to petsc-maint at mcs.anl.gov


> On Jul 16, 2016, at 8:40 AM, Simon Burton <simon at arrowtheory.com> wrote:
> 
> 
> Hi again,
> 
> I found another machine with enough ram to run this (i think).
> 
> Running into another problem now, with dgemv:
> 
> [0] EPSSetUp_Power(): Warning: parameter mpd ignored
> [0] STSetUp(): Setting up new ST
> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
> [0] BV_SafeSqrt(): Zero norm, either the vector is zero or a semi-inner product is being used
> 
> 
> I dug into this in gdb a bit:
> 
> 
> Breakpoint 2, 0x00007ffff4f4cbd0 in dgemv_ ()
>   from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
> (gdb) bt
> #0  0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
> #1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
>    y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
> #2  0x00007ffff5dcbd86 in BVDotVec_Svec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
>    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/impls/svec/svec.c:150
> #3  0x00007ffff5dffd58 in BVDotVec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
>    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvglobal.c:191
> #4  0x00007ffff5e1aad9 in BVOrthogonalizeCGS1 (bv=0x6ba6b0, j=0, v=0x0, H=0x75a3b0, onorm=0x7fffffffdc28, 
>    norm=0x7fffffffdc20) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:81
> #5  0x00007ffff5e1c1bb in BVOrthogonalizeCGS (bv=0x6ba6b0, j=0, v=0x0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
>    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:214
> #6  0x00007ffff5e1ddfd in BVOrthogonalizeColumn (bv=0x6ba6b0, j=0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
>    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:371
> #7  0x00007ffff6050986 in EPSGetStartVector (eps=0x6a3ee0, i=0, breakdown=0x0)
>    at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:758
> #8  0x00007ffff5f52812 in EPSSolve_Power (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/impls/power/power.c:103
> #9  0x00007ffff6049b28 in EPSSolve (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:101
> #10 0x0000000000401430 in main ()
> (gdb) up
> #1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
>    y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
> 274	    if (n) PetscStackCallBLAS("BLASgemv",BLASgemv_("C",&n,&k,&done,A,&n,x,&one,&zero,y,&one));
> (gdb) print n
> $1 = 4294967296
> (gdb) print sizeof(n)
> $2 = 8
> (gdb) step
> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
> 
> 
> It looks to me like slepc is doing it right, but with error messages
> like this who knows. It's a bit beyond me debugging assembly.
> 
> Originally I built petsc with --download-fblaslapack but i don't think
> it was working with 64bit indexes (?)
> 
> Maybe I should try another blas.
> 
> Simon.
> 
> 
> On Sat, 16 Jul 2016 07:17:44 +1000
> Simon Burton <simon at arrowtheory.com> wrote:
> 
>> On Fri, 15 Jul 2016 19:53:31 +0200
>> "Jose E. Roman" <jroman at dsic.upv.es> wrote:
>> 
>>> 
>>> The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors?
>> 
>> Yes I think you are right.
>> I can get beyond STSetUp with the right settings.
>> Now the solver runs out of memory inside EPSGetStartVector.
>> 
>>> 
>>> Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ?
>> 
>> good question. I didn't change much, let me try again the original.
>> 
>>> Do you get the same behaviour with the original ex3 with the same problem size?
>> 
>> Yes
>> 
>>> 
>>> Do you have the same problem with a smaller problem? (half size, say)
>> 
>> Halving n gives a quarter of the dimension, which is 8gb vector sizes.
>> It works fine and uses a total of 48gb ram. Oh, I see at one point during
>> initialization it hits a maximum of 56gb.
>> 
>> So I guess it needs to keep 6 vectors in total.
>> With the original problem size this becomes 192gb which is
>> just a few gb too much to crunch. I guess I can still try it,
>> but it doesn't feel good hitting the harddrive that much.
>> 
>> Thanks for the suggestions.
>> 
>> Simon.



More information about the petsc-users mailing list