[petsc-users] slepc eating all my ram

Simon Burton simon at arrowtheory.com
Sat Jul 16 08:40:24 CDT 2016


Hi again,

I found another machine with enough ram to run this (i think).

Running into another problem now, with dgemv:

[0] EPSSetUp_Power(): Warning: parameter mpd ignored
[0] STSetUp(): Setting up new ST
Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
[0] BV_SafeSqrt(): Zero norm, either the vector is zero or a semi-inner product is being used


I dug into this in gdb a bit:


Breakpoint 2, 0x00007ffff4f4cbd0 in dgemv_ ()
   from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
(gdb) bt
#0  0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
#1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
    y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
#2  0x00007ffff5dcbd86 in BVDotVec_Svec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/impls/svec/svec.c:150
#3  0x00007ffff5dffd58 in BVDotVec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvglobal.c:191
#4  0x00007ffff5e1aad9 in BVOrthogonalizeCGS1 (bv=0x6ba6b0, j=0, v=0x0, H=0x75a3b0, onorm=0x7fffffffdc28, 
    norm=0x7fffffffdc20) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:81
#5  0x00007ffff5e1c1bb in BVOrthogonalizeCGS (bv=0x6ba6b0, j=0, v=0x0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:214
#6  0x00007ffff5e1ddfd in BVOrthogonalizeColumn (bv=0x6ba6b0, j=0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:371
#7  0x00007ffff6050986 in EPSGetStartVector (eps=0x6a3ee0, i=0, breakdown=0x0)
    at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:758
#8  0x00007ffff5f52812 in EPSSolve_Power (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/impls/power/power.c:103
#9  0x00007ffff6049b28 in EPSSolve (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:101
#10 0x0000000000401430 in main ()
(gdb) up
#1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
    y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
274	    if (n) PetscStackCallBLAS("BLASgemv",BLASgemv_("C",&n,&k,&done,A,&n,x,&one,&zero,y,&one));
(gdb) print n
$1 = 4294967296
(gdb) print sizeof(n)
$2 = 8
(gdb) step
Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .


It looks to me like slepc is doing it right, but with error messages
like this who knows. It's a bit beyond me debugging assembly.

Originally I built petsc with --download-fblaslapack but i don't think
it was working with 64bit indexes (?)

Maybe I should try another blas.

Simon.


On Sat, 16 Jul 2016 07:17:44 +1000
Simon Burton <simon at arrowtheory.com> wrote:

> On Fri, 15 Jul 2016 19:53:31 +0200
> "Jose E. Roman" <jroman at dsic.upv.es> wrote:
> 
> > 
> > The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors?
> 
> Yes I think you are right.
> I can get beyond STSetUp with the right settings.
> Now the solver runs out of memory inside EPSGetStartVector.
> 
> > 
> > Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ?
> 
> good question. I didn't change much, let me try again the original.
> 
> > Do you get the same behaviour with the original ex3 with the same problem size?
> 
> Yes
> 
> > 
> > Do you have the same problem with a smaller problem? (half size, say)
> 
> Halving n gives a quarter of the dimension, which is 8gb vector sizes.
> It works fine and uses a total of 48gb ram. Oh, I see at one point during
> initialization it hits a maximum of 56gb.
> 
> So I guess it needs to keep 6 vectors in total.
> With the original problem size this becomes 192gb which is
> just a few gb too much to crunch. I guess I can still try it,
> but it doesn't feel good hitting the harddrive that much.
> 
> Thanks for the suggestions.
> 
> Simon.


More information about the petsc-users mailing list