[petsc-users] slepc eating all my ram
Jose E. Roman
jroman at dsic.upv.es
Sun Jul 17 12:40:52 CDT 2016
Simon:
I have made a few optimizations regarding memory management in EPS. In your case, these changes will allocate 1 vector less (maybe 2). If you are using the repository version, just pull and try again. Otherwise, wait until slepc-3.7.2 is released (in a few days).
Jose
> El 16 jul 2016, a las 17:00, Barry Smith <bsmith at mcs.anl.gov> escribió:
>
>
> Send configure.log to petsc-maint at mcs.anl.gov
>
>
>> On Jul 16, 2016, at 8:40 AM, Simon Burton <simon at arrowtheory.com> wrote:
>>
>>
>> Hi again,
>>
>> I found another machine with enough ram to run this (i think).
>>
>> Running into another problem now, with dgemv:
>>
>> [0] EPSSetUp_Power(): Warning: parameter mpd ignored
>> [0] STSetUp(): Setting up new ST
>> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
>> [0] BV_SafeSqrt(): Zero norm, either the vector is zero or a semi-inner product is being used
>>
>>
>> I dug into this in gdb a bit:
>>
>>
>> Breakpoint 2, 0x00007ffff4f4cbd0 in dgemv_ ()
>> from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
>> (gdb) bt
>> #0 0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
>> #1 0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650,
>> y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
>> #2 0x00007ffff5dcbd86 in BVDotVec_Svec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
>> at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/impls/svec/svec.c:150
>> #3 0x00007ffff5dffd58 in BVDotVec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
>> at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvglobal.c:191
>> #4 0x00007ffff5e1aad9 in BVOrthogonalizeCGS1 (bv=0x6ba6b0, j=0, v=0x0, H=0x75a3b0, onorm=0x7fffffffdc28,
>> norm=0x7fffffffdc20) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:81
>> #5 0x00007ffff5e1c1bb in BVOrthogonalizeCGS (bv=0x6ba6b0, j=0, v=0x0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
>> at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:214
>> #6 0x00007ffff5e1ddfd in BVOrthogonalizeColumn (bv=0x6ba6b0, j=0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
>> at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:371
>> #7 0x00007ffff6050986 in EPSGetStartVector (eps=0x6a3ee0, i=0, breakdown=0x0)
>> at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:758
>> #8 0x00007ffff5f52812 in EPSSolve_Power (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/impls/power/power.c:103
>> #9 0x00007ffff6049b28 in EPSSolve (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:101
>> #10 0x0000000000401430 in main ()
>> (gdb) up
>> #1 0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650,
>> y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
>> 274 if (n) PetscStackCallBLAS("BLASgemv",BLASgemv_("C",&n,&k,&done,A,&n,x,&one,&zero,y,&one));
>> (gdb) print n
>> $1 = 4294967296
>> (gdb) print sizeof(n)
>> $2 = 8
>> (gdb) step
>> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
>>
>>
>> It looks to me like slepc is doing it right, but with error messages
>> like this who knows. It's a bit beyond me debugging assembly.
>>
>> Originally I built petsc with --download-fblaslapack but i don't think
>> it was working with 64bit indexes (?)
>>
>> Maybe I should try another blas.
>>
>> Simon.
>>
>>
>> On Sat, 16 Jul 2016 07:17:44 +1000
>> Simon Burton <simon at arrowtheory.com> wrote:
>>
>>> On Fri, 15 Jul 2016 19:53:31 +0200
>>> "Jose E. Roman" <jroman at dsic.upv.es> wrote:
>>>
>>>>
>>>> The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors?
>>>
>>> Yes I think you are right.
>>> I can get beyond STSetUp with the right settings.
>>> Now the solver runs out of memory inside EPSGetStartVector.
>>>
>>>>
>>>> Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ?
>>>
>>> good question. I didn't change much, let me try again the original.
>>>
>>>> Do you get the same behaviour with the original ex3 with the same problem size?
>>>
>>> Yes
>>>
>>>>
>>>> Do you have the same problem with a smaller problem? (half size, say)
>>>
>>> Halving n gives a quarter of the dimension, which is 8gb vector sizes.
>>> It works fine and uses a total of 48gb ram. Oh, I see at one point during
>>> initialization it hits a maximum of 56gb.
>>>
>>> So I guess it needs to keep 6 vectors in total.
>>> With the original problem size this becomes 192gb which is
>>> just a few gb too much to crunch. I guess I can still try it,
>>> but it doesn't feel good hitting the harddrive that much.
>>>
>>> Thanks for the suggestions.
>>>
>>> Simon.
>
More information about the petsc-users
mailing list