[petsc-users] Error with SuperLU_DIST (mkl related?)
Eric Chamberland
Eric.Chamberland at giref.ulaval.ca
Sat Dec 31 12:10:34 CST 2016
Hi,
ok I will test with 5.1.3 with the option you gave me
(--download-superlu_dit-commit=v5.1.3).
But from what you and Matthew said, I should have 5.1.3 with
petsc-master, but the last night log shows me library file name 5.1.0:
http://www.giref.ulaval.ca/~cmpgiref/petsc-master-debug/2016.12.31.02h00m01s_configure.log
So I am a bit confused: Why did I got 5.1.0 last night? (I use the
petsc-master tarball, is it the reason?)
Thanks,
Eric
Le 2016-12-31 à 11:52, Satish Balay a écrit :
> On Sat, 31 Dec 2016, Eric Chamberland wrote:
>
>> Hi,
>>
>> I am just starting to debug a bug encountered with and only with SuperLU_Dist
>> combined with MKL on a 2 processes validation test.
>>
>> (the same test works fine with MUMPS on 2 processes).
>>
>> I just noticed that the SuperLU_Dist version installed by PETSc configure
>> script is 5.1.0 and the latest SuperLU_DIST is 5.1.3.
> If you use petsc-master - it will install 5.1.3 by default.
>> Before going further, I just want to ask:
>>
>> Is there any specific reason to stick to 5.1.0?
> We don't usually upgrade externalpackage version in PETSc releases
> [unless its tested to work and fixes known bugs]. There could be API
> changes - or build changes that can potentially conflict.
>
> >From what I know - 5.1.3 should work with petsc-3.7 [it fixes a couple of bugs].
>
> You might be able to do the following with petsc-3.7 [with git externalpackage repos]
>
> --download-superlu_dist --download-superlu_dit-commit=v5.1.3
>
> Satish
>
>> Here is some more information:
>>
>> On process 2 I have this printed in stdout:
>>
>> Intel MKL ERROR: Parameter 6 was incorrect on entry to DTRSM .
>>
>> and in stderr:
>>
>> Test.ProblemeEFGen.opt: malloc.c:2369: sysmalloc: Assertion `(old_top ==
>> (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof
>> (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size)
>>> = (unsigned long)((((__builtin_offsetof (struct malloc_chunk,
>> fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) - 1))) &&
>> ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 0)' failed.
>> [saruman:15771] *** Process received signal ***
>>
>> This is the 7th call to KSPSolve in the same execution. Here is the last
>> KSPView:
>>
>> KSP Object:(o_slin) 2 MPI processes
>> type: preonly
>> maximum iterations=10000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
>> left preconditioning
>> using NONE norm type for convergence test
>> PC Object:(o_slin) 2 MPI processes
>> type: lu
>> LU: out-of-place factorization
>> tolerance for zero pivot 2.22045e-14
>> matrix ordering: natural
>> factor fill ratio given 0., needed 0.
>> Factored matrix follows:
>> Mat Object: 2 MPI processes
>> type: mpiaij
>> rows=382, cols=382
>> package used to perform factorization: superlu_dist
>> total: nonzeros=0, allocated nonzeros=0
>> total number of mallocs used during MatSetValues calls =0
>> SuperLU_DIST run parameters:
>> Process grid nprow 2 x npcol 1
>> Equilibrate matrix TRUE
>> Matrix input mode 1
>> Replace tiny pivots FALSE
>> Use iterative refinement FALSE
>> Processors in row 2 col partition 1
>> Row permutation LargeDiag
>> Column permutation METIS_AT_PLUS_A
>> Parallel symbolic factorization FALSE
>> Repeated factorization SamePattern
>> linear system matrix = precond matrix:
>> Mat Object: (o_slin) 2 MPI processes
>> type: mpiaij
>> rows=382, cols=382
>> total: nonzeros=4458, allocated nonzeros=4458
>> total number of mallocs used during MatSetValues calls =0
>> using I-node (on process 0) routines: found 109 nodes, limit used is 5
>>
>> I know this information is not enough to help debug, but I would like to know
>> if PETSc guys will upgrade to 5.1.3 before trying to debug anything.
>>
>> Thanks,
>> Eric
>>
>>
More information about the petsc-users
mailing list