[petsc-users] Error with SuperLU_DIST (mkl related?)

Matthew Knepley knepley at gmail.com
Sat Dec 31 12:14:35 CST 2016


On Sat, Dec 31, 2016 at 12:10 PM, Eric Chamberland <
Eric.Chamberland at giref.ulaval.ca> wrote:

> Hi,
>
> ok I will test with 5.1.3 with the option you gave me
> (--download-superlu_dit-commit=v5.1.3).
>
> But from what you and Matthew said, I should have 5.1.3 with petsc-master,
> but the last night log shows me library file name 5.1.0:
>
> http://www.giref.ulaval.ca/~cmpgiref/petsc-master-debug/2016
> .12.31.02h00m01s_configure.log
>
> So I am a bit confused: Why did I got 5.1.0 last night? (I use the
> petsc-master tarball, is it the reason?)
>

We do not automatically upgrade the version of dependent packages. You have
to delete them and reconfigure
if you want us to download the new thing.

  Matt


> Thanks,
>
> Eric
>
>
> Le 2016-12-31 à 11:52, Satish Balay a écrit :
>
>> On Sat, 31 Dec 2016, Eric Chamberland wrote:
>>
>> Hi,
>>>
>>> I am just starting to debug a bug encountered with and only with
>>> SuperLU_Dist
>>> combined with MKL on a 2 processes validation test.
>>>
>>> (the same test works fine with MUMPS on 2 processes).
>>>
>>> I just noticed that the SuperLU_Dist version installed by PETSc configure
>>> script is 5.1.0 and the latest SuperLU_DIST is 5.1.3.
>>>
>> If you use petsc-master - it will install 5.1.3 by default.
>>
>>> Before going further, I just want to ask:
>>>
>>> Is there any specific reason to stick to 5.1.0?
>>>
>> We don't usually upgrade externalpackage version in PETSc releases
>> [unless its tested to work and fixes known bugs]. There could be API
>> changes - or build changes that can potentially conflict.
>>
>> >From what I know - 5.1.3 should work with petsc-3.7 [it fixes a couple
>> of bugs].
>>
>> You might be able to do the following with petsc-3.7 [with git
>> externalpackage repos]
>>
>> --download-superlu_dist --download-superlu_dit-commit=v5.1.3
>>
>> Satish
>>
>> Here is some more information:
>>>
>>> On process 2 I have this printed in stdout:
>>>
>>> Intel MKL ERROR: Parameter 6 was incorrect on entry to DTRSM .
>>>
>>> and in stderr:
>>>
>>> Test.ProblemeEFGen.opt: malloc.c:2369: sysmalloc: Assertion `(old_top ==
>>> (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof
>>> (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long)
>>> (old_size)
>>>
>>>> = (unsigned long)((((__builtin_offsetof (struct malloc_chunk,
>>>>
>>> fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) -
>>> 1))) &&
>>> ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 0)'
>>> failed.
>>> [saruman:15771] *** Process received signal ***
>>>
>>> This is the 7th call to KSPSolve in the same execution. Here is the last
>>> KSPView:
>>>
>>> KSP Object:(o_slin) 2 MPI processes
>>>    type: preonly
>>>    maximum iterations=10000, initial guess is zero
>>>    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>    left preconditioning
>>>    using NONE norm type for convergence test
>>> PC Object:(o_slin) 2 MPI processes
>>>    type: lu
>>>      LU: out-of-place factorization
>>>      tolerance for zero pivot 2.22045e-14
>>>      matrix ordering: natural
>>>      factor fill ratio given 0., needed 0.
>>>        Factored matrix follows:
>>>          Mat Object:         2 MPI processes
>>>            type: mpiaij
>>>            rows=382, cols=382
>>>            package used to perform factorization: superlu_dist
>>>            total: nonzeros=0, allocated nonzeros=0
>>>            total number of mallocs used during MatSetValues calls =0
>>>              SuperLU_DIST run parameters:
>>>                Process grid nprow 2 x npcol 1
>>>                Equilibrate matrix TRUE
>>>                Matrix input mode 1
>>>                Replace tiny pivots FALSE
>>>                Use iterative refinement FALSE
>>>                Processors in row 2 col partition 1
>>>                Row permutation LargeDiag
>>>                Column permutation METIS_AT_PLUS_A
>>>                Parallel symbolic factorization FALSE
>>>                Repeated factorization SamePattern
>>>    linear system matrix = precond matrix:
>>>    Mat Object:  (o_slin)   2 MPI processes
>>>      type: mpiaij
>>>      rows=382, cols=382
>>>      total: nonzeros=4458, allocated nonzeros=4458
>>>      total number of mallocs used during MatSetValues calls =0
>>>        using I-node (on process 0) routines: found 109 nodes, limit used
>>> is 5
>>>
>>> I know this information is not enough to help debug, but I would like to
>>> know
>>> if PETSc guys will upgrade to 5.1.3 before trying to debug anything.
>>>
>>> Thanks,
>>> Eric
>>>
>>>
>>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161231/35a98be2/attachment.html>


More information about the petsc-users mailing list