[petsc-users] Error with SuperLU_DIST (mkl related?)

Eric Chamberland Eric.Chamberland at giref.ulaval.ca
Sat Dec 31 12:35:51 CST 2016


I think there is definitly a problem.

After looking at the files installed either from petsc-master tarball or 
the manual configure I just did with 
--download-superlu_dist-commit=v5.1.3, the file include/superlu_defs.h 
have these values:

#define SUPERLU_DIST_MAJOR_VERSION     5
#define SUPERLU_DIST_MINOR_VERSION     1
#define SUPERLU_DIST_PATCH_VERSION     0

What's wrong?

Eric


Le 2016-12-31 à 13:26, Eric Chamberland a écrit :
> Ah ok, I see!  Here look at the file name in the configure.log:
>
> Install the project...
> /usr/bin/cmake -P cmake_install.cmake
> -- Install configuration: "DEBUG"
> -- Installing: /opt/petsc-master_debug/lib/libsuperlu_dist.so.5.1.0
> -- Installing: /opt/petsc-master_debug/lib/libsuperlu_dist.so.5
>
> It is saying 5.1.0, but in fact you are right: it is 5.1.3 that is 
> downloaded!!! :)
>
> And FWIW, the nighlty automatic compilation of PETSc starts within a 
> brand new and empty directory each night...
>
> Thanks to both of you again! :)
>
> Eric
>
>
> Le 2016-12-31 à 13:17, Satish Balay a écrit :
>> ===============================================================================
>>                            Trying to download 
>> git://https://github.com/xiaoyeli/superlu_dist for SUPERLU_DIST
>> ===============================================================================
>>                      Executing: git clone 
>> https://github.com/xiaoyeli/superlu_dist 
>> /pmi/cmpbib/compilation_BIB_gcc_redhat_petsc-master_debug/COMPILE_AUTO/petsc-master-debug/arch-linux2-c-debug/externalpackages/git.superlu_dist
>> stdout: Cloning into 
>> '/pmi/cmpbib/compilation_BIB_gcc_redhat_petsc-master_debug/COMPILE_AUTO/petsc-master-debug/arch-linux2-c-debug/externalpackages/git.superlu_dist'...
>>                      Looking for SUPERLU_DIST at git.superlu_dist, 
>> hg.superlu_dist or a directory starting with ['superlu_dist']
>>                      Found a copy of SUPERLU_DIST in git.superlu_dist
>> Executing: ['git', 'rev-parse', '--git-dir']
>> stdout: .git
>> Executing: ['git', 'cat-file', '-e', 'v5.1.3^{commit}']
>> Executing: ['git', 'rev-parse', 'v5.1.3']
>> stdout: 7306f704c6c8d5113def649b76def3c8eb607690
>> Executing: ['git', 'stash']
>> stdout: No local changes to save
>> Executing: ['git', 'clean', '-f', '-d', '-x']
>> Executing: ['git', 'checkout', '-f', 
>> '7306f704c6c8d5113def649b76def3c8eb607690']
>> <<<<<<<<
>>
>> Per log below - its using 5.1.3. Why did you think you got 5.1.0?
>>
>> Satish
>>
>> On Sat, 31 Dec 2016, Eric Chamberland wrote:
>>
>>> Hi,
>>>
>>> ok I will test with 5.1.3 with the option you gave me
>>> (--download-superlu_dit-commit=v5.1.3).
>>>
>>> But from what you and Matthew said, I should have 5.1.3 with 
>>> petsc-master, but
>>> the last night log shows me library file name 5.1.0:
>>>
>>> http://www.giref.ulaval.ca/~cmpgiref/petsc-master-debug/2016.12.31.02h00m01s_configure.log 
>>>
>>>
>>> So I am a bit confused: Why did I got 5.1.0 last night? (I use the
>>> petsc-master tarball, is it the reason?)
>>>
>>> Thanks,
>>>
>>> Eric
>>>
>>>
>>> Le 2016-12-31 à 11:52, Satish Balay a écrit :
>>>> On Sat, 31 Dec 2016, Eric Chamberland wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am just starting to debug a bug encountered with and only with
>>>>> SuperLU_Dist
>>>>> combined with MKL on a 2 processes validation test.
>>>>>
>>>>> (the same test works fine with MUMPS on 2 processes).
>>>>>
>>>>> I just noticed that the SuperLU_Dist version installed by PETSc 
>>>>> configure
>>>>> script is 5.1.0 and the latest SuperLU_DIST is 5.1.3.
>>>> If you use petsc-master - it will install 5.1.3 by default.
>>>>> Before going further, I just want to ask:
>>>>>
>>>>> Is there any specific reason to stick to 5.1.0?
>>>> We don't usually upgrade externalpackage version in PETSc releases
>>>> [unless its tested to work and fixes known bugs]. There could be API
>>>> changes - or build changes that can potentially conflict.
>>>>
>>>> >From what I know - 5.1.3 should work with petsc-3.7 [it fixes a 
>>>> couple of
>>>> bugs].
>>>>
>>>> You might be able to do the following with petsc-3.7 [with git
>>>> externalpackage repos]
>>>>
>>>> --download-superlu_dist --download-superlu_dit-commit=v5.1.3
>>>>
>>>> Satish
>>>>
>>>>> Here is some more information:
>>>>>
>>>>> On process 2 I have this printed in stdout:
>>>>>
>>>>> Intel MKL ERROR: Parameter 6 was incorrect on entry to DTRSM .
>>>>>
>>>>> and in stderr:
>>>>>
>>>>> Test.ProblemeEFGen.opt: malloc.c:2369: sysmalloc: Assertion 
>>>>> `(old_top ==
>>>>> (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - 
>>>>> __builtin_offsetof
>>>>> (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long)
>>>>> (old_size)
>>>>>> = (unsigned long)((((__builtin_offsetof (struct malloc_chunk,
>>>>> fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 
>>>>> *(sizeof(size_t))) -
>>>>> 1))) &&
>>>>> ((old_top)->size & 0x1) && ((unsigned long) old_end & pagemask) == 
>>>>> 0)'
>>>>> failed.
>>>>> [saruman:15771] *** Process received signal ***
>>>>>
>>>>> This is the 7th call to KSPSolve in the same execution. Here is 
>>>>> the last
>>>>> KSPView:
>>>>>
>>>>> KSP Object:(o_slin) 2 MPI processes
>>>>>     type: preonly
>>>>>     maximum iterations=10000, initial guess is zero
>>>>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>>     left preconditioning
>>>>>     using NONE norm type for convergence test
>>>>> PC Object:(o_slin) 2 MPI processes
>>>>>     type: lu
>>>>>       LU: out-of-place factorization
>>>>>       tolerance for zero pivot 2.22045e-14
>>>>>       matrix ordering: natural
>>>>>       factor fill ratio given 0., needed 0.
>>>>>         Factored matrix follows:
>>>>>           Mat Object:         2 MPI processes
>>>>>             type: mpiaij
>>>>>             rows=382, cols=382
>>>>>             package used to perform factorization: superlu_dist
>>>>>             total: nonzeros=0, allocated nonzeros=0
>>>>>             total number of mallocs used during MatSetValues calls =0
>>>>>               SuperLU_DIST run parameters:
>>>>>                 Process grid nprow 2 x npcol 1
>>>>>                 Equilibrate matrix TRUE
>>>>>                 Matrix input mode 1
>>>>>                 Replace tiny pivots FALSE
>>>>>                 Use iterative refinement FALSE
>>>>>                 Processors in row 2 col partition 1
>>>>>                 Row permutation LargeDiag
>>>>>                 Column permutation METIS_AT_PLUS_A
>>>>>                 Parallel symbolic factorization FALSE
>>>>>                 Repeated factorization SamePattern
>>>>>     linear system matrix = precond matrix:
>>>>>     Mat Object:  (o_slin)   2 MPI processes
>>>>>       type: mpiaij
>>>>>       rows=382, cols=382
>>>>>       total: nonzeros=4458, allocated nonzeros=4458
>>>>>       total number of mallocs used during MatSetValues calls =0
>>>>>         using I-node (on process 0) routines: found 109 nodes, 
>>>>> limit used
>>>>>         is 5
>>>>>
>>>>> I know this information is not enough to help debug, but I would 
>>>>> like to
>>>>> know
>>>>> if PETSc guys will upgrade to 5.1.3 before trying to debug anything.
>>>>>
>>>>> Thanks,
>>>>> Eric
>>>>>
>>>>>
>>>



More information about the petsc-users mailing list