[petsc-dev] Slowness of PetscSortIntWithArrayPair in MatAssembly

John Peterson jwpeterson at gmail.com
Tue Jul 2 13:28:31 CDT 2019


Hi Junchao,

That's what I think is happening. We managed to get a stack trace from one
of the slow running cases, and it was over 100k frames deep in recursive
PetscSortIntWithArrayPair_Private calls (see below). So perhaps we are
seeing worse case O(n) stack frame depth for quicksort on an already-sorted
array, but I also am not sure where the big number, n=35241426, comes from,
as the problem did not have nearly that many DOFs.

#104609 PetscSortIntWithArrayPair_Private (L=0x7ff6a413e650,
J=0x7ff69bace650, K=0x7ff68277e650, right=33123131)
#104610 PetscSortIntWithArrayPair_Private (L=0x7ff6a413e650,
J=0x7ff69bace650, K=0x7ff68277e650, right=35241425)
#104611 PetscSortIntWithArrayPair (n=35241426, L=0x7ff6a413e650,
J=0x7ff69bace650, K=0x7ff68277e650)
#104612 MatStashSortCompress_Private (stash=0x555f7603b628,
insertmode=ADD_VALUES)
#104613 MatStashScatterBegin_BTS (mat=0x555f7603aee0, stash=0x555f7603b628,
owners=0x555f7623c9c0)
#104614 MatStashScatterBegin_Private (mat=0x555f7603aee0,
stash=0x555f7603b628, owners=0x555f7623c9c0)
#104615 MatAssemblyBegin_MPIAIJ (mat=0x555f7603aee0,
mode=MAT_FINAL_ASSEMBLY)

--
John


On Tue, Jul 2, 2019 at 1:23 PM Zhang, Junchao <jczhang at mcs.anl.gov> wrote:

> Is it because the array is already sorted?
> --Junchao Zhang
>
>
> On Tue, Jul 2, 2019 at 12:13 PM Fande Kong via petsc-dev <
> petsc-dev at mcs.anl.gov> wrote:
>
>> Hi Developers,
>>
>> John just noticed that the matrix assembly was slow when having
>> sufficient amount of off-diagonal entries. It was not a MPI issue since I
>> was  able to reproduce the issue using two cores on my desktop, that is,
>> "mpirun -n 2".
>>
>> I turned  on a profiling, and 99.99% of the time was spent
>> on PetscSortIntWithArrayPair (recursively calling).   It took THREE MINUTES
>>  to get the assembly done. And then changed to use the option
>> "-matstash_legacy" to restore
>> the code to the old assembly routine, and the same code took ONE SECOND
>> to get the matrix assembly done.
>>
>> Should write any better sorting algorithms?
>>
>>
>> Fande,
>>
>

-- 
John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20190702/bbdb29bf/attachment-0001.html>


More information about the petsc-dev mailing list