[petsc-dev] Slowness of PetscSortIntWithArrayPair in MatAssembly

Fande Kong fdkong.jd at gmail.com
Wed Jul 3 18:47:19 CDT 2019


Process 3915 resuming
Process 3915 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS
(code=2, address=0x7ffee9b91fc8)
    frame #0: 0x000000010cbaa031
libpetsc.3.011.dylib`PetscSortIntWithArrayPair_Private(L=0x0000000119fc5480,
J=0x000000011bfaa480, K=0x000000011ff74480, right=13291) at sorti.c:298
   295      }
   296      PetscFunctionReturn(0);
   297    }
-> 298    i    = MEDIAN(L,right);
   299    SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp);
   300    vl   = L[0];
   301    last = 0;
(lldb)


On Wed, Jul 3, 2019 at 4:32 PM Zhang, Junchao <jczhang at mcs.anl.gov> wrote:

> Could you debug it or paste the stack trace? Since it is a segfault, it
> should be easy.
> --Junchao Zhang
>
>
> On Wed, Jul 3, 2019 at 5:16 PM Fande Kong <fdkong.jd at gmail.com> wrote:
>
>> Thanks Junchao,
>>
>> But there is still segment fault. I guess you could write
>> some continuous integers to test your changes.
>>
>>
>> Fande
>>
>> On Wed, Jul 3, 2019 at 12:57 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>> wrote:
>>
>>> Fande and John,
>>>   Could you try jczhang/feature-better-quicksort-pivot? It passed
>>> Jenkins tests and I could not imagine why it failed on yours.
>>>   Hash table has its own cost. We'd better get quicksort right and see
>>> how it performs before rewriting code.
>>> --Junchao Zhang
>>>
>>>
>>> On Tue, Jul 2, 2019 at 2:37 PM Fande Kong <fdkong.jd at gmail.com> wrote:
>>>
>>>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault:
>>>> 11 (signal 11)
>>>>
>>>> Segmentation fault :-)
>>>>
>>>>
>>>> As Jed said, it might be a good idea to rewrite the code using the
>>>> hashing table.
>>>>
>>>>
>>>> Fande,
>>>>
>>>>
>>>> On Tue, Jul 2, 2019 at 1:27 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>>>> wrote:
>>>>
>>>>> Try this to see if it helps:
>>>>>
>>>>> diff --git a/src/sys/utils/sorti.c b/src/sys/utils/sorti.c
>>>>> index 1b07205a..90779891 100644
>>>>> --- a/src/sys/utils/sorti.c
>>>>> +++ b/src/sys/utils/sorti.c
>>>>> @@ -294,7 +294,8 @@ static PetscErrorCode
>>>>> PetscSortIntWithArrayPair_Private(PetscInt *L,PetscInt *J,
>>>>>      }
>>>>>      PetscFunctionReturn(0);
>>>>>    }
>>>>> -  SWAP3(L[0],L[right/2],J[0],J[right/2],K[0],K[right/2],tmp);
>>>>> +  i = MEDIAN(L,right);
>>>>> +  SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp);
>>>>>    vl   = L[0];
>>>>>    last = 0;
>>>>>    for (i=1; i<=right; i++) {
>>>>>
>>>>>
>>>>> On Tue, Jul 2, 2019 at 12:14 PM Fande Kong via petsc-dev <
>>>>> petsc-dev at mcs.anl.gov> wrote:
>>>>>
>>>>>> BTW,
>>>>>>
>>>>>> PetscSortIntWithArrayPair is used in MatStashSortCompress_Private.
>>>>>>
>>>>>> Any way to avoid to use PetscSortIntWithArrayPair in
>>>>>> MatStashSortCompress_Private?
>>>>>>
>>>>>> Fande,
>>>>>>
>>>>>> On Tue, Jul 2, 2019 at 11:09 AM Fande Kong <fdkong.jd at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Developers,
>>>>>>>
>>>>>>> John just noticed that the matrix assembly was slow when having
>>>>>>> sufficient amount of off-diagonal entries. It was not a MPI issue since I
>>>>>>> was  able to reproduce the issue using two cores on my desktop, that is,
>>>>>>> "mpirun -n 2".
>>>>>>>
>>>>>>> I turned  on a profiling, and 99.99% of the time was spent
>>>>>>> on PetscSortIntWithArrayPair (recursively calling).   It took THREE MINUTES
>>>>>>>  to get the assembly done. And then changed to use the option
>>>>>>> "-matstash_legacy" to restore
>>>>>>> the code to the old assembly routine, and the same code took ONE
>>>>>>> SECOND to get the matrix assembly done.
>>>>>>>
>>>>>>> Should write any better sorting algorithms?
>>>>>>>
>>>>>>>
>>>>>>> Fande,
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20190703/e2567d86/attachment.html>


More information about the petsc-dev mailing list