[petsc-dev] Slowness of PetscSortIntWithArrayPair in MatAssembly

Fande Kong fdkong.jd at gmail.com
Mon Jul 8 15:05:37 CDT 2019


Thanks Junchao,

Tried your code. I did not hit seg fault this time, but the assembly was
still slow


*time mpirun -n 2 ./matrix_sparsity-opt   -matstash_legacy*
*Close matrix for np = 2 ... *
*Matrix successfully closed*

*real 0m2.009s*
*user 0m3.324s*
*sys 0m0.575s*




* time mpirun -n 2 ./matrix_sparsity-opt*
*Close matrix for np = 2 ... *
*Matrix successfully closed*

*real 3m39.235s*
*user 6m42.184s*
*sys 0m35.084s*




Fande,




On Mon, Jul 8, 2019 at 8:47 AM Fande Kong <fdkong.jd at gmail.com> wrote:

> Will let you know soon.
>
> Thanks,
>
> Fande,
>
> On Mon, Jul 8, 2019 at 8:41 AM Zhang, Junchao <jczhang at mcs.anl.gov> wrote:
>
>> Fande or John,
>>   Could any of you have a try? Thanks
>> --Junchao Zhang
>>
>>
>> ---------- Forwarded message ---------
>> From: Junchao Zhang <jczhang at mcs.anl.gov>
>> Date: Thu, Jul 4, 2019 at 8:21 AM
>> Subject: Re: [petsc-dev] Slowness of PetscSortIntWithArrayPair in
>> MatAssembly
>> To: Fande Kong <fdkong.jd at gmail.com>
>>
>>
>> Fande,
>>   I wrote tests but could not reproduce the error. I pushed a commit that
>> changed the MEDIAN macro to a function to make it easier to debug.  Could
>> you run and debug it again? It should be easy to see what is wrong in gdb.
>>   Thanks.
>> --Junchao Zhang
>>
>>
>> On Wed, Jul 3, 2019 at 6:48 PM Fande Kong <fdkong.jd at gmail.com> wrote:
>>
>>> Process 3915 resuming
>>> Process 3915 stopped
>>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>>> EXC_BAD_ACCESS (code=2, address=0x7ffee9b91fc8)
>>>     frame #0: 0x000000010cbaa031
>>> libpetsc.3.011.dylib`PetscSortIntWithArrayPair_Private(L=0x0000000119fc5480,
>>> J=0x000000011bfaa480, K=0x000000011ff74480, right=13291) at sorti.c:298
>>>    295      }
>>>    296      PetscFunctionReturn(0);
>>>    297    }
>>> -> 298    i    = MEDIAN(L,right);
>>>    299    SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp);
>>>    300    vl   = L[0];
>>>    301    last = 0;
>>> (lldb)
>>>
>>>
>>> On Wed, Jul 3, 2019 at 4:32 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>>> wrote:
>>>
>>>> Could you debug it or paste the stack trace? Since it is a segfault, it
>>>> should be easy.
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Wed, Jul 3, 2019 at 5:16 PM Fande Kong <fdkong.jd at gmail.com> wrote:
>>>>
>>>>> Thanks Junchao,
>>>>>
>>>>> But there is still segment fault. I guess you could write
>>>>> some continuous integers to test your changes.
>>>>>
>>>>>
>>>>> Fande
>>>>>
>>>>> On Wed, Jul 3, 2019 at 12:57 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>>>>> wrote:
>>>>>
>>>>>> Fande and John,
>>>>>>   Could you try jczhang/feature-better-quicksort-pivot? It passed
>>>>>> Jenkins tests and I could not imagine why it failed on yours.
>>>>>>   Hash table has its own cost. We'd better get quicksort right and
>>>>>> see how it performs before rewriting code.
>>>>>> --Junchao Zhang
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 2, 2019 at 2:37 PM Fande Kong <fdkong.jd at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation
>>>>>>> fault: 11 (signal 11)
>>>>>>>
>>>>>>> Segmentation fault :-)
>>>>>>>
>>>>>>>
>>>>>>> As Jed said, it might be a good idea to rewrite the code using the
>>>>>>> hashing table.
>>>>>>>
>>>>>>>
>>>>>>> Fande,
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 2, 2019 at 1:27 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Try this to see if it helps:
>>>>>>>>
>>>>>>>> diff --git a/src/sys/utils/sorti.c b/src/sys/utils/sorti.c
>>>>>>>> index 1b07205a..90779891 100644
>>>>>>>> --- a/src/sys/utils/sorti.c
>>>>>>>> +++ b/src/sys/utils/sorti.c
>>>>>>>> @@ -294,7 +294,8 @@ static PetscErrorCode
>>>>>>>> PetscSortIntWithArrayPair_Private(PetscInt *L,PetscInt *J,
>>>>>>>>      }
>>>>>>>>      PetscFunctionReturn(0);
>>>>>>>>    }
>>>>>>>> -  SWAP3(L[0],L[right/2],J[0],J[right/2],K[0],K[right/2],tmp);
>>>>>>>> +  i = MEDIAN(L,right);
>>>>>>>> +  SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp);
>>>>>>>>    vl   = L[0];
>>>>>>>>    last = 0;
>>>>>>>>    for (i=1; i<=right; i++) {
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 2, 2019 at 12:14 PM Fande Kong via petsc-dev <
>>>>>>>> petsc-dev at mcs.anl.gov> wrote:
>>>>>>>>
>>>>>>>>> BTW,
>>>>>>>>>
>>>>>>>>> PetscSortIntWithArrayPair is used in MatStashSortCompress_Private.
>>>>>>>>>
>>>>>>>>> Any way to avoid to use PetscSortIntWithArrayPair in
>>>>>>>>> MatStashSortCompress_Private?
>>>>>>>>>
>>>>>>>>> Fande,
>>>>>>>>>
>>>>>>>>> On Tue, Jul 2, 2019 at 11:09 AM Fande Kong <fdkong.jd at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Developers,
>>>>>>>>>>
>>>>>>>>>> John just noticed that the matrix assembly was slow when having
>>>>>>>>>> sufficient amount of off-diagonal entries. It was not a MPI issue since I
>>>>>>>>>> was  able to reproduce the issue using two cores on my desktop, that is,
>>>>>>>>>> "mpirun -n 2".
>>>>>>>>>>
>>>>>>>>>> I turned  on a profiling, and 99.99% of the time was spent
>>>>>>>>>> on PetscSortIntWithArrayPair (recursively calling).   It took THREE MINUTES
>>>>>>>>>>  to get the assembly done. And then changed to use the option
>>>>>>>>>> "-matstash_legacy" to restore
>>>>>>>>>> the code to the old assembly routine, and the same code took ONE
>>>>>>>>>> SECOND to get the matrix assembly done.
>>>>>>>>>>
>>>>>>>>>> Should write any better sorting algorithms?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Fande,
>>>>>>>>>>
>>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20190708/aa238151/attachment.html>


More information about the petsc-dev mailing list