[petsc-dev] Slowness of PetscSortIntWithArrayPair in MatAssembly

Fande Kong fdkong.jd at gmail.com
Mon Jul 8 15:58:59 CDT 2019


Yes, here it is https://github.com/fdkong/matrixsparsity


You need to follow instructions here to install MOOSE
https://www.mooseframework.org/getting_started/installation/mac_os.html


Thanks for your help.


Fande



On Mon, Jul 8, 2019 at 2:28 PM Zhang, Junchao <jczhang at mcs.anl.gov> wrote:

> Is the code public for me to test?
> --Junchao Zhang
>
>
> On Mon, Jul 8, 2019 at 3:06 PM Fande Kong <fdkong.jd at gmail.com> wrote:
>
>> Thanks Junchao,
>>
>> Tried your code. I did not hit seg fault this time, but the assembly was
>> still slow
>>
>>
>> *time mpirun -n 2 ./matrix_sparsity-opt   -matstash_legacy*
>> *Close matrix for np = 2 ... *
>> *Matrix successfully closed*
>>
>> *real 0m2.009s*
>> *user 0m3.324s*
>> *sys 0m0.575s*
>>
>>
>>
>>
>> * time mpirun -n 2 ./matrix_sparsity-opt*
>> *Close matrix for np = 2 ... *
>> *Matrix successfully closed*
>>
>> *real 3m39.235s*
>> *user 6m42.184s*
>> *sys 0m35.084s*
>>
>>
>>
>>
>> Fande,
>>
>>
>>
>>
>> On Mon, Jul 8, 2019 at 8:47 AM Fande Kong <fdkong.jd at gmail.com> wrote:
>>
>>> Will let you know soon.
>>>
>>> Thanks,
>>>
>>> Fande,
>>>
>>> On Mon, Jul 8, 2019 at 8:41 AM Zhang, Junchao <jczhang at mcs.anl.gov>
>>> wrote:
>>>
>>>> Fande or John,
>>>>   Could any of you have a try? Thanks
>>>> --Junchao Zhang
>>>>
>>>>
>>>> ---------- Forwarded message ---------
>>>> From: Junchao Zhang <jczhang at mcs.anl.gov>
>>>> Date: Thu, Jul 4, 2019 at 8:21 AM
>>>> Subject: Re: [petsc-dev] Slowness of PetscSortIntWithArrayPair in
>>>> MatAssembly
>>>> To: Fande Kong <fdkong.jd at gmail.com>
>>>>
>>>>
>>>> Fande,
>>>>   I wrote tests but could not reproduce the error. I pushed a commit
>>>> that changed the MEDIAN macro to a function to make it easier to debug.
>>>> Could you run and debug it again? It should be easy to see what is wrong in
>>>> gdb.
>>>>   Thanks.
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Wed, Jul 3, 2019 at 6:48 PM Fande Kong <fdkong.jd at gmail.com> wrote:
>>>>
>>>>> Process 3915 resuming
>>>>> Process 3915 stopped
>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>>>>> EXC_BAD_ACCESS (code=2, address=0x7ffee9b91fc8)
>>>>>     frame #0: 0x000000010cbaa031
>>>>> libpetsc.3.011.dylib`PetscSortIntWithArrayPair_Private(L=0x0000000119fc5480,
>>>>> J=0x000000011bfaa480, K=0x000000011ff74480, right=13291) at sorti.c:298
>>>>>    295      }
>>>>>    296      PetscFunctionReturn(0);
>>>>>    297    }
>>>>> -> 298    i    = MEDIAN(L,right);
>>>>>    299    SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp);
>>>>>    300    vl   = L[0];
>>>>>    301    last = 0;
>>>>> (lldb)
>>>>>
>>>>>
>>>>> On Wed, Jul 3, 2019 at 4:32 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>>>>> wrote:
>>>>>
>>>>>> Could you debug it or paste the stack trace? Since it is a segfault,
>>>>>> it should be easy.
>>>>>> --Junchao Zhang
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 3, 2019 at 5:16 PM Fande Kong <fdkong.jd at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Junchao,
>>>>>>>
>>>>>>> But there is still segment fault. I guess you could write
>>>>>>> some continuous integers to test your changes.
>>>>>>>
>>>>>>>
>>>>>>> Fande
>>>>>>>
>>>>>>> On Wed, Jul 3, 2019 at 12:57 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Fande and John,
>>>>>>>>   Could you try jczhang/feature-better-quicksort-pivot? It passed
>>>>>>>> Jenkins tests and I could not imagine why it failed on yours.
>>>>>>>>   Hash table has its own cost. We'd better get quicksort right and
>>>>>>>> see how it performs before rewriting code.
>>>>>>>> --Junchao Zhang
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 2, 2019 at 2:37 PM Fande Kong <fdkong.jd at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation
>>>>>>>>> fault: 11 (signal 11)
>>>>>>>>>
>>>>>>>>> Segmentation fault :-)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> As Jed said, it might be a good idea to rewrite the code using the
>>>>>>>>> hashing table.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Fande,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jul 2, 2019 at 1:27 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Try this to see if it helps:
>>>>>>>>>>
>>>>>>>>>> diff --git a/src/sys/utils/sorti.c b/src/sys/utils/sorti.c
>>>>>>>>>> index 1b07205a..90779891 100644
>>>>>>>>>> --- a/src/sys/utils/sorti.c
>>>>>>>>>> +++ b/src/sys/utils/sorti.c
>>>>>>>>>> @@ -294,7 +294,8 @@ static PetscErrorCode
>>>>>>>>>> PetscSortIntWithArrayPair_Private(PetscInt *L,PetscInt *J,
>>>>>>>>>>      }
>>>>>>>>>>      PetscFunctionReturn(0);
>>>>>>>>>>    }
>>>>>>>>>> -  SWAP3(L[0],L[right/2],J[0],J[right/2],K[0],K[right/2],tmp);
>>>>>>>>>> +  i = MEDIAN(L,right);
>>>>>>>>>> +  SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp);
>>>>>>>>>>    vl   = L[0];
>>>>>>>>>>    last = 0;
>>>>>>>>>>    for (i=1; i<=right; i++) {
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 2, 2019 at 12:14 PM Fande Kong via petsc-dev <
>>>>>>>>>> petsc-dev at mcs.anl.gov> wrote:
>>>>>>>>>>
>>>>>>>>>>> BTW,
>>>>>>>>>>>
>>>>>>>>>>> PetscSortIntWithArrayPair is used
>>>>>>>>>>> in MatStashSortCompress_Private.
>>>>>>>>>>>
>>>>>>>>>>> Any way to avoid to use PetscSortIntWithArrayPair in
>>>>>>>>>>> MatStashSortCompress_Private?
>>>>>>>>>>>
>>>>>>>>>>> Fande,
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 2, 2019 at 11:09 AM Fande Kong <fdkong.jd at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Developers,
>>>>>>>>>>>>
>>>>>>>>>>>> John just noticed that the matrix assembly was slow when having
>>>>>>>>>>>> sufficient amount of off-diagonal entries. It was not a MPI issue since I
>>>>>>>>>>>> was  able to reproduce the issue using two cores on my desktop, that is,
>>>>>>>>>>>> "mpirun -n 2".
>>>>>>>>>>>>
>>>>>>>>>>>> I turned  on a profiling, and 99.99% of the time was spent
>>>>>>>>>>>> on PetscSortIntWithArrayPair (recursively calling).   It took THREE MINUTES
>>>>>>>>>>>>  to get the assembly done. And then changed to use the option
>>>>>>>>>>>> "-matstash_legacy" to restore
>>>>>>>>>>>> the code to the old assembly routine, and the same code took
>>>>>>>>>>>> ONE SECOND to get the matrix assembly done.
>>>>>>>>>>>>
>>>>>>>>>>>> Should write any better sorting algorithms?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Fande,
>>>>>>>>>>>>
>>>>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20190708/9e7c8928/attachment.html>


More information about the petsc-dev mailing list