[petsc-dev] Slowness of PetscSortIntWithArrayPair in MatAssembly

Fande Kong fdkong.jd at gmail.com
Mon Jul 8 16:00:07 CDT 2019


I guess John has a pure PETSc example,

John, could you share the pure PETSc example with Junchao?


Fande,

On Mon, Jul 8, 2019 at 2:58 PM Fande Kong <fdkong.jd at gmail.com> wrote:

> Yes, here it is https://github.com/fdkong/matrixsparsity
>
>
> You need to follow instructions here to install MOOSE
> https://www.mooseframework.org/getting_started/installation/mac_os.html
>
>
> Thanks for your help.
>
>
> Fande
>
>
>
> On Mon, Jul 8, 2019 at 2:28 PM Zhang, Junchao <jczhang at mcs.anl.gov> wrote:
>
>> Is the code public for me to test?
>> --Junchao Zhang
>>
>>
>> On Mon, Jul 8, 2019 at 3:06 PM Fande Kong <fdkong.jd at gmail.com> wrote:
>>
>>> Thanks Junchao,
>>>
>>> Tried your code. I did not hit seg fault this time, but the assembly was
>>> still slow
>>>
>>>
>>> *time mpirun -n 2 ./matrix_sparsity-opt   -matstash_legacy*
>>> *Close matrix for np = 2 ... *
>>> *Matrix successfully closed*
>>>
>>> *real 0m2.009s*
>>> *user 0m3.324s*
>>> *sys 0m0.575s*
>>>
>>>
>>>
>>>
>>> * time mpirun -n 2 ./matrix_sparsity-opt*
>>> *Close matrix for np = 2 ... *
>>> *Matrix successfully closed*
>>>
>>> *real 3m39.235s*
>>> *user 6m42.184s*
>>> *sys 0m35.084s*
>>>
>>>
>>>
>>>
>>> Fande,
>>>
>>>
>>>
>>>
>>> On Mon, Jul 8, 2019 at 8:47 AM Fande Kong <fdkong.jd at gmail.com> wrote:
>>>
>>>> Will let you know soon.
>>>>
>>>> Thanks,
>>>>
>>>> Fande,
>>>>
>>>> On Mon, Jul 8, 2019 at 8:41 AM Zhang, Junchao <jczhang at mcs.anl.gov>
>>>> wrote:
>>>>
>>>>> Fande or John,
>>>>>   Could any of you have a try? Thanks
>>>>> --Junchao Zhang
>>>>>
>>>>>
>>>>> ---------- Forwarded message ---------
>>>>> From: Junchao Zhang <jczhang at mcs.anl.gov>
>>>>> Date: Thu, Jul 4, 2019 at 8:21 AM
>>>>> Subject: Re: [petsc-dev] Slowness of PetscSortIntWithArrayPair in
>>>>> MatAssembly
>>>>> To: Fande Kong <fdkong.jd at gmail.com>
>>>>>
>>>>>
>>>>> Fande,
>>>>>   I wrote tests but could not reproduce the error. I pushed a commit
>>>>> that changed the MEDIAN macro to a function to make it easier to debug.
>>>>> Could you run and debug it again? It should be easy to see what is wrong in
>>>>> gdb.
>>>>>   Thanks.
>>>>> --Junchao Zhang
>>>>>
>>>>>
>>>>> On Wed, Jul 3, 2019 at 6:48 PM Fande Kong <fdkong.jd at gmail.com> wrote:
>>>>>
>>>>>> Process 3915 resuming
>>>>>> Process 3915 stopped
>>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason =
>>>>>> EXC_BAD_ACCESS (code=2, address=0x7ffee9b91fc8)
>>>>>>     frame #0: 0x000000010cbaa031
>>>>>> libpetsc.3.011.dylib`PetscSortIntWithArrayPair_Private(L=0x0000000119fc5480,
>>>>>> J=0x000000011bfaa480, K=0x000000011ff74480, right=13291) at sorti.c:298
>>>>>>    295      }
>>>>>>    296      PetscFunctionReturn(0);
>>>>>>    297    }
>>>>>> -> 298    i    = MEDIAN(L,right);
>>>>>>    299    SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp);
>>>>>>    300    vl   = L[0];
>>>>>>    301    last = 0;
>>>>>> (lldb)
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 3, 2019 at 4:32 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>>>>>> wrote:
>>>>>>
>>>>>>> Could you debug it or paste the stack trace? Since it is a segfault,
>>>>>>> it should be easy.
>>>>>>> --Junchao Zhang
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 3, 2019 at 5:16 PM Fande Kong <fdkong.jd at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Junchao,
>>>>>>>>
>>>>>>>> But there is still segment fault. I guess you could write
>>>>>>>> some continuous integers to test your changes.
>>>>>>>>
>>>>>>>>
>>>>>>>> Fande
>>>>>>>>
>>>>>>>> On Wed, Jul 3, 2019 at 12:57 PM Zhang, Junchao <jczhang at mcs.anl.gov>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Fande and John,
>>>>>>>>>   Could you try jczhang/feature-better-quicksort-pivot? It passed
>>>>>>>>> Jenkins tests and I could not imagine why it failed on yours.
>>>>>>>>>   Hash table has its own cost. We'd better get quicksort right and
>>>>>>>>> see how it performs before rewriting code.
>>>>>>>>> --Junchao Zhang
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jul 2, 2019 at 2:37 PM Fande Kong <fdkong.jd at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation
>>>>>>>>>> fault: 11 (signal 11)
>>>>>>>>>>
>>>>>>>>>> Segmentation fault :-)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> As Jed said, it might be a good idea to rewrite the code using
>>>>>>>>>> the hashing table.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Fande,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 2, 2019 at 1:27 PM Zhang, Junchao <
>>>>>>>>>> jczhang at mcs.anl.gov> wrote:
>>>>>>>>>>
>>>>>>>>>>> Try this to see if it helps:
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/src/sys/utils/sorti.c b/src/sys/utils/sorti.c
>>>>>>>>>>> index 1b07205a..90779891 100644
>>>>>>>>>>> --- a/src/sys/utils/sorti.c
>>>>>>>>>>> +++ b/src/sys/utils/sorti.c
>>>>>>>>>>> @@ -294,7 +294,8 @@ static PetscErrorCode
>>>>>>>>>>> PetscSortIntWithArrayPair_Private(PetscInt *L,PetscInt *J,
>>>>>>>>>>>      }
>>>>>>>>>>>      PetscFunctionReturn(0);
>>>>>>>>>>>    }
>>>>>>>>>>> -  SWAP3(L[0],L[right/2],J[0],J[right/2],K[0],K[right/2],tmp);
>>>>>>>>>>> +  i = MEDIAN(L,right);
>>>>>>>>>>> +  SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp);
>>>>>>>>>>>    vl   = L[0];
>>>>>>>>>>>    last = 0;
>>>>>>>>>>>    for (i=1; i<=right; i++) {
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 2, 2019 at 12:14 PM Fande Kong via petsc-dev <
>>>>>>>>>>> petsc-dev at mcs.anl.gov> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> BTW,
>>>>>>>>>>>>
>>>>>>>>>>>> PetscSortIntWithArrayPair is used
>>>>>>>>>>>> in MatStashSortCompress_Private.
>>>>>>>>>>>>
>>>>>>>>>>>> Any way to avoid to use PetscSortIntWithArrayPair in
>>>>>>>>>>>> MatStashSortCompress_Private?
>>>>>>>>>>>>
>>>>>>>>>>>> Fande,
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jul 2, 2019 at 11:09 AM Fande Kong <fdkong.jd at gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Developers,
>>>>>>>>>>>>>
>>>>>>>>>>>>> John just noticed that the matrix assembly was slow when
>>>>>>>>>>>>> having sufficient amount of off-diagonal entries. It was not a MPI issue
>>>>>>>>>>>>> since I was  able to reproduce the issue using two cores on my desktop,
>>>>>>>>>>>>> that is, "mpirun -n 2".
>>>>>>>>>>>>>
>>>>>>>>>>>>> I turned  on a profiling, and 99.99% of the time was spent
>>>>>>>>>>>>> on PetscSortIntWithArrayPair (recursively calling).   It took THREE MINUTES
>>>>>>>>>>>>>  to get the assembly done. And then changed to use the option
>>>>>>>>>>>>> "-matstash_legacy" to restore
>>>>>>>>>>>>> the code to the old assembly routine, and the same code took
>>>>>>>>>>>>> ONE SECOND to get the matrix assembly done.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Should write any better sorting algorithms?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Fande,
>>>>>>>>>>>>>
>>>>>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20190708/e11602cb/attachment-0001.html>


More information about the petsc-dev mailing list