[petsc-users] matsetvalueslocal for aijcusparse matrix
Smith, Barry F.
bsmith at mcs.anl.gov
Tue Oct 22 19:26:09 CDT 2019
What a crap design PETSc has for GPUs. Sorry about this.
Move
> MatSetOptionPrefix(A,"test_")
> 8 MatSetFromOptions
to immediately after the MatSetType(A,MATAIJ).
What is happening is the original preallocation information you provided is lost when the matrix type is changed in MatSetFromOptions(), it comes from the ancient decision to have GPU vectors and matrices be entirely new subclasses instead of just providing GPU backends to the standard classes. Hopefully we can eventually fix this.
Barry
> On Oct 22, 2019, at 3:32 PM, Xiangdong <epscodes at gmail.com> wrote:
>
> My Matrix setup workflow is like this:
>
> 1 MatCreate
> 2 MatSetSizes
> 3 MatSetType(A,MATAIJ)
> 4 MatMPIAIJSetPreallocation
> 5 MatSeqAIJSetPreallocation
> 6 MatSetLocalToGlobalMapping
> 7 MatSetOptionPrefix(A,"test_")
> 8 MatSetFromOptions
> 9 MatSetUp
>
> 10 loop all the nonzero entries by calling MatSetValuesLocal(A,1,&i,1,&j, &val, ADD_VALUES);
>
> 11 MatAssemblyBegin
> 12 MatAssemblyEnd
>
> For the AIJ format, it works fine. -info gives "Number of mallocs during MatSetValues() is 0" and "Stash has 0 entries, uses 0 mallocs."
>
> If I run the same code with -test_mat_type aijcusparse, it takes forever to finish step 10. Does this step really involve moving data from host to devices? Do I need to have more changes to use aijcusparse other than just changing mat_type from command line?
>
> Thank you.
>
> Best,
> Xiangdong
>
>
> On Tue, Oct 22, 2019 at 1:53 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>
> The aijcusparse actually uses the same data structures and code for setting values as does aij. So it is not related directly to that format.
>
> Barry
>
>
> > On Oct 21, 2019, at 6:26 PM, Xiangdong via petsc-users <petsc-users at mcs.anl.gov> wrote:
> >
> > Hello everyone,
> >
> > When I use matsetvalueslocal to form the matrix in aijcusparse format, I found that it takes forever to finish the loops of matsetvalueslocal. I am setting one entry a time, and there are about about 28 Million nonzeros. (It is fast if the matrix is aij, instead of aijcusparse).
> >
> > However, if I have the matrix ready in binary format and use matload to get it into aijcusparse format, it is fast.
> >
> > Is it the issue of matsetvalueslocal or my wrong way of using matsetvalueslocal? any suggestions to speed up the process?
> >
> > Thank you.
> >
> > Xiangdong
>
More information about the petsc-users
mailing list