[petsc-users] The multiplication of the transpose of a dense matrix(A^T) and a large sparse matrix(X1)

Jed Brown jedbrown at mcs.anl.gov
Tue Jun 4 06:55:31 CDT 2013


Joon hee Choi <choi240 at purdue.edu> writes:

> Hello,
>
> I am trying to multiply the transpose of a dense matrix(A) and a large sparse matrix(X1). That is, A^T x X1.
>
> X1 is a 26Mil x 1200Tril, 144Mil non-zeros sparse matrix and A is a 26Mil x 10 dense matrix.
>
> I know that sparse x dense is faster than dense x sparse when using MatMatMult.  Thus, I tried to implement the following code:
>
>         ierr = MatTranspose(X1, MAT_INITIAL_MATRIX, &tempX1); CHKERRQ(ierr);
>         ierr = MatMatMult(tempX1, A, MAT_INITIAL_MATRIX, 1.0, &MT);
>         ierr = MatDestroy(&tempX1); CHKERRQ(ierr);
>         ierr = MatTranspose(MT, MAT_INITIAL_MATRIX, &M); CHKERRQ(ierr);
>         ierr = MatDestroy(&MT); CHKERRQ(ierr);
>
> However, I got the "out-of-memory" error when implementing
> MatTranspose(). 

Which MatTranspose?

> I think this is because the number of columns of X1 is much larger
> than that of rows of X1.  If there is a fast way to calculate M = A^T
> x X1,

Hong, do you have time to implement MatTransposeMatMult_MPIAIJ_MPIDense?

Can you create X1 as X1^T instead?

If you want to keep storing X1 is you do now, you can either store it as
ten vectors and use MatMultTranspose or you can pack it into one vector and use

  MatCreateMAIJ(X1,10,&X1m);
  MatMultTranspose(X1m,Apacked,Bpacked);

This is actually a better ordering for memory bandwidth.  The MAIJ
matrix does not need extra storage.


More information about the petsc-users mailing list