[petsc-users] The multiplication of the transpose of a dense matrix(A^T) and a large sparse matrix(X1)

Joon hee Choi choi240 at purdue.edu
Tue Jun 4 13:15:04 CDT 2013


Thank you for your fast reply.
I got the out-of-memory error from MatTranspose(X1, MAT_INITIAL_MATRIX, &tempX1).
Also, it may take so much time to create X1 as X1^T because nnz for X1 cannot have 1200Tril elements because of memory.
Is there a fast way to create X1^T?

Thank you,
Joon


----- Original Message -----
From: "Jed Brown" <jedbrown at mcs.anl.gov>
To: "Joon hee Choi" <choi240 at purdue.edu>, petsc-users at mcs.anl.gov
Sent: Tuesday, June 4, 2013 7:55:31 AM
Subject: Re: [petsc-users] The multiplication of the transpose of a dense matrix(A^T) and a large sparse matrix(X1)

Joon hee Choi <choi240 at purdue.edu> writes:

> Hello,
>
> I am trying to multiply the transpose of a dense matrix(A) and a large sparse matrix(X1). That is, A^T x X1.
>
> X1 is a 26Mil x 1200Tril, 144Mil non-zeros sparse matrix and A is a 26Mil x 10 dense matrix.
>
> I know that sparse x dense is faster than dense x sparse when using MatMatMult.  Thus, I tried to implement the following code:
>
>         ierr = MatTranspose(X1, MAT_INITIAL_MATRIX, &tempX1); CHKERRQ(ierr);
>         ierr = MatMatMult(tempX1, A, MAT_INITIAL_MATRIX, 1.0, &MT);
>         ierr = MatDestroy(&tempX1); CHKERRQ(ierr);
>         ierr = MatTranspose(MT, MAT_INITIAL_MATRIX, &M); CHKERRQ(ierr);
>         ierr = MatDestroy(&MT); CHKERRQ(ierr);
>
> However, I got the "out-of-memory" error when implementing
> MatTranspose(). 

Which MatTranspose?

> I think this is because the number of columns of X1 is much larger
> than that of rows of X1.  If there is a fast way to calculate M = A^T
> x X1,

Hong, do you have time to implement MatTransposeMatMult_MPIAIJ_MPIDense?

Can you create X1 as X1^T instead?

If you want to keep storing X1 is you do now, you can either store it as
ten vectors and use MatMultTranspose or you can pack it into one vector and use

  MatCreateMAIJ(X1,10,&X1m);
  MatMultTranspose(X1m,Apacked,Bpacked);

This is actually a better ordering for memory bandwidth.  The MAIJ
matrix does not need extra storage.


More information about the petsc-users mailing list