[petsc-dev] MatMatMult gives different results

Alexander Grayver agrayver at gfz-potsdam.de
Wed Feb 8 14:28:59 CST 2012


Hong,

Everything seems to be working well so far.
I would say there is nothing to acknowledge, but I don't mind...

Please let me know if there will be support for mpiaij_mpidense.

Thanks for this great support.

On 08.02.2012 18:43, Hong Zhang wrote:
> Alexander:
> Recently, we add error flags in petsc-dev 
> requiring MatXXXSetPreallocation() for many matrix creation routines.
> I've updated relevant routines for your code, and add your contributed 
> testing code as a petsc example with acknowledge:
> petsc-dev/src/mat/examples/tests/ex165.c
> Let me know if you do not want your name to be acknowledge in the example.
>
> I tested ex165 with your A and B in sequential (np=1) and parallel 
> (np=2, 6):
> mpiexec -n <np> ./ex165 -fA A.dat -fB B.dat -view_C
>
> Checking the size of output file C.dat, I see the sequential run and 
> parallel run give identical bit size.
> Please get the updated petsc-dev and test it. Let me know if you still 
> have problem.
>
> It is inefficient to compute C=A^T*B via MatTranspose() and MatMatMult().
> I'll see if I can write a C=A^T*B  for mpiaij_mpidense.
>
> Hong
>
>
>     In addition, I changed my code and form A^T explicitly (to avoid
>     previous error) and I got another one:
>
>
>     [1]PETSC ERROR: --------------------- Error Message
>     ------------------------------------
>     [1]PETSC ERROR: Object is in wrong state!
>     [1]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp()
>     on argument 1 "mat" before MatAssemblyBegin()!
>     [1]PETSC ERROR:
>     ------------------------------------------------------------------------
>     [1]PETSC ERROR: Petsc Development HG revision:
>     249597282bcb6a1051042a9fdfa5705679ed4f18  HG Date: Tue Feb 07
>     09:44:23 2012 -0600
>     [1]PETSC ERROR: See docs/changes/index.html for recent updates.
>     [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>     [1]PETSC ERROR: See docs/index.html for manual pages.
>     [1]PETSC ERROR:
>     ------------------------------------------------------------------------
>     [1]PETSC ERROR: solveTest on a openmpi-i named glic1 by agrayver
>     Wed Feb  8 13:06:45 2012
>     [1]PETSC ERROR: Libraries linked from
>     /home/lib/petsc-dev/openmpi-intel-complex-debug-f-mkl/lib
>     [1]PETSC ERROR: Configure run at Tue Feb  7 18:19:58 2012
>     [1]PETSC ERROR: Configure options
>     --with-petsc-arch=openmpi-intel-complex-debug-f-mkl
>     --with-fortran-interfaces=1 --download-superlu
>     --download-superlu_dist --download-mumps --download-parmetis
>     --download-ptscotch --download-metis
>     --with-scalapack-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_scalapack_lp64.a
>     --with-scalapack-include=/opt/intel/Compiler/11.1/072/mkl/include
>     --with-blacs-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_blacs_openmpi_lp64.a
>     --with-blacs-include=/opt/intel/Compiler/11.1/072/mkl/include
>     --with-mpi-dir=/opt/mpi/intel/openmpi-1.4.2
>     --with-scalar-type=complex
>     --with-blas-lapack-lib="[/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_lp64.a,/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.a,/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_core.a,/opt/intel/Compiler/11.1/072/lib/intel64/libiomp5.a]"
>     --with-precision=double --with-x=0
>     [1]PETSC ERROR:
>     ------------------------------------------------------------------------
>     [1]PETSC ERROR: MatAssemblyBegin() line 4795 in
>     /home/lib/petsc-dev/src/mat/interface/matrix.c
>     [1]PETSC ERROR: MatMatMultSymbolic_MPIAIJ_MPIDense() line 638 in
>     /home/lib/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c
>     [1]PETSC ERROR: MatMatMult_MPIAIJ_MPIDense() line 594 in
>     /home/lib/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c
>     [1]PETSC ERROR: MatMatMult() line 8618 in
>     /home/lib/petsc-dev/src/mat/interface/matrix.c
>
>
>     On 08.02.2012 17:45, Hong Zhang wrote:
>>     Alexander :
>>     I can repeat the crash, and am working on it.
>>     I'll let you know after the bug is fixed.
>>
>>     Thanks for your patience,
>>     Hong
>>
>>
>>         It seems now I know why I used MAT_REUSE_MATRIX and
>>         preallocated matrix.
>>         I removed MatCreateMPIAIJ() and use just MatTranspose with
>>         MAT_INITIAL_MATRIX.
>>         As a consequence I get hundreds of errors like:
>>
>>         [30]PETSC ERROR: MatSetValues_MPIAIJ() line 538 in
>>         /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
>>         [30]PETSC ERROR: MatAssemblyEnd_MPIAIJ() line 653 in
>>         /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
>>         [30]PETSC ERROR: MatAssemblyEnd() line 4978 in
>>         /home/mt/agrayver/lib/petsc-dev/src/mat/interface/matrix.c
>>         [30]PETSC ERROR: MatTranspose_MPIAIJ() line 2061 in
>>         /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
>>         [30]PETSC ERROR: MatTranspose() line 4397 in
>>         /home/mt/agrayver/lib/petsc-dev/src/mat/interface/matrix.c
>>         [31]PETSC ERROR: --------------------- Error Message
>>         ------------------------------------
>>         [31]PETSC ERROR: Argument out of range!
>>         [31]PETSC ERROR: New nonzero at (1659,53337) caused a malloc!
>>
>>         Ans this is the lastest petsc-dev revision.
>>         I know there were some changes in petsc-dev concerning this
>>         issue.
>>         Can you give me a hint how to avoid this?
>>
>>         As for why I need C=A^T*B. This product is used further as a
>>         system matrix for LSQR solver. The largest dimension is on
>>         the order of 10^6
>>         Since I form A and B myself I can, of course, form A^T
>>         explicitly, but I thought I would first implement everything
>>         as it is written down on paper and then optimize once it
>>         works (I thought it's easier way).
>>
>>
>>         On 07.02.2012 20:44, Hong Zhang wrote:
>>>         Alexander,
>>>         I'm curious about why do you need parallel C=A^T*B?
>>>         How large your matrices are?
>>>
>>>         In petsc-dev, we have MatTransposeMatMult() for mpiaij and
>>>         mpiaij, but not mpiaij and mpidense.
>>>         We may add support of MatTransposeMatMult_MPIAIJ_MPIDense()
>>>         if there is such need.
>>>
>>>         Hong
>>>
>>>
>>>         On Tue, Feb 7, 2012 at 1:18 PM, agrayver at gfz-potsdam.de
>>>         <mailto:agrayver at gfz-potsdam.de> <agrayver at gfz-potsdam.de
>>>         <mailto:agrayver at gfz-potsdam.de>> wrote:
>>>
>>>             Hong,
>>>
>>>             Thanks for explanation. I will try this tomorrow. Good
>>>             to have this stuff in the help now.
>>>
>>>             And sorry for misleading you initially.
>>>
>>>             Regards,
>>>             Alexander
>>>
>>>
>>>             ----- Reply message -----
>>>             From: "Hong Zhang" <hzhang at mcs.anl.gov
>>>             <mailto:hzhang at mcs.anl.gov>>
>>>             To: "For users of the development version of PETSc"
>>>             <petsc-dev at mcs.anl.gov <mailto:petsc-dev at mcs.anl.gov>>
>>>             Subject: [petsc-dev] MatMatMult gives different results
>>>             Date: Tue, Feb 7, 2012 19:09
>>>
>>>
>>>             Alexander :
>>>
>>>
>>>                 There is something I didn't get yet, I hope you
>>>                 could clarify it.
>>>
>>>                 So, when I use flag MAT_INITIAL_MATRIX in test
>>>                 program it works fine.
>>>
>>>             Good to know :-)
>>>
>>>                 If I put this flag in my original program I get
>>>                 dozens of exceptions like:
>>>                 [42]PETSC ERROR: Argument out of range!
>>>                 [42]PETSC ERROR: New nonzero at (1336,153341) caused
>>>                 a malloc!
>>>
>>>             You cannot do
>>>             MatCreateMPIAIJ()
>>>             MatTranspose(A,MAT_INITIAL_MATRIX,&AT);
>>>
>>>             MatCreateMPIAIJ() creates AT and preallocates
>>>             approximate nonzeros, which does not match exactly the
>>>             nonzeros in
>>>             MatTranspose(A,MAT_INITIAL_MATRIX,&AT);
>>>             MatTranspose(A,MAT_INITIAL_MATRIX,&AT) creates matrix AT
>>>             and sets correct nonzero pattern and values in AT.
>>>             MatTranspose() only takes in "MAT_INITIAL_MATRIX" - for
>>>             a new AT,
>>>             and "MAT_REUSE_MATRIX" when AT is created with
>>>             MatTranspose(A,MAT_INITIAL_MATRIX,&AT)
>>>             and reuse for updating its values (not nonzero patten).
>>>
>>>             I'm updating petsc help menu on MatTranspose(). Thanks
>>>             for the report.
>>>
>>>             Hong
>>>
>>>
>>>                 I changed this flag to MAT_REUSE_MATRIX and
>>>                 exceptions disappeared, but result is incorrect
>>>                 again (same as for MAT_IGNORE_MATRIX)
>>>                 I tried test program with MAT_REUSE_MATRIX and it
>>>                 also gives different matrix product.
>>>
>>>                 Since there is no description of MatReuse structure
>>>                 for MatTranspose it's a bit confusing what to expect
>>>                 from it.
>>>
>>>>>
>>>>>                     Do you mean 'Cm = A'*B;'?
>>>>>                     'Cm = A.'*B;' gives component-wise matrix
>>>>>                     product, not matrix product.
>>>>
>>>>                     .' operator means non-Hermitian transpose. That
>>>>                     is what I get with MatTranspose (in contrast
>>>>                     with MatHermitianTranspose)
>>>>                     component-wise matrix product would be .*
>>>>
>>>>                 You are correct.
>>>>
>>>>                 Hong
>>>>
>>>>
>>>>
>>>>>
>>>>>                     Hong
>>>>>
>>>>>                         C = PetscBinaryRead('C.dat','complex',true);
>>>>>
>>>>>                         Matrix C is different depending on number
>>>>>                         of cores I use.
>>>>>                         My PETSc is:
>>>>>                         Using Petsc Development HG revision:
>>>>>                         876c894d95f4fa6561d0a91310ca914592527960 
>>>>>                         HG Date: Tue Jan 10 19:27:14 2012 +0100
>>>>>
>>>>>
>>>>>                         On 06.02.2012 17:13, Hong Zhang wrote:
>>>>>>                         MatMatMult() in petsc is not well-tested
>>>>>>                         for complex - could be buggy.
>>>>>>                         Can you send us the matrices A and B in
>>>>>>                         petsc binary format for investigation?
>>>>>>
>>>>>>                         Hong
>>>>>>
>>>>>>                         On Mon, Feb 6, 2012 at 5:55 AM, Alexander
>>>>>>                         Grayver <agrayver at gfz-potsdam.de
>>>>>>                         <mailto:agrayver at gfz-potsdam.de>> wrote:
>>>>>>
>>>>>>                             Dear PETSc team,
>>>>>>
>>>>>>                             I try to use:
>>>>>>                             call
>>>>>>                             MatMatMult(A,B,MAT_INITIAL_MATRIX,PETSC_DEFAULT_DOUBLE_PRECISION,C,ierr);CHKERRQ(ierr)
>>>>>>
>>>>>>                             Where both A and B are rectangular,
>>>>>>                             but A is sparse and B is dense. Both
>>>>>>                             are double complex and distributed.
>>>>>>                             The product PETSc gives me contains
>>>>>>                             some errors in some part of the matrix.
>>>>>>                             I output A, B and C then computed
>>>>>>                             product in matlab.
>>>>>>
>>>>>>                             Attached you see figure plotted as:
>>>>>>                             imagesc(log10(abs(C-Cm)))
>>>>>>
>>>>>>                             Where Cm -- product computed in matlab.
>>>>>>
>>>>>>                             The pattern and amplitude vary
>>>>>>                             depending on the number of cores I
>>>>>>                             use. This picture is obtained for 48
>>>>>>                             cores (I've tried 12, 64 cores as well).
>>>>>>
>>>>>>                             Where should I look for possible
>>>>>>                             explanation?
>>>>>>
>>>>>>                             -- 
>>>>>>                             Regards,
>>>>>>                             Alexander
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>                         -- 
>>>>>                         Regards,
>>>>>                         Alexander
>>>>>
>>>>>
>>>>
>>>>
>>>>                     -- 
>>>>                     Regards,
>>>>                     Alexander
>>>>
>>>>
>>>
>>>
>>>                 -- 
>>>                 Regards,
>>>                 Alexander
>>>
>>>
>>>
>>
>>
>>         -- 
>>         Regards,
>>         Alexander
>>
>>
>
>
>     -- 
>     Regards,
>     Alexander
>
>


-- 
Regards,
Alexander

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120208/cbdcfb6f/attachment.html>


More information about the petsc-dev mailing list