[petsc-dev] MatMatMult gives different results
Alexander Grayver
agrayver at gfz-potsdam.de
Wed Feb 8 10:53:20 CST 2012
Hong,
Thank you.
In addition, I changed my code and form A^T explicitly (to avoid
previous error) and I got another one:
[1]PETSC ERROR: --------------------- Error Message
------------------------------------
[1]PETSC ERROR: Object is in wrong state!
[1]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on
argument 1 "mat" before MatAssemblyBegin()!
[1]PETSC ERROR:
------------------------------------------------------------------------
[1]PETSC ERROR: Petsc Development HG revision:
249597282bcb6a1051042a9fdfa5705679ed4f18 HG Date: Tue Feb 07 09:44:23
2012 -0600
[1]PETSC ERROR: See docs/changes/index.html for recent updates.
[1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[1]PETSC ERROR: See docs/index.html for manual pages.
[1]PETSC ERROR:
------------------------------------------------------------------------
[1]PETSC ERROR: solveTest on a openmpi-i named glic1 by agrayver Wed
Feb 8 13:06:45 2012
[1]PETSC ERROR: Libraries linked from
/home/lib/petsc-dev/openmpi-intel-complex-debug-f-mkl/lib
[1]PETSC ERROR: Configure run at Tue Feb 7 18:19:58 2012
[1]PETSC ERROR: Configure options
--with-petsc-arch=openmpi-intel-complex-debug-f-mkl
--with-fortran-interfaces=1 --download-superlu --download-superlu_dist
--download-mumps --download-parmetis --download-ptscotch
--download-metis
--with-scalapack-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_scalapack_lp64.a
--with-scalapack-include=/opt/intel/Compiler/11.1/072/mkl/include
--with-blacs-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_blacs_openmpi_lp64.a
--with-blacs-include=/opt/intel/Compiler/11.1/072/mkl/include
--with-mpi-dir=/opt/mpi/intel/openmpi-1.4.2 --with-scalar-type=complex
--with-blas-lapack-lib="[/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_lp64.a,/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.a,/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_core.a,/opt/intel/Compiler/11.1/072/lib/intel64/libiomp5.a]"
--with-precision=double --with-x=0
[1]PETSC ERROR:
------------------------------------------------------------------------
[1]PETSC ERROR: MatAssemblyBegin() line 4795 in
/home/lib/petsc-dev/src/mat/interface/matrix.c
[1]PETSC ERROR: MatMatMultSymbolic_MPIAIJ_MPIDense() line 638 in
/home/lib/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c
[1]PETSC ERROR: MatMatMult_MPIAIJ_MPIDense() line 594 in
/home/lib/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c
[1]PETSC ERROR: MatMatMult() line 8618 in
/home/lib/petsc-dev/src/mat/interface/matrix.c
On 08.02.2012 17:45, Hong Zhang wrote:
> Alexander :
> I can repeat the crash, and am working on it.
> I'll let you know after the bug is fixed.
>
> Thanks for your patience,
> Hong
>
>
> It seems now I know why I used MAT_REUSE_MATRIX and preallocated
> matrix.
> I removed MatCreateMPIAIJ() and use just MatTranspose with
> MAT_INITIAL_MATRIX.
> As a consequence I get hundreds of errors like:
>
> [30]PETSC ERROR: MatSetValues_MPIAIJ() line 538 in
> /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
> [30]PETSC ERROR: MatAssemblyEnd_MPIAIJ() line 653 in
> /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
> [30]PETSC ERROR: MatAssemblyEnd() line 4978 in
> /home/mt/agrayver/lib/petsc-dev/src/mat/interface/matrix.c
> [30]PETSC ERROR: MatTranspose_MPIAIJ() line 2061 in
> /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
> [30]PETSC ERROR: MatTranspose() line 4397 in
> /home/mt/agrayver/lib/petsc-dev/src/mat/interface/matrix.c
> [31]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [31]PETSC ERROR: Argument out of range!
> [31]PETSC ERROR: New nonzero at (1659,53337) caused a malloc!
>
> Ans this is the lastest petsc-dev revision.
> I know there were some changes in petsc-dev concerning this issue.
> Can you give me a hint how to avoid this?
>
> As for why I need C=A^T*B. This product is used further as a
> system matrix for LSQR solver. The largest dimension is on the
> order of 10^6
> Since I form A and B myself I can, of course, form A^T explicitly,
> but I thought I would first implement everything as it is written
> down on paper and then optimize once it works (I thought it's
> easier way).
>
>
> On 07.02.2012 20:44, Hong Zhang wrote:
>> Alexander,
>> I'm curious about why do you need parallel C=A^T*B?
>> How large your matrices are?
>>
>> In petsc-dev, we have MatTransposeMatMult() for mpiaij and
>> mpiaij, but not mpiaij and mpidense.
>> We may add support of MatTransposeMatMult_MPIAIJ_MPIDense() if
>> there is such need.
>>
>> Hong
>>
>>
>> On Tue, Feb 7, 2012 at 1:18 PM, agrayver at gfz-potsdam.de
>> <mailto:agrayver at gfz-potsdam.de> <agrayver at gfz-potsdam.de
>> <mailto:agrayver at gfz-potsdam.de>> wrote:
>>
>> Hong,
>>
>> Thanks for explanation. I will try this tomorrow. Good to
>> have this stuff in the help now.
>>
>> And sorry for misleading you initially.
>>
>> Regards,
>> Alexander
>>
>>
>> ----- Reply message -----
>> From: "Hong Zhang" <hzhang at mcs.anl.gov
>> <mailto:hzhang at mcs.anl.gov>>
>> To: "For users of the development version of PETSc"
>> <petsc-dev at mcs.anl.gov <mailto:petsc-dev at mcs.anl.gov>>
>> Subject: [petsc-dev] MatMatMult gives different results
>> Date: Tue, Feb 7, 2012 19:09
>>
>>
>> Alexander :
>>
>>
>> There is something I didn't get yet, I hope you could
>> clarify it.
>>
>> So, when I use flag MAT_INITIAL_MATRIX in test program it
>> works fine.
>>
>> Good to know :-)
>>
>> If I put this flag in my original program I get dozens of
>> exceptions like:
>> [42]PETSC ERROR: Argument out of range!
>> [42]PETSC ERROR: New nonzero at (1336,153341) caused a
>> malloc!
>>
>> You cannot do
>> MatCreateMPIAIJ()
>> MatTranspose(A,MAT_INITIAL_MATRIX,&AT);
>>
>> MatCreateMPIAIJ() creates AT and preallocates approximate
>> nonzeros, which does not match exactly the nonzeros in
>> MatTranspose(A,MAT_INITIAL_MATRIX,&AT);
>> MatTranspose(A,MAT_INITIAL_MATRIX,&AT) creates matrix AT and
>> sets correct nonzero pattern and values in AT.
>> MatTranspose() only takes in "MAT_INITIAL_MATRIX" - for a new AT,
>> and "MAT_REUSE_MATRIX" when AT is created with
>> MatTranspose(A,MAT_INITIAL_MATRIX,&AT)
>> and reuse for updating its values (not nonzero patten).
>>
>> I'm updating petsc help menu on MatTranspose(). Thanks for
>> the report.
>>
>> Hong
>>
>>
>> I changed this flag to MAT_REUSE_MATRIX and exceptions
>> disappeared, but result is incorrect again (same as for
>> MAT_IGNORE_MATRIX)
>> I tried test program with MAT_REUSE_MATRIX and it also
>> gives different matrix product.
>>
>> Since there is no description of MatReuse structure for
>> MatTranspose it's a bit confusing what to expect from it.
>>
>>>>
>>>> Do you mean 'Cm = A'*B;'?
>>>> 'Cm = A.'*B;' gives component-wise matrix product,
>>>> not matrix product.
>>>
>>> .' operator means non-Hermitian transpose. That is
>>> what I get with MatTranspose (in contrast with
>>> MatHermitianTranspose)
>>> component-wise matrix product would be .*
>>>
>>> You are correct.
>>>
>>> Hong
>>>
>>>
>>>
>>>>
>>>> Hong
>>>>
>>>> C = PetscBinaryRead('C.dat','complex',true);
>>>>
>>>> Matrix C is different depending on number of
>>>> cores I use.
>>>> My PETSc is:
>>>> Using Petsc Development HG revision:
>>>> 876c894d95f4fa6561d0a91310ca914592527960 HG
>>>> Date: Tue Jan 10 19:27:14 2012 +0100
>>>>
>>>>
>>>> On 06.02.2012 17:13, Hong Zhang wrote:
>>>>> MatMatMult() in petsc is not well-tested for
>>>>> complex - could be buggy.
>>>>> Can you send us the matrices A and B in petsc
>>>>> binary format for investigation?
>>>>>
>>>>> Hong
>>>>>
>>>>> On Mon, Feb 6, 2012 at 5:55 AM, Alexander
>>>>> Grayver <agrayver at gfz-potsdam.de
>>>>> <mailto:agrayver at gfz-potsdam.de>> wrote:
>>>>>
>>>>> Dear PETSc team,
>>>>>
>>>>> I try to use:
>>>>> call
>>>>> MatMatMult(A,B,MAT_INITIAL_MATRIX,PETSC_DEFAULT_DOUBLE_PRECISION,C,ierr);CHKERRQ(ierr)
>>>>>
>>>>> Where both A and B are rectangular, but A
>>>>> is sparse and B is dense. Both are double
>>>>> complex and distributed.
>>>>> The product PETSc gives me contains some
>>>>> errors in some part of the matrix.
>>>>> I output A, B and C then computed product
>>>>> in matlab.
>>>>>
>>>>> Attached you see figure plotted as:
>>>>> imagesc(log10(abs(C-Cm)))
>>>>>
>>>>> Where Cm -- product computed in matlab.
>>>>>
>>>>> The pattern and amplitude vary depending
>>>>> on the number of cores I use. This picture
>>>>> is obtained for 48 cores (I've tried 12,
>>>>> 64 cores as well).
>>>>>
>>>>> Where should I look for possible explanation?
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Alexander
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Alexander
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Alexander
>>>
>>>
>>
>>
>> --
>> Regards,
>> Alexander
>>
>>
>>
>
>
> --
> Regards,
> Alexander
>
>
--
Regards,
Alexander
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120208/134cf8b2/attachment.html>
More information about the petsc-dev
mailing list