[petsc-dev] MatMatMult gives different results

Hong Zhang hzhang at mcs.anl.gov
Wed Feb 8 11:43:03 CST 2012


Alexander:
Recently, we add error flags in petsc-dev
requiring MatXXXSetPreallocation() for many matrix creation routines.
I've updated relevant routines for your code, and add your contributed
testing code as a petsc example with acknowledge:
petsc-dev/src/mat/examples/tests/ex165.c
Let me know if you do not want your name to be acknowledge in the example.

I tested ex165 with your A and B in sequential (np=1) and parallel (np=2,
6):
mpiexec -n <np> ./ex165 -fA A.dat -fB B.dat -view_C

Checking the size of output file C.dat, I see the sequential run and
parallel run give identical bit size.
Please get the updated petsc-dev and test it. Let me know if you still have
problem.

It is inefficient to compute C=A^T*B via MatTranspose() and MatMatMult().
I'll see if I can write a C=A^T*B  for mpiaij_mpidense.

Hong

>
> In addition, I changed my code and form A^T explicitly (to avoid previous
> error) and I got another one:
>
>
> [1]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [1]PETSC ERROR: Object is in wrong state!
> [1]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on
> argument 1 "mat" before MatAssemblyBegin()!
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Petsc Development HG revision:
> 249597282bcb6a1051042a9fdfa5705679ed4f18  HG Date: Tue Feb 07 09:44:23 2012
> -0600
> [1]PETSC ERROR: See docs/changes/index.html for recent updates.
> [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [1]PETSC ERROR: See docs/index.html for manual pages.
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: solveTest on a openmpi-i named glic1 by agrayver Wed Feb
> 8 13:06:45 2012
> [1]PETSC ERROR: Libraries linked from
> /home/lib/petsc-dev/openmpi-intel-complex-debug-f-mkl/lib
> [1]PETSC ERROR: Configure run at Tue Feb  7 18:19:58 2012
> [1]PETSC ERROR: Configure options
> --with-petsc-arch=openmpi-intel-complex-debug-f-mkl
> --with-fortran-interfaces=1 --download-superlu --download-superlu_dist
> --download-mumps --download-parmetis --download-ptscotch --download-metis
> --with-scalapack-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_scalapack_lp64.a
> --with-scalapack-include=/opt/intel/Compiler/11.1/072/mkl/include
> --with-blacs-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_blacs_openmpi_lp64.a
> --with-blacs-include=/opt/intel/Compiler/11.1/072/mkl/include
> --with-mpi-dir=/opt/mpi/intel/openmpi-1.4.2 --with-scalar-type=complex
> --with-blas-lapack-lib="[/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_lp64.a,/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.a,/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_core.a,/opt/intel/Compiler/11.1/072/lib/intel64/libiomp5.a]"
> --with-precision=double --with-x=0
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: MatAssemblyBegin() line 4795 in
> /home/lib/petsc-dev/src/mat/interface/matrix.c
> [1]PETSC ERROR: MatMatMultSymbolic_MPIAIJ_MPIDense() line 638 in
> /home/lib/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c
> [1]PETSC ERROR: MatMatMult_MPIAIJ_MPIDense() line 594 in
> /home/lib/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c
> [1]PETSC ERROR: MatMatMult() line 8618 in
> /home/lib/petsc-dev/src/mat/interface/matrix.c
>
>
> On 08.02.2012 17:45, Hong Zhang wrote:
>
> Alexander :
> I can repeat the crash, and am working on it.
> I'll let you know after the bug is fixed.
>
>  Thanks for your patience,
> Hong
>
>>
>> It seems now I know why I used MAT_REUSE_MATRIX and preallocated matrix.
>> I removed MatCreateMPIAIJ() and use just MatTranspose with
>> MAT_INITIAL_MATRIX.
>> As a consequence I get hundreds of errors like:
>>
>> [30]PETSC ERROR: MatSetValues_MPIAIJ() line 538 in
>> /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
>> [30]PETSC ERROR: MatAssemblyEnd_MPIAIJ() line 653 in
>> /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
>> [30]PETSC ERROR: MatAssemblyEnd() line 4978 in
>> /home/mt/agrayver/lib/petsc-dev/src/mat/interface/matrix.c
>> [30]PETSC ERROR: MatTranspose_MPIAIJ() line 2061 in
>> /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
>> [30]PETSC ERROR: MatTranspose() line 4397 in
>> /home/mt/agrayver/lib/petsc-dev/src/mat/interface/matrix.c
>> [31]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [31]PETSC ERROR: Argument out of range!
>> [31]PETSC ERROR: New nonzero at (1659,53337) caused a malloc!
>>
>> Ans this is the lastest petsc-dev revision.
>> I know there were some changes in petsc-dev concerning this issue.
>> Can you give me a hint how to avoid this?
>>
>> As for why I need C=A^T*B. This product is used further as a system
>> matrix for LSQR solver. The largest dimension is on the order of 10^6
>> Since I form A and B myself I can, of course, form A^T explicitly, but I
>> thought I would first implement everything as it is written down on paper
>> and then optimize once it works (I thought it's easier way).
>>
>>
>> On 07.02.2012 20:44, Hong Zhang wrote:
>>
>> Alexander,
>> I'm curious about why do you need parallel C=A^T*B?
>> How large your matrices are?
>>
>>  In petsc-dev, we have MatTransposeMatMult() for mpiaij and mpiaij, but
>> not mpiaij and mpidense.
>> We may add support of MatTransposeMatMult_MPIAIJ_MPIDense() if there is
>> such need.
>>
>>  Hong
>>
>>
>> On Tue, Feb 7, 2012 at 1:18 PM, agrayver at gfz-potsdam.de <
>> agrayver at gfz-potsdam.de> wrote:
>>
>>> Hong,
>>>
>>> Thanks for explanation. I will try this tomorrow. Good to have this
>>> stuff in the help now.
>>>
>>> And sorry for misleading you initially.
>>>
>>> Regards,
>>> Alexander
>>>
>>>
>>> ----- Reply message -----
>>> From: "Hong Zhang" <hzhang at mcs.anl.gov>
>>> To: "For users of the development version of PETSc" <
>>> petsc-dev at mcs.anl.gov>
>>> Subject: [petsc-dev] MatMatMult gives different results
>>> Date: Tue, Feb 7, 2012 19:09
>>>
>>>
>>> Alexander :
>>>
>>>>
>>>> There is something I didn't get yet, I hope you could clarify it.
>>>>
>>>> So, when I use flag MAT_INITIAL_MATRIX in test program it works fine.
>>>>
>>> Good to know :-)
>>>
>>>
>>>>  If I put this flag in my original program I get dozens of exceptions
>>>> like:
>>>> [42]PETSC ERROR: Argument out of range!
>>>> [42]PETSC ERROR: New nonzero at (1336,153341) caused a malloc!
>>>>
>>> You cannot do
>>>  MatCreateMPIAIJ()
>>> MatTranspose(A,MAT_INITIAL_MATRIX,&AT);
>>>
>>>  MatCreateMPIAIJ() creates AT and preallocates approximate nonzeros,
>>> which does not match exactly the nonzeros in
>>> MatTranspose(A,MAT_INITIAL_MATRIX,&AT);
>>> MatTranspose(A,MAT_INITIAL_MATRIX,&AT) creates matrix AT and sets
>>> correct nonzero pattern and values in AT.
>>> MatTranspose() only takes in "MAT_INITIAL_MATRIX" - for a new AT,
>>> and "MAT_REUSE_MATRIX" when AT is created with
>>> MatTranspose(A,MAT_INITIAL_MATRIX,&AT)
>>> and reuse for updating its values (not nonzero patten).
>>>
>>>  I'm updating petsc help menu on MatTranspose(). Thanks for the report.
>>>
>>>  Hong
>>>
>>>
>>>> I changed this flag to MAT_REUSE_MATRIX and exceptions disappeared, but
>>>> result is incorrect again (same as for MAT_IGNORE_MATRIX)
>>>> I tried test program with MAT_REUSE_MATRIX and it also gives different
>>>> matrix product.
>>>>
>>>> Since there is no description of MatReuse structure for MatTranspose
>>>> it's a bit confusing what to expect from it.
>>>>
>>>>
>>>>>  Do you mean 'Cm = A'*B;'?
>>>>> 'Cm = A.'*B;' gives component-wise matrix product, not matrix product.
>>>>>
>>>>>
>>>>>  .' operator means non-Hermitian transpose. That is what I get with
>>>>> MatTranspose (in contrast with MatHermitianTranspose)
>>>>> component-wise matrix product would be .*
>>>>>
>>>> You are correct.
>>>>
>>>>  Hong
>>>>
>>>>>
>>>>>
>>>>>
>>>>>  Hong
>>>>>
>>>>>
>>>>>
>>>>>>  C = PetscBinaryRead('C.dat','complex',true);
>>>>>>
>>>>>> Matrix C is different depending on number of cores I use.
>>>>>> My PETSc is:
>>>>>> Using Petsc Development HG revision:
>>>>>> 876c894d95f4fa6561d0a91310ca914592527960  HG Date: Tue Jan 10 19:27:14 2012
>>>>>> +0100
>>>>>>
>>>>>>
>>>>>> On 06.02.2012 17:13, Hong Zhang wrote:
>>>>>>
>>>>>> MatMatMult() in petsc is not well-tested for complex - could be
>>>>>> buggy.
>>>>>> Can you send us the matrices A and B in petsc binary format for
>>>>>> investigation?
>>>>>>
>>>>>>  Hong
>>>>>>
>>>>>> On Mon, Feb 6, 2012 at 5:55 AM, Alexander Grayver <
>>>>>> agrayver at gfz-potsdam.de> wrote:
>>>>>>
>>>>>>> Dear PETSc team,
>>>>>>>
>>>>>>> I try to use:
>>>>>>> call
>>>>>>> MatMatMult(A,B,MAT_INITIAL_MATRIX,PETSC_DEFAULT_DOUBLE_PRECISION,C,ierr);CHKERRQ(ierr)
>>>>>>>
>>>>>>> Where both A and B are rectangular, but A is sparse and B is dense.
>>>>>>> Both are double complex and distributed.
>>>>>>> The product PETSc gives me contains some errors in some part of the
>>>>>>> matrix.
>>>>>>> I output A, B and C then computed product in matlab.
>>>>>>>
>>>>>>> Attached you see figure plotted as:
>>>>>>> imagesc(log10(abs(C-Cm)))
>>>>>>>
>>>>>>> Where Cm -- product computed in matlab.
>>>>>>>
>>>>>>> The pattern and amplitude vary depending on the number of cores I
>>>>>>> use. This picture is obtained for 48 cores (I've tried 12, 64 cores as
>>>>>>> well).
>>>>>>>
>>>>>>> Where should I look for possible explanation?
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Alexander
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>   --
>>>>>> Regards,
>>>>>> Alexander
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>   --
>>>>> Regards,
>>>>> Alexander
>>>>>
>>>>>
>>>>
>>>>
>>>>   --
>>>> Regards,
>>>> Alexander
>>>>
>>>>
>>>
>>
>>
>>   --
>> Regards,
>> Alexander
>>
>>
>
>
> --
> Regards,
> Alexander
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120208/3e74965e/attachment.html>


More information about the petsc-dev mailing list