[petsc-dev] MatMatMult gives different results

Alexander Grayver agrayver at gfz-potsdam.de
Wed Feb 8 14:39:01 CST 2012


Here comes the continuation of the story:

[24]PETSC ERROR: --------------------- Error Message 
------------------------------------
[24]PETSC ERROR: Object is in wrong state!
[24]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on 
argument 1 "mat" before MatScaleSystem()!
[24]PETSC ERROR: 
------------------------------------------------------------------------
[24]PETSC ERROR: Petsc Development HG revision: 
a4dae06b4d05ef5b78a9a9aa2cbc31b2b158834f  HG Date: Wed Feb 08 11:29:52 
2012 -0600
[24]PETSC ERROR: See docs/changes/index.html for recent updates.
[24]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[24]PETSC ERROR: See docs/index.html for manual pages.
[24]PETSC ERROR: 
------------------------------------------------------------------------
[24]PETSC ERROR: /home/eprog on a openmpi-i named node226 by agrayver 
Wed Feb  8 21:24:13 2012
[24]PETSC ERROR: Libraries linked from 
/home/lib/petsc-dev/openmpi-intel-complex-debug-f-mkl/lib
[24]PETSC ERROR: Configure run at Wed Feb  8 20:51:18 2012
[24]PETSC ERROR: 
------------------------------------------------------------------------
[24]PETSC ERROR: MatScaleSystem() line 932 in 
/home/lib/petsc-dev/src/mat/interface/matrix.c
[24]PETSC ERROR: PCPreSolve() line 1389 in 
/home/lib/petsc-dev/src/ksp/pc/interface/precon.c
[24]PETSC ERROR: KSPSolve() line 410 in 
/home/lib/petsc-dev/src/ksp/ksp/interface/itfunc.c

It happens within the CG solver with the system matrix which is created 
like this:
call 
MatCreateShell(MPI_COMM_WORLD,mlocal,nlocal,N,N,PETSC_NULL_INTEGER,H,ierr);CHKERRQ(ierr)

On 08.02.2012 21:28, Alexander Grayver wrote:
> Hong,
>
> Everything seems to be working well so far.
> I would say there is nothing to acknowledge, but I don't mind...
>
> Please let me know if there will be support for mpiaij_mpidense.
>
> Thanks for this great support.
>
> On 08.02.2012 18:43, Hong Zhang wrote:
>> Alexander:
>> Recently, we add error flags in petsc-dev 
>> requiring MatXXXSetPreallocation() for many matrix creation routines.
>> I've updated relevant routines for your code, and add your 
>> contributed testing code as a petsc example with acknowledge:
>> petsc-dev/src/mat/examples/tests/ex165.c
>> Let me know if you do not want your name to be acknowledge in the 
>> example.
>>
>> I tested ex165 with your A and B in sequential (np=1) and parallel 
>> (np=2, 6):
>> mpiexec -n <np> ./ex165 -fA A.dat -fB B.dat -view_C
>>
>> Checking the size of output file C.dat, I see the sequential run and 
>> parallel run give identical bit size.
>> Please get the updated petsc-dev and test it. Let me know if you 
>> still have problem.
>>
>> It is inefficient to compute C=A^T*B via MatTranspose() and MatMatMult().
>> I'll see if I can write a C=A^T*B  for mpiaij_mpidense.
>>
>> Hong
>>
>>
>>     In addition, I changed my code and form A^T explicitly (to avoid
>>     previous error) and I got another one:
>>
>>
>>     [1]PETSC ERROR: --------------------- Error Message
>>     ------------------------------------
>>     [1]PETSC ERROR: Object is in wrong state!
>>     [1]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp()
>>     on argument 1 "mat" before MatAssemblyBegin()!
>>     [1]PETSC ERROR:
>>     ------------------------------------------------------------------------
>>     [1]PETSC ERROR: Petsc Development HG revision:
>>     249597282bcb6a1051042a9fdfa5705679ed4f18  HG Date: Tue Feb 07
>>     09:44:23 2012 -0600
>>     [1]PETSC ERROR: See docs/changes/index.html for recent updates.
>>     [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>     [1]PETSC ERROR: See docs/index.html for manual pages.
>>     [1]PETSC ERROR:
>>     ------------------------------------------------------------------------
>>     [1]PETSC ERROR: solveTest on a openmpi-i named glic1 by agrayver
>>     Wed Feb  8 13:06:45 2012
>>     [1]PETSC ERROR: Libraries linked from
>>     /home/lib/petsc-dev/openmpi-intel-complex-debug-f-mkl/lib
>>     [1]PETSC ERROR: Configure run at Tue Feb  7 18:19:58 2012
>>     [1]PETSC ERROR: Configure options
>>     --with-petsc-arch=openmpi-intel-complex-debug-f-mkl
>>     --with-fortran-interfaces=1 --download-superlu
>>     --download-superlu_dist --download-mumps --download-parmetis
>>     --download-ptscotch --download-metis
>>     --with-scalapack-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_scalapack_lp64.a
>>     --with-scalapack-include=/opt/intel/Compiler/11.1/072/mkl/include
>>     --with-blacs-lib=/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_blacs_openmpi_lp64.a
>>     --with-blacs-include=/opt/intel/Compiler/11.1/072/mkl/include
>>     --with-mpi-dir=/opt/mpi/intel/openmpi-1.4.2
>>     --with-scalar-type=complex
>>     --with-blas-lapack-lib="[/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_lp64.a,/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.a,/opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_core.a,/opt/intel/Compiler/11.1/072/lib/intel64/libiomp5.a]"
>>     --with-precision=double --with-x=0
>>     [1]PETSC ERROR:
>>     ------------------------------------------------------------------------
>>     [1]PETSC ERROR: MatAssemblyBegin() line 4795 in
>>     /home/lib/petsc-dev/src/mat/interface/matrix.c
>>     [1]PETSC ERROR: MatMatMultSymbolic_MPIAIJ_MPIDense() line 638 in
>>     /home/lib/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c
>>     [1]PETSC ERROR: MatMatMult_MPIAIJ_MPIDense() line 594 in
>>     /home/lib/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c
>>     [1]PETSC ERROR: MatMatMult() line 8618 in
>>     /home/lib/petsc-dev/src/mat/interface/matrix.c
>>
>>
>>     On 08.02.2012 17:45, Hong Zhang wrote:
>>>     Alexander :
>>>     I can repeat the crash, and am working on it.
>>>     I'll let you know after the bug is fixed.
>>>
>>>     Thanks for your patience,
>>>     Hong
>>>
>>>
>>>         It seems now I know why I used MAT_REUSE_MATRIX and
>>>         preallocated matrix.
>>>         I removed MatCreateMPIAIJ() and use just MatTranspose with
>>>         MAT_INITIAL_MATRIX.
>>>         As a consequence I get hundreds of errors like:
>>>
>>>         [30]PETSC ERROR: MatSetValues_MPIAIJ() line 538 in
>>>         /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
>>>         [30]PETSC ERROR: MatAssemblyEnd_MPIAIJ() line 653 in
>>>         /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
>>>         [30]PETSC ERROR: MatAssemblyEnd() line 4978 in
>>>         /home/mt/agrayver/lib/petsc-dev/src/mat/interface/matrix.c
>>>         [30]PETSC ERROR: MatTranspose_MPIAIJ() line 2061 in
>>>         /home/mt/agrayver/lib/petsc-dev/src/mat/impls/aij/mpi/mpiaij.c
>>>         [30]PETSC ERROR: MatTranspose() line 4397 in
>>>         /home/mt/agrayver/lib/petsc-dev/src/mat/interface/matrix.c
>>>         [31]PETSC ERROR: --------------------- Error Message
>>>         ------------------------------------
>>>         [31]PETSC ERROR: Argument out of range!
>>>         [31]PETSC ERROR: New nonzero at (1659,53337) caused a malloc!
>>>
>>>         Ans this is the lastest petsc-dev revision.
>>>         I know there were some changes in petsc-dev concerning this
>>>         issue.
>>>         Can you give me a hint how to avoid this?
>>>
>>>         As for why I need C=A^T*B. This product is used further as a
>>>         system matrix for LSQR solver. The largest dimension is on
>>>         the order of 10^6
>>>         Since I form A and B myself I can, of course, form A^T
>>>         explicitly, but I thought I would first implement everything
>>>         as it is written down on paper and then optimize once it
>>>         works (I thought it's easier way).
>>>
>>>
>>>         On 07.02.2012 20:44, Hong Zhang wrote:
>>>>         Alexander,
>>>>         I'm curious about why do you need parallel C=A^T*B?
>>>>         How large your matrices are?
>>>>
>>>>         In petsc-dev, we have MatTransposeMatMult() for mpiaij and
>>>>         mpiaij, but not mpiaij and mpidense.
>>>>         We may add support of MatTransposeMatMult_MPIAIJ_MPIDense()
>>>>         if there is such need.
>>>>
>>>>         Hong
>>>>
>>>>
>>>>         On Tue, Feb 7, 2012 at 1:18 PM, agrayver at gfz-potsdam.de
>>>>         <mailto:agrayver at gfz-potsdam.de> <agrayver at gfz-potsdam.de
>>>>         <mailto:agrayver at gfz-potsdam.de>> wrote:
>>>>
>>>>             Hong,
>>>>
>>>>             Thanks for explanation. I will try this tomorrow. Good
>>>>             to have this stuff in the help now.
>>>>
>>>>             And sorry for misleading you initially.
>>>>
>>>>             Regards,
>>>>             Alexander
>>>>
>>>>
>>>>             ----- Reply message -----
>>>>             From: "Hong Zhang" <hzhang at mcs.anl.gov
>>>>             <mailto:hzhang at mcs.anl.gov>>
>>>>             To: "For users of the development version of PETSc"
>>>>             <petsc-dev at mcs.anl.gov <mailto:petsc-dev at mcs.anl.gov>>
>>>>             Subject: [petsc-dev] MatMatMult gives different results
>>>>             Date: Tue, Feb 7, 2012 19:09
>>>>
>>>>
>>>>             Alexander :
>>>>
>>>>
>>>>                 There is something I didn't get yet, I hope you
>>>>                 could clarify it.
>>>>
>>>>                 So, when I use flag MAT_INITIAL_MATRIX in test
>>>>                 program it works fine.
>>>>
>>>>             Good to know :-)
>>>>
>>>>                 If I put this flag in my original program I get
>>>>                 dozens of exceptions like:
>>>>                 [42]PETSC ERROR: Argument out of range!
>>>>                 [42]PETSC ERROR: New nonzero at (1336,153341)
>>>>                 caused a malloc!
>>>>
>>>>             You cannot do
>>>>             MatCreateMPIAIJ()
>>>>             MatTranspose(A,MAT_INITIAL_MATRIX,&AT);
>>>>
>>>>             MatCreateMPIAIJ() creates AT and preallocates
>>>>             approximate nonzeros, which does not match exactly the
>>>>             nonzeros in
>>>>             MatTranspose(A,MAT_INITIAL_MATRIX,&AT);
>>>>             MatTranspose(A,MAT_INITIAL_MATRIX,&AT) creates matrix
>>>>             AT and sets correct nonzero pattern and values in AT.
>>>>             MatTranspose() only takes in "MAT_INITIAL_MATRIX" - for
>>>>             a new AT,
>>>>             and "MAT_REUSE_MATRIX" when AT is created with
>>>>             MatTranspose(A,MAT_INITIAL_MATRIX,&AT)
>>>>             and reuse for updating its values (not nonzero patten).
>>>>
>>>>             I'm updating petsc help menu on MatTranspose(). Thanks
>>>>             for the report.
>>>>
>>>>             Hong
>>>>
>>>>
>>>>                 I changed this flag to MAT_REUSE_MATRIX and
>>>>                 exceptions disappeared, but result is incorrect
>>>>                 again (same as for MAT_IGNORE_MATRIX)
>>>>                 I tried test program with MAT_REUSE_MATRIX and it
>>>>                 also gives different matrix product.
>>>>
>>>>                 Since there is no description of MatReuse structure
>>>>                 for MatTranspose it's a bit confusing what to
>>>>                 expect from it.
>>>>
>>>>>>
>>>>>>                     Do you mean 'Cm = A'*B;'?
>>>>>>                     'Cm = A.'*B;' gives component-wise matrix
>>>>>>                     product, not matrix product.
>>>>>
>>>>>                     .' operator means non-Hermitian transpose.
>>>>>                     That is what I get with MatTranspose (in
>>>>>                     contrast with MatHermitianTranspose)
>>>>>                     component-wise matrix product would be .*
>>>>>
>>>>>                 You are correct.
>>>>>
>>>>>                 Hong
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>                     Hong
>>>>>>
>>>>>>                         C = PetscBinaryRead('C.dat','complex',true);
>>>>>>
>>>>>>                         Matrix C is different depending on number
>>>>>>                         of cores I use.
>>>>>>                         My PETSc is:
>>>>>>                         Using Petsc Development HG revision:
>>>>>>                         876c894d95f4fa6561d0a91310ca914592527960 
>>>>>>                         HG Date: Tue Jan 10 19:27:14 2012 +0100
>>>>>>
>>>>>>
>>>>>>                         On 06.02.2012 17:13, Hong Zhang wrote:
>>>>>>>                         MatMatMult() in petsc is not well-tested
>>>>>>>                         for complex - could be buggy.
>>>>>>>                         Can you send us the matrices A and B in
>>>>>>>                         petsc binary format for investigation?
>>>>>>>
>>>>>>>                         Hong
>>>>>>>
>>>>>>>                         On Mon, Feb 6, 2012 at 5:55 AM,
>>>>>>>                         Alexander Grayver
>>>>>>>                         <agrayver at gfz-potsdam.de
>>>>>>>                         <mailto:agrayver at gfz-potsdam.de>> wrote:
>>>>>>>
>>>>>>>                             Dear PETSc team,
>>>>>>>
>>>>>>>                             I try to use:
>>>>>>>                             call
>>>>>>>                             MatMatMult(A,B,MAT_INITIAL_MATRIX,PETSC_DEFAULT_DOUBLE_PRECISION,C,ierr);CHKERRQ(ierr)
>>>>>>>
>>>>>>>                             Where both A and B are rectangular,
>>>>>>>                             but A is sparse and B is dense. Both
>>>>>>>                             are double complex and distributed.
>>>>>>>                             The product PETSc gives me contains
>>>>>>>                             some errors in some part of the matrix.
>>>>>>>                             I output A, B and C then computed
>>>>>>>                             product in matlab.
>>>>>>>
>>>>>>>                             Attached you see figure plotted as:
>>>>>>>                             imagesc(log10(abs(C-Cm)))
>>>>>>>
>>>>>>>                             Where Cm -- product computed in matlab.
>>>>>>>
>>>>>>>                             The pattern and amplitude vary
>>>>>>>                             depending on the number of cores I
>>>>>>>                             use. This picture is obtained for 48
>>>>>>>                             cores (I've tried 12, 64 cores as well).
>>>>>>>
>>>>>>>                             Where should I look for possible
>>>>>>>                             explanation?
>>>>>>>
>>>>>>>                             -- 
>>>>>>>                             Regards,
>>>>>>>                             Alexander
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>                         -- 
>>>>>>                         Regards,
>>>>>>                         Alexander
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>                     -- 
>>>>>                     Regards,
>>>>>                     Alexander
>>>>>
>>>>>
>>>>
>>>>
>>>>                 -- 
>>>>                 Regards,
>>>>                 Alexander
>>>>
>>>>
>>>>
>>>
>>>
>>>         -- 
>>>         Regards,
>>>         Alexander
>>>
>>>
>>
>>
>>     -- 
>>     Regards,
>>     Alexander
>>
>>
>
>
> -- 
> Regards,
> Alexander


-- 
Regards,
Alexander

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120208/e6e68419/attachment.html>


More information about the petsc-dev mailing list