[petsc-dev] MATOP_MAT_MULT
Stefano Zampini
stefano.zampini at gmail.com
Thu May 7 13:16:36 CDT 2020
Jose
I have just pushed some code to support MPI DENSE CUDA matrices and
MatMatMult operations (basic loop over columns, without copy into vectors).
I have rebased against the latest master
Let me know if it works for you. I will strip out the relevant commits and
make a new MR
Pierre, I have added a test for sbaij in parallel and it works nicely
(automatically doing the loop over dense columns). Let me know if it works
for you now
Thanks
Il giorno gio 7 mag 2020 alle ore 00:17 Stefano Zampini <
stefano.zampini at gmail.com> ha scritto:
>
> >
> >
> >> El 6 may 2020, a las 20:00, Pierre Jolivet <pierre.jolivet at enseeiht.fr>
> escribió:
> >>
> >> Stefano,
> >> Is this working for nsize > 1
> https://gitlab.com/petsc/petsc/-/blob/7e88e4dd44e2a5120b858cf9f19502ac359985be/src/mat/tests/ex70.c#L295
> >> I am now getting (in another example):
> >> [0]PETSC ERROR: Call MatProductSymbolic() first
> >> Instead of the previous:
> >> [0]PETSC ERROR: MatProductSetFromOptions_AB for A mpisbaij and B
> mpidense is not supported
> >>
>
> Pierre,
>
> Not sure what is going on if you do not tell me what to run. My branch
> stefanozampini/feature-add-hpackages is off master and has been recently
> rebased (includes the fixes I have made in maint too)
> BTW, I found your message below Jose’s answer and I never get your
> original message. Did you forget to send to petsc-dev?
>
>
>
> >> (But my branch is lagging behind maint, so maybe I’m missing some other
> fixes, take this with a grain of salt).
> >> Thanks,
> >> Pierre
> >>
> >>> On 6 May 2020, at 4:52 PM, Stefano Zampini <stefano.zampini at gmail.com>
> wrote:
> >>>
> >>> I have working support for MATSHELL here
> https://gitlab.com/petsc/petsc/-/commit/146e7f1ccf5f267b36079cac494077a23e8bbc45
> >>> Tested here
> https://gitlab.com/petsc/petsc/-/commit/c4fcaa45a01cc783c629913983b204a1cbcb3939
> >>>
> >>> Jose and Pierre, this code is supposed to work with CUDA, but I
> haven't tested it yet
> >>> Can you tell me if this fixes the issues for you to not have to loop
> over the columns of the dense matrix yourself?
> >>>
> >>> Il giorno mer 6 mag 2020 alle ore 10:09 Stefano Zampini <
> stefano.zampini at gmail.com> ha scritto:
> >>> Hong
> >>>
> >>> If the product is not supported, the type of C will never be set
> anyway, so you cannot call MatHasOperation after MatProductSetFromOptions.
> >>> The purpose of MatProductSetFromOptions is to populate the function
> pointers for symbolic and numeric phases. If not found, they should be set
> to null instead of erroring as it is now.
> >>> What I propose is to have MatProductHasOperation (not
> MatHasOperation): this function will be identical to MatHasOperation, with
> the only difference that does not call PetscValidType on the input mat.
> >>>
> >>> Meanwhile, I’m coding a basic MatMat (and MatTransposeMat) driver to
> loop over dense columns and apply MatMult. (Or MatMultTranspose) without
> memory movement.
> >>> This will be valid for all B matrices being of type dense (and its
> derivations), with C of type dense too. This in principle will fix Jose and
> Pierre’s issues (they can correct me if I’m wrong)
> >>>
> >>> However, we should definitely have a way for the user to enquire if a
> given operation is supported or not.
> >>>
> >>> Thanks
> >>> Stefano
> >>>
> >>>> On May 6, 2020, at 12:03 AM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
> >>>>
> >>>> Stefano:
> >>>> Now, we need address this bug report: enable
> MatHasOperation(C,MATOP_MAT_MULT,&flg) for matrix products, e.g., C=A*B,
> which is related to your issue https://gitlab.com/petsc/petsc/-/issues/608
> .
> >>>>
> >>>> In petsc-3.13:
> >>>> 1) MATOP_MAT_MULT, ..., MATOP_MATMAT_MULT are removed from the MATOP
> table (they are still listed in petscmat.h -- an overlook, I'll remove
> them).
> >>>> MATOP_MAT_MULT_SYMBOLIC/NUMERIC ... are still in the table.
> >>>> 2) MatHasOperation(C,...) must be called for the matrix product C,
> not matrix A or B (slepc needs to fix this after this reported bug is
> fixed).
> >>>>
> >>>> Like MatSetOption(), MatHasOperation() must be called AFTER
> MatSetType(). You moved MatSetType() from MatProductSetFromOptions() back
> to MatProductSymbolic() in your latest patch, thus user has to call
> MatHasOption() after MatProductSymbolic():
> >>>>
> >>>> MatProductCreate(A,B,NULL,&C);
> >>>> MatProductSetType(C,...);
> >>>> ...
> >>>> MatProductSetFromOptions(); //if the product is not supported for
> the given mat types, currently petsc crashes here, which we can replace
> with an error output
> >>>>
> >>>> MatProductSymbloc(); -> call MatSetType()
> >>>> MatHasOperation(C,MATOP_MAT_MULT,&flg)
> >>>>
> >>>> Question: how to call MatHasOperation(C,..) when MatProductSymbloc()
> is not supported?
> >>>>
> >>>> My fix to this bug:
> >>>> Resume MatSetType() in MatProductSetFromOptions(). Then user calls:
> >>>>
> >>>> MatProductCreate(A,B,NULL,&C);
> >>>> MatProductSetType(C,...);
> >>>> ...
> >>>> MatProductSetFromOptions(C); //if the product is not supported for
> the given mat types, C->ops->productsymbolic=NULL;
> >>>> MatHasOperation(C,MATOP_PRODUCTSYMBOLIC,&flg);
> >>>> if (flg) {
> >>>> MatProductSymbolic(C);
> >>>> ...
> >>>> } else {
> >>>> MatDestroy(&C);
> >>>> ...
> >>>> }
> >>>>
> >>>> Either you take care of this bug report, or let me know your thoughts
> about how to fix this bug.
> >>>> Hong
> >>>> From: Zhang, Hong <hzhang at mcs.anl.gov>
> >>>> Sent: Saturday, April 25, 2020 2:40 PM
> >>>> To: Pierre Jolivet <pierre.jolivet at enseeiht.fr>
> >>>> Cc: Jose E. Roman <jroman at dsic.upv.es>; Stefano Zampini <
> stefano.zampini at gmail.com>; petsc-dev <petsc-dev at mcs.anl.gov>; Smith,
> Barry F. <bsmith at mcs.anl.gov>
> >>>> Subject: Re: [petsc-dev] MATOP_MAT_MULT
> >>>>
> >>>> Pierre,
> >>>> When we do
> >>>> MatProductCreate: C = A*B; //C owns A and B, thus B->refct =2
> >>>> MatProductCreateWithMats: B = A*C; //If I let B own A and C, then
> C->refct=2
> >>>> Then
> >>>> MatDestroy(&B) and MatDestroy(&C) only reduce their refct from 2 to
> 1, thus memory leak.
> >>>> My solution is adding
> >>>> {
> >>>> matreference; /* do not add refct when using
> MatProductCreateWithMat() to void recursive references */
> >>>> } Mat_Product
> >>>> This flg prevents MatProductCreateWithMats() to increase reference
> counts, i.e., B does not own A and C to avoid reverse ownership. I am not
> sure this is a reasonable solution. Let me know if you have better solution.
> >>>> See ex109.c and ex195.c for tests.
> >>>> Hong
> >>>> From: Pierre Jolivet <pierre.jolivet at enseeiht.fr>
> >>>> Sent: Saturday, April 25, 2020 11:45 AM
> >>>> To: Zhang, Hong <hzhang at mcs.anl.gov>
> >>>> Cc: Jose E. Roman <jroman at dsic.upv.es>; Stefano Zampini <
> stefano.zampini at gmail.com>; petsc-dev <petsc-dev at mcs.anl.gov>; Smith,
> Barry F. <bsmith at mcs.anl.gov>
> >>>> Subject: Re: [petsc-dev] MATOP_MAT_MULT
> >>>>
> >>>> Hong,
> >>>> José didn’t report this, though he may have run into the same issue,
> I did.
> >>>> I’ll try the branch and get back at you on GitLab MR.
> >>>>
> >>>> Thanks,
> >>>> Pierre
> >>>>
> >>>>> On 25 Apr 2020, at 6:17 PM, Zhang, Hong <hzhang at mcs.anl.gov> wrote:
> >>>>>
> >>>>> Jose,
> >>>>>
> >>>>>>> I also now just tested some previously PETSC_VERSION_LT(3,13,0)
> running code with C=A*B, Dense=Nest*Dense, all previously allocated prior
> to a call to MatMatMult and scall = MAT_REUSE_MATRIX.
> >>>>>>> Sadly, it’s now broken. It is my fault for not having a test for
> this in https://gitlab.com/petsc/petsc/-/merge_requests/2069, sorry about
> that.
> >>>>>>> [0]PETSC ERROR: Call MatProductSymbolic() first
> >>>>>>> [0]PETSC ERROR: #1 MatProductNumeric() line 730 in
> /ccc/work/cont003/rndm/rndm/petsc/src/mat/interface/matproduct.c
> >>>>>>> [0]PETSC ERROR: #2 MatMatMult() line 9335 in
> /ccc/work/cont003/rndm/rndm/petsc/src/mat/interface/matrix.c
> >>>>>>>
> >>>>>>> Here is a reproducer (that will work OK with 3.12.4).
> >>>>>>> diff --git a/src/mat/tests/ex195.c b/src/mat/tests/ex195.c
> >>>>>>> index c72662bc3c..811de669c5 100644
> >>>>>>> --- a/src/mat/tests/ex195.c
> >>>>>>> +++ b/src/mat/tests/ex195.c
> >>>>>>> @@ -73,2 +73,3 @@ int main(int argc,char **args)
> >>>>>>> ierr =
> MatMatMult(nest,B,MAT_REUSE_MATRIX,PETSC_DEFAULT,&C);CHKERRQ(ierr);
> >>>>>>> + ierr =
> MatMatMult(nest,C,MAT_REUSE_MATRIX,PETSC_DEFAULT,&B);CHKERRQ(ierr);
> >>>>>>> ierr = MatMatMultEqual(nest,B,C,10,&equal);CHKERRQ(ierr);
> >>>>>>>
> >>>>>>> $ make -f gmakefile test searchin=mat_tests-ex195
> >>>>>>>
> >>>>>>> I believe this is very close to the topic at hand and issue #608,
> so maybe you could fix this as well in the same upcoming MR? Just let me
> know, I can have a crack it otherwise.
> >>>>>
> >>>>> This is a bug. I fixed it in the branch
> hzhang/fix-matproduct-reuse/maint. Can you test it?
> >>>>> Hong
> >>>
> >>>
> >>>
> >>> --
> >>> Stefano
> >>
> >
>
>
--
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200507/16bd5d23/attachment.html>
More information about the petsc-dev
mailing list