<div dir="ltr">Jose<div><br></div><div>I have just pushed some code to support MPI DENSE CUDA matrices and MatMatMult operations (basic loop over columns, without copy into vectors).</div><div>I have rebased against the latest master</div><div>Let me know if it works for you. I will strip out the relevant commits and make a new MR</div><div><br></div><div>Pierre, I have added a test for sbaij in parallel and it works nicely (automatically doing the loop over dense columns). Let me know if it works for you now</div><div><br></div><div>Thanks</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Il giorno gio 7 mag 2020 alle ore 00:17 Stefano Zampini <<a href="mailto:stefano.zampini@gmail.com">stefano.zampini@gmail.com</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
> <br>
> <br>
>> El 6 may 2020, a las 20:00, Pierre Jolivet <<a href="mailto:pierre.jolivet@enseeiht.fr" target="_blank">pierre.jolivet@enseeiht.fr</a>> escribió:<br>
>> <br>
>> Stefano,<br>
>> Is this working for nsize > 1 <a href="https://gitlab.com/petsc/petsc/-/blob/7e88e4dd44e2a5120b858cf9f19502ac359985be/src/mat/tests/ex70.c#L295" rel="noreferrer" target="_blank">https://gitlab.com/petsc/petsc/-/blob/7e88e4dd44e2a5120b858cf9f19502ac359985be/src/mat/tests/ex70.c#L295</a><br>
>> I am now getting (in another example):<br>
>> [0]PETSC ERROR: Call MatProductSymbolic() first<br>
>> Instead of the previous:<br>
>> [0]PETSC ERROR: MatProductSetFromOptions_AB for A mpisbaij and B mpidense is not supported<br>
>> <br>
<br>
Pierre,<br>
<br>
Not sure what is going on if you do not tell me what to run. My branch stefanozampini/feature-add-hpackages is off master and has been recently rebased (includes the fixes I have made in maint too)<br>
BTW, I found your message below Jose’s answer and I never get your original message. Did you forget to send to petsc-dev?<br>
<br>
<br>
<br>
>> (But my branch is lagging behind maint, so maybe I’m missing some other fixes, take this with a grain of salt).<br>
>> Thanks,<br>
>> Pierre<br>
>> <br>
>>> On 6 May 2020, at 4:52 PM, Stefano Zampini <<a href="mailto:stefano.zampini@gmail.com" target="_blank">stefano.zampini@gmail.com</a>> wrote:<br>
>>> <br>
>>> I have working support for MATSHELL here <a href="https://gitlab.com/petsc/petsc/-/commit/146e7f1ccf5f267b36079cac494077a23e8bbc45" rel="noreferrer" target="_blank">https://gitlab.com/petsc/petsc/-/commit/146e7f1ccf5f267b36079cac494077a23e8bbc45</a><br>
>>> Tested here <a href="https://gitlab.com/petsc/petsc/-/commit/c4fcaa45a01cc783c629913983b204a1cbcb3939" rel="noreferrer" target="_blank">https://gitlab.com/petsc/petsc/-/commit/c4fcaa45a01cc783c629913983b204a1cbcb3939</a><br>
>>> <br>
>>> Jose and Pierre, this code is supposed to work with CUDA, but I haven't tested it yet<br>
>>> Can you tell me if this fixes the issues for you to not have to loop over the columns of the dense matrix yourself?<br>
>>> <br>
>>> Il giorno mer 6 mag 2020 alle ore 10:09 Stefano Zampini <<a href="mailto:stefano.zampini@gmail.com" target="_blank">stefano.zampini@gmail.com</a>> ha scritto:<br>
>>> Hong<br>
>>> <br>
>>> If the product is not supported, the type of C will never be set anyway, so you cannot call MatHasOperation after MatProductSetFromOptions.<br>
>>> The purpose of MatProductSetFromOptions is to populate the function pointers for symbolic and numeric phases. If not found, they should be set to null instead of erroring as it is now.<br>
>>> What I propose is to have MatProductHasOperation (not MatHasOperation): this function will be identical to MatHasOperation, with the only difference that does not call PetscValidType on the input mat.<br>
>>> <br>
>>> Meanwhile, I’m coding a basic MatMat (and MatTransposeMat) driver to loop over dense columns and apply MatMult. (Or MatMultTranspose) without memory movement.<br>
>>> This will be valid for all B matrices being of type dense (and its derivations), with C of type dense too. This in principle will fix Jose and Pierre’s issues (they can correct me if I’m wrong)<br>
>>> <br>
>>> However, we should definitely have a way for the user to enquire if a given operation is supported or not. <br>
>>> <br>
>>> Thanks<br>
>>> Stefano<br>
>>> <br>
>>>> On May 6, 2020, at 12:03 AM, Zhang, Hong <<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>> wrote:<br>
>>>> <br>
>>>> Stefano:<br>
>>>> Now, we need address this bug report: enable MatHasOperation(C,MATOP_MAT_MULT,&flg) for matrix products, e.g., C=A*B, which is related to your issue <a href="https://gitlab.com/petsc/petsc/-/issues/608" rel="noreferrer" target="_blank">https://gitlab.com/petsc/petsc/-/issues/608</a>.<br>
>>>> <br>
>>>> In petsc-3.13:<br>
>>>> 1) MATOP_MAT_MULT, ..., MATOP_MATMAT_MULT are removed from the MATOP table (they are still listed in petscmat.h -- an overlook, I'll remove them). <br>
>>>> MATOP_MAT_MULT_SYMBOLIC/NUMERIC ... are still in the table.<br>
>>>> 2) MatHasOperation(C,...) must be called for the matrix product C, not matrix A or B (slepc needs to fix this after this reported bug is fixed).<br>
>>>> <br>
>>>> Like MatSetOption(), MatHasOperation() must be called AFTER MatSetType(). You moved MatSetType() from MatProductSetFromOptions() back to MatProductSymbolic() in your latest patch, thus user has to call MatHasOption() after MatProductSymbolic():<br>
>>>> <br>
>>>> MatProductCreate(A,B,NULL,&C);<br>
>>>> MatProductSetType(C,...);<br>
>>>> ...<br>
>>>> MatProductSetFromOptions(); //if the product is not supported for the given mat types, currently petsc crashes here, which we can replace with an error output<br>
>>>> <br>
>>>> MatProductSymbloc(); -> call MatSetType()<br>
>>>> MatHasOperation(C,MATOP_MAT_MULT,&flg)<br>
>>>> <br>
>>>> Question: how to call MatHasOperation(C,..) when MatProductSymbloc() is not supported?<br>
>>>> <br>
>>>> My fix to this bug:<br>
>>>> Resume MatSetType() in MatProductSetFromOptions(). Then user calls:<br>
>>>> <br>
>>>> MatProductCreate(A,B,NULL,&C);<br>
>>>> MatProductSetType(C,...);<br>
>>>> ...<br>
>>>> MatProductSetFromOptions(C); //if the product is not supported for the given mat types, C->ops->productsymbolic=NULL;<br>
>>>> MatHasOperation(C,MATOP_PRODUCTSYMBOLIC,&flg);<br>
>>>> if (flg) { <br>
>>>> MatProductSymbolic(C);<br>
>>>> ...<br>
>>>> } else {<br>
>>>> MatDestroy(&C);<br>
>>>> ...<br>
>>>> }<br>
>>>> <br>
>>>> Either you take care of this bug report, or let me know your thoughts about how to fix this bug.<br>
>>>> Hong<br>
>>>> From: Zhang, Hong <<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>><br>
>>>> Sent: Saturday, April 25, 2020 2:40 PM<br>
>>>> To: Pierre Jolivet <<a href="mailto:pierre.jolivet@enseeiht.fr" target="_blank">pierre.jolivet@enseeiht.fr</a>><br>
>>>> Cc: Jose E. Roman <<a href="mailto:jroman@dsic.upv.es" target="_blank">jroman@dsic.upv.es</a>>; Stefano Zampini <<a href="mailto:stefano.zampini@gmail.com" target="_blank">stefano.zampini@gmail.com</a>>; petsc-dev <<a href="mailto:petsc-dev@mcs.anl.gov" target="_blank">petsc-dev@mcs.anl.gov</a>>; Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>><br>
>>>> Subject: Re: [petsc-dev] MATOP_MAT_MULT<br>
>>>> <br>
>>>> Pierre,<br>
>>>> When we do <br>
>>>> MatProductCreate: C = A*B; //C owns A and B, thus B->refct =2<br>
>>>> MatProductCreateWithMats: B = A*C; //If I let B own A and C, then C->refct=2<br>
>>>> Then<br>
>>>> MatDestroy(&B) and MatDestroy(&C) only reduce their refct from 2 to 1, thus memory leak. <br>
>>>> My solution is adding <br>
>>>> {<br>
>>>> matreference; /* do not add refct when using MatProductCreateWithMat() to void recursive references */<br>
>>>> } Mat_Product <br>
>>>> This flg prevents MatProductCreateWithMats() to increase reference counts, i.e., B does not own A and C to avoid reverse ownership. I am not sure this is a reasonable solution. Let me know if you have better solution.<br>
>>>> See ex109.c and ex195.c for tests.<br>
>>>> Hong<br>
>>>> From: Pierre Jolivet <<a href="mailto:pierre.jolivet@enseeiht.fr" target="_blank">pierre.jolivet@enseeiht.fr</a>><br>
>>>> Sent: Saturday, April 25, 2020 11:45 AM<br>
>>>> To: Zhang, Hong <<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>><br>
>>>> Cc: Jose E. Roman <<a href="mailto:jroman@dsic.upv.es" target="_blank">jroman@dsic.upv.es</a>>; Stefano Zampini <<a href="mailto:stefano.zampini@gmail.com" target="_blank">stefano.zampini@gmail.com</a>>; petsc-dev <<a href="mailto:petsc-dev@mcs.anl.gov" target="_blank">petsc-dev@mcs.anl.gov</a>>; Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>><br>
>>>> Subject: Re: [petsc-dev] MATOP_MAT_MULT<br>
>>>> <br>
>>>> Hong,<br>
>>>> José didn’t report this, though he may have run into the same issue, I did.<br>
>>>> I’ll try the branch and get back at you on GitLab MR.<br>
>>>> <br>
>>>> Thanks,<br>
>>>> Pierre<br>
>>>> <br>
>>>>> On 25 Apr 2020, at 6:17 PM, Zhang, Hong <<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>> wrote:<br>
>>>>> <br>
>>>>> Jose,<br>
>>>>> <br>
>>>>>>> I also now just tested some previously PETSC_VERSION_LT(3,13,0) running code with C=A*B, Dense=Nest*Dense, all previously allocated prior to a call to MatMatMult and scall = MAT_REUSE_MATRIX.<br>
>>>>>>> Sadly, it’s now broken. It is my fault for not having a test for this in <a href="https://gitlab.com/petsc/petsc/-/merge_requests/2069" rel="noreferrer" target="_blank">https://gitlab.com/petsc/petsc/-/merge_requests/2069</a>, sorry about that.<br>
>>>>>>> [0]PETSC ERROR: Call MatProductSymbolic() first<br>
>>>>>>> [0]PETSC ERROR: #1 MatProductNumeric() line 730 in /ccc/work/cont003/rndm/rndm/petsc/src/mat/interface/matproduct.c<br>
>>>>>>> [0]PETSC ERROR: #2 MatMatMult() line 9335 in /ccc/work/cont003/rndm/rndm/petsc/src/mat/interface/matrix.c<br>
>>>>>>> <br>
>>>>>>> Here is a reproducer (that will work OK with 3.12.4).<br>
>>>>>>> diff --git a/src/mat/tests/ex195.c b/src/mat/tests/ex195.c<br>
>>>>>>> index c72662bc3c..811de669c5 100644<br>
>>>>>>> --- a/src/mat/tests/ex195.c<br>
>>>>>>> +++ b/src/mat/tests/ex195.c<br>
>>>>>>> @@ -73,2 +73,3 @@ int main(int argc,char **args)<br>
>>>>>>> ierr = MatMatMult(nest,B,MAT_REUSE_MATRIX,PETSC_DEFAULT,&C);CHKERRQ(ierr);<br>
>>>>>>> + ierr = MatMatMult(nest,C,MAT_REUSE_MATRIX,PETSC_DEFAULT,&B);CHKERRQ(ierr);<br>
>>>>>>> ierr = MatMatMultEqual(nest,B,C,10,&equal);CHKERRQ(ierr);<br>
>>>>>>> <br>
>>>>>>> $ make -f gmakefile test searchin=mat_tests-ex195<br>
>>>>>>> <br>
>>>>>>> I believe this is very close to the topic at hand and issue #608, so maybe you could fix this as well in the same upcoming MR? Just let me know, I can have a crack it otherwise.<br>
>>>>> <br>
>>>>> This is a bug. I fixed it in the branch hzhang/fix-matproduct-reuse/maint. Can you test it?<br>
>>>>> Hong<br>
>>> <br>
>>> <br>
>>> <br>
>>> -- <br>
>>> Stefano<br>
>> <br>
> <br>
<br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature">Stefano</div>