[petsc-users] Parallelizing a matrix-free code

Michael Werner michael.werner at dlr.de
Wed Oct 18 11:01:43 CDT 2017


Ah, never mind. I only misunderstood the creation of subvectors by 
scattering/gathering. I thought it was necessary to collect the complete 
vector on each of my processes in order to extract a subvector. After 
rereading the corresponding section in the manual I learned, that this 
isn't necessary.

So now its possible to simply gather the correct values by their global 
ID, pass them to the external code and then scatter the result back to 
the parallel vector. Now my code is working as intended.

Thanks for your help!

Kind regards,
Michael Werner

Am 18.10.2017 um 12:01 schrieb Michael Werner:
> Thank your for this explanation, it makes sense. And after I updated 
> my code, the external CFD code runs without problems in parallel.
>
> However, now I'm back to the problem with the creation of the 
> vectors/domains. By using the application ordering, I can assign the 
> correct points from PETSc to the corresponding points in my external 
> code. At least, as long as both use the same subdomain size. But 
> sometimes they differ, and then the KSP breaks down, because the 
> solution Vector it receives has a different size than what it expects.
>
> An example:
> I have an unstructured grid with 800,000 datapoints.
>
> If I decompose this to run  on 2 processors, PETSc delegates exactly 
> 400,000 points to each process. However, the external code might 
> assign 400,100 points to the first and 399,900 process. As a result, 
> PETSc expects a solution vector of size 400,000 on each process, but 
> receives one of 400,100 and one of 399,900, leading to a breakdown.
>
> I suppose I could use VecScatterCreateToAll to collect all the values 
> from the solution vectors of my external code, and then create from 
> those a temporary vector that only contains the expected 400,000 
> values to hand over to the KSP. But this would create a lot of 
> communication between the different processes and seems quite clunky.
>
> Is there a more elegant way? Is it maybe possible to manually assign 
> the size of the PETSc subdomains?
>
> Kind regards,
> Michael Werner
>
> Am 17.10.2017 um 12:31 schrieb Matthew Knepley:
>> On Tue, Oct 17, 2017 at 6:08 AM, Michael Werner 
>> <michael.werner at dlr.de <mailto:michael.werner at dlr.de>> wrote:
>>
>>     Because usally this code is called just once. It runs one
>>     multiple processes, but there it's still always processing the
>>     whole domain. I can't run it on only one subdomain. As I
>>     understand it now, when I call it from PETSc, this call is issued
>>     once per process, so I would end up running several contesting
>>     instances of the computation on the whole domain.
>>
>>     But maybe that's only because I haven't completly understood how
>>     MPI really works in such cases...
>>
>>
>> No, it makes one call in which all processes participate. So you 
>> would call your external CFD routine once from all processes, passing 
>> in the MPI communicator.
>>
>>    Matt
>>
>>     Kind regards,
>>     Michael
>>
>>     Am 17.10.2017 um 11:50 schrieb Matthew Knepley:
>>>     On Tue, Oct 17, 2017 at 5:46 AM, Michael Werner
>>>     <michael.werner at dlr.de <mailto:michael.werner at dlr.de>> wrote:
>>>
>>>         I'm not sure what you mean with this question?
>>>         The external CFD code, if that was what you referred to, can
>>>         be run in parallel.
>>>
>>>
>>>     Then why is it a problem that "in a parallel case, this call
>>>     obviously gets called once per process"?
>>>
>>>        Matt
>>>
>>>         Am 17.10.2017 um 11:11 schrieb Matthew Knepley:
>>>>         On Tue, Oct 17, 2017 at 4:21 AM, Michael Werner
>>>>         <michael.werner at dlr.de <mailto:michael.werner at dlr.de>> wrote:
>>>>
>>>>             That's something I'm still struggling with. In the
>>>>             serial case, I can simply extract the values from the
>>>>             original grid, and since the ordering of the Jacobian
>>>>             is the same there is no problem. In the parallel case
>>>>             this is still a more or less open question. That's why
>>>>             I thought about reordering the Jacobian. As long as the
>>>>             position of the individual IDs is the same for both, I
>>>>             don't have to care about their absolute position.
>>>>
>>>>             I also wanted to thank you for your previous answer, it
>>>>             seems that the application ordering might be what I'm
>>>>             looking for. However, in the meantime I stumbled about
>>>>             another problem, that I have to solve first. My new
>>>>             problem is, that I call the external code within the
>>>>             shell matrix' multiply call. But in a parallel case,
>>>>             this call obviously gets called once per process. So
>>>>             right now I'm trying to circumvent this, so it might
>>>>             take a while before I'm able to come back to the
>>>>             original problem...
>>>>
>>>>
>>>>         I am not understanding. Is your original code parallel?
>>>>
>>>>           Thanks,
>>>>
>>>>              Matt
>>>>
>>>>             Kind regards,
>>>>             Michael
>>>>
>>>>             Am 16.10.2017 um 17:25 schrieb Praveen C:
>>>>>             I am interested to learn more about how this works.
>>>>>             How are the vectors created if the ids are not
>>>>>             contiguous in a partition ?
>>>>>
>>>>>             Thanks
>>>>>             praveen
>>>>>
>>>>>             On Mon, Oct 16, 2017 at 2:02 PM, Stefano Zampini
>>>>>             <stefano.zampini at gmail.com
>>>>>             <mailto:stefano.zampini at gmail.com>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>                 2017-10-16 10:26 GMT+03:00 Michael Werner
>>>>>                 <michael.werner at dlr.de
>>>>>                 <mailto:michael.werner at dlr.de>>:
>>>>>
>>>>>                     Hello,
>>>>>
>>>>>                     I'm having trouble with parallelizing a
>>>>>                     matrix-free code with PETSc. In this code, I
>>>>>                     use an external CFD code to provide the
>>>>>                     matrix-vector product for an iterative solver
>>>>>                     in PETSc. To increase convergence rate, I'm
>>>>>                     using an explicitly stored Jacobian matrix to
>>>>>                     precondition the solver. This works fine for
>>>>>                     serial runs. However, when I try to use
>>>>>                     multiple processes, I face the problem that
>>>>>                     PETSc decomposes the preconditioner matrix,
>>>>>                     and probably also the shell matrix, in a
>>>>>                     different way than the external CFD code
>>>>>                     decomposes the grid.
>>>>>
>>>>>                     The Jacobian matrix is built in a way, that
>>>>>                     its rows and columns correspond to the global
>>>>>                     IDs of the individual points in my CFD mesh
>>>>>
>>>>>                     The CFD code decomposes the domain based on
>>>>>                     the proximity of points to each other, so that
>>>>>                     the resulting subgrids are coherent. However,
>>>>>                     since its an unstructured grid, those subgrids
>>>>>                     are not necessarily made up of points with
>>>>>                     successive global IDs. This is a problem,
>>>>>                     since PETSc seems to partition the matrix in
>>>>>                     coherent slices.
>>>>>
>>>>>                     I'm not sure what the best approach to this
>>>>>                     problem might be. Is it maybe possible to
>>>>>                     exactly tell PETSc, which rows/columns it
>>>>>                     should assign to the individual processes?
>>>>>
>>>>>
>>>>>                 If you are explicitly setting the values in your
>>>>>                 Jacobians via MatSetValues(), you can create a
>>>>>                 ISLocalToGlobalMapping
>>>>>
>>>>>                 http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/IS/ISLocalToGlobalMappingCreate.html
>>>>>                 <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/IS/ISLocalToGlobalMappingCreate.html>
>>>>>
>>>>>                 that maps the numbering you use for the Jacobians
>>>>>                 to their counterpart in the CFD ordering, then
>>>>>                 call MatSetLocalToGlobalMapping and then use
>>>>>                 MatSetValuesLocal with the same arguments you are
>>>>>                 calling MatSetValues now.
>>>>>
>>>>>                 Otherwise, you can play with the application
>>>>>                 ordering
>>>>>                 http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/AO/index.html
>>>>>                 <http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/AO/index.html>
>>>>>
>>>>>
>>>>>
>>>>>                 -- 
>>>>>                 Stefano
>>>>>
>>>>>
>>>>
>>>>             -- 
>>>>
>>>>             ____________________________________________________
>>>>
>>>>             Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
>>>>             Institut für Aerodynamik und Strömungstechnik |Bunsenstr. 10 | 37073 Göttingen
>>>>             <https://maps.google.com/?q=Bunsenstr.+10+%7C+37073+G%C3%B6ttingen&entry=gmail&source=g>
>>>>
>>>>             Michael Werner
>>>>             Telefon 0551 709-2627 | Telefax 0551 709-2811 |Michael.Werner at dlr.de <mailto:Michael.Werner at dlr.de>
>>>>             DLR.de
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         -- 
>>>>         What most experimenters take for granted before they begin
>>>>         their experiments is infinitely more interesting than any
>>>>         results to which their experiments lead.
>>>>         -- Norbert Wiener
>>>>
>>>>         https://www.cse.buffalo.edu/~knepley/
>>>>         <http://www.caam.rice.edu/%7Emk51/>
>>>
>>>         -- 
>>>
>>>         ____________________________________________________
>>>
>>>         Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
>>>         Institut für Aerodynamik und Strömungstechnik |Bunsenstr. 10 | 37073 Göttingen
>>>         <https://maps.google.com/?q=Bunsenstr.+10+%7C+37073+G%C3%B6ttingen&entry=gmail&source=g>
>>>
>>>         Michael Werner
>>>         Telefon 0551 709-2627 | Telefax 0551 709-2811 |Michael.Werner at dlr.de <mailto:Michael.Werner at dlr.de>
>>>         DLR.de
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>     -- 
>>>     What most experimenters take for granted before they begin their
>>>     experiments is infinitely more interesting than any results to
>>>     which their experiments lead.
>>>     -- Norbert Wiener
>>>
>>>     https://www.cse.buffalo.edu/~knepley/
>>>     <http://www.caam.rice.edu/%7Emk51/>
>>
>>     -- 
>>
>>     ____________________________________________________
>>
>>     Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
>>     Institut für Aerodynamik und Strömungstechnik |Bunsenstr. 10 | 37073 Göttingen
>>     <https://maps.google.com/?q=Bunsenstr.+10+%7C+37073+G%C3%B6ttingen&entry=gmail&source=g>
>>
>>     Michael Werner
>>     Telefon 0551 709-2627 | Telefax 0551 709-2811 |Michael.Werner at dlr.de <mailto:Michael.Werner at dlr.de>
>>     DLR.de
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin their 
>> experiments is infinitely more interesting than any results to which 
>> their experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/%7Emk51/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171018/770e5469/attachment-0001.html>


More information about the petsc-users mailing list