[petsc-dev] Parallel calculation on GPU

Wed Aug 20 05:41:38 CDT 2014

On 08/20/14 12:11, Karl Rupp wrote:
> Hi Pierre,
>
> > I have a cluster with nodes of 2 sockets of 4 cores+1 GPU.
>>
>> Is there a way to run a calculation with 4*N MPI tasks where
>> my matrix is first built outside PETSc, then to solve the
>> linear system using PETSc Mat, Vec, KSP on only N MPI
>> tasks to adress efficiently the N GPUs ?
>
> as far as I can tell, this should be possible with a suitable 
> subcommunicator. The tricky piece, however, is to select the right MPI 
> ranks for this. Note that you generally have no guarantee on how the 
> MPI ranks are distributed across the nodes, so be prepared for 
> something fairly specific to your MPI installation.
Yes, I am ready to face this point too.
>
>
>> I am playing with the communicators without success, but I
>> am surely confusing things...
>
> To keep matters simple, try to get this scenario working with a purely 
> CPU-based solve. Once this works, the switch to GPUs should be just a 
> matter of passing the right flags. Have a look at PetscInitialize() here:
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.html 
>
> which mentions that you need to create the subcommunicator of 
> MPI_COMM_WORLD first.
>
I also started the work with a purely CPU-based solve only to test, but 
without success. When
I read this:

"If you wish PETSc code to run ONLY on a subcommunicator of 
MPI_COMM_WORLD, create that communicator first and assign it to 
PETSC_COMM_WORLD 
<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PETSC_COMM_WORLD.html#PETSC_COMM_WORLD> 
BEFORE calling PetscInitialize 
<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.html#PetscInitialize>(). 

Thus if you are running a four process job and two processes will run 
PETSc and have PetscInitialize 
<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.html#PetscInitialize>() 
and PetscFinalize 
<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscFinalize.html#PetscFinalize>() 
and two process will not, then do this. If ALL processes in
the job are using PetscInitialize 
<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscInitialize.html#PetscInitialize>() 
and PetscFinalize 
<http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscFinalize.html#PetscFinalize>() 
then you don't need to do this, even if different subcommunicators of 
the job are doing different things with PETSc."

I think I am not in this special scenario, because as my matrix is 
initially partitionned on 4
processes, I need to call PetscInitialize() on each 4 processes in order 
to build the PETSc matrix
with MatSetValues. And my goal is after to solve the linear system on 
only 2 processes... So
building a sub-communicator will really do the trick ? Or i miss something ?

Thanks Karli for your answer,

Pierre
> Best regards,
> Karli
>

-- 
*Trio_U support team*
Marthe ROUX (01 69 08 00 02) Saclay
Pierre LEDAC (04 38 78 91 49) Grenoble
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20140820/9f995fac/attachment.html>