[PETSC #15639] questions regarding using petsc in OpenMP

Fri Dec 22 14:21:37 CST 2006

  Jin,

    I have added support for this to the development version of PETSc
http://www-unix.mcs.anl.gov/petsc/petsc-as/developers/index.html
see the manual pages for PCOPENMP and PetscOpenMPMerge() and PetscOpenMPSpawn()

You will have to replace MPI_Init() with PetscInitialize(). All the uses
of MPI_COMM_WORLD with PETSC_COMM_WORLD and then use MatCreateSeqAIJWithArrays()
to create the PETSc matrix and KSPCreate() as usual. Then run with the
options described in the manual pages and it will just work.

   Good luck,

   Barry

Access the manual pages on the website at 
http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/index.html 
they will not be updated until early Saturday morning.

On Thu, 7 Dec 2006, Jin Chen wrote:

> The application code is written in Fortran 90.
> 
> > You mean - you are using: 8x16 = 128 procs total?
> Yes. But OpenMP only works on 16 processors independently on 8 nodes.
> 
> > However you can have a single MPI_COMM_WORLD with size=128
> > - and then split it up into 8 different communicators each with
> > size=16 - as required. You then create PETSc objects on each of these
> > communicators.
> Yes, this is what I want. But the number of size=16 processors has already
> been tied up as one by OpenMP. I want to know how can this be done without
> affecting OpenMP?
> 
> Thanks,
> 
> -Jin-
> 
> On Thu, 7 Dec 2006, Satish Balay wrote:
> 
> > On Thu, 7 Dec 2006, Jin Chen wrote:
> > 
> > > 
> > > Hi
> > > 
> > > I am having a MPI and OpenMP mixed application code.
> > > In the circle direction, I have 8 planes, represented by P0 ~ P7
> > > and communicated by MPI.
> > > 
> > >      --P0--
> > >    /        \
> > >   P1        P7
> > >  /            \
> > > P2            P6
> > >  \            /
> > >   P3         P5
> > >    \        /
> > >      --P4--
> > > 
> > > Inside each plane, I am using OpenMP on a whole computer node.
> > > 
> > > So for this particular case on seaborg, I am using 8 nodes using MPI
> > > and 16 threads on each node using OpenMP.
> > 
> > You mean - you are using: 8x16 = 128 procs total?
> > 
> > > 
> > > Right now the matrix, generated on each node using OpenMP,
> > > is so large that it takes forever for superlu_mt to solve.
> > > Therefore, I hope we can move to petsc since petsc has more choices.
> > > 
> > > We just want
> > > petsc for solving Ax=b inside each plane on each node
> > > and independently between nodes,
> > > not want to change the whole code.  So an interface like this:
> > > 
> > > -geting matrix and rhs from application code
> > > -call petsc to solve Ax=b
> > > -keep the matrix and preconditioner information in a pointer for next
> > > physics
> > > time use
> > > -return solution and pointers for the matrix and preconditoner to
> > > application
> > > code
> > > 
> > > is good enough.
> > > 
> > > But petsc uses MPI while the matrix is generated using OpenMP on a whole
> > > node,
> > > So I need your help to make it work:
> > > 
> > > -how to initialize petsc independently inside each node?
> > >  Here we want petsc to use all 16 threads as 16 processors
> > > -how to creat petsc Mat object object
> > >  so that the matrix is distributed across the 16 processors?
> > 
> > PETSc is MPI based - and you can't have eight separate MPI
> > instances. However you can have a single MPI_COMM_WORLD with size=128
> > - and then split it up into 8 different communicators each with
> > size=16 - as required. You then create PETSc objects on each of these
> > communicators.
> > 
> > What role will OpenMP play in this modified code? Once the
> > mpi_comm_size is changed from 8 to 128 - OpenMP gets affected. You
> > might require some workarrounds - so that only one node form the
> > sub-communicator [of size=16] does the OpenMP part.
> > 
> > Satish
> > 
> 
>