[petsc-dev] OpenMP/Vec

Shri abhyshr at mcs.anl.gov
Wed Mar 7 13:22:02 CST 2012




> >
> > Rather than adding an nthreads slot to vectors, I put it in the
> > generic
> > PetscObject struct -- we're threading matrices too.
> >
> >
> > We are going with PetscLayout for distribution information with the
> > pthreads implementations.
> 
> Ah, that's a better approach. So probably the recently created
> seqpthread/tmap.c wants moving.

I've recently started adding the PetscThreadsLayout structure to manage the vector/matrix distribution
information for threads. See VecCreate_SeqPThread_Private() in vec/impls/seq/seqpthread/vecpthread.c
how to set PetscThreadsLayout information in PETSc Vecs. There might be changes to the PetscThreadsLayout structure
or we might decide to get rid of this structure and putting all its information in PetscLayout directly. So please 
don't use it till it gets stable.

> 
> > We have also discussed making "thread communicator" information (how
> > many
> > threads and their affinities) an attribute of an MPI_Comm. I'd be
> > interested to hear your opinions on that.
> 

> > All, should we consider moving all the "kernels" into their own
> > functions and just doing parallel region dispatch differently for
> > OpenMP
> > and pthreads? I think that is the only reasonable way to have just
> > one
> > version of the code. If we committed to always using separate
> > side-effect-free kernels, we might be able to reuse many of them
> > with TBB
> > and OpenCL as well.

I like the idea of having a one parallel region code to call different kernel implementations.


> 
> It would probably be a good idea, even if only to cut down on
> typo-bugs. I
> imagine, to take as an example VecMax_Seq it would look something like
> this:
> 
> VecMax_SeqKernel(PetscScalar *xx, PetscInt start, PetscInt end,
> PetscInt
> *idx, PetscReal *z)
> {
> PetscInt i;
> PetscInt j;
> PetscReal max;
> max = PetscRealPart(xx[start]);
> j = start;
> for ( i=start+1; i < end; i++ ) {
> if (PetscRealPart(xx[i]) > max) {
> max = PetscRealPart(xx[i]); j = i;
> }
> }
> if (idx) *idx = j;
> }
> 
> And then you have three (?) wrapper implementations:
> 
> VecMax_Seq(Vec xin, PetscInt *idx, PetscReal *z)
> {
> PetscScalar *xx;
> PetscFunctionBegin;
> VecGetArrayRead(xin, &xx);
> VecMax_SeqKernel(xx, 0, xin->map->n, idx, z);
> VecRestoreArrayRead(xin, &xx);
> PetscFunctionReturn(0);
> }
> 
> 
> VecMax_SeqOMP(Vec xin, PetscInt *idx, PetscReal *z)
> {
> PetscScalar *xx;
> PetscInt *jt;
> PetscReal *zt;
> PetscFunctionBegin;
> // you could just call VecMax_Seq here if nthread == 1
> VecGetArrayRead(xin, &xx);
> PetscMalloc(xin->map->tmap->nthread * sizeof(PetscReal), &zt);
> PetscMalloc(xin->map->tmap->nthread * sizeof(PetscInt), &jt);
> VecOMPParallelBegin(xin, shared(xx, jt, zt));
> PetscInt thread = PetscGetThreadNum();
> zt[thread] = PETSC_MIN_REAL;
> jt[thread] = -1;
> VecMax_SeqKernel(xx + __start, __end, jt + thread, zt + thread);
> VecOMPParallelEnd();
> // do reduction to find max value/index
> ...;
> PetscFunctionReturn(0);
> }
> 
> And the Pthread version looks similar but has to wrap the kernel
> invocation
> in another level as is currently done, I guess.
> 
> > Is there ever a case where we want to have _both_ OpenMP and
> > Pthreads
> > objects in the same application? Maybe, e.g. when servicing two
> > multiphysics applications, one of which chooses OpenMP and one of
> > which
> > chooses Pthreads (perhaps on different communicators so we don't
> > have to
> > fight with over-subscription).
> 
> See above. I think that currently the OMP and pthread implementations
> would
> play fine with one-another, modulo over-subscription of threads. I
> think
> addressing the issue should probably be motivated by a concrete use
> case.
> 
> >
> >
> > I don't know how to write the correct tests for configure to probe
> > for
> > the existance of _Pragma, so that's still missing.
> >
> >
> > I'll do it.
> 
> Thanks.
> 
> Lawrence
> 
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.



More information about the petsc-dev mailing list