[petsc-dev] new work on to start on threading and PETSc

Sun May 25 20:23:32 CDT 2014

Jed Brown <jed at jedbrown.org> writes:
> I can see OpenMP being handled in two different ways.  The OpenMP
> runtime usually manages its own thread pool.  When an "omp parallel"
> block is entered, the threads in the thread pool are assigned to work on
> the block.  In the existing threadcomm implementation, each kernel
> launch has its own omp parallel region.  An alternative implementation
> would be for the user to create a parallel region (at coarser
> granularity) and hand off (some or all of) the threads to PETSc.

This may need clarification.  In the existing implementation, we have:

  PetscErrorCode PetscThreadCommRunKernel_OpenMP(PetscThreadComm tcomm,PetscThreadCommJobCtx job)
  {
    PetscInt        trank=0;

    PetscFunctionBegin;
  #pragma omp parallel num_threads(tcomm->nworkThreads) shared(job) private(trank)
    {
      trank = omp_get_thread_num();
      PetscRunKernel(trank,job->nargs,job);
      job->job_status[trank] = THREAD_JOB_COMPLETED;
    }
    PetscFunctionReturn(0);
  }

so PETSc uses the OpenMP runtime in a very conventional way.  In the
alternative implementation, which I'll call the "pthread model", the
user creates the threads up-front and then does lots of things with
them.  You can use the "pthread model" with threads obtained from
OpenMP.  It would look something like this:

   #pragma omp parallel
     {
       int trank = omp_get_thread_num();
       if (trank) {
         // This thread is joining the pool as a worker.
         // This function will not return until the thread is released from the pool.
         PetscThreadPoolJoin(pool);
       } else {
         // Make sure all worker threads have joined.
         PetscThreadPoolWait(pool, omp_get_num_threads());

         // Use the workers one or more times
         PetscThreadCommRunKernel(...);
         KSPSolve(...);             // contains many calls to PetscThreadCommRunKernel()

         PetscThreadPoolReturn(pool); // release the worker threads waiting in PetscThreadPoolJoin
       }
     }

Of course the worker threads could have been created many other ways.
PETSc could create them itself and keep them alive until PetscFinalize,
the user could create them using pthread_create, C11 thrd_create, etc.

Note that error handling with OpenMP is messy because you can't
early-return from within an omp parallel block (and because error
conditions with threads are fundamentally hard).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20140525/79c982c3/attachment.sig>