[petsc-dev] new work on to start on threading and PETSc

Wed May 21 10:08:36 CDT 2014

Thanks for the suggestions.  I have been reading over the PETSc threadcomm code and trying to understand what it is doing and I have a few questions.

First, I was wondering if you had any suggestions on a debugging/profiler tool to use to help step through the parallel PETSc code and see what is happening.  I had used Totalview at a previous job but I haven't found a good replacement for that since going back to school last fall.

Regarding the pattern you suggested implementing, can you clarify how that would work in PETSc?  In particular, is the idea to allow the user to create threads, use those threads within PETSc, then return the threads to the user for future use?  Or is it something different?  

I think I understand in general how to create and use a threadpool using pthreads (although I haven't worked with pthreads much in the past), but I am unsure how to create and use a threadpool with openmp and then have that threads stay active once the pragma omp parallel has completed.  When I have used openmp in the past, I generally tried to place the parallel pragmas around the largest chunk of code I could, such as having each thread work on separate independent iterations of a large time-consuming loop.  Or create multiple threads prior to an iterative loop, and have each thread work on different parts of the arrays each iteration, using synchronization at the end of each iteration to make sure all threads are at the same place.  I guess I am trying to figure out how the PETSc code I need to develop compares to code I have developed in the past with openmp.
Thanks.

Paul

________________________________________
From: Jed Brown [jed at jedbrown.org]
Sent: Monday, May 19, 2014 4:01 PM
To: Barry Smith; petsc-dev; Eller, Paul R
Subject: Re: [petsc-dev] new work on to start on threading and PETSc

Barry Smith <bsmith at mcs.anl.gov> writes:
>   Jed,
>
>    Paul started today. I talked about the big picture and pointed him to the thread comm code.

Okay, excellent.

>    What should he start with? I seem to recall you wanting to move the
>    thread initialization into PetscInitialize()? Would that be a good
>    task to get his feet wet with changing code?

I want to be able to create and relinquish threads rather than insist on
having our own thread pool.  I think it's worth reading about the design
here, as a related execution model that we would like to be able to
support.

  https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/380

Before tinkering with PetscInitialize (a hairy beast at times), we
should try to support patterns like this:

#pragma omp parallel
  {
    int trank = omp_get_thread_num();
    if (trank) {
      PetscThreadPoolJoin(pool); // thread becomes member of the pool
    } else {
      PetscThreadPoolWait(pool, omp_get_num_threads()); // make sure all threads have joined
      PetscThreadCommRunKernel(...);
      PetscThreadPoolReturn(pool); // threads waiting in PetscThreadPoolJoin return
    }
  }

The key point is that even though many PETSc objects may linger, the
user has possession of all threads after they return from joining the
pool.  Clearly this could use pthreads or another model.

I'm not sure whether a thread pool should be distinct from a thread
comm.  The tradeoff is fewer objects versus multi-purpose objects with
more conditionals.