[petsc-dev] Understanding Some Parallel Results with PETSc

Dave Nystrom dnystrom1 at comcast.net
Fri Feb 24 20:22:27 CST 2012


That is what I did for the gpu case where I was running with 2 mpi processes
per node since I had 2 gpus per node.  But if that is not really the right
way to be doing this, I'll be happy to learn of a better way.  Note that for
the CPU case in the runPetscProb_2 script I use a different binding.

Anyway, if there is a better way to do this, I'll be happy to learn and try
the better way.  Also, this is recent work that has been done in a hurry
because of trying to get ready for this new machine coming in.  I probably
need to go back and revisit the details to make sure I understand the binding
I am really getting.

Jed Brown writes:
 > Hmm, I'll look more carefully, but I think your binding is incorrect.
 > 
 > numactl -l --cpunodebind=$OMPI_COMM_WORLD_LOCAL_RANK ex2 $KSP_ARGS
 > 
 > NUMA node numbering is different from MPI ranks.
 > 
 > On Fri, Feb 24, 2012 at 12:11, Nystrom, William D <wdn at lanl.gov> wrote:
 > 
 > >  Hi Jed,
 > >
 > > Attached is a gzipped tarball of the stuff I used to run these two test
 > > problems with
 > > numactl.  Actually, I hacked them a bit this morning because I was running
 > > them in
 > > our test framework for doing acceptance testing of new systems.  But the
 > > scripts in
 > > the tarball should give you all the info you need.  There is a top level
 > > script called
 > > runPetsc that just invokes mpirun from openmpi and calls the wrapper
 > > scripts for using
 > > numactl.  You could actually dispense with the top level script and just
 > > invoke the
 > > mpirun commands yourself.  I include it as an easy way to document what I
 > > did.
 > > The runPetscProb_1 script runs petsc on the gpus using numactl to control
 > > the affinities
 > > of the gpus to the cpu numa nodes.  The runPetscProb_2 script runs petsc
 > > on the cpus
 > > using numactl.  Note that both of those wrapper scripts are using openmpi
 > > variables.
 > > I'm not sure how one would do the same thing with another flavor of mpi.
 > > But I imagine
 > > it is possible.  Also, I'm not sure if there are other more elegant ways
 > > to run with numactl
 > > than using the wrapper script approach.  Perhaps there is but this is what
 > > we have been
 > > doing.
 > >
 > > I've also included a Perl script called numa-maps that is useful for
 > > actually checking the
 > > affinities that you get while running in order to make sure that numactl
 > > is doing what
 > > you think it is doing.  I'm not sure where this script comes from.  I find
 > > it on some systems
 > > and not on others.
 > >
 > > I've also include logs with the output of cpuinfo, nvidia-smi and uname -a
 > > to answer any
 > > questions you had about the system I was running on.
 > >
 > > Finally, I've included  runPetscProb_1.log and runPetscProb_2.log which
 > > contains the
 > > log_summary output for my latest runs on the gpu and cpu respectively.
 > > Using numactl
 > > reduced the runtime for the gpu case as well but not as much as for the
 > > cpu case.  So
 > > the final result was that running the same problem while using all of the
 > > gpu resources
 > > on a node was about 2.5x times faster than using all of the cpu resources
 > > on the same
 > > number of nodes.
 > >
 > > Let me know if you need more any more info.  I'm planning to use this
 > > stuff to help test
 > > a new gpu cluster that we have just started acceptance testing on.  It has
 > > the same
 > > basic hardware as the testbed cluster for these results but has 308
 > > nodes.  That
 > > should be interesting and fun.
 > >
 > >
 > > Thanks,
 > >
 > > Dave
 > >
 > >  --
 > > Dave Nystrom
 > > LANL HPC-5
 > > Phone: 505-667-7913
 > > Email: wdn at lanl.gov
 > > Smail: Mail Stop B272
 > >        Group HPC-5
 > >        Los Alamos National Laboratory
 > >        Los Alamos, NM 87545
 > >
 > >   ------------------------------
 > > *From:* petsc-dev-bounces at mcs.anl.gov [petsc-dev-bounces at mcs.anl.gov] on
 > > behalf of Jed Brown [jedbrown at mcs.anl.gov]
 > > *Sent:* Thursday, February 23, 2012 10:43 PM
 > >
 > > *To:* For users of the development version of PETSc
 > > *Cc:* Dave Nystrom
 > >
 > > *Subject:* Re: [petsc-dev] Understanding Some Parallel Results with PETSc
 > >
 > >   On Thu, Feb 23, 2012 at 23:41, Dave Nystrom <dnystrom1 at comcast.net>wrote:
 > >
 > >> I could also send you my mpi/numactl command lines for gpu and cpu when I
 > >> am
 > >> back in the office.
 > >>
 > >
 > > Yes, please.
 > >



More information about the petsc-dev mailing list