[mpich2-dev] patch for ticket #1511

Jeff Hammond jhammond at alcf.anl.gov
Thu May 3 08:35:31 CDT 2012


I submit the attached patch as a fix for ticket #1511
(https://trac.mcs.anl.gov/projects/mpich2/ticket/1511), which Dave
assigned to me a while ago at my request.

This patch allows all the RMA tests to run on an arbitrarily large
number of processes, rather than just 1, 2, or 3, as was required for
10 or so tests.  There may be other tests that need this treatment as
well, but the RMA tests were the most critical ones for BGQ
acceptance, hence that is where I started.

I have tested that entire set of RMA tests pass with 4 procs (this was
impossible before) with these changes, at least on my laptop.  CLANG
didn't produce any compiler errors.

The solution implemented in my patch is simple: all unnecessary ranks
do nothing for the test.  I have chosen this approach because there
was no discernible value in replicating the test across all pairs or
triples of nodes, yet the bookkeeping required was non-trivial.  It
wasn't onerous either, but the benefit/cost was near zero.  I do not
believe that the correctness of MPICH2 should depend on what rank runs
the test code.

Regarding the suggestion by Pavan that I instead parse all tests for
required number of processes, store that in a database, then write a
custom script to use that database for invocation on all possible
schedulers (PBS, Maui, LoadLeveler, Cobalt, etc.), I decided against
this because I did not want to be restricted to running the MPICH2
test suite using a particular script.  In fact, I currently cannot
figure out how to use the runtests script (a README would be helpful),
so I just invoke all the tests with "find -perm 755 -exec mpiexec -np
4 {} \;" and find this highly satisfactory (although execvp barfs on
directories).

I believe that it is important to allow a test to be run completely
standalone on any number of processes (greater than the minimum, of
course) without knowledge of a special database.  It seems silly to
require a database query to launch a single test.

Anyways, I hope that this patch is accepted and I can focus on the
rest of the BGQ MPI-related acceptance tests.

Best,

Jeff



-- 
Jeff Hammond
Argonne Leadership Computing Facility
University of Chicago Computation Institute
jhammond at alcf.anl.gov / (630) 252-5381
http://www.linkedin.com/in/jeffhammond
https://wiki.alcf.anl.gov/parts/index.php/User:Jhammond (in-progress)
https://wiki.alcf.anl.gov/old/index.php/User:Jhammond (deprecated)
https://wiki-old.alcf.anl.gov/index.php/User:Jhammond(deprecated)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jeff.patch
Type: application/octet-stream
Size: 89257 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich2-dev/attachments/20120503/4f12fd23/attachment-0001.obj>


More information about the mpich2-dev mailing list