[mpich-discuss] Race condition in smpd?

John Fettig jfettig at illinois.edu
Fri Oct 24 09:44:08 CDT 2008


I'm trying to get "tight integration" of mpich2 going with rocks/sge,
and am encountering the following problem.  When I launch the job on a
couple of machines, everything works fine.  In ~/.smpd, I have

phrase=somephrase

Here, the home directory is NFS shared between the compute nodes.

However, if I launch on much more than 2 machines, it seems there is a
race to write to this file which causes it to be truncated.  This
results in subsequent machines prompting for a phrase, and the job
failing because they are waiting for user input.

I have worked around this by assigning a phrase in the program that
starts smpd for every job, however this is a hack and I would rather let
users set their own phrase.  Are you aware of such a problem?

John




More information about the mpich-discuss mailing list