[mpich-discuss] mpich2 hangs on Ubuntu beowulf cluster(with NFS)

Nicolas Rosner nrosner at gmail.com
Wed Jan 4 19:20:27 CST 2012


> the code we use is an old one

The fact that the code was written long ago can be the main or only
cause of some problems, but in practice, it rarely is. (It probably
was somehow broken in the first place, maybe in some subtle way that
went unnoticed under the older MPI.)

Had the code been correct by, say, the MPI 1.1 standard, in most cases
it would still be correct by the latest standard.
(Backward-compatibility is an important goal, carefully preserved by
those in charge of improving the standard over time.)

When legacy code fails, the real fault usually turns out to be more
than merely its age -- perhaps it was written under one implementation
of MPI and never tested with any other one, for instance?


> But this is an interesting topic so I will
> probably read about deadlock and
> I may come up with a solution

Yeah, it is interesting, and understanding what it is and how it works
is important. I think it's a wise decision to read up on the matter;
it should pay off.

Then again, while you are still learning the basics, finding solutions
to existing deadlock problems in legacy code may be not so simple.
Figuring out some small, few-line examples and classic problems before
attempting to fix real-world trouble with lots of distractions around
it (at least before doing so on your own, without help) can spare you
some frustration and speed up the learning curve, I think.

Good luck! N.


More information about the mpich-discuss mailing list