[mpich-discuss] Cluster problem running MPI programs

Gus Correa gus at ldeo.columbia.edu
Tue Apr 10 21:42:32 CDT 2012


On 04/10/2012 09:01 PM, Pavan Balaji wrote:
>
> [please keep mpich-discuss cc'ed]
>
> Sorry, we can't help you with mpd.
>
> I'm not sure which tutorial you are following, but if you look at the
> README, it will give you step-by-step instructions for building and
> installing mpich2.
>
> -- Pavan

For what it is worth, this issue appeared recently
in the list, related to the following tutorial,
which is very detailed, but unfortunately seems
to have become obsolete now [particularly regarding mpd]:

https://help.ubuntu.com/community/MpichCluster

To configure/make/install MPICH and to run MPI programs,
you may be better off if you use the MPICH2
documentation instead:

http://www.mcs.anl.gov/research/projects/mpich2/documentation/index.php?s=docs

If you already have or install gfortran,
you don't need to disable Fortran in MPICH.
You don't need to 'overconfigure' MPICH2 either.
To get it up and running you may need at most to
set --prefix [if you don't want it to install in /usr/local],
and point to the compilers [e.g. FC=gfortran]

'./configure -help' shows the options available.

My $0.02,
Gus Correa




>
> On 04/10/2012 07:58 PM, Brice Chaffin wrote:
>> I realize mpd is deprecated, but this being my first time, I was
>> following tutorials that relied on mpd, so I made the decision to use it
>> on my first run. I did very little configuration of mpich2, beyond
>> disabling Fortran options(since I have no Fortran compilers installed)
>> and setting the install directory. I am not familiar with all the config
>> options yet, so I may have left something important out. The problem
>> with the tutorials is that they assume everything works the first time.
>> I had to do some fine tuning afterwards, and may need to do some more to
>> correct this. I'm just not quite sure yet where the trouble is.
>>
>> On Tue, 2012-04-10 at 19:41 -0500, Pavan Balaji wrote:
>>> Hello,
>>>
>>> How did you configure mpich2? Please note that mpd is now deprecated
>>> and is not supported. In 1.4.1p1, mpd should not be built at all by
>>> default.
>>>
>>> -- Pavan
>>>
>>> On 04/10/2012 07:27 PM, Brice Chaffin wrote:
>>>> Hi all,
>>>>
>>>> I have built a small cluster, but seem to be having a problem.
>>>>
>>>> I am using Ubuntu Linux 11.04 server edition on two nodes, with an NFS
>>>> share for a common directory when running as a cluster.
>>>>
>>>> According to mpdtrace the ring is fully functional. Both machines are
>>>> recognized and communicating.
>>>>
>>>> I can run regular c programs compiled with gcc using mpiexec or mpirun,
>>>> and results are returned from both nodes. When running actual MPI
>>>> programs, such as the examples included with MPICH2, or ones I compile
>>>> myself with mpicc, I get this:
>>>>
>>>> rank 1 in job 8 node1_33851 caused collective abort of all ranks
>>>> exit status of rank 1: killed by signal 4
>>>>
>>>> I am including mpich2version output so you can see exactly how I built
>>>> it.
>>>>
>>>> MPICH2 Version: 1.4.1p1
>>>> MPICH2 Release date: Thu Sep 1 13:53:02 CDT 2011
>>>> MPICH2 Device: ch3:nemesis
>>>> MPICH2 configure: --disable-f77 --disable-fc --with-pm=mpd
>>>> --prefix=/home/bchaffin/mpich2
>>>> MPICH2 CC: gcc -O2
>>>> MPICH2 CXX: c++ -O2
>>>> MPICH2 F77:
>>>> MPICH2 FC:
>>>>
>>>> This is my first time working with a cluster, so any advice or
>>>> suggestions are more than welcome.
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>
>>
>>
>>
>



More information about the mpich-discuss mailing list