Troubles with mpich installation

paolo.zini at ipcf.cnr.it paolo.zini at ipcf.cnr.it
Wed May 18 06:04:20 CDT 2005


Hi all.



I have troubles with the mpich installation on one opteron cluster.



The configuration is:

HW

11 x MB Tyan 2882, processors 2 Opteron 2.4 Ghz, 4 Gbytes ram,

         2 160 Gbytes SATA disks arranged in hardware RAID 1 (using the
onboard controller)

1 d-link gigabit switch.

SW

Suse 9.0

Portland compilers (cc and f77-f90)

Mpich-1.2.6, compiled with Portland suite. Both with and without the patches
available on the mpich home page.



If I run the perftest programs, the buflimit test stops at 64K; the mpptest
run correctly on short messages, but hangs silently, without errors on long
messages.



The application programs runs on two processor on a single node, using the
ch_shmem  device, but if I try to run it on a multiple node configuration,
using ch_p4 device, one of the processes dies silently, after a time
variable from few minutes to several hours.

Sometimes the OS itself hang.



Any suggestion?



Paolo Zini


Paolo Zini
IPCF institute of CNR
Pisa
Italy
tel +39 050 3152964
Paolo.Zini at ipcf.cnr.it




More information about the mpich-discuss mailing list