[mpich-discuss] using mpich2/hydra with ssh

Gonçalo C. Justino goncalo.justino at gmail.com
Thu Feb 3 11:14:21 CST 2011


Dear all,

I've been googling and browsing this list's archives and still have a
problem in using mpiexec.hydra to run; i'm starting to use hydra, so I'm
just using two nodes to make sure I get it all right. There was a thread on
this subject last year, I checked all those issues, I covered them all, but
the thread appears to have died.

The machines are at 192.168.0.1 and 192.168.0.2, both running ubuntu 10.10,
both have the program I want use up and running (nwchem, just in case; it
runs fine in each machine using mpiexec.hydra). I'm using the last version
of mpich2 (grabbed it last monday).
I can ssh from each one into the other and into itself (if it matters...).
The hosts file i use is:
192.168.0.3:6
192.168.0.1:4
I've also tried names, but no good.

When I try to use both nodes from machine 3 I get
[proxy at sm] main (./pm/pmiserv/pmi_proxy.c:108): unable to connect to the
main server

>From machine 1, the message is similar (machine 1 is sm, machine 3 is
sm-comp).

FYI, the command I use is
mpiexec.hydra -bootstrap ssh -f hosts -np 10
-v /opt/nwchem-6.0/bin/LINUX64/nwchem test.nw

and the output is pasted at the end of this message.

Any hint is welcome,

All the best,
Gonçalo

==================================================================================================
mpiexec options:
----------------
  Base path: /usr/bin/
  Proxy port: 9899
  Bootstrap server: ssh
  Debug level: 1
  Enable X: -1
  Working dir: /wip

  Global environment:
  -------------------
    LDFLAGS=-Lopt/gromacs-4.5.1-mopac7
    MPILIB=-lfmpich -lmpich -lpmpich
    MPI_INCLUDE=/usr/include/mpich2/
    TERM=xterm
    SHELL=/bin/bash

 XDG_SESSION_COOKIE=d59b7913861333d766cd7af400018fd6-1296733763.439332-419157494
    SSH_CLIENT=192.168.0.1 40484 22
    SSH_TTY=/dev/pts/3
    USER=goncalo
    LD_LIBRARY_PATH=/opt/intel/Compiler/11.1/072/lib/intel64
    NWCHEM_TOP=/opt/nwchem-6.0/
    LS_COLORS=rs
    MPI_LIB=(null)
    USE_MPI=y
    LIB_DEFINES=-DDFLT_TOT_MEM
    TOOLROOT=/opt/x86_open64-4.2.4
    LIBS=-lmopac
    MAIL=/var/mail/goncalo

 PATH=/opt/x86_open64-4.2.4/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/opt/bin-linux-tinker:/home/goncalo/g03
    NWCHEM_MODULES=all
    PWD=/wip
    LANG=en_US.UTF-8
    GAUSS_SCRDIR=/home/goncalo/scratch
    GAUSS_EXEDIR=/home/goncalo/g03
    MPI_LOC=/usr/share/mpich2/
    g03root=/home/goncalo/g03
    SHLVL=1
    HOME=/home/goncalo
    FC=/opt/intel/Compiler/11.1/072/bin/intel64/ifort
    LOGNAME=goncalo
    LARGE_FILES=TRUE
    SSH_CONNECTION=192.168.0.1 40484 192.168.0.3 22
    LESSOPEN=| /usr/bin/lesspipe %s
    CC=/opt/intel/Compiler/11.1/072/bin/intel64/icc
    LESSCLOSE=/usr/bin/lesspipe %s %s
    NWCHEM_TARGET=LINUX64
    _=/usr/bin/mpiexec.hydra
    OLDPWD=/home/goncalo/DATA-RECOVERY


    Executable information:
    **********************
      Executable ID:  1
      -----------------
        Process count: 10
        Executable: /opt/nwchem-6.0/bin/LINUX64/nwchem test.nw

    Proxy information:
    *********************
      Proxy ID:  1
      -----------------
        Proxy name: 192.168.0.3
        Process count: 6
        Start PID: 0

        Proxy exec list:
        ....................
          Exec: /opt/nwchem-6.0/bin/LINUX64/nwchem; Process count: 6
          Exec: /opt/nwchem-6.0/bin/LINUX64/nwchem; Process count: 3
      Proxy ID:  2
      -----------------
        Proxy name: 192.168.0.1
        Process count: 1
        Start PID: 6

        Proxy exec list:
        ....................
          Exec: /opt/nwchem-6.0/bin/LINUX64/nwchem; Process count: 1

==================================================================================================

[mpiexec at sm-comp] Timeout set to -1 (-1 means infinite)
[mpiexec at sm-comp] Got a PMI port string of sm-comp:46338
[mpiexec at sm-comp] Got a proxy port string of sm-comp:59737
Arguments being passed to proxy 0:
--version 1.2.1p1 --hostname 192.168.0.3 --global-core-count 7 --wdir /wip
--pmi-port-str sm-comp:46338 --binding HYDRA_NULL --bindlib plpa
--ckpointlib none --ckpoint-prefix HYDRA_NULL --global-inherited-env 38
'LDFLAGS=-Lopt/gromacs-4.5.1-mopac7' 'MPILIB=-lfmpich -lmpich -lpmpich'
'MPI_INCLUDE=/usr/include/mpich2/' 'TERM=xterm' 'SHELL=/bin/bash'
'XDG_SESSION_COOKIE=d59b7913861333d766cd7af400018fd6-1296733763.439332-419157494'
'SSH_CLIENT=192.168.0.1 40484 22' 'SSH_TTY=/dev/pts/3' 'USER=goncalo'
'LD_LIBRARY_PATH=/opt/intel/Compiler/11.1/072/lib/intel64'
'NWCHEM_TOP=/opt/nwchem-6.0/' 'LS_COLORS=rs' 'MPI_LIB=' 'USE_MPI=y'
'LIB_DEFINES=-DDFLT_TOT_MEM' 'TOOLROOT=/opt/x86_open64-4.2.4' 'LIBS=-lmopac'
'MAIL=/var/mail/goncalo'
'PATH=/opt/x86_open64-4.2.4/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/opt/bin-linux-tinker:/home/goncalo/g03'
'NWCHEM_MODULES=all' 'PWD=/wip' 'LANG=en_US.UTF-8'
'GAUSS_SCRDIR=/home/goncalo/scratch' 'GAUSS_EXEDIR=/home/goncalo/g03'
'MPI_LOC=/usr/share/mpich2/' 'g03root=/home/goncalo/g03' 'SHLVL=1'
'HOME=/home/goncalo' 'FC=/opt/intel/Compiler/11.1/072/bin/intel64/ifort'
'LOGNAME=goncalo' 'LARGE_FILES=TRUE' 'SSH_CONNECTION=192.168.0.1 40484
192.168.0.3 22' 'LESSOPEN=| /usr/bin/lesspipe %s'
'CC=/opt/intel/Compiler/11.1/072/bin/intel64/icc'
'LESSCLOSE=/usr/bin/lesspipe %s %s' 'NWCHEM_TARGET=LINUX64'
'_=/usr/bin/mpiexec.hydra' 'OLDPWD=/home/goncalo/DATA-RECOVERY'
--global-user-env 0 --global-system-env 0 --genv-prop all --start-pid 0
--proxy-core-count 6 --exec --exec-proc-count 6 --exec-local-env 0
--exec-env-prop HYDRA_NULL /opt/nwchem-6.0/bin/LINUX64/nwchem test.nw --exec
--exec-proc-count 3 --exec-local-env 0 --exec-env-prop HYDRA_NULL
/opt/nwchem-6.0/bin/LINUX64/nwchem test.nw

Arguments being passed to proxy 1:
--version 1.2.1p1 --hostname 192.168.0.1 --global-core-count 7 --wdir /wip
--pmi-port-str sm-comp:46338 --binding HYDRA_NULL --bindlib plpa
--ckpointlib none --ckpoint-prefix HYDRA_NULL --global-inherited-env 38
'LDFLAGS=-Lopt/gromacs-4.5.1-mopac7' 'MPILIB=-lfmpich -lmpich -lpmpich'
'MPI_INCLUDE=/usr/include/mpich2/' 'TERM=xterm' 'SHELL=/bin/bash'
'XDG_SESSION_COOKIE=d59b7913861333d766cd7af400018fd6-1296733763.439332-419157494'
'SSH_CLIENT=192.168.0.1 40484 22' 'SSH_TTY=/dev/pts/3' 'USER=goncalo'
'LD_LIBRARY_PATH=/opt/intel/Compiler/11.1/072/lib/intel64'
'NWCHEM_TOP=/opt/nwchem-6.0/' 'LS_COLORS=rs' 'MPI_LIB=' 'USE_MPI=y'
'LIB_DEFINES=-DDFLT_TOT_MEM' 'TOOLROOT=/opt/x86_open64-4.2.4' 'LIBS=-lmopac'
'MAIL=/var/mail/goncalo'
'PATH=/opt/x86_open64-4.2.4/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/opt/bin-linux-tinker:/home/goncalo/g03'
'NWCHEM_MODULES=all' 'PWD=/wip' 'LANG=en_US.UTF-8'
'GAUSS_SCRDIR=/home/goncalo/scratch' 'GAUSS_EXEDIR=/home/goncalo/g03'
'MPI_LOC=/usr/share/mpich2/' 'g03root=/home/goncalo/g03' 'SHLVL=1'
'HOME=/home/goncalo' 'FC=/opt/intel/Compiler/11.1/072/bin/intel64/ifort'
'LOGNAME=goncalo' 'LARGE_FILES=TRUE' 'SSH_CONNECTION=192.168.0.1 40484
192.168.0.3 22' 'LESSOPEN=| /usr/bin/lesspipe %s'
'CC=/opt/intel/Compiler/11.1/072/bin/intel64/icc'
'LESSCLOSE=/usr/bin/lesspipe %s %s' 'NWCHEM_TARGET=LINUX64'
'_=/usr/bin/mpiexec.hydra' 'OLDPWD=/home/goncalo/DATA-RECOVERY'
--global-user-env 0 --global-system-env 0 --genv-prop all --start-pid 6
--proxy-core-count 1 --exec --exec-proc-count 1 --exec-local-env 0
--exec-env-prop HYDRA_NULL /opt/nwchem-6.0/bin/LINUX64/nwchem test.nw

[mpiexec at sm-comp] Launching process: /usr/bin/ssh -x 192.168.0.3
/usr/bin/pmi_proxy --launch-mode 1 --proxy-port sm-comp:59737 --debug
--bootstrap ssh --proxy-id 0
[mpiexec at sm-comp] Launching process: /usr/bin/ssh -x 192.168.0.1
/usr/bin/pmi_proxy --launch-mode 1 --proxy-port sm-comp:59737 --debug
--bootstrap ssh --proxy-id 1
[proxy at sm] HYDU_sock_connect (./utils/sock/sock.c:128): unable to get host
address (Connection timed out)
[proxy at sm] main (./pm/pmiserv/pmi_proxy.c:108): unable to connect to the
main server
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20110203/14a6dab0/attachment.htm>


More information about the mpich-discuss mailing list