[Nek5000-users] MPI problems

nek5000-users at lists.mcs.anl.gov nek5000-users at lists.mcs.anl.gov
Fri Feb 28 10:15:30 CST 2014


Dear Nek’s 

I have a problem when running the code on more than one processor.  The problem just appeared after a recent update of the system (debian) on our cluster. The code was working perfectly before the update but now I cannot run jobs in parallel anymore. MPI works with other softwares, but not with NEK. In particular I have the following problem:

When I execute a parallel run using the script    nekmpi eddy_uv 4 

the command execute 4 different jobs running on a single processor rather than a single  job running on 4 processors. I attach the log to this mail (log.out) . After a few second three jobs are killed and only one remains active. A similar problem was also found on a new machine with a fresh installation of debian. It seems that the scripts is not able to set the correct value of the variable np (or_np).
Anyone found a similar problem ? Any explanation for such behavior ? Any advice to solve the problem  ? 

Thanks in advance 


Flavio





Platform
uname -a
Linux cfd 2.6.32-5-amd64 #1 SMP Mon Sep 23 22:14:43 UTC 2013 x86_64 GNU/Linux






ompi_info
                 Package: Open MPI manuel at ce170155 Distribution
                Open MPI: 1.4.2
   Open MPI SVN revision: r23093
   Open MPI release date: May 04, 2010
                Open RTE: 1.4.2
   Open RTE SVN revision: r23093
   Open RTE release date: May 04, 2010
                    OPAL: 1.4.2
       OPAL SVN revision: r23093
       OPAL release date: May 04, 2010
            Ident string: 1.4.2
                  Prefix: /usr
 Configured architecture: x86_64-pc-linux-gnu
          Configure host: ce170155
           Configured by: manuel
           Configured on: Wed Sep  1 15:58:32 UTC 2010
          Configure host: ce170155
                Built by: root
                Built on: Wed Sep  1 16:01:42 UTC 2010
              Built host: ce170155
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (all)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: gcc
     C compiler absolute: /usr/lib/ccache/gcc
            C++ compiler: g++
   C++ compiler absolute: /usr/lib/ccache/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
           Sparse Groups: no
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
   Heterogeneous support: yes
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: yes  (checkpoint thread: no)
           MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.4.2)
              MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.4.2)
           MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.4.2)
               MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.4.2)
               MCA carto: file (MCA v2.0, API v2.0, Component v1.4.2)
           MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4.2)
           MCA maffinity: libnuma (MCA v2.0, API v2.0, Component v1.4.2)
               MCA timer: linux (MCA v2.0, API v2.0, Component v1.4.2)
         MCA installdirs: env (MCA v2.0, API v2.0, Component v1.4.2)
         MCA installdirs: config (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA crs: none (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.4.2)
              MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.4.2)
           MCA allocator: basic (MCA v2.0, API v2.0, Component v1.4.2)
           MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: basic (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: inter (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: self (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: sm (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: sync (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: tuned (MCA v2.0, API v2.0, Component v1.4.2)
                  MCA io: romio (MCA v2.0, API v2.0, Component v1.4.2)
               MCA mpool: fake (MCA v2.0, API v2.0, Component v1.4.2)
               MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.4.2)
               MCA mpool: sm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA pml: cm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA pml: crcpw (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA pml: csum (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA pml: v (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.4.2)
              MCA rcache: vma (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: ofud (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: openib (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: self (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: sm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.4.2)
                MCA topo: unity (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA osc: rdma (MCA v2.0, API v2.0, Component v1.4.2)
                MCA crcp: bkmrk (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA iof: orted (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA iof: tool (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.4.2)
                MCA odls: default (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ras: slurm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ras: tm (MCA v2.0, API v2.0, Component v1.4.2)
               MCA rmaps: load_balance (MCA v2.0, API v2.0, Component v1.4.2)
               MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.4.2)
               MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.4.2)
               MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA rml: ftrm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA rml: oob (MCA v2.0, API v2.0, Component v1.4.2)
              MCA routed: binomial (MCA v2.0, API v2.0, Component v1.4.2)
              MCA routed: direct (MCA v2.0, API v2.0, Component v1.4.2)
              MCA routed: linear (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA plm: rsh (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA plm: slurm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA plm: tm (MCA v2.0, API v2.0, Component v1.4.2)
               MCA snapc: full (MCA v2.0, API v2.0, Component v1.4.2)
               MCA filem: rsh (MCA v2.0, API v2.0, Component v1.4.2)
              MCA errmgr: default (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ess: env (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ess: hnp (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ess: singleton (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ess: slurm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ess: tool (MCA v2.0, API v2.0, Component v1.4.2)
             MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.4.2)
             MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.4.2)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140228/5093b074/attachment-0005.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: eddy_uv.rea
Type: application/octet-stream
Size: 131687 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140228/5093b074/attachment-0004.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140228/5093b074/attachment-0006.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: log.out
Type: application/octet-stream
Size: 97595 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140228/5093b074/attachment-0005.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140228/5093b074/attachment-0007.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SIZE
Type: application/octet-stream
Size: 3262 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140228/5093b074/attachment-0006.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140228/5093b074/attachment-0008.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: compiler.out
Type: application/octet-stream
Size: 92365 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140228/5093b074/attachment-0007.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/nek5000-users/attachments/20140228/5093b074/attachment-0009.html>


More information about the Nek5000-users mailing list