[mpich-discuss] [mvapich-discuss] Announcing the Release of MVAPICH2 1.7RC2 and OSU Micro-Benchmarks (OMB) 3.4 (fwd)
Dhabaleswar Panda
panda at cse.ohio-state.edu
Mon Sep 19 21:46:37 CDT 2011
These releases might be of interest to some of the MPICH2 users. Thus, I
am posting it here.
Thanks,
DK
---------- Forwarded message ----------
Date: Mon, 19 Sep 2011 20:16:12 -0400 (EDT)
From: Dhabaleswar Panda <panda at cse.ohio-state.edu>
To: mvapich-discuss at cse.ohio-state.edu
Cc: Dhabaleswar Panda <panda at cse.ohio-state.edu>
Subject: [mvapich-discuss] Announcing the Release of MVAPICH2 1.7RC2 and
OSU Micro-Benchmarks (OMB) 3.4
The MVAPICH team is pleased to announce the release of MVAPICH2-1.7RC2
and OSU Micro-Benchmarks (OMB) 3.4.
Features, Enhancements, and Bug Fixes for MVAPICH2 1.7RC2 (since
MVAPICH2-1.7RC1 release) are listed here.
* NEW Features and Enhancements
- Based on MPICH2-1.4.1p1
- Integrated Hybrid (UD-RC/XRC) design to get best performance
on large-scale systems with reduced/constant memory footprint
- Shared memory backed Windows for One-Sided Communication
- Support for truly passive locking for intra-node RMA in shared
memory and LIMIC based windows
- Integrated with Portable Hardware Locality (hwloc v1.2.1)
- Integrated with latest OSU Micro-benchmarks (3.4)
- Enhancements and tuned collectives (Allreduce and Allgatherv)
- MPI_THREAD_SINGLE provided by default and
MPI_THREAD_MULTIPLE as an option
- Enabling Checkpoint/Restart support in pure SMP mode
- Optimization for QDR cards
- On-demand connection management support with IB CM (RoCE interface)
- Optimization to limit number of RDMA Fast Path connections
for very large clusters (Nemesis interface)
- Multi-core-aware collective support (QLogic PSM interface)
Bug Fixes:
- Fixes for code compilation warnings
- Compiler preference lists reordered to avoid mixing GCC and Intel
compilers if both are found by configure
- Fix a bug in transferring very large messages (>2GB)
- Thanks to Tibor Pausz from Univ. of Frankfurt for reporting it
- Fix a hang with One-Sided Put operation
- Fix a bug in ptmalloc integration
- Avoid double-free crash with mpispawn
- Avoid crash and print an error message in mpirun_rsh when the
hostfile is empty
- Checking for error codes in PMI design
- Verify programs can link with LiMIC2 at runtime
- Fix for compilation issue when BLCR or FTB installed in
non-system paths
- Fix an issue with RDMA-Migration
- Fix for memory leaks
- Fix an issue in supporting RoCE with second port on available on HCA
- Thanks to Jeffrey Konz from HP for reporting it
- Fix for a hang with passive RMA tests (QLogic PSM interface)
The complete set of Features, Enhancements, and Bug Fixes for MVAPICH2
1.7RC2 (since MVAPICH2-1.6 release) are listed here.
- Based on MPICH2-1.4.1p1
- Integrated Hybrid (UD-RC/XRC) design to get best performance
on large-scale systems with reduced/constant memory footprint
- CH3 shared memory channel for standalone hosts
(including laptops) without any InfiniBand adapters
- HugePage support
- Improved intra-node shared memory communication performance
- Shared memory backed Windows for One-Sided Communication
- Support for truly passive locking for intra-node RMA in shared
memory and LIMIC based windows
- Improved on-demand InfiniBand connection setup (CH3 and RoCE)
- Tuned RDMA Fast Path Buffer size to get better performance
with less memory footprint (CH3 and Nemesis)
- Supporting large data transfers (>2GB)
- Integrated with enhanced LiMIC2 (v0.5.5) to support Intra-node
large message (>2GB) transfers
- Optimized Fence synchronization (with and without
LIMIC2 support)
- Automatic intra-node communication parameter tuning
based on platform
- Efficient connection set-up for multi-core systems
- Enhanced designs and tuning for collectives
(bcast, reduce, barrier, gather, allreduce, allgather,
allgatherv and alltoall)
- MPI_THREAD_SINGLE provided by default and
MPI_THREAD_MULTIPLE as an option
- Fast process migration using RDMA
- Enabling Checkpoint/Restart support in pure SMP mode
- Compact and shorthand way to specify blocks of processes
on the same host with mpirun_rsh
- Support for latest stable version of HWLOC v1.2.1
- Enhanced mpirun_rsh design to avoid race conditions,
support for fault-tolerance functionality and
improved debug messages
- Enhanced debugging config options to generate
core files and back-traces
- Automatic inter-node communication parameter tuning
based on platform and adapter detection (Nemesis)
- Integrated with latest OSU Micro-benchmarks (3.4)
- Improved performance for medium sized messages (QLogic PSM interface)
- Multi-core-aware collective support (QLogic PSM interface)
- Optimization for QDR cards
- Support for Chelsio T4 Adapter
- Support for Ekopath Compiler
Bug Fixes:
- Fixes in Checkpoint/Restart and Migration support
- Fix Restart when using automatic checkpoint
- Thanks to Alexandr for reporting this
- Handling very large one-sided transfers using RDMA
- Fixes for memory leaks
- Graceful handling of unknown HCAs
- Better handling of shmem file creation errors
- Fix for a hang in intra-node transfer
- Fix for a build error with --disable-weak-symbols
- Thanks to Peter Willis for reporting this issue
- Fixes for one-sided communication with passive target
synchronization
- Better handling of memory allocation and registration failures
- Fixes for compilation warnings
- Fix a bug that disallows '=' from mpirun_rsh arguments
- Handling of non-contiguous transfer in Nemesis interface
- Bug fix in gather collective when ranks are in cyclic order
- Fix for the ignore_locks bug in MPI-IO with Lustre
- Compiler preference lists reordered to avoid mixing GCC and Intel
compilers if both are found by configure
- Fix a bug in transferring very large messages (>2GB)
- Thanks to Tibor Pausz from Univ. of Frankfurt for reporting it
- Fix a hang with One-Sided Put operation
- Fix a bug in ptmalloc integration
- Avoid double-free crash with mpispawn
- Avoid crash and print an error message in mpirun_rsh when the
hostfile is empty
- Checking for error codes in PMI design
- Verify programs can link with LiMIC2 at runtime
- Fix for compilation issue when BLCR or FTB installed in
non-system paths
- Fix an issue with RDMA-Migration
- Fix an issue in supporting RoCE with second port on available on HCA
- Thanks to Jeffrey Konz from HP for reporting it
- Fix for a hang with passive RMA tests (QLogic PSM interface)
New features, Enhancements and Bug Fixes of OSU Micro-Benchmarks (OMB)
3.4 (since OMB 3.3 release) are listed here.
New Features & Enhancements
- Add passive one-sided communication benchmarks
- Update one-sided communication benchmarks to provide shared
memory hint in MPI_Alloc_mem calls
- Update one-sided communication benchmarks to use MPI_Alloc_mem
for buffer allocation
- Give default values to configure definitions (can now build
directly with mpicc)
- Update latency benchmarks to begin from 0 byte message
* Bug Fixes
- Remove memory leaks in one-sided communication benchmarks
- Update benchmarks to touch buffers before using them for
communication
- Fix osu_get_bw test to use different buffers for concurrent
communication operations
- Fix compilation warnings
For downloading MVAPICH2-1.7RC2, OSU Micro-Benchmarks (OMB) 3.4,
associated user guide and accessing the SVN, please visit the
following URL:
http://mvapich.cse.ohio-state.edu
All questions, feedbacks, bug reports, hints for performance tuning,
patches and enhancements are welcome. Please post it to the
mvapich-discuss mailing list (mvapich-discuss at cse.ohio-state.edu).
We are also happy to inform that the number of organizations using
MVAPICH/MVAPICH2 (and registered at the MVAPICH site) has crossed
1,700 world-wide (in 63 countries). The MVAPICH team extends thanks to
all these organizations.
Thanks,
The MVAPICH Team
_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu
http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
More information about the mpich-discuss
mailing list