[hpc-announce] Deadline Extension: SUSCOM Special Issue on RE-HPC - Paper due on May 15

Hongyang Sun hongyang.sun at vanderbilt.edu
Mon May 1 09:50:26 CDT 2017

Special Issue of Sustainable Computing: Informatics and Systems
(SUSCOM) on Resilience and/or Energy-aware techniques for
High-Performance Computing (RE-HPC)



Resilience and energy consumption have become two important concerns
for high-performance computing (HPC) systems. With the increasing core
count and technology miniaturization, today's large computing
platforms (datacenters, clusters, supercomputers, etc.) are
increasingly prone to failures. Faults are becoming norm rather than
exception. Besides the classical fail-stop errors (such as hardware
failures), soft errors (such as SDCs for silent data corruptions)
constitute another threat that can no longer be ignored by the HPC
community. Another concern is energy. Presently, large computing
centers are among the largest consumers of energy, hence measures must
be taken to reduce energy consumption. Energy is needed not only to
power the individual cores but also to provide cooling for the system.
In today's datacenters, a large proportion of energy is spent on
cooling and thermal-related activities. It is anticipated that the
power dissipated to perform communications and I/O transfers will also
make up a much larger share of the overall power consumption. The
relative cost of communication is expected to increase dramatically,
both in terms of latency/overhead and of consumed energy. Re-designing
algorithms for HPC systems to ensure resilience and to reduce energy
consumption will be crucial to achieving sustained performance. The
link between resilience and energy must also be carefully tackled.
Better resilience often requires redundancy (replication and/or
checkpointing, rollback and recovery), which consumes extra energy.
Hot cores may lead to less resilient computing or increase the
probability of individual failures. On the other hand, reducing the
energy consumption via voltage/frequency scaling techniques will
increase the application running time, and hence the expected number
of failures during execution.

This Special Issue will encompass a broad range of topics related to
resilience and energy efficiency for HPC. Its objective is to
facilitate exchange of valuable information and ideas among
researchers and practitioners. Topics of interest include (but are not
limited to):

●      Fault-tolerant algorithms, tools, and protocols

●      Checkpointing, replication, and recovery techniques

●      Detection and prediction of soft errors and SDCs

●      System reliability, testing, and verification

●      Resilience models, algorithms, and simulations

●      Energy-efficient scheduling and resource management

●      Power-aware runtime systems

●      Energy-efficient I/O, storage, and networking

●      Thermal behavior modeling, control and management

●      Cooling-aware optimizations and evaluations

●      Tradeoffs between performance, reliability, energy and temperature


General information for submitting papers to SUSCOM can be found at
http://www.journals.elsevier.com/sustainable-computing (please note
the “Guide for Authors” link).  Submissions to this Special Issue (SI)
should be made using Elsevier's editorial system at the journal
website (under the “submit your paper” link).  Please make sure to
select the “SI: RE-HPC” option for the type of the paper during the
submission process.  All submissions must be original and may not be
under review. A submission based on one or more papers that appeared
elsewhere has to include major value-added extensions over what
appeared previously (at least 30% new conceptual material). Authors
are requested to attach to the submitted paper such earlier articles
and a summary document explaining the enhancements made in the journal
version. All submitted papers will be peer-reviewed using the normal
standards of SUSCOM.


●   Manuscript due date: May 15, 2017

●   First decision notification: August 1, 2017

●   Tentative publication schedule: December- 2017


Anne Benoit, ENS-Lyon, France

Jean-Marc Pierson, University of Toulouse, France

Hongyang Sun, Vanderbilt University, USA

Any question may be sent to hongyang.sun at vanderbilt.edu

More information about the hpc-announce mailing list