[hpc-announce] CFP: The 7th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS) 2017 - June 26, 2017

Teranishi, Keita knteran at sandia.gov
Thu Jan 19 12:23:38 CST 2017

7th Workshop on Fault-Tolerance for HPC at eXtreme Scale (FTXS 2017)

In conjunction with
The 26th International ACM Symposium on
High Performance Distributed Computing (HPDC 2017)
Washington D.C, USA on June 26 - June 30, 2017

Authors are invited to submit original papers on the research and practice
Of fault-tolerance in extreme scale (HPC) computing.  Resilience and
fault-tolerance remain a major concern for supercomputing, and advances in
this area is needed to allow applications to compute accurate (or within
error tolerance) answers in a timely and efficient manner in the presence
of degradations or failures of platform components (both hardware and

Topics include, but are not limited to:
* Failure data analysis and field studies
* Power, performance, resilience (PPR) assessments / tradeoffs
* Novel fault-tolerance techniques and implementations
* Emerging hardware and software technology for resilience
* Silent data corruption (SDC) detection / correction techniques
* Advances in reliability monitoring, analysis, and control of highly complex systems
* Failure prediction, error preemption, and recovery techniques
* Fault-tolerant programming models
* Models for software and hardware reliability
* Metrics and standards for measuring, improving, and enforcing effective fault-tolerance
* Scalable Byzantine fault-tolerance and security from single-fault and fail-silent violations
* Atmospheric evaluations relevant to HPC systems (terrestrial neutrons, temperature, voltage, etc.)
* Near-threshold-voltage implications and evaluations for reliability
* Benchmarks and experimental environments including fault injection
* Frameworks and APIs for fault-tolerance and fault management

See https://sites.google.com/site/ftxsworkshop/home/ftxs-2017 and
http://www.hpdc.org/2017/ for more information.

Submissions are solicited in the following categories:
* Regular papers presenting innovative ideas improving the state of the
art of extreme-scale fault-tolerance.
* Experience papers presenting data and discussing issues seen on existing extreme-scale systems including some form of analysis and evaluation.
Submissions shall be sent electronically, must conform to ACM conference
proceedings style.  Regular papers should not exceed eight (8) pages
including all text, appendices, and figures.  Experience papers
should not exceed six (6) pages.

Submission of papers: March 16th, 2017
Author notification: April 20th, 2017
Camera-ready papers: May 5th, 2017
Workshop: June 26th (expected) 2017​

Nathan DeBardeleben ­- Los Alamos National Laboratory

Keita Teranishi ­- Sandia National Laboratories
John Daly - Laboratory for Physical Sciences

Emmanuel Agullo – INRIA Bordeaux
Leonardo Bautista Gomez – Barcelona Supercomputing Center
Aurélien Bouteiller – University of Tennessee Knoxville
Robert Clay – Sandia National Laboratories
James Elliott – Sandia National Laboratories
Christian Engelmann – Oak Ridge National Laboratory
Kurt Ferreira – Sandia National Laboratories
Marc Gamell – Rutgers University
Qiang Guan – Los Alamos National Laboratory
Sudhanva Gurumurthi – AMD
Saurabh Hukerikar – Oak Ridge National Laboratory
Hideyuki Jitsumoto – Tokyo Institute of Technology
Zhiling Lan – Illinois Institute of Technology
Scott Levy – Sandia National Laboratories
Naoya Maruyama – RIKEN AICS
Bogdan Nicolae – Huawei Research Germany
Yves Robert – ENS Lyon & Univ. Tenn. Knoxville
Vilas Sridharan – AMD
Peter Strazdins – Australian National University
Abhinav Vishnu – Pacific Northwest National Lab.
Panruo Wu – University of California at Riverside


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.mcs.anl.gov/mailman/private/hpc-announce/attachments/20170119/71b35aab/attachment-0001.html>

More information about the hpc-announce mailing list