[hpc-announce] [CFP] FTXS 2018 @ SC18

Scott Levy sllevy at sandia.gov
Thu Jun 7 13:08:28 CDT 2018


CALL FOR PAPERS
8th Workshop on Fault-Tolerance for HPC at eXtreme Scale (FTXS 2018)

In conjunction with The International Conference for
High Performance Computing, Networking, Storage, and Analysis (SC18)
Dallas, Texas, USA November 11 - 16, 2018
https://sites.google.com/site/ftxsworkshop/home/ftxs-2018

Important Dates
* Submissions open: July 1, 2018
* Submission of papers: August 30, 2018
* Author notification: September 27, 2018
* Camera-ready papers: TBA
* Workshop: Friday, November 16, 2018

Authors are invited to submit original papers on the research and 
practice of fault-tolerance in extreme-scale distributed systems 
(primarily HPC systems, but including grid and cloud systems). 
Resilience and fault-tolerance remain a major concern for supercomputing 
and advances in this area are needed to allow applications to compute 
accurate (or within an acceptable error tolerance) answers in a timely 
and efficient manner in the presence of
degradations or failures of platform components (both hardware and 
software).

Topics include, but are not limited to:
* Failure data analysis and field studies
* Power, performance, resilience (PPR) assessments / tradeoffs
* Novel fault-tolerance techniques and implementations
* Emerging hardware and software technology for resilience
* Silent data corruption (SDC) detection / correction techniques
* Advances in reliability monitoring, analysis, and control of \
   highly complex systems
* Failure prediction, error preemption, and recovery techniques
* Fault-tolerant programming models
* Models for software and hardware reliability
* Metrics and standards for measuring, improving, and enforcing
   effective fault-tolerance
* Scalable Byzantine fault-tolerance and security from single-fault and
   fail-silent violations
* Atmospheric evaluations relevant to HPC systems (terrestrial
   neutrons, temperature, voltage, etc.)
* Near-threshold-voltage implications and evaluations for reliability
* Benchmarks and experimental environments including fault injection
* Frameworks and APIs for fault-tolerance and fault management

PAPER SUBMISSIONS
Submissions are solicited in the following categories:
* Regular papers presenting innovative ideas improving the state of the
   art or discussing the issues seen on existing extreme-scale systems,
   including some form of analysis and evaluation.
* Extended abstracts proposing disruptive ideas and challenging
   assumptions in the field, including some form of preliminary results.
Extended abstracts will be evaluated separately and given shorter oral
presentations.

Submissions shall be sent electronically, must conform to SC18
proceedings style.  Regular papers should not exceed ten (10) pages 
including all text, appendices, figures, and references.  Extended 
abstract papers should not exceed six (6) pages.

Our workshop has been accepted to have its proceedings published by IEEE 
TCHPC (and included in IEEE Xplore).

WORKSHOP CO-CHAIRS
Nathan DeBardeleben - Los Alamos National Laboratory
Scott Levy - Sandia National Laboratories

ORGANIZING COMMITTEE
Keita Teranishi - Sandia National Laboratories
John Daly - Laboratory for Physical Sciences

PROGRAM COMMITTEE
Rizwan Ashraf - Oak Ridge National Laboratory
Marc Gamell Balmana - Intel
Leonardo Bautista Gomez - Barcelona Supercomputing Center
Aurelien Bouteiller - University of Tennessee Knoxville
Robert Clay - Sandia National Laboratories
James Elliott - Sandia National Laboratories
Christian Engelmann -Oak Ridge National Laboratory
Kurt B. Ferreira - Sandia National Laboratories
Qiang Guan - Kent State University
Sudhanva Gurumurthi -AMD
Hideyuki Jitsumoto - Tokyo Institute of Technology
Zhiling Lan - Illinois Institute of Technology
Naoya Maruyama - Lawrence Livermore National Laboratory
Bogdan Nicolae - Argonne National Laboratory
Yves Robert - ENS Lyon & Univ. Tenn. Knoxville
Vilas Sridharan - AMD
Peter E. Strazdins - The Australian National University
Abhinav Vishnu - Pacific Northwest National Laboratory
Panruo Wu - University of California at Riverside

Questions? Contact Scott Levy (sllevy at sandia.gov).


More information about the hpc-announce mailing list