[hpc-announce] FTXS 2020 @ SC20 Call for Papers (Deadline extended)

Levy, Scott Larson sllevy at sandia.gov
Tue Aug 25 14:24:40 CDT 2020

We have extended the deadline for FTXS 2020.  Submissions are now due September 4, 2020 (AoE)

10th Workshop on Fault-Tolerance for HPC at eXtreme Scale (FTXS 2020)

NOTE: FTXS 2020 will be entirely virtual this year.  Details will be provided on our website.

In conjunction with The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC20)
Atlanta, Georgia, USA November 15 - 20, 2020

Important Dates
* Submission of papers: September 4, 2020
* Author notification: September 27, 2020
* Camera-ready papers: TBA
* Workshop: November 15, 2020

Authors are invited to submit original papers on the research and practice of fault-tolerance in extreme-scale distributed systems (primarily HPC systems, but including grid and cloud systems).  Resilience and fault-tolerance remain a major concern for supercomputing and advances in this area are needed to allow applications to compute accurate (or within an acceptable error tolerance) answers in a timely and efficient manner in the presence of degradations or failures of platform components (both hardware and software).

Topics include, but are not limited to:
* Failure data analysis and field studies
* Power, performance, resilience (PPR) assessments / tradeoffs
* Novel fault-tolerance techniques and implementations
* Emerging hardware and software technology for resilience
* Silent data corruption (SDC) detection / correction techniques
* Advances in reliability monitoring, analysis, and control of highly complex systems
* Failure prediction, error preemption, and recovery techniques
* Fault-tolerant programming models
* Models for software and hardware reliability
* Metrics and standards for measuring, improving, and enforcing effective fault-tolerance
* Scalable Byzantine fault-tolerance and security from single-fault and fail-silent violations
* Atmospheric evaluations relevant to HPC systems (terrestrial neutrons,
  temperature, voltage, etc.)
* Near-threshold-voltage implications and evaluations for reliability
* Benchmarks and experimental environments including fault injection
* Frameworks and APIs for fault-tolerance and fault management

Submissions are solicited in the following categories:
* Regular papers presenting innovative ideas improving the state of the art or discussing the
   issues seen on existing extreme-scale systems, including some form of analysis and
* Extended abstracts proposing disruptive ideas and challenging assumptions in the field,
   including some form of preliminary results.
Extended abstracts will be evaluated separately and given shorter oral presentations.

Submissions shall be sent electronically, must conform to SC20 proceedings style.  Regular papers should not exceed ten (10) pages including all text, appendices, figures, and references.  Extended abstract papers should not exceed six (6) pages.  These are *maximum* page lengths; shorter submissions are welcome.

Scott Levy - Sandia National Laboratories
Nathan DeBardeleben - Los Alamos National Laboratory

Keita Teranishi - Sandia National Laboratories
John Daly - Laboratory for Physical Sciences

Questions? Contact Scott Levy (sllevy at sandia.gov).

More information about the hpc-announce mailing list