[hpc-announce] FTXS @SC20 : Call for participation

Levy, Scott Larson sllevy at sandia.gov
Wed Nov 4 09:16:32 CST 2020

FTXS 2020 @ SC20
10:00a-1:30p Wednesday, November 11th, 2020


FTXS is an important forum for presenting and discussing cutting-edge research on fault tolerance for extreme-scale systems.  We have a very strong program this year and we hope that you'll join us on Wednesday.

Opening Remarks

Improving Scalability of Silent-Error Resilience for Message-Passing Solvers via Local Recovery and Asynchrony (Kolla, Mayo, Teranishi, Armstrong)

Towards Distributed Software Resilience in Asynchronous Many-Task Programming Models (Gupta, Mayo, Lemoine, Kaiser)

Models for Resilience Design Patterns (Kumar, Engelmann)


>From tasks graphs to asynchronous distributed checkpointing with local restart (Lion, Thibault)

A Generic Strategy for Node-Failure Resilience for Certain Iterative Linear Algebra Methods (Pachajoa, Ernstbrunner, Gansterer)

Checkpointing OpenSHMEM Programs Using Compiler Analysis (Shahneous Bari, Basu, Lu, Curtis, Chapman)

Closing remarks

The Workshop Program is also available at: https://sites.google.com/site/ftxsworkshop/home/ftxs-2020

More information about the hpc-announce mailing list