           ScalA20: 11th Workshop on Latest Advances in
           Scalable Algorithms for Large-Scale Systems

                  held in conjunction with the
     SC20: The International Conference on High Performance
           Computing, Networking, Storage and Analysis

     in cooperation with the IEEE Computer Society Technical
        Consortium on High Performance Computing (TCHPC)

               November 12, 2020 --- Virtual Location


Novel scalable scientific algorithms are needed in order to enable key
science applications to exploit the computational power of large-scale
systems. This is especially true for the current tier of leading petascale
machines and the road to exascale computing as HPC systems continue to scale
up in compute node and processor core count. These extreme-scale systems
require novel scientific algorithms to hide network and memory latency, have
very high computation/communication overlap, have minimal communication, and
have no synchronization points. With the advent of Big Data and AI in the
past few years the need of such scalable mathematical methods and algorithms
able to handle data and compute intensive applications at scale becomes even
more important.

Scientific algorithms for multi-petaflop and exa-flop systems also need to be
fault tolerant and fault resilient, since the probability of faults increases
with scale. Resilience at the system software and at the algorithmic level is
needed as a crosscutting effort. Finally, with the advent of heterogeneous
compute nodes that employ standard processors as well as GPGPUs, scientific
algorithms need to match these architectures to extract the most performance.
This includes different system-specific levels of parallelism as well as
co-scheduling of computation. Key science applications require novel
mathematical models and system software that address the scalability and
resilience challenges of current- and future-generation extreme-scale HPC

Workshop Chairs

- Vassil Alexandrov, Hartree Centre, Science and Technology Facilities
  Council, UK
- Al Geist, Oak Ridge National Laboratory, USA
- Jack Dongarra, University of Tennessee, Knoxville, USA

Workshop Program Chair

- Christian Engelmann, Oak Ridge National Laboratory, USA
  Contact at engelmannc at ornl.gov<mailto:engelmannc at ornl.gov>

Workshop Program

The workshop will be held as a live online session on Thursday, November 12 2020, 10:00 - 18:30 in the US Eastern Time Zone. The live session will be recorded and available on demand afterwards. The SC20 virtual platform can be found here: https://www.eventscribe.net/2020/SC20/index.asp?

All Times in US Eastern Time Zone:

Session 1
10:00-10:10 Welcome
10:10-11:00 Keynote 1: "Performance Evaluation of The Supercomputer "Fugaku" and A64FX Manycore Processor," Mitsuhisa Sato (RIKEN Center for Computational Science, Japan).
11:00-11:25 Paper 1: "An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement," Takeshi Iwashita, Kengo Suzuki and Takeshi Fukaya.

11:25-11:50 Coffee break (coffee on your own)

Session 2
11:50-12:40 Keynote 2: "High Performance Data Analytics and Some Applications," Nahid Emad (University of Paris-Saclay, France).
12:40-13:05 Paper 2: "Two-stage Asynchronous Iterative Solvers for multi-GPU Clusters," Pratik Nayak, Terry Cojean and Hartwig Anzt.
13:05-13:30 Paper 3: "Revisiting exponential integrator methods for HPC with a mini-application," James Douglas Shanks.
13:30-13:55 Paper 4: "A Survey of Singular Value Decomposition Methods for Distributed Tall/Skinny Data," Drew Schmidt.

13:55-14:30 Lunch break (lunch on your own)

Session 3
14:30-15:20 Keynote 3: "ECP: Recent Experiences in Porting Complex Applications to Accelerator-based Systems," Andrew Siegel (Argonne National Laboratory, USA).
15:20-15:45 Paper 5: "Replacing Pivoting in Distributed Gaussian Elimination with Randomized Techniques," Neil Lindquist, Piotr Luszczek and Jack J. Dongarra.
15:45-16:10 Paper 6: "Recursive Basic Linear Algebra Operations on TensorCore GPU," Shaoshuai Zhang, Vivek Karihaloo and Panruo Wu.
16:10-16:35 Paper 7: "High-Order Finite Element Method using Standard and Device-Level Batch GEMM on GPUs," Natalie Beams, Ahmad Abdelfattah, Stanimire Tomov, Jack J. Dongarra, Tzanio Kolev and Yohann Dudouit.

16:35-17:00 Coffee break (coffee on your own)

Session 4
17:00-17:25 Paper 8: "A Fast Scalable Iterative Implicit Solver with Green's function-based Neural Networks," Tsuyoshi Ichimura, Kohei Fujita, Muneo Hori, Lalith Maddegedara, Naonori Ueda and Yuma Kikuchi.
17:25-17:50 Paper 9: "Implementation and Numerical techniques for One Eflop/s HPL-AI benchmark on Fugaku," Toshiyuki Imamura, Shuhei Kudo, Keigo Nitadori and Takuya Ina.
17:50-18:15 Paper 10: "Performance Analysis of a Quantum Monte Carlo Application on Multiple Hardware Architectures Using the HPX Runtime," Weile Wei, Arghya Chatterjee, Kevin Huck, Oscar Hernandez and Hartmut Kaiser.
18:15-18:30 Closing

The workshop program is also listed in the SC online program: https://sc20.supercomputing.org/session/?sess=sess214


