[hpc-announce] Save the date - Friday, May 21 - Call for participation - 2nd Workshop on High-Performance Storage - HPS 2021 - held virtually in conjunction with IPDPS 2021

Gabriel Antoniu gabriel.antoniu at inria.fr
Tue May 18 05:30:52 CDT 2021

 <https://sites.google.com/view/hps-2021/participate?authuser=0#h.azl5k17t5zrb>CALL FOR PARTICIPATION
2nd Workshop on High-Performance Storage
HPS 2021 
Held May 21st 2021 virtually in conjunction with IPDPS 2021 
Created in 2020, HPS is a newly established workshop that covers all aspects of High-Performance I/O and storage, including storage hardware, storage systems, libraries, and I/O intensive applications. 

NOTE:  It is assumed that all attendees to the workshop will be registered with IPDPS 2021 <https://www.google.com/url?q=https%3A%2F%2Fwww.ipdps.org&sa=D&sntz=1&usg=AFQjCNEmGW-Uvkxi4LE01cgOjUkQrb1JEQ> to obtain access to the proceedings and to all live events and recorded sessions of the conference. Up-to-date information for accessing the workshop will be posted on the workshop web site. 

08:00 - 08:10: 	Welcome message from the chairs
08:10 - 09:00: 	Designing High-Performance Storage for a World after Hard Drives - Glenn K Lockwood, NERSC
Paper and Invited Talk Session
09:00 - 09:30:	Facilitating Staging-based Unstructured Mesh Processing to Support Hybrid In-Situ Workflows - Wang, Subedi, Dorier, Davis, Parashar
09:30 - 10:00:	The Storage System of the Fugaku Supercomputer  - Takuya Okamoto, Fujitsu
10:00 - 10:30:	Exploring MPI Collective I/O and File-per-process I/O for Checkpointing a Logical Inference task Fan, Micinski, Gilray, Kumar
10:30 - 11:00:	Meaningful Measurements? IO500’s 5th Year’s Search for Meaning - Jay Lofstead, Sandia National Laboratories


Keynote: Designing High-Performance Storage for a World after Hard Drives
Glenn K. Lockwood 
National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory

Abstract: We will discuss the architecture of the Perlmutter file system and the quantitative approach NERSC used to ensure that this all-flash file system would provide the best balance of capacity, performance, endurance, and stability for NERSC's 8,000 users. We will also discuss unresolved challenges in designing extreme-scale all-flash storage systems, then conclude with several promising future directions in storage systems design that NERSC will be pursuing over the next five years.

Bio: Glenn K. Lockwood is the principal storage architect at the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory where he leads future storage systems design, I/O performance engineering, and many storage R&D activities across the center. He was a lead designer of the 35 PB all-NVMe Perlmutter file system, and he also played a key role in defining NERSC's Storage 2020 vision which culminated in the deployment of its 128 PB Community File System. In addition to storage systems design, Glenn is also actively engaged in the parallel I/O community; he represents NERSC on the HPSS Executive Committee, is a maintainer of the IOR and mdtest community benchmarks, and is a contributor to the Darshan I/O profiling library. Glenn holds a Ph.D. in materials science and a B.S. in ceramic engineering from Rutgers University.

Invited Talk:  The Storage System of the Fugaku Supercomputer
Takuya Okamoto, Fujitsu Ltd.
Abstract: The Fugaku supercomputer is currently the world's fastest supercomputer, developed by RIKEN and Fujitsu. To provide a large-capacity and high-performance storage system, the Fugaku adopts 3-level hierarchical storage system: the 1st layer that serves as a dedicated high-performance filesystem for each job execution, the 2nd layer that provides large-capacity shared filesystems used by users and jobs, and the 3rd layer that provides commercial cloud storage. The 2nd layer storage adopts Fujitsu Exabyte File System (FEFS), a Lustre based filesystem developed during the K computer development. For the 1st layer storage, we have developed a new filesystem called Lightweight Layered IO-Accelerator (LLIO). LLIO provides 3 types of area to the jobs: a transparent cache of the 2nd layer filesystem, and 2 temporary filesystems. LLIO also provides an efficient file copying command to relieve hotspots which often lead to performance bottleneck when large scale jobs read shared input files. This talk presents an overview of the Fugaku storage system, LLIO functionalities, and their performance.
Bio: Takuya Okamoto received his B.E., M.E. degrees from The University of Tokyo in 2014, 2016, respectively. Afterwards he joined Fujitsu Ltd. He has spent 5 years as developer for Lightweight Layered IO-Accelerator (LLIO).

Invited Talk:  Meaningful Measurements? IO500’s 5th Year’s Search for Meaning
Jay Lofstead, Sandia National Laboratories
Abstract: The IO500 was created as a way to encourage users to submit information about their data center, particularly, their storage systems, by providing a competition for bragging rights. The initial workloads were justified well at the time and the 10 Node Challenge was added to increase participation. With things seemingly moving along, reflection on the existing benchmarks and how to represent new workloads, and if they are relevant, has caused considerable discussion. This talk will examine the roots and motivations for the IO500 benchmark suite and what challenges have been revealed in meaningful measurements that are useful beyond a competition.

Bio: Dr. Jay Lofstead is a Principal Member of Technical Staff in the Scalable System Software department of the Center for Computing Research at Sandia National Laboratories in Albuquerque, NM. His work focuses on infrastructure to support all varieties of simulation, scientific, and engineering workflows with a strong emphasis on IO, middleware, storage, transactions, operating system features to support workflows, containers, software engineering and reproducibility. He is co-founder of the IO-500 storage list. He also works extensively to support various student mentoring and diversity programs at several venues each year including outreach to both high school and college students. Jay graduated with a BS, MS, and PhD in Computer Science from Georgia Institute of Technology and was a recipient of a 2013 R&D 100 award for his work on the ADIOS IO library.
 <https://sites.google.com/view/hps-2021/home?authuser=0#h.p_aXOc1ATlOfbh>WORKSHOP OVERVIEW
Advances in storage are becoming increasingly critical because workloads on high performance computing (HPC) and cloud systems are producing and consuming more data than ever before, and the situation promises to only increase in future years. Additionally, the last decades have seen relatively few changes in the structure of parallel file systems, and limited interaction between the evolution of parallel file systems, e.g., Lustre, GPFS, and I/O support systems that take advantage of hierarchical storage layers, e.g., node local burst buffers. However, recently the community has seen a large uptick in innovations in storage systems and I/O support software for several reasons:
Technology: The availability of an increasing number of persistent solid-state storage technologies that can replace either memory or disk are creating new opportunities for the structure of storage systems.
Performance requirements: Disk-based parallel file systems cannot satisfy the performance needs of high-end systems. However, it is not clear how solid-state storage can best be used to achieve the needed performance, so new approaches for using solid-state storage in HPC systems are being designed and evaluated.
Application evolution: Data analysis applications, including graph analytics and machine learning, are becoming increasingly important both for scientific computing and for commercial computing.  I/O is often a major bottleneck for such applications, both in cloud and HPC environments – especially when fast turnaround or integration of heavy computation and analysis are required.
Infrastructure evolution. HPC technology will not only be deployed in dedicated supercomputing centers in the future. “Embedded HPC”, “HPC in the box”, “HPC in the loop”, “HPC in the cloud”, “HPC as a service”, and “near- to-real-time simulation” are concepts requiring new small-scale deployment environments for HPC. A federation of systems and functions with consistent mechanisms for managing I/O, storage, and data processing across all participating systems will be required to create a “continuum” of computing.
Virtualization and disaggregation: As virtualization and disaggregation become broadly used in cloud and HPC computing, the issue of virtualized storage has increasing importance and efforts will be needed to understand its implications for performance.
Our goals in the HPS Workshop are to bring together expert researchers and developers in storage and I/O from across HPC and cloud computing to discuss advances and possible solutions to the new challenges we face.

Workshop Chairs:
Gabriel Antoniu, Inria, France  - Chair - gabriel.antoniu at inria.fr <mailto:gabriel.antoniu at inria.fr> 
Marc Snir, University of Illinois at Urbana-Champaign, USA - Co-Chair- snir at illinois.edu <mailto:snir at illinois.edu> 
Program Co-Chairs:
Bogdan Nicolae, Argonne National Lab, USA - Chair - bogdan.nicolae at acm.org <mailto:bogdan.nicolae at acm.org>
Osamu Tatebe, University of Tsukuba, Japan, Co-Chair - tatebe at cs.tsukuba.ac.jp <mailto:tatebe at cs.tsukuba.ac.jp> 
Program Committee:
Angelos Bilas, Forth, Greece
Suren Byna, LLBL, USA
Franck Cappello, ANL, USA
Jesus Carretero, Universidad Carlos III de Madrid, Spain
Toni Cortes, Barcelona Supercomputing Center, Spain
Kathryn Mohror, Lawrence Livermore National Laboratory, USA
Alexandru Costan, Inria and INSA Rennes, France
Matthieu Dorier, Argonne National Lab, USA
Dana Petcu, University West Timisoara, Romania
Michael Schoettner, University of Dusseldorf, Germany
Domenico Talia, University of Calabria, Italy
Kento Sato, RIKEN, Japan
François Tessier, Inria, France
Weikuan Yu, Florida State University, USA

For additional details, see  web site: https://sites.google.com/view/hps-2021/home <https://www.google.com/url?q=https%3A%2F%2Fsites.google.com%2Fview%2Fhps-2021%2Fhome&sa=D&sntz=1&usg=AFQjCNEug6wxjs_pgtvIzyPel-yTFBulfA> 

Gabriel Antoniu, Research Director, Inria
Head of the KerData Research Team
Inria, Rennes - Bretagne Atlantique Research Center
Campus de Beaulieu, 35042 Rennes cedex, France
Tél: +33 (0) 2 99 84 72 44, Fax: +33 (0) 2 99 84 71 71

More information about the hpc-announce mailing list