[hpc-announce] Call for Participation - ScienceCloud 2021: 11th Workshop on Scientific Cloud Computing (co-located with HPDC)

Alexandru Costan alexandru.costan at inria.fr
Mon Jun 14 07:31:56 CDT 2021

*** ScienceCloud 2021 *** 
11th Workshop on Scientific Cloud Computing 
Workshop Date: June 21, 2021 

Held in conjunction with ACM HPDC 2021, Stockholm, Sweden 
(Conference dates: June 21-25, 2021) 
Web: [ https://sites.google.com/view/science-cloud/ | https://sites.google.com/view/science-cloud/ ] 

The workshop is scheduled virtually on June 21, 2021. Times are EDT (UTC - 4:00). 

• 09:00 - 09:10 Opening 
• 09:10 - 10:10 Keynote 
• Kate Keahey (University of Chicago / Argonne National Laboratory): Taking Science from Cloud to Edge 
• 10:10 - 10:30 Technical Paper Presentation 
• Quincy Wofford (Los Alamos National Laboratory), Patrick Bridges (University of New Mexico), Patrick Widener (Sandia National Laboratory): A Layered Approach for Modular Container Construction and Orchestration in HPC Environments 
• 10:30 - 10:50 Break 
• 10:50 - 11:50 eScience Invited Talks 
• Christoph Kessler (Linköping University): Portable high-level programming of heterogeneous parallel systems with SkePU 
• Amit Chourasia (San Diego Supercomputing Center): Democratizing Scientific Data Management with SeedMeLab 
• Ben van Werkhoven (eScience Center): GPU code optimization and auto-tuning made easy with Kernel Tuner 
• Eric Coulter (Indiana University): HPC From the Ground to the Cloud 
11:50 - 12:00 Closing 

Keynote speaker - Kate Keahey ( University of Chicago / Argonne National Laboratory) 

Kate Keahey is one of the pioneers of infrastructure cloud computing. She created the Nimbus project, recognized as the first open source Infrastructure-as-a-Service implementation, and continues to work on research aligning cloud computing concepts with the needs of scientific datacenters and applications. To facilitate such research for the community at large, Kate leads the Chameleon project, providing a deeply reconfigurable, large-scale, and open experimental platform for Computer Science research. To foster the recognition of contributions to science made by software projects, Kate co-founded and serves as co-Editor-in-Chief of the SoftwareX journal, a new format designed to publish software contributions. Kate is a Scientist at Argonne National Laboratory and a Senior Fellow at the Computation Institute at the University of Chicago. 

Taking Science from Cloud to Edge 

New research ideas require an instrument where they can be developed, tested — and shared. To support Computer Science experiments such instrument has to support a diversity of hardware configurations, deployment at scale, deep reconfigrability, and mechanisms for sharing so that new results can trigger further innovation. Most importantly -- since science does not stand still – such instrument requires constant adaptation to support an ever increasing range of experiments driven by emergent ideas and opportunities. 
The NSF-funded Chameleon testbed for systems research and education ( [ http://www.chameleoncloud.org/ | www.chameleoncloud.org ] ) has been developed to provide all those capabilities. The testbed provides many thousands of cores and over 5PB of storage hosted at three sites (University of Chicago, Texas Advanced Computing Center, and Northwestern) connected by 100 Gbps networks. The hardware consists of a large homogenous partitions to facilitate experiments at scale, alongside a diverse set of hardware consisting of accelerators, storage hierarchy nodes with a mix of HDDs, SDDs, and NVMe, high-bandwidth I/0 storage, SDN-enabled networking hardware, and edge devices. To support experiments ranging from work in operating systems through networking to edge computing, Chameleon provides a range of reconfigurability options from bare metal to virtual machine management. To date, the testbed has supported 5,000+ users and 700+ research and education projects and has just been renewed until the end of 2024. 
This talk will describe the goals, design strategy, and the existing and future capabilities of the testbed, as well as some of the research and education projects our users are working on. I will also describe how Chameleon is evolving to support new research directions, in particular edge and IoT-based research and applications. Finally, I will introduce the services and tools we created to support sharing of experiments, curricula, and other digitally expressed artifacts that allow science to be shared via active involvement and foster reproducibility. 

eScience Invited Talks 

Christoph Kessler: Portable high-level programming of heterogeneous parallel systems with SkeP 
We live in the era of parallel and heterogeneous computer systems, with multi- and many-core CPUs, GPUs and other types of accelerators being omnipresent. The execution and programming models exposed by modern computer architectures are diverse, parallel, heterogeneous, distributed, and far away from the sequential von-Neumann model of the early days of computing. Yet, the convenience of single-threaded programming, together with technical debt from legacy code, make us mentally stick to programming interfaces that follow the familiar von-Neumann model, typically extended with various platform-specific APIs that allow to explicitly control parallelism and accelerator usage. High-level parallel programming based on generic, portable programming constructs known as algorithmic skeletons can raise the level of abstraction and bridge the semantic gap between a sequential-looking, platform-independent single-source program code and the heterogeneous and parallel hardware. We present the design principles of one such framework, the latest generation of our open-source programming framework SkePU for heterogeneous parallel systems and clusters. The SkePU high-level programming interface is based on modern C++, leveraging variadic template metaprogramming and a custom source-to-source pre-compiler. SkePU provides currently seven (fully variadic) skeletons for data-parallel patterns such as map, reduce, stencil, scan etc., together with high-level data abstractions for skeleton call operands. SkePU can perform automated optimizations of the high-level execution flow, such as context-dependent best-backend selection among the supported platforms, operand data transfer and memory optimizations. 

Amit Chourasia: Democratizing Scientific Data Management with SeedMeLab 
Researchers have an increasing need to manage preliminary and transient results. This evolving and increasing corpus of data needs to be paired with its contextual information and findings. However, this amalgamation of data, metadata, context, and insights is often highly fragmented and dispersed on many systems including local/remote file servers, emails, presentations, and meeting notes. Much of this information becomes increasingly cumbersome to assimilate, use and reuse over time. Researchers often create ad-hoc strategies to manage this data using variety of tools that are loosely glued together, however this fragile system runs into many limitations and burdens them with requiring continued investments. In this talk I will introduce SeedMeLab – an open-source scientific data management system which overcomes limitations of ad-hoc systems by providing a robust feature set that includes data annotation, data discussion, data visualization and discoverability along with modular extensibility. It also provides full ownership, access control and branding; and can be deployable On-Premises or On-Cloud and available as a managed service. 

Ben van Werkhoven: GPU code optimization and auto-tuning made easy with Kernel Tuner 
Graphics Processing Units (GPUs) have revolutionized the computing landscape in the past decade, and are seen as one of enabling factors in recent breakthroughs in Artificial Intelligence. While GPUs are used to enable scientific computing workloads in many fields, including climate modeling, artificial intelligence, and quantum physics, it is actually very hard to unlock to full computational power of the GPU. This is because there are many degrees of freedom in GPU programming, and often there are only a handful of specific combinations of thread block dimensions and other code optimization parameters, like tiling or unrolling factors, that result in dramatically higher performance than other kernel configurations. To obtain such highly-efficient kernels it is often required to search vast and discontinuous search spaces that consist of all possible combinations of values for all tunable parameters, which is infeasible to do by hand. This talk gives a brief introduction of Kernel Tuner, an easy-to-use tool for testing and auto-tuning OpenCL, CUDA, and Fortran, and C kernels with support for many search optimization algorithms that accelerate the tuning process. 

Eric Coulter: HPC From the Ground to the Cloud 
This talk describes how we've created and used a “Virtual Cluster toolkit”(VC) and “Container Template Library”(CTL) for building elastic, cloud-based HPC resources to support both Science Gateways and new HPC administrators. The VC toolkit has been invaluable for training for new HPC administrators and users in addition to creating production resources for approximately 25 projects, including SeaGrid, UltraScan, and multiple fast-track COVID-19 research projects. This was based on work from the XSEDE Cyberinfrastructure and Resource Integration (XCRI) team and the Cyberinfrastructure Integration Research Center (CIRC). These teams have been addressing the problem of disconnected and unequally distributed research computing on three fronts: by helping under-resourced institutions build XSEDE-like resources (XCRI), by providing container templates to ease installation and increase portability of scientific software (XCRI), and by enabling easy access to compute resources through science gateways (CIRC). The XCRI team provides both toolkits and hands-on consultation to growing institutions. CIRC supports scientific software developers through Science Gateways using the Airavata middleware, in addition to working with gateway providers to develop software and grow their community. Collaboration between the two resulted in the creation of the toolkits, which have proven beneficial to users and researchers at the edges of the cyberinfrastructure community in the US. In particular, the VC and CTL support several needs surrounding scientific software development in areas less served by established community software, such as rapid development cycles, easing deployment to extant HPC systems, and efficient use of limited cloud resources. 

More information about the hpc-announce mailing list