[hpc-announce] CFP - DEADLINE EXTENDED (17 Nov 2023): 6th AccML Workshop at HiPEAC 2024
Jose Cano Reyes
Jose.CanoReyes at glasgow.ac.uk
Tue Nov 7 03:34:10 CST 2023
==================================================================
6th Workshop on Accelerated Machine Learning (AccML)
Co-located with the HiPEAC 2024 Conference
(https://www.hipeac.net/2024/munich/)
January 17, 2024
Munich, Germany
==================================================================
-------------------------------------------------------------------------
CALL FOR CONTRIBUTIONS
-------------------------------------------------------------------------
The remarkable performance achieved in a variety of application areas
(natural language processing, computer vision, games, etc.) has led to
the emergence of heterogeneous architectures to accelerate machine
learning workloads. In parallel, production deployment, model complexity
and diversity pushed for higher productivity systems, more powerful
programming abstractions, software and system architectures, dedicated
runtime systems and numerical libraries, deployment and analysis tools.
Deep learning models are generally memory and computationally intensive,
for both training and inference. Accelerating these operations has
obvious advantages, first by reducing the energy consumption (e.g. in
data centers), and secondly, making these models usable on smaller
devices at the edge of the Internet. In addition, while convolutional
neural networks have motivated much of this effort, numerous
applications and models involve a wider variety of operations, network
architectures, and data processing. These applications and models
permanently challenge computer architecture, the system stack, and
programming abstractions. The high level of interest in these areas
calls for a dedicated forum to discuss emerging acceleration techniques
and computation paradigms for machine learning algorithms, as well as
the applications of machine learning to the construction of such systems.
-------------------------------------------------------------------------
LINKS TO THE WORKSHOP PAGES
-------------------------------------------------------------------------
Organizers: https://accml.dcs.gla.ac.uk/
HiPEAC: https://www.hipeac.net/2024/munich/#/program/sessions/8090/
-------------------------------------------------------------------------
TOPICS
-------------------------------------------------------------------------
Topics of interest include (but are not limited to):
- Novel ML systems: heterogeneous multi/many-core systems, GPUs and FPGAs;
- Software ML acceleration: languages, primitives, libraries, compilers
and frameworks;
- Novel ML hardware accelerators and associated software;
- Emerging semiconductor technologies with applications to ML hardware
acceleration;
- ML for the construction and tuning of systems;
- Cloud and edge ML computing: hardware and software to accelerate
training and inference;
- Computing systems research addressing the privacy and security of
ML-dominated systems;
- ML techniques for more efficient model training and inference (e.g.
sparsity, pruning, etc);
- Generative AI and their impact on computational resources
-------------------------------------------------------------------------
INVITED SPEAKERS
-------------------------------------------------------------------------
- *Giuseppe Desoli (STMicroelectronics)*: Revolutionizing Edge AI:
Enabling Ultra-low-power and High-performance Inference with In-memory
Computing Embedded NPUs
_Abstract_: The increasing demand for Edge AI has led to the development
of complex cognitive applications on edge devices, where energy
efficiency and compute density are crucial. While HW Neural Processing
Units (NPUs) have already shown considerable benefits, the growing need
for more complex algorithms demands significant improvements. To address
the limitations of traditional Von Neumann architectures, novel designs
based on computational memories are being developed by industry and
academia. In this talk, we present STMicroelectronics' future directions
in designing NPUs that integrate digital and analog In-Memory Computing
(IMC) technology with high-efficiency dataflow inference engines capable
of accelerating a wide range of Deep Neural Networks (DNNs). Our
approach combines SRAM computational memory and phase change resistive
memories, and we discuss the architectural considerations and
purpose-designed compiler mapping algorithms required for practical
industrial applications and some challenges we foresee in harnessing the
potential of In-memory Computing going forward.
- *John Kim (KAIST)*: Domain-Specific Networks for Accelerated Computing
_Abstract_: Domain-specific architectures are hardware computing engine
that is specialized for a particular application domain. As
domain-specific architectures become widely used, the interconnection
network can become the bottleneck for the system as the system scales.
In this talk, I will present the role of domain-specific interconnection
networks to enable scalable domain-specific architectures. In
particular, I will present the impact of the physical/logical topology
of the interconnection network on communication such as AllReduce in
domain-specific systems. I will also discuss the opportunity of
domain-specific interconnection networks and how they can be leveraged
to optimize overall system performance and efficiency. As a case study,
I will present the unique design of the Groq software-managed scale-out
system and how it adopts architectures from high-performance computing
to enable a domain-specific interconnection network.
- *Adam Paszke (Google)*: A Multi-Platform High-Productivity Language
for Accelerator Kernels
_Abstract_: Compute accelerators are the workhorses of modern scientific
computing and machine learning workloads. But, their ever increasing
performance also comes at a cost of increasing micro-architectural
complexity. Worse, it happens at a speed that makes it hard for both
compilers and low-level kernel authors to keep up. At the same time, the
increased complexity makes it even harder for a wider audience to author
high-performance software, leaving them almost entirely reliant on
high-level libraries and compilers. In this talk I plan to introduce
Pallas: a domain specific language embedded in Python and built on top
of JAX. Pallas is highly inspired by the recent development and success
of the Triton language and compiler, and aims to present users with a
high-productivity programming environment that is a minimal extension
over native JAX. For example, kernels can be implemented using the
familiar JAX-NumPy language, while a single line of code can be
sufficient to interface the kernel with a larger JAX program. Uniquely,
Pallas kernels support a subset of JAX program transformations, making
it possible to derive a number of interesting operators from a single
implementation. Finally, based on our experiments, Pallas can be
leveraged for high-performance code generation not only for GPUs, but
also for other accelerator architectures such as Google’s TPUs.
- *Ayse Coskun (Boston University)*: ML-Powered Diagnosis of Performance
Anomalies in Computer Systems
_Abstract_: Today’s large-scale computer systems that serve high
performance computing and cloud face challenges in delivering
predictable performance, while maintaining efficiency, resilience, and
security. Much of computer system management has traditionally relied on
(manual) expert analysis and policies that rely on heuristics derived
based on such analysis. This talk will discuss a new path on designing
ML-powered “automated analytics” methods for large-scale computer
systems and how to make strides towards a longer term vision where
computing systems are able to self-manage and improve. Specifically, the
talk will first cover how to systematically diagnose root causes of
performance “anomalies”, which cause substantial efficiency losses and
higher cost. Second, it will discuss how to identify applications
running on computing systems and discuss how such discoveries can help
reduce vulnerabilities and avoid unwanted applications. The talk will
also highlight how to apply ML in a practical and scalable way to help
understand complex systems, demonstrate methods to help standardize
study of performance anomalies, discuss explainability of applied ML
methods in the context of computer systems, and point out future
directions in automating computer system management.
-------------------------------------------------------------------------
SUBMISSION
-------------------------------------------------------------------------
Papers will be reviewed by the workshop's technical program committee
according to criteria regarding the submission's quality, relevance to
the workshop's topics, and, foremost, its potential to spark discussions
about directions, insights, and solutions in the context of accelerating
machine learning. Research papers, case studies, and position papers are
all welcome.
In particular, we encourage authors to submit work-in-progress papers:
To facilitate sharing of thought-provoking ideas and high-potential
though preliminary research, authors are welcome to make submissions
describing early-stage, in-progress, and/or exploratory work in order to
elicit feedback, discover collaboration opportunities, and spark
productive discussions.
The workshop does not have formal proceedings.
-------------------------------------------------------------------------
IMPORTANT DATES
-------------------------------------------------------------------------
Submission deadline: November 17, 2023
Notification of decision: December 8, 2023
-------------------------------------------------------------------------
ORGANIZERS
-------------------------------------------------------------------------
José Cano (University of Glasgow)
Valentin Radu (University of Sheffield)
José L. Abellán (University of Murcia)
Marco Cornero (DeepMind)
Ulysse Beaugnon (Google)
Juliana Franco (DeepMind)
More information about the hpc-announce
mailing list