Table of Contents

ISCA 2024 Tutorial on Memory-Centric Computing Systems (Half Day)

Tutorial Description

Processing-in-Memory (PIM) is a computing paradigm that aims at overcoming the data movement bottleneck (i.e., the waste of execution cycles and energy resulting from the back-and-forth data movement between memory units and compute units) by making memory (and storage) systems compute-capable.

Explored over several decades since the 1960s, PIM systems are now becoming a reality with the advent of the first commercial products and prototypes.

Several startups (e.g., UPMEM, NeuroBlade, Mythic, Syntiant, Aizip, Axelera, d-Matrix, Gyrfalcon Technology, MemComputing, SEMRON, SureCore, Synthara, TetraMem, EnCharge AI) are already commercializing real PIM hardware, each with its design approach and target applications. Major vendors (e.g., Samsung, SK Hynix, Micron, Alibaba) have presented real PIM chip and system prototypes in the past several years.

Recent PIM products and prototypes place compute units near the memory arrays. New memory interfaces like CXL (Compute Express Link) aid the enablement of compute-capable memories. At the same time, academia and industry are actively exploring other types of PIM by, e.g., exploiting the analog operation of DRAM, SRAM, flash memory, and emerging non-volatile memories, and hybrid PIM architectures that combine processing capabilities of different types and at different parts of the memory/storage hierarchy.

PIM can improve performance and energy efficiency for many modern applications, enabling a commercially viable way of dealing with huge amounts of data bottlenecking our computing systems, which is especially exacerbated by workloads like AI/ML and genomics. In fact, workloads like large language model training and inference can potentially be “killer applications'' for PIM.

However, there are many open questions spanning the entire computing stack and many challenges for widespread adoption. For example, it is critical to (1) develop programming frameworks and tools that can lower the learning curve and ease the adoption of PIM systems, (2) develop methods to identify what type of PIM would be useful for what workload, and (3) design system and security mechanisms that enable PIM in a wider scale. Implications of PIM on all aspects of computing systems and workloads is a challenging and exciting field of study.

This tutorial focuses on the latest advances in PIM technology, spanning both hardware and software, including novel PIM ideas, different tools and frameworks to conduct PIM research, and programming techniques and optimization strategies for PIM kernels. We will (1) provide an introduction to PIM and the taxonomy of PIM systems, (2) give an overview and a rigorous analysis of existing PIM hardware from industry and academia, (3) provide and describe hardware and software infrastructures that can enable new and experienced researchers to conduct research in PIM systems, and (4) shed light on how to improve future PIM systems for emerging memory-bound workloads. The tutorial will also incorporate invited talks from leading industry and academic researchers in PIM systems.

Livestream

YouTube livestream

Organizers

Agenda (June 29, 2024)

Lectures (tentative schedule, time zone: GMT-3)

Tutorial Materials

Time Speaker Title Materials
09:00am-09:20am Prof. Onur Mutlu / Geraldo F. Oliveira Memory-Centric Computing (PDF) (PPT)
09:30am-09:50am Professor Minsoo Rhu Memory-Centric Computing Systems – For AI and Beyond (PDF) (PPT)
10:00am-10:20am N/A Coffee Break
10:30am-10:50am Dr. Mohammad Sadr Processing-Near-Memory: Real PNM Architectures (PDF) (PPT)
11:00am-11:20am Geraldo F. Oliveira Processing-Using-Memory for Bulk Bitwise Operations (PDF) (PPT)
11:30am-11:50am Professor Saugata Ghose RACER and ReRAM Processing-Using-Memory (PDF) (PPT)
12:00pm-12:00pm Geraldo F. Oliveira Programming Techniques, Infrastructure, and Research Challenges for PIM (PDF) (PPT)
12:20pm-12:30pm Geraldo F. Oliveira Closing Remarks (PDF) (PPT)

Learning Materials

More Learning Materials