MICRO 2023 Real-World PIM Tutorial

Real-world Processing-in-Memory Systems for Modern Workloads

Tutorial Description

Processing-in-Memory (PIM) is a computing paradigm that aims at overcoming the data movement bottleneck (i.e., the waste of execution cycles and energy resulting from the back-and-forth data movement between memory units and compute units) by making memory compute-capable.

Explored over several decades since the 1960s, PIM systems are becoming a reality with the advent of the first commercial products and prototypes.

A number of startups (e.g., UPMEM, Neuroblade) are already commercializing real PIM hardware, each with its own design approach and target applications. Several major vendors (e.g., Samsung, SK Hynix, Alibaba) have presented real PIM chip prototypes in the last two years. Most of these architectures have in common that they place compute units near the memory arrays. This type of PIM is called processing near memory (PNM).

PIM can provide large improvements in both performance and energy consumption for many modern applications, thereby enabling a commercially viable way of dealing with huge amounts of data that is bottlenecking our computing systems. Yet, it is critical to (1) study and understand the characteristics that make a workload suitable for a PIM architecture, (2) propose optimization strategies for PIM kernels, and (3) develop programming frameworks and tools that can lower the learning curve and ease the adoption of PIM.

This tutorial focuses on the latest advances in PIM technology, workload characterization for PIM, and programming and optimizing PIM kernels. We will (1) provide an introduction to PIM and taxonomy of PIM systems, (2) give an overview and a rigorous analysis of existing real-world PIM hardware, (3) conduct hand-on labs about important workloads (machine learning, sparse linear algebra, bioinformatics, etc.) using real PIM systems, and (4) shed light on how to improve future PIM systems for such workloads.

Livestream

Zoom livestream

YouTube livestream

Organizers

Name	E-mail
Juan Gómez Luna	juan.gomez@safari.ethz.ch
Onur Mutlu	onur.mutlu@safari.ethz.ch
Ataberk Olgun	ataberk.olgun@safari.ethz.ch
Geraldo F. Oliveira	geraldod@safari.ethz.ch

Agenda (October 29, 2023)

Lectures (tentative schedule, time zone: EDT GMT-4)

7:55am-8:00am, Dr Juan Gómez Luna, “Welcome & Agenda.”
8:00am-9:20am, Prof. Onur Mutlu / Geraldo F. Oliveira, “Memory-centric Computing: Introduction to PIM as a Paradigm to Overcome the Data Movement Bottleneck.”
- PIM taxonomy: PNM (processing near memory) and PUM (processing using memory).
- DAMOV Workload Characterization Methodology.
9:20am-10:20am, Dr. Juan Gómez Luna, “Processing-Near-Memory: Real PNM.”
- PNM prototypes: Samsung HBM-PIM, SK Hynix AiM, Samsung AxDIMM, Alibaba HB-PNM.
- UPMEM PIM: Architecture Characterization, Programming.

Coffee break (10:20am-10:40am)

10:40am-11:20am, Prof. Youngsok Kim (Yonsei University), “PID-Join: A Fast In-Memory Join Algorithm for Commodity PIM-Enabled DIMMs.”
11:20am-12:00pm, Dr. Abu Sebastian (IBM Research - Zürich), “PUM Based on Memristive Devices: The IBM HERMES Project Chip.”

Lunch break (12:00pm-1:00pm)

1:00pm-2:00pm, Geraldo F. Oliveira, “Processing-Using-DRAM: Ambit, SIMDRAM, pLUTo.”
2:00pm-3:15pm, Dr. Juan Gómez Luna, “Accelerating Modern Workloads on a General-purpose PIM System: Machine leaning, Genomics…”
3:15pm-3:45pm, Dr. Juan Gómez Luna, “Adoption Issues: How to Enable PIM?”
3:45pm-4:15pm, Dr. Juan Gómez Luna, “SimplePIM: A Software Framework for High-level PIM Programming.”
4:15pm-5:00pm, Ataberk Olgun, “Processing-Using-Memory Prototypes: PiDRAM.”
5:00pm-5:10pm, Dr. Juan Gómez Luna, “Introduction/Preparation for Hands-on Labs.”
- Optional - Hands-on Lab: Programming and Understanding a Real PIM Architecture.

Tutorial Materials

Time	Speaker	Title	Materials
7:55am-8:00am	Dr. Juan Gómez Luna	Welcome & Agenda	(PDF) (PPT)
8:00am-9:20am	Prof. Onur Mutlu / Geraldo F. Oliveira	Memory-Centric Computing	(PDF) (PPT)
9:20am-10:20am	Dr. Juan Gómez Luna	Processing-Near-Memory: Real PNM Architectures / Programming General-purpose PIM	(PDF) (PPT)
10:40am-11:20am	Prof. Youngsok Kim	PID-Join: A Fast In-Memory Join Algorithm for Commodity PIM-Enabled DIMMs	(PDF) SIGMOD'2023
11:20am-12:00pm	Dr. Abu Sebastian	PUM Based on Memristive Devices: The IBM HERMES Project Chip	(PDF) (PPT) Lecture (ETH Zürich, Fall 2020 IBM Analog Hardware Acceleration Kit Nature Nanotechnology (2020) Nature Electronics (2023) IEEE VLSI (2023) Nature Communications (2023)
1:00pm-2:00pm	Geraldo F. Oliveira	Processing-Using-DRAM: Ambit, SIMDRAM, pLUTo	(PDF) (PPT)
2:00pm-3:15pm	Dr. Juan Gómez Luna	Accelerating Modern Workloads on a General-purpose PIM System: Machine leaning, Genomics…	(PDF) (PPT)
3:15pm-3:45pm	Dr. Juan Gómez Luna	Adoption Issues: How to Enable PIM?	(PDF) (PPT)
3:45pm-4:15pm	Dr. Juan Gómez Luna	SimplePIM: A Software Framework for High-level PIM Programming	(PDF) (PPT)
4:15pm-5:00pm	Ataberk Olgun	Processing-Using-Memory Prototypes: PiDRAM	(PDF) (PPT)
5:00pm-5:10pm	Dr. Juan Gómez Luna	Hands-on Lab: Programming and Understanding a Real Processing-in-Memory Architecture	(Handout) (PDF) (PPT)

Learning Materials

Recommended Materials

Gómez-Luna, J., and Mutlu, O., Data-Centric Architectures: Fundamentally Improving Performance and Energy (227-0085-37L), ETH Zürich, Fall 2022.
- Course Website
- Lecture Playlist
Mutlu, O., Ghose, S., Gómez-Luna, J., and Ausavarungnirun, R. A Modern Primer on Processing in Memory. In Emerging Computing: From Devices to Systems, 2023.
- PDF (arXiv)
Gómez-Luna, J., El Hajj, I., Fernandez, I., Giannoula, C., Oliveira, G. F., and Mutlu, O. Benchmarking a New Paradigm: Experimental Analysis and Characterization of a Real Processing-in-Memory System. IEEE Access, 2022.
- PDF (arXiv)
- Repository (GitHub)
Giannoula, C., Fernandez, I., Gómez-Luna, J., Koziris, N., Goumas, G., and Mutlu, O. SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures. SIGMETRICS 2022.
- PDF (arXiv)
- Repository (GitHub)
Olgun, A., Gómez-Luna, J., Kanellopoulos, K., Salami, B., Hassan, H., Ergin, O., and Mutlu, O. PiDRAM: A Holistic End-to-end FPGA-based Framework for Processing-in-DRAM. ACM TACO, 2022.
- PDF (arXiv)
- Repository (GitHub)

More Learning Materials

Mutlu O., Memory-Centric Computing (IMACAW Keynote Talk at DAC 2023), July 2023:
- PDF PPT Video
Processing-in-memory: A workload-driven perspective (summary paper about recent research in PIM):
- PDF
Processing Data Where It Makes Sense: Enabling In-Memory Computation (summary paper about recent research in PIM):
- PDF
Processing-in-Memory course (Spring 2022):
- Course website
UPMEM SDK documentation: The first real-world PIM architecture

Table of Contents