FEBRUARY 1st, 2026 (08.45-17.45)
Room: Collaroy

MCCSys 2026

4th Workshop on Memory-Centric Computing Systems

In conjunction with the 32nd International Symposium on
High-Performance Computer Architecture (HPCA 2026)

Sydney, Australia

Submission Deadline
December 20th, 2025
Notification Date
December 27th, 2025
Workshop Date
February 1st, 2026

About

Processing-in-Memory (PIM) is a computing paradigm that aims to overcome data movement bottlenecks by making memory systems compute-capable. Explored over several decades since the 1960s, PIM systems are now becoming a reality with the advent of the first commercial products and prototypes. PIM can improve performance and energy efficiency for many modern applications. However, there are many open questions spanning the entire computing stack and many challenges for widespread adoption.

This combined tutorial and workshop will focus on the latest advances in PIM technology, spanning both hardware and software. It will include novel PIM ideas, different tools and frameworks for conducting PIM research, and programming techniques and optimization strategies for PIM kernels. First, we will provide a series of lectures and invited talks that will provide an introduction to PIM, including an overview and a rigorous analysis of existing PIM hardware from industry and academia. Second, we will invite the broad PIM research community to submit and present their ongoing work on memory-centric systems. The program committee will favor papers that bring new insights on memory-centric systems or novel PIM-friendly applications, address key system integration challenges in academic or industry PIM architectures, or put forward controversial points of view on the memory-centric execution paradigm. We also consider position papers, especially from industry, that outline design and process challenges affecting PIM systems, new PIM architectures, or system solutions for real state-of-the-art PIM devices.


Call for Presentations

This workshop consists of invited talks on the general topic of memory-centric computing systems. There are a limited number of slots for invited talks. If you would like to deliver a talk on related topics, please contact us by filling out this form.

We invite abstract submissions related to (but not limited to) the following topics in the context of memory-centric computing systems:


Key Dates

Submission Deadline December 20, 2025 (23:59 AoE)
Notification of Acceptance December 27, 2025
Workshop Date February 1st, 2026 (Full Day)

Agenda & Workshop Materials

Time Talk
08:45 – 08:55
Logistics/Welcome
08:55 – 10:30
Memory-Centric Computing: Solving Memory's Computing Problem
10:30 – 11:00 Coffee Break
11:00 – 11:35
Processing-using-Memory in Real DRAM Chips
11:35 – 12:10
MASTODON: Enabling Early-Stage Cross-Stack Simulation for Processing Using Memory Systems
12:10 – 12:45
Fault Injection Framework for Processing-using-Memory Architectures
12:45 – 13:45 Lunch Break
13:45 – 14:20
Offloading to CXL Computational Memory
14:20 – 14:55
Architectural and System Software Support for PIM Integrated Systems
14:55 – 15:30
Revisiting Main Memory-Based Covert and Side Channel Attacks in the Context of Processing-in-Memory
15:30 – 16:00 Coffee Break
16:00 – 16:35
PIMphony: Overcoming Bandwidth and Capacity Inefficiency in PIM-based Long-Context LLM Inference Systems
16:35 – 17:10
Storage-Centric Systems for Genomics and Metagenomics
Nika Mansouri Ghiasi
17:10 – 17:40
Conduit: Programmer-Transparent Near-Data Processing Using Multiple Compute-Capable Resources in Solid State Drives
Rakesh Nadig
17:40 – 17:45
Closing Remarks

Invited Speakers

Prof. Ada Gavrilovska

Prof. Ada Gavrilovska

Georgia Institute of Technology
"Offloading to CXL Computational Memory"

Short Bio: Ada Gavrilovska is a Professor in the School of Computer Science at Georgia Tech. Her research is focused on designing systems for emerging technologies, and she develops new systems software solutions in response to new hardware, applications, and use cases. Her past research has considered the impact on systems software from programmable network processors, high-performance interconnects, multi/manycores, virtualization and cloud computing. Her recent research is driven by two major trends rooted in the exponential growth in demand for data and for ever-faster insights from such data – the proliferation of new memory system designs, and the shift to edge computing. She has served as program or general chair for OSDI’24, SOCC’22, HPDC’22, USENIX ATC’20, as an Associate Editor for the IEEE Transactions on Cloud Computing and the ACM Transactions of Computer Systems. Gavrilovska’s research has been supported by the NSF, the Department of Energy, the Semiconductor Research Corporation, and by multiple industry awards, including from Cisco, HPE, IBM, Intel, Intercontinental Exchange, LexisNexis, VMware, and others. She is currently a PI and Systems Software co-lead in the SRC/DARPA center for Processing with Intelligent Storage and Memories (PRISM).

Abstract: CXL-based Computational Memory (CCM) enables near-memory processing within expanded remote memory, and presents opportunities to address data movement costs associated with disaggregated memory systems and to accelerate overall performance. However, existing models for operation offload are limited in their ability to leverage the tradeoffs associated with different CXL protocols. This talk first examines these tradeoffs and demonstrates their impact on end-to-end performance and system efficiency for workloads with diverse data and processing requirements. Next, we will present a new proposal for an ‘Asynchronous Back-Streaming’ protocol which carefully layers data and control transfer operations on top of the underlying CXL protocols. We will describe a new system that realizes the asynchronous back-streaming model that supports asynchronous data movement and lightweight pipelining in host-CCM interactions. Experimental results show that the approach reduces end-to-end runtime by up to 50.4%, and CCM and host idle times by average 22.11× and 3.85×, respectively. This talk is based on the following preprint: https://arxiv.org/pdf/2512.04449.
Prof. Saugata Ghose

Prof. Saugata Ghose

University of Illinois Urbana-Champaign
"MASTODON: Enabling Early-Stage Cross-Stack Simulation for Processing Using Memory Systems"

Short Bio: Saugata Ghose is an assistant professor in Computing and Data Science at the University of Illinois Urbana-Champaign, where he leads the ARCANA Research Group. His research interests include data-centric computer architectures and systems, new interfaces between systems software and hardware architectures, and architectures for emerging platforms and application domains. He holds M.S. and Ph.D. degrees from Cornell University, and dual B.S. degrees from Binghamton University. Among his awards, he was named an Intel Rising Star, has been inducted into the ISCA and HPCA Halls of Fame, and was a Wimmer Faculty Fellow while at Carnegie Mellon University. For more information, please visit his website at https://ghose.cs.illinois.edu/.

Abstract: Processing Using Memory (PUM) systems have attracted significant research attention over the past decade as a potential solution to the data movement bottlenecks in modern systems. Despite this research interest, there continues to be a disconnect between the device and the architecture communities working on PUM, even as we recognize the need to co-design devices, circuits, and architectures to achieve viable end products. This results in a slow and serialized design cycle: device physicists keep improving a “hero” device until it’s mature, and only after that do architects see an appealing-enough device and try to design an (often overly optimistic and impractical) architecture. True collaborative co-design, if it happens, comes years after the initial devices work, leading to many failed devices and architectures. We aim to avoid these pitfalls of designing isolated devices and hero-device-based architectures, which can ignore issues such as manufacturing variability, cycle-to-cycle variation, aggregated faults, and data lifetimes, among other issues. To that end, we have been developing MASTODON, a cross-stack simulator that can directly take in Verilog-A models developed by device physicists, including both early-stage models and mature models, and can capture the potential of these devices at scale. This includes modeling array topologies, interconnects, device drivers and peripherals, parasitics, and read/write/in-memory logic functions, all while executing end-to-end applications and supporting multiple control path models. MASTODON enables application-to-device studies with orders of magnitude simulation speed improvements over industrial SPICE-based simulation, with minimal error. We hope to demonstrate how MASTODON can help usher in needed cross-stack collaboration to bring PUM architectures closer to reality.
Dr. João Paulo Cardoso de Lima

Dr. João Paulo Cardoso de Lima

Technische Universität Dresden
"Fault Injection Framework for Processing-using-Memory Architectures"

Short Bio: I received my bachelor's degree in Computer Engineering from the Federal University of Santa Catarina (UFSC) in 2017, followed by a master's degree in 2019 and a Ph.D. in Computer Science from the Federal University of Rio Grande do Sul (UFRGS) in 2025. I'm currently a postdoctoral researcher at the Chair for Compiler Construction to research and develop code optimizations for emerging AI systems as part of the ScaDS.AI Dresden/Leipzig center. My main research interests include Processing-in-Memory architectures, system design, hardware/software co-design, design automation tools, compilers, reliability evaluation, and fault tolerance methods. On the application side, I am particularly interested in efficient methods for machine learning algorithms through memory-centric optimizations, whether compiler-driven, hand-tuned, or enabled by domain-specific tools, for energy efficiency.

Abstract: Processing-using-Memory (PuM) enables bulk bitwise operations inside memory arrays, reducing data movement and exposing massive parallelism. However, PuM systems are highly vulnerable to device- and circuit-level faults, whose impact must ultimately be evaluated at the application level. Existing reliability evaluation approaches are either prohibitively slow or do not scale to large tensor workloads. We present a scalable fault-injection framework for PuM architectures that enables cross-layer reliability evaluation from in-memory operations to end applications. The framework consists of two complementary components: (i) a Python-based library with NumPy-like syntax that provides gate-level-accurate operation fault simulation, and (ii) an LLVM-based fault injection engine that operates at the intermediate-representation level, requires no application recoding, and supports fast, large-scale experiments. Together, they enable low-design-time, high-level fault simulation with fast execution and application-level evaluation.
Dongjae Lee

Dongjae Lee

Korea Advanced Institute of Science & Technology (KAIST)
"Architectural and System Software Support for PIM Integrated Systems"

Short Bio: Dongjae Lee is a third-year Ph.D. student at KAIST, advised by Professor Minsoo Rhu. He received his B.S. degree in Electrical Engineering from Korea University and his M.S. from KAIST. His research primarily focuses on system-level support for processing-in-memory (PIM) architectures. For more information, please visit his website (https://sites.google.com/view/dongjaelee/).

Abstract: Processing-in-Memory (PIM) is a promising solution for memory-intensive workloads, yet its adoption remains limited by system-level bottlenecks in data movement and memory management. This talk explores two critical pillars of PIM integration. First, we address host-to-PIM transfer overheads, which we identified through a detailed characterization of real-world PIM systems. To mitigate these overheads, we propose PIM-MMU, a hardware/software co-design featuring a dedicated copy engine and heterogeneity-aware mapping. This design significantly improves both throughput and energy efficiency. Second, because dynamic allocation is currently not adequately supported on PIM devices, we introduce PIM-malloc, a fast and scalable memory allocator for general-purpose PIM hardware. Our evaluation across representative workloads demonstrates that PIM-malloc effectively enhances system performance and programmability.
Kyungmo Koo

Kyungmo Koo

Hanyang University
"PIMphony: Overcoming Bandwidth and Capacity Inefficiency in PIM-based Long-Context LLM Inference Systems"

Short Bio: Kyungmo Koo is a second-year Ph.D. student in the integrated M.S./Ph.D. program at Hanyang University, advised by Professor Jungwook Choi, and a member of the AIHA (AI Hardware and Algorithm) Lab. He received his B.S. degree from Hanyang University. His research focuses on optimizing PIM-based memory-centric systems, with an emphasis on system architecture and end-to-end performance/efficiency.

Abstract: Long-context LLM inference is increasingly constrained by memory bandwidth and capacity. As context length grows, the KV cache footprint expands, limiting batching and keeping inference strongly memory-bound. Processing-in-Memory (PIM), particularly designs specialized for GEMV, offers a promising way to exploit high internal memory bandwidth to alleviate these bottlenecks. However, in long-context scenarios, existing PIM-based approaches often suffer from bandwidth underutilization and capacity inefficiency due to suboptimal data placement, static command scheduling and memory management. We present PIMphony, a system that co-optimizes bandwidth and capacity utilization in PIM-based long-context LLM serving to significantly improve inference performance.

Livestream

🔴 Can't attend in person? Join us live!

The workshop will be livestreamed on YouTube. A replay will also be available afterwards.

▶️ Watch on YouTube

Organizers

Ismail Emir Yuksel

Ismail Emir Yuksel

ETH Zürich

Ismail Emir Yuksel is a 2nd-year PhD student in the SAFARI Research Group at ETH Zurich under the supervision of Prof. Onur Mutlu. His current broader research interests are in computer architecture, processing-in-memory, and hardware security, focusing on understanding, enhancing, and exploiting fundamental computational capabilities of modern DRAM architectures. His recent publications show that commodity DRAM chips, without any modification to the chip itself (only with modifications to the memory controller), are able to execute bulk-bitwise computation and data movement operations (including NAND, NOR, NOT, AND, OR, MAJority, multi-row copy, and initialization functions) in a reasonably robust manner.

F. Nisa Bostanci

F. Nisa Bostanci

ETH Zürich

F. Nisa Bostanci is a fourth-year PhD student in the SAFARI Research Group at ETH Zurich, under the supervision of Prof. Onur Mutlu. She is broadly interested in computer architecture and, more specifically, in security, reliability, and safety (robustness) of memory systems, emerging memory and computation paradigms, including Processing-In-Memory architectures (PIM), and designing effective and efficient solutions to address robustness issues in modern and future systems. Her recent works uncover and mitigate new security vulnerabilities that emerge with the adoption of read disturbance solutions and PIM architectures to aid in designing robust future systems.

Ataberk Olgun

Ataberk Olgun

ETH Zürich

Ataberk Olgun is a senior PhD student at ETH Zurich, working with Prof. Onur Mutlu. His broad research interests include designing secure, high-performance, and energy-efficient DRAM architectures. Especially with the worsening RowHammer vulnerability, it is increasingly difficult to design new DRAM architectures that satisfy all three characteristics. His current research focuses on i) deeply understanding and ii) efficiently mitigating the RowHammer vulnerability in modern systems.

Dr. Zhiheng Yue

Dr. Zhiheng Yue

ETH Zürich

Zhiheng Yue is a postdoctoral researcher at ETH Zurich, working with Prof. Onur Mutlu. He received the B.S. degree in electronic science and technology from the Beijing University of Posts and Telecommunications, Beijing, China, in 2017, and the M.S. degree in electrical and computer engineering from the University of Michigan, Ann Arbor, MI, USA, in 2019, and the Ph.D. degree in electronic science and technology from Tsinghua University, Beijing, in 2024. His current research interests include deep learning, Processing-in-memory, AI acceleration, 3D stacking, and very-large-scale-integration (VLSI) design.

Dr. Mohammad Sadrosadati

Dr. Mohammad Sadrosadati

ETH Zürich

Mohammad Sadrosadati received the B.Sc., M.Sc., and Ph.D. degrees in Computer Engineering from Sharif University of Technology, Tehran, Iran, in 2012, 2014, and 2019, respectively. He spent one year, from April 2017 to April 2018, as an academic guest at ETH Zurich, hosted by Prof. Onur Mutlu during his Ph.D. program. He is currently a senior researcher and lecturer at ETH Zurich, working under the supervision of Prof. Onur Mutlu. His research interests are in the areas of heterogeneous computing, processing-in-memory, memory systems, and interconnection networks. Due to his achievements and impact on improving the energy efficiency of GPUs, he won the Khwarizmi Youth Award, one of the most prestigious awards, as the first laureate in 2020, to honor and embolden him to keep taking even bigger steps in his research career.

Dr. Geraldo F. Oliveira

Dr. Geraldo F. Oliveira

ETH Zürich

Geraldo F. Oliveira received a B.S. degree in computer science from the Federal University of Viçosa, Viçosa, Brazil, in 2015, an M.S. degree in computer science from the Federal University of Rio Grande do Sul, Porto Alegre, Brazil, in 2017, and a Ph.D. degree in computer science from ETH Zürich, Zürich, Switzerland, in 2025, advised by Prof. Onur Mutlu. His current research interests include system support for processing-in-memory and processing-using-memory architectures, data-centric accelerators for emerging applications, approximate computing, and emerging memory systems for consumer devices. He has several publications on these topics.

Prof. Onur Mutlu

Professor Onur Mutlu

ETH Zürich

Onur Mutlu is a Professor of Computer Science at ETH Zurich. He previously held the William D. and Nancy W. Strecker Early Career Professorship at Carnegie Mellon University. His research interests are in computer architecture, computing systems, hardware security, memory & storage systems, and bioinformatics, with a major focus on designing fundamentally energy-efficient, high-performance, and robust computing systems. He started the Computer Architecture Group at Microsoft Research (2006-2009), and held product, research and visiting positions at Intel Corporation, Advanced Micro Devices, VMware, Google, and Stanford University. He received various honors for his research, including the 2025 IEEE Computer Society Harry H. Goode Memorial Award “for seminal contributions to computer architecture research and practice, especially in memory systems.” He is an ACM Fellow, IEEE Fellow, and an elected member of the Academy of Europe. He enjoys teaching, mentoring, and enabling & democratizing access to high-quality research and education. He has supervised 25 PhD graduates, many of whom received major dissertation awards, 18 postdoctoral trainees, and more than 70 Master’s and Bachelor’s students. His computer architecture and digital logic design course lectures and materials are freely available on YouTube and his research group makes a wide variety of artifacts freely available online. For more information, please see his webpage at https://people.inf.ethz.ch/omutlu/.


Past Editions


Event Location

Venue

Sydney International Convention Centre (ICC)

Iron Wharf Place
Sydney
Australia

The workshop will be held in conjunction with HPCA 2026.

For registration and accommodation information, please visit the HPCA 2026 website.


Contact

For questions about the workshop, please contact the organizers:

General Inquiries: ismail.yuksel@safari.ethz.ch

SAFARI Research Group: safari.ethz.ch