[ESL] Quant-PIM: An Energy-efficient Processing-in-memory Accelerator …

SMRL 0 953 2021.01.07 11:17

Young Seo Lee, Eui-Young Chung, Young-Ho Gong, and Sung Woo Chung, "Quant-PIM: An Energy-efficient Processing-in-memory Accelerator for Layer-wise Quantized Neural Networks", IEEE Embedded Systems Letters, vol.13, no.4, pp.162-165, December 2021.

Abstract

Layer-wise quantized neural networks (QNNs), which adopt different precisions for weights or activations in a layer-wise manner, have emerged as a promising approach for embedded systems. The layer-wise QNNs deploy only required number of data bits for the computation (e.g., convolution of weights and activations), which in turn reduces computation energy compared to the conventional QNNs. However, the layer-wise QNNs still cause a large amount of energy in the conventional memory systems, since memory accesses are not optimized for the required precision of each layer. To address this problem, we propose Qunat-PIM, an energy-efficient processing-in-memory (PIM) accelerator for layer-wise QNNs. Quant-PIM selectively reads only required data bits within a data word depending on the precision, by deploying the modified I/O gating logics in a 3D stacked memory. Thus, Quant-PIM significantly reduces energy consumption for memory accesses. In addition, Quant-PIM improves the performance of layer-wise QNNs. When the required precision is half of the weight (or activation) size or less, Quant-PIM reads two data blocks in a single read operation by exploiting the saved memory bandwidth from the selective memory access, thus providing higher compute-throughput. Our simulation results show that Quant-PIM reduces system energy by 39.1~50.4% compared to the PIM system with 16-bit quantized precision, without accuracy loss.

Comments

로그인한 회원만 댓글 등록이 가능합니다.

번호	제목	글쓴이	날짜	조회
12	[ISLPED] Temperature-aware Adaptive VM Allocation in Heterog…	SMRL	05.07	1706
11	[ISLPED] Exploring the Relation between Monolithic 3D L1 GPU…	SMRL	05.07	1903
10	[TC] Signal Strength-aware Adaptive Offloading with Local Im…	SMRL	09.02	1405
9	[ICCD] A High-Performance Processing-in-Memory Accelerator f…	SMRL	09.23	1759
8	[TC] An Adaptive Thermal Management Framework for Heterogene…	SMRL	01.27	1151
7	[ESL] Enhancing Matrix Multiplication with a Monolithic 3D B…	SMRL	05.11	1124
열람중	[ESL] Quant-PIM: An Energy-efficient Processing-in-memory Ac…	SMRL	01.07	954
5	[MICRO] On-demand Mobile CPU Cooling with Thin-Film Thermoel…	SMRL	02.22	958
4	[ESL] IDRA: An In-storage Data Reorganization Accelerator fo…	SMRL	03.10	995
3	[DATE] Stealth ECC: A Data-Width Aware Adaptive ECC Scheme f…	SMRL	11.11	783
2	[DATE] Twin ECC: A Data Duplication Based ECC for Strong DRA…	SMRL	11.16	607
1	[TETC] Near-Memory Computing with Compressed Embedding Table…	SMRL	12.20	224

Category

Publication Highlights

[ESL] Quant-PIM: An Energy-efficient Processing-in-memory Accelerator …

Comments