TAIT: One-Shot Full-Integer Lightweight DNN Quantization via Tunable Activation Imbalance Transfer

Weixiong Jiang, Heng Yu, Xinzhe Liu, Hao Sun, Rui Li, Yajun Ha

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

8 Citations (Scopus)

Abstract

Both parameter quantization and depthwise convolution are essential measures to provide high-accuracy, lightweight, and resource-friendly solutions when deploying deep neural networks (DNNs) onto edge-AI devices. However, combining the two methodologies may lead to adverse effects: It either suffers from significant accuracy loss or long finetuning time. Besides, contemporary quantization methods are only selectively applied to weight and activation values but not bias and scaling factor values, making them less practical for ASIC/FPGA accelerators. To solve these issues, we propose a novel quantization framework that is effectively optimized for depthwise convolution networks. We discover that the uniformity of the value range within a tensor can serve as a predictor for the tensor's quantization error. Under the guidance of this predictor, we develop a mechanism called Tunable Activation Imbalance Transfer (TAIT), which tunes the value range uniformity between an activated feature map and its latter weights. Moreover, TAIT fully supports full-integer quantization. We demonstrate TAIT on SkyNet and deploy it on FPGA. Compared to the state-of-the-art, our quantization framework and system design achieve 2.2%+ IoU, 2.4 × speed, and 1.8 × energy efficiency improvements, without any requirement of finetuning.

Original languageEnglish
Title of host publication2021 58th ACM/IEEE Design Automation Conference, DAC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1027-1032
Number of pages6
ISBN (Electronic)9781665432740
DOIs
Publication statusPublished - 5 Dec 2021
Event58th ACM/IEEE Design Automation Conference, DAC 2021 - San Francisco, United States
Duration: 5 Dec 20219 Dec 2021

Publication series

NameProceedings - Design Automation Conference
Volume2021-December
ISSN (Print)0738-100X

Conference

Conference58th ACM/IEEE Design Automation Conference, DAC 2021
Country/TerritoryUnited States
CitySan Francisco
Period5/12/219/12/21

ASJC Scopus subject areas

  • Computer Science Applications
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'TAIT: One-Shot Full-Integer Lightweight DNN Quantization via Tunable Activation Imbalance Transfer'. Together they form a unique fingerprint.

Cite this