Loading Events

« All Events

  • This event has passed.

ECE PhD Proposal Review: Mengshu Sun

February 14, 2022 @ 10:00 am - 11:00 am

PhD Proposal Review: Deep Learning Acceleration on Edge Devices with Algorithm/System Co-Design

Mengshu Sun

Location: Zoom Link

Abstract: As deep learning has succeeded in a broad range of applications in recent years, there is an increasing trend towards deploying deep neural networks (DNNs) on edge devices such as FPGAs and mobiles. However, there exists a significant gap between the extraordinary accuracy of state-of-the-art DNNs and the efficient implementations on edge devices, due to their limited resources to DNNs with high computation and memory intensity. With the target of simultaneously accelerating the inference and maintaining the accuracy of DNNs, I investigate efficient implementation of deep learning on low-power and resource-constrained devices in this dissertation, leveraging algorithm/system co-design techniques that incorporate hardware-friendly DNN compression algorithms with system design optimizations.

In the first part of this dissertation, I explore the DNN compression algorithms leveraging weight pruning and quantization techniques. As for weight pruning, novel structured and fined-grained sparsity schemes are proposed and obtained with the reweighted regularization pruning algorithm, and then incorporated into acceleration frameworks on both FPGAs and mobiles to make the acceleration rate of sparse models approach the pruning rate of GFLOPs for the unpruned models. As for quantization, intra-layer mixed precision/scheme weight quantization is proposed to boost utilization of heterogeneous FPGA resources and therefore improving the FPGA throughput, by assigning multiple precisions and/or multiple schemes at the filter level within each layer and maintaining the same ratio of filters with different quantization assignments across all the layers.

In the second part of this dissertation, I study the system implementations, proposing an automatic DNN acceleration framework to generate DNN accelerators to satisfy a target frame rate (FPS). Unlike previous approaches that start from model quantization and then optimizing the FPS for hardware implementations, this automatic framework will provide an estimation of the FPS with the FPGA resource utilization analysis and performance analysis modules, and the bit-width is reduced until the target FPS is met and the ratio is automatically determined to guide the quantization process and the accelerator implementation on hardware. A resource utilization model is developed to overcome the difficulty in estimating the LUT consumption, and a novel computing engine for DNNs is designed with various optimization techniques in support of DNN compression to improve the computation parallelism and resource utilization efficiency.

Details

Date:
February 14, 2022
Time:
10:00 am - 11:00 am
Website:
https://northeastern.zoom.us/j/8201462640?pwd=QVB2T1dsWGJQVUtZUzlEWjFlN3d4QT09#success

Other

Department
Electrical and Computer Engineering
Topics
MS/PhD Thesis Defense