This event has passed.

Zhenglun Kong PhD Dissertation Defense

July 23, 2024 @ 9:00 pm - 10:00 pm

Name:
Zhenglun Kong

Title:
Towards Efficient Deep Learning for Vision and Language Applications

Date:
7/23/2024

Time:
9:00:00 PM

Committee Members:
Prof. Yanzhi Wang (Advisor)
Prof. David Kaeli
Prof. Dakuo Wang
Prof. Weiyan Shi

Abstract:
Machine learning and AI have been advancing rapidly in recent years, leading to numerous applications across diverse fields such as autonomous vehicles, entertainment, science, healthcare, and assistive technologies—significantly enhancing daily life. However, this advancement has been accompanied by a significant increase in the size of deep neural network (DNN) models, which poses considerable economic challenges. The substantial costs associated with the training, inference, and deployment of large vision and language models require extensive computational resources and time, proving especially taxing for smaller entities and individuals. This also complicates deployment on resource-constrained devices and in areas with limited infrastructure.

A major challenge is deploying AI models on devices with limited capacity, such as wearables, sensors, and mobile phones. These edge devices, often operating offline and requiring real-time processing, are critical for many applications but struggle to support large models. My dissertation research addresses these pressing issues with the aim of enabling the practical implementation of AI. We ensure the effectiveness of AI models while adapting them for use in constrained environments by tackling fundamental AI challenges from four angles:

1. Managing Massive Computation: We introduce a novel token pruning framework that reduces the latency of Vision Transformers (ViT) by up to 41% compared to existing works on mobile devices. Additionally, we propose a quantization framework for large language models (LLMs), achieving an on-device speedup of up to 2.55x compared to FP16 counterparts across multiple edge devices.

2. Mitigating Training Costs: We develop fast, accurate, and memory-efficient training methods by utilizing a hierarchical data redundancy reduction scheme, which achieves up to a 40% speedup in ViT pre-training with minimal accuracy loss.

3. Merging Multiple Models: We propose an efficient way to merge multiple LLMS, yielding a more advanced and robust LLM while maintaining the model size, as well as reducing knowledge interference.

4. Co-designing Speed-aware Deep Neural Networks: We consider memory access cost, the degree of parallelism, and practical latency in the design of 2D and 3D object detection models for practical deployment.  By addressing these areas, my research aims to enable the effective and efficient use of AI models in constrained environments, ensuring their practical implementation across various applications.

Details

Date:: July 23, 2024
Time:: 9:00 pm - 10:00 pm

Organizer

: Electrical and Computer Engineering
Phone: 617.373.7529
: View Organizer Website

Other

Department: Electrical and Computer Engineering
Topics: MS/PhD Thesis Defense
Audience: Graduate, PhD