- This event has passed.
Bin Sun’s PhD Dissertation Defense
December 9, 2022 @ 11:00 am - 1:00 pm
“Factorization guided Lightweight Neural Networks for Visual Analysis”
Prof. Yun Fu (Advisor)
Prof. Ming Shao
Prof. Lili Su
Deep learning has become popular in recent years primarily due to powerful computing devices such as GPUs. However, many applications such as face alignment, image classification, and gesture recognition need to be deployed to multimedia devices, smartphones, or embedded systems with limited resources. Thus, there is an urgent need for high-performance but memory-efficient deep learning models. For this, we design several lightweight deep learning models for different tasks with factorization strategies.
Specifically, we constructed a lightweight face alignment model by proposing a factorization-based deep convolution module named Depthwise Separable Block (DSB) and a light but practical module based on the spatial configuration of the faces. Experiments on four popular datasets verify that Block Mobilenet has better overall performance with less than 1MB storage size.
Besides the face analysis application, we also explored a general, lightweight deep learning module for image classification with low-rank pointwise residual (LRPR) convolution, called LRPRNet. Essentially, LRPR aims at using a low-rank approximation to factorize the pointwise convolution while keeping depthwise convolutions as the residual module to rectify the LRPR module. Moreover, our LRPR is quite general and can be directly applied to many existing network architectures.
Due to the success of the factorization strategy on image-based data, we extended factorization on time sequence data for Sign Language Recognition (SLR). We achieved the first rank in the challenge of SLR with the help of our proposed novel Separable Spatial-Temporal Convolution Network (SSTCN), which divides a 3D convolution on joint features into several stages , which help the SSTCN achieve higher accuracy with fewer parameters.
We also tried to factorize the features for single image super resolution (SISR). Factorization on features will reduce the feature size in order to reduce the computation costs. However, the reduction of the spatial size is counter-intuitive for the super resolution task. With our exploration, we demonstrated a network named Hybrid Pixel-Unshuffled Network (HPUN), which factorized the features to achieve the lightweight purpose while keeping high performance. Specifically, we utilized pixel-unshuffle operation to factorize the input features. After the factorization, we improved the performance by the grouped convolution, max-pooling, and self-residual. The experiments on popular benchmarks showed that the factorization strategy could achieve SOTA performance on SISR.