Patent for Faster and More Efficient AI Applications on Smartphones
ECE Professor Yanzhi Wang was awarded a patent for “Computer-implemented methods and systems for DNN weight pruning for real-time execution on mobile devices.”
Abstract Source: USPTO
A computer-implemented method is disclosed for compressing a deep neural network (DNN) model by DNN weight pruning to accelerate DNN inference on mobile devices. The method includes the steps of (a) performing an intra-convolution kernel pruning of the DNN model wherein a fixed number of weights are pruned in each convolution kernel of the DNN model to generate sparse convolution patterns; (b) performing inter-convolution kernel pruning of the DNN model to generate connectivity sparsity, wherein inter-convolution kernel pruning comprises cutting connections between given input and output channels of the DNN model to remove corresponding kernels; and (c) training the DNN model compressed in steps (a) and (b).
