Lin & Kaeli to Work on $800K NSF Grant Collaboration
Electrical and Computer Engineering (ECE) Assistant Professor Xue (Shelley) Lin and College of Engineering Distinguished Professor David Kaeli will collaborate with City University of New York researchers to design an efficient deep learning system under a three-year, $800,000 National Science Foundation grant.
If successful, the grant to develop “A Framework of Simultaneous Acceleration and Storage Reduction on Deep Neural Networks Using Structured Matrices” will have a profound impact on a variety of deep learning applications with significant implications for autonomous systems/spaces of the future.
Deep learning models, a subset of traditional machine learning algorithms, use a network structure composed of multiple layers known as Deep Neural Networks or DNNs, which are designed to be able to extract features at multiple levels of abstraction. Deep learning requires training DNNs by feeding them a lot of data, which they can then use to become more intelligent and make decisions about new data.
The algorithms applied by Google for image recognition and article searches or by Amazon in its recommendations of future consumer purchases based on past purchase history are among the most familiar examples of deep learning systems in use today.
Expanding Deep Learning Applications
“DNNs require millions of parameters to perform complicated tasks,” says principal investigator Lin. “That, in turn, demands a lot of computation and parameter storage resources from the computing platform, which can potentially limit the use of deep learning in many applications. Our project will provide for efficient implementation in terms of computation speed and parameter storage such that we can seat deep learning systems into computing platforms that don’t have enough computing or storage resources.”
Lin notes that ultimately the research project will promote wider applications of deep learning systems. Among these applications are self-driving cars, unmanned aerial vehicles, and wearable devices. “These applications may need deep learning to perform specific tasks, but they are limited by their storage and computing power,” says Lin, “so we need to compress the model and accelerate computation.”
Under the National Science Foundation grant, the Northeastern team—which includes two graduate students—will develop the general framework, focusing on software development and different hardware platforms. CCNY’s team, which has expertise in matrix theory, will prove out the approach. “They will provide a theoretical foundation for our methods,” says Lin, “and that it can guarantee accuracy while compressing the model and accelerating computing. They will prove that our new DNN models can be applied to different applications with high accuracy.”
Given the growing trend toward autonomous systems, Lin says the project “is an important first step in the development of future autonomous systems and future ‘smart spaces’ such as smart homes and buildings.”
Abstract Source: NSF
Deep neural networks (DNNs) have emerged as a class of powerful techniques for learning solutions in a number of challenging problem domains, including computer vision, natural language processing and bioinformatics. These solutions have been enabled mainly because we now have computational accelerators able to sift though the myriad of data required to train a neural network. As the size of DNN models continues to grow, computational and memory resource requirements for training will also grow, limiting deployment of deep learning in many practical applications.
Leveraging the theory of structured matrices, this project will develop a general framework for efficient DNN training and inference, providing a significant reduction in algorithmic complexity measures in terms of both computation and storage.
The project, if successful, should fundamentally impact a broad class of deep learning applications. It will explore accelerating this new structure for deep learning algorithms targeting emerging accelerator architectures, and will evaluate the benefits of these advances across a number of application domains, including big data analytics, cognitive systems, unmanned vehicles and aerial systems, and wearable devices. The interdisciplinary nature of this project bridges the areas of matrix theory, machine learning, and computer architecture, and will affect education at both Northeastern and CCNY, including the involvement of underrepresented and undergraduate students in the rich array of research tasks.
The project will: (1) for the first time, develop a general theoretical framework for structured matrix-based DNN models and perform detailed analysis and investigation of error bounds, convergence, fast training algorithms, etc.; (2) develop low-space-cost and high-speed inference and training schemes for the fully connected layers of DNNs; (3) impose a weight tensor with structure and enable low computational and space cost convolutional layers; (4) develop high-performance and energy-efficient implementations of deep learning systems on high-performance parallel platforms, low-power embedded platforms, as well as emerging computing paradigms and devices; (5) perform a comprehensive evaluation of the proposed approaches on different performance metrics in a variety of platforms. The project will deliver tuned implementations targeting a range of computational platforms, including ASICs, FPGAs, GPUs and cloud servers. The hardware optimizations will focus on producing high-speed and low-cost implementations of deep learning systems.