Dr. Yanzhi Wang
Assistant Professor, 2015-present
329 Dana, 360 Huntington Avenue
Boston, MA 02115
Yanzhi Wang is currently an assistant professor in the Department of Electrical and Computer Engineering at Northeastern University. He has received his Ph.D. Degree in Computer Engineering from University of Southern California (USC) in 2014, under the supervision of Prof. Massoud Pedram. He received the Ming Hsieh Scholar Award (the highest honor in the EE Dept. of USC) for his Ph.D. study. He received his B.S. Degree in Electronic Engineering from Tsinghua University in 2009 with distinction from both the university and Beijing city.
Dr. Wang’s current research interests are the following. His group works on both algorithms and actual implementations (FPGAs, circuit tapeouts, mobile and embedded systems, GPUs, emerging devices, and UAVs).
- Energy-efficient and high-performance implementation of deep learning systems
- Model compression of deep neural networks (DNNs)
- Neuromorphic computing and non-von Neumann computing paradigms
- Cyber-security in deep learning systems
His research maintains the highest model compression rates on representative DNNs since 09/2018 (ECCV18, ASPLOS19, ICCV19), and also achieves the highest performance/energy efficiency in DNN implementations on many platforms (FPGA19, ISLPED19, HPCA19). It is worth mentioning that his work on AQFP superconducting based DNN inference acceleration, which is validated through cryogenic testing, has by far the highest energy efficiency among all hardware devices (ISCA19). His work has been published broadly in top conference and journal venues (e.g., ASPLOS, ISCA, MICRO, HPCA, ISSCC, AAAI, ICML, CVPR, ICLR, IJCAI, ECCV, ACM MM, ICDM, DAC, ICCAD, FPGA, LCTES, CCS, VLDB, ICDCS, TComputer, TCAD, JSAC, Nature SP, etc.), and has been cited for over 4,900 times according to Google Scholar. He has received four Best Paper Awards, has another seven Best Paper Nominations and three Popular Papers in IEEE TCAD. Multiple of his group’s work have been adopted by industry.
The first Ph.D. student of Yanzhi, Dr. Caiwen Ding, has graduated in June 2019, and will become a tenure-track assistant professor in Dept. of CSE at University of Connecticut. The second Ph.D. student, Ning Liu, will start as a superstar employee at DiDi AI Research (DiDi Inc.).
Yanzhi’s current research interests mainly focus on DNN model compression and energy-efficient implementation (on various platforms). His group has made the following two key contributions. The first is a systematic, unified DNN model compression framework (ECCV18, ASPLOS19, etc.) based on the powerful optimization tool ADMM (Alternating Direction Methods of Multipliers), which applies to non-structured and various types of structured weight pruning as well as weight quantization technique of DNNs. It achieves unprecedented model compression rates on representative DNNs, consistently outperforming competing methods. When weight pruning and quantization are combined, we achieve up to 6,645X weight storage reduction without accuracy loss (ICML19), which is two orders of magnitude higher than prior methods. Our most recent results suggest that non-structured weight pruning is not desirable at any hardware platform.
Based on the systematic model compression framework, the second contribution is the development of model compression and acceleration techniques for various hardware platforms (FPGA19, ISLPED19, HPCA19). The mapping framework from algorithm to hardware level must account for unique hardware characteristics. The hardware implementation shall not become over-specialized to a specific model compression technique, but should be general to both uncompressed and compressed DNNs. Compilers should act as the bridge between DNN algorithm to hardware level, maintaining highest parallelism degree (ICML19). Our ultimate goal is to develop DNN model compression that is desirable at all of theory, algorithm, compiler, and hardware levels.
A brief summary of the achievements of Yanzhi’s group on hardware accelerations of DNNs:
- For FPGA, currently the highest performance and energy efficiency in RNN and YOLO-based object detection implementations (FPGA19, HPCA19).
- For ASIC, the first solid-state tapeout of block-circulant based DNN acceleration framework, and the first on stochastic computing-based DNNs. Also (one of) the most energy efficient implementations. (AAAI19, ASPLOS17)
- For emerging devices, the first of DNN acceleration using superconducting technology and the highest energy efficiency among all hardware devices – 4 to 5 orders of magnitude higher than CMOS-based implementations. (ISCA19)
- For mobile devices, currently the fastest in DNN acceleration (e.g., 26ms inference time for ResNet-50 and less than 10ms for MobileNet-V2). All DNNs can potentially be real-time in mobile devices through algorithm-compiler-hardware co-design. (ICML19)
- 07/2019 Our work “Non-Structured DNN Weight Pruning Considered Harmful” is on Arxiv. It integrates our most recent progresses on DNN weight pruning and weight quantization. It has a strong conclusion that non-structured DNN weight pruning is not preferred on any platform. We suggest not to continue working on sparsity-aware DNN acceleration with non-structured weight pruning.
- 07/2019 Our collaborative paper with YNU “AQFP: Towards building extremely energy-efficient circuits and systems” has been accepted by Nature Scientific Reports.
- 06/2019 Yanzhi serves as session chair of DAC 2019.
- 06/2019 Yanzhi presents the EDA tool for superconducting electronics in DAC 2019 Birds-of-a-Feather meeting “Open-Source Academic EDA Software”.
- 06/2019 Yanzhi organizes a panel on the superconducting EDA in SLIP workshop collocated with DAC 2019.
- More news