Home

Dr. Yanzhi Wang

Associate Professor and Faculty Fellow

Department of Electrical & Computer Engineering, College of Engineering,

Khoury College of Computer Science (Affiliated),

Northeastern University

B.S. (Tsinghua), Ph.D. (University of Southern California)

329 Dana, 360 Huntington Avenue
Boston, MA 02115
Phone: 617.373.8805
Email: yanz.wang@northeastern.edu

Youtube channel and Bilibili channel:


About:

Yanzhi Wang is currently an Associate Professor and Faculty Fellow in the Department of Electrical and Computer Engineering, and Khoury College of Computer Science (Affiliated) at Northeastern University. He has received his Ph.D. Degree in Computer Engineering from University of Southern California (USC) in 2014, under the supervision of Prof. Massoud Pedram. He received the Ming Hsieh Scholar Award (the highest honor in the EE Dept. of USC) for his Ph.D. study. He received his B.S. Degree in Electronic Engineering from Tsinghua University in 2009 with distinction from both the university and Beijing city.

Dr. Wang’s current research interests are the following. His group works on both algorithms and actual implementations (mobile and embedded systems, FPGAs, circuit tapeouts, GPUs, emerging devices, and UAVs).

  • Real-time and energy-efficient deep learning and artificial intelligence systems
  • Model compression and mobile acceleration of deep neural networks (DNNs)
  • Deep learning acceleration for autodriving
  • Neuromorphic computing and non-von Neumann computing paradigms
  • Cyber-security in deep learning systems

For a brief list of technical achievements, his research (i) achieves and maintains the highest model compression rates on representative DNNs since 09/2018 (ECCV18, ASPLOS19, ICCV19, ISLPED19, ASP-DAC20, AAAI20-1, AAAI20-2, DAC21, CVPR21, ICLR22, etc.), (ii) achieves, for the first time, real-time and fastest execution of representative large-scale DNNs on an off-the-shelf mobile device (ASPLOS20, AAAI20, ICML19, IJCAI20, ECCV20, DAC20, AAAI21-1, AAAI21-2, NeurIPS21, PLDI21, MICRO22, CACM, TPAMI, etc.), (iii) achieves the highest performance/energy efficiency in DNN implementations on many platforms (FPGA19, ISLPED19 , AAAI19, HPCA19, ISSCC19, ASP-DAC20 , DATE20, AAAI20, PLDI20, ICS20, IJCAI20, PACT20, HPCA21, JSSC21). It is worth mentioning that his work on AQFP superconducting based DNN inference acceleration, which is validated through cryogenic testing, has by far the highest energy efficiency among all hardware devices (ISCA19, ICCAD18, ICCAD20, DAC22).

His research works have been published broadly in top conference and journal venues, ranging from (i) EDA, solid-state circuit and system conferences such as DAC, ICCAD, DATE, ISLPED, FPGA, LCTES, ISSCC, RTAS, etc., (ii) architecture and computer system conferences such as ASPLOS, ISCA, MICRO, HPCA, CCS, VLDB, PLDI, ICS, PACT, INFOCOM, ICDCS, etc., (iii) machine learning algorithm conferences such as AAAI, CVPR, NeurIPS, ICML, ICCV, ICLR, IJCAI, ECCV, ACM MM, ICDM, etc., and (iv) IEEE and ACM transactions (including Communications of ACM, Proc. of IEEE, JSSC, TPAMI, etc.) and Nature and Science series journals. He ranks No. 2 in CSRankings at Northeastern University in the past 10 years, and around No. 35 throughout the U.S. His research works have been cited for above 12,000 times according to Google Scholar with H-index 55.

He has received six Best Paper or Top Paper Awards (ISLPED’14, IEEE CLOUD’14, ISVLSI’14, ICASSP’17, KDD Workshop’19, ICLR Workshop’21), one Communications of ACM Featured Article (Article) (Interview Video), has another 12 Best Paper Nominations (GLS-VLSI’13, IEEE TCAD’13, ASP-DAC’15, ISLPED’17, ASP-DAC’17, ISQED’18, ASP-DAC’18, DATE’19, ICCAD’19, DATE’20, DATE’21, ICLR Workshop’22) and four Popular Papers in IEEE TCAD. He received the U.S. Army Research Office Young Investigator Award, IEEE TC-SDM Early Career Award, Faculty Fellow Award, Martin W. Essigmann Excellence in Teaching Award, etc. Besides, his group has received Massachusetts Acorn Innovation Award, Google Equipment Research Award, MathWorks Faculty Award, MIT Tech Review TR35 China Finalist, Ming Hsieh Scholar Award, Young Student Support Award of DAC (for himself and six of his Ph.D. students), DAC Service Award, etc. His group and students have received first place in ISLPED Design Contest twice (2012, 2020), first place in Student Research Competition at CGO 2021, and awards in multiple other contests such as Low Power Computer Vision Challenge 2019 and NeurIPS MicroNet Challenge 2019.

Yanzhi has delivered over 120 invited technical presentations on research of real-time and efficient deep learning systems. His research works have been broadly featured and cited in around 600 media, including Boston Globe, Communications of ACM (three times), VentureBeat, The Register, Medium, The New Yorker, Wired, NEU News, Import AI, Italian National TV, MRS TV, Quartz, ODSC, MIT Tech Review, TechTalks, IBM Research Blog, ScienceDaily, AAAS, CNET, ZDNet, New Atlas, Tencent News, Sina News, to name a few.

The first Ph.D. student of Yanzhi, Caiwen Ding, has graduated in June 2019, and has become a tenure-track assistant professor in Dept. of CSE at University of Connecticut. The second Ph.D. student, Ning Liu, will start as a superstar employee at DiDi AI Research (DiDi Inc.). The third Ph.D. student, Ao Ren, is currently joining School of CS at Chongqing University as full professor (with tenure). The fourth Ph.D. student, Ruizhe Cai, has joined Facebook Infrastructure. The fifth Ph.D. student, Sheng Lin, has joined Tencent U.S. as research scientist. The postdoc/visiting scholar, Chen Pan, has joined Dept. of CSE at Texas A&M Corpus Christi, as tenure-track assistant professor. His co-advised Ph.D. student, Tianyun Zhang, has joined Dept. of ECE at Cleveland State University as assistant professor. Recently, his Ph.D. student Xiaolong Ma will join Dept. of ECE at Clemson University as assistant professor.

Ph.D., Postdoc, and Visiting Scholar/Students Positions Available: Northeastern University has been rising thanks to the strong leadership and efforts from faculty members. The university is located in between the famous Museum of Fine Arts (MFA) and Boston Symphony and Berkelee College of Music, the Best Location at Boston! Please apply to NEU.


CoCoPIE (A Representative Contribution):

Assuming hardware is the major constraint for enabling real mobile intelligence, the industry has mainly dedicated their efforts to developing specialized hardware accelerators for machine learning inference. Billions of dollars have been spent to fuel this intelligent hardware race. We challenge this assumption. By drawing on a recent real-time AI optimization framework CoCoPIE, it maintains that with effective compression-compiler co-design, it is possible to enable real-time artificial intelligence (AI) on mainstream end devices without special hardware.

The principle of compression-compilation co-design is to design the compression of Deep Learning Models and their compilation to executables in a hand-in-hand manner. This synergistic method can effectively optimize both the size and speed of Deep Learning models, and also can dramatically shorten the tuning time of the compression process, largely reducing the time to the market of AI products. CoCoPIE holds numerous records on mobile AI: the first time to support all kinds of DNNs including CNNs, RNNs, transformer and language models, etc.; the fastest DNN pruning and acceleration framework, up to 180X faster compared with current frameworks such as TensorFlow-Lite; a majority of representative DNNs and applications can be executed in real-time, for the first time, in off-the-shelf mobile devices; CoCoPIE framework on general-purpose mobile devices even outperforms a number of representative ASIC and FPGA solutions in terms of energy efficiency and/or performance.

More Info about CoCoPIE: Official webpage https://www.cocopie.ai/; CoCoPIE Youtube Channel here and Bilibili Channel here; CoCoPIE description paper and demonstration paper.


Two Representative Contributions:

Yanzhi’s group has made the following two key contributions on DNN model compression and acceleration. The first is a systematic, unified DNN model compression framework (ECCV18, ASPLOS19, ICCV19, ISLPED19, ASP-DAC20, AAAI20-1, AAAI20-2, DAC21, CVPR21, ICLR22, etc.) based on the powerful mathematical optimization tool ADMM (Alternating Direction Methods of Multipliers) and other advanced mathematical optimization tools, which applies to non-structured and various types of structured weight pruning as well as weight quantization technique of DNNs. It achieves unprecedented model compression rates on representative DNNs, consistently outperforming competing methods. When weight pruning and quantization are combined, we achieve up to 6,645X weight storage reduction without accuracy loss, which is two orders of magnitude higher than prior methods. Our most recent results (on Arxiv) suggest that non-structured weight pruning is not desirable at any hardware platform.

Recently, the second major contribution has been made (ASPLOS20, AAAI20, ICML19, IJCAI20, ECCV20, DAC20, AAAI21-1, AAAI21-2, NeurIPS21, PLDI21, MICRO22, CACM, TPAMI, etc.) based on the ADMM solution framework. The compiler has been identified as the bridge between DNN algorithm-level compression and hardware-level acceleration, maintaining highest possible parallelism degree without accuracy compromise. Using mobile device (embedded CPU/GPU) as an example, we have developed a novel category of fine-grained structured pruning schemes, such as pattern-based and block-based pruning, possessing both flexibility (and high accuracy) and regularity (and then hardware parallelism and acceleration). Accuracy and hardware performance are not a tradeoff anymore. Rather, it is possible for DNN model compression to be desirable at all of theory, algorithm, compiler, and hardware levels. For mobile devices, we achieve undoubtfully the fastest in DNN acceleration (e.g., 1.6ms on an off-the-shelf mobile phone with over 80% ImageNet accuracy), even outperforming prior work on FPGA and ASIC in many cases. All DNNs can be potentially be real-time in mobile devices through our algorithm-compiler-hardware co-design.


Recent News:  

  • 09/2022 [Paper] Our major survey paper “Survey: Exploiting Data Redundancy for Optimization of Deep Learning” accepted in ACM Computing Surveys, 2022 (Impact Factor 14.32).
  • 09/2022 [Talk] Yanzhi remotely presented compression-compilation co-design for real-time DNN acceleration to NeuroXess Inc..
  • 09/2022 [Paper] One collaborative paper “Pruning Adversarially Robust Neural Networks without Adversarial Examples” accepted in ICDM 2022.
  • 08/2022 [Committee] Yanzhi will serve as committee member of MLSys 2023.
  • 08/2022 [Paper] One paper “Memristor-Based Spectral Decomposition of Matrices and Applications” accepted in IEEE Trans. on Computers 2022.
  • 08/2022 [Award] Our group has received the 2nd place award in HAET Workshop at ICLR 2022.
  • 08/2022 [Student] Co-advised Ph.D. student Mengshu Sun has accepted an offer as Associate Professor, Beijing University of Technology starting at Fall 2022.
  • 07/2022 [Paper] One collaborative paper on mobile DSP acceleration of DNNs accepted in MICRO 2022.
  • 07/2022 [Paper] Two papers “All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management”, “Quantum Neural Network Compression” accepted in ICCAD 2022.
  • 07/2022 [Panel] Yanzhi serves as a panelist at the ICML Continual Learning Workshop, 2022.
  • 07/2022 [Paper] Our major survey paper “A survey for deep reinforcement learning in markovian cyber-physical systems: Common problems and solutions” accepted in Elsevier Neural Networks Journal, 2022 (Impact Factor 9.7).
  • 07/2022 [Committee] Yanzhi will serve as committee member of ICLR 2023.
  • 07/2022 [Committee] Yanzhi serves as an organizer of the ROAD4NN workshop co-located with DAC 2022.
  • 07/2022 [Panel] Yanzhi serves as a panelist at the DAC Early Career Panel, 2022.
  • 07/2022 [Paper] Three leading papers accepted in ECCV 2022 from our group, including “Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution”, “SPViT: Enabling Faster Vision Transformers via Soft Token Pruning”, and “You Already Have It: A Generator-Free Low-Precision DNN Training Framework using Stochastic Rounding”.
  • 06/2022 [Promotion] Yanzhi got tenure and promotion to the rank of Associate Professor.
  • 06/2022 [Organizer] Yanzhi will co-organize the HALO Workshop co-located with ICCAD 2022.
  • 06/2022 [Talk] Yanzhi presented superconducting design automation and DNN acceleration at DisCOVER workshop at ISCA 2022.
  • 06/2022 [Paper] One collaborative paper “A 28nm 198.9 TOPS/W Fault-Tolerant Stochastic Computing Neural Network Processor” accepted in IEEE Solid-State Circuits Letters 2022.
  • 05/2022 [Award] Yanzhi is awarded Faculty Fellow in College of Engineering.
  • 05/2022 [Talk] Yanzhi remotely presented compression-compilation co-design for real-time DNN acceleration to Bose Inc..
  • 05/2022 [Grant] Yanzhi’s group has received a research gift grant from Snap Inc. U.S. Thanks Snap!
  • 05/2022 [Paper] One collaborative paper “Coarsening the granularity: towards structurally sparse lottery tickets” accepted in ICML 2022.
  • 05/2022 [Paper] One collaborative paper “Neural Network-based OFDM Receiver for Resource Constrained IoT Devices” accepted in IEEE Internet of Things Magazine 2022.
  • 04/2022 [Grant] We are awarded the $15M NSF Expedition Award for the “Expeditions: DISCoVER: Design and Integration of Superconducting Computation for Ventures beyond Exascale Realization.” (link) Thanks NSF!
  • 04/2022 [Student] Ph.D. student Xiaolong Ma has accepted an offer as a Tenure-Track Assistant Professor in Department of Electrical and Computer Engineering at Clemson University, starting at Fall 2022.
  • 04/2022 [Paper] Three leading papers accepted in IJCAI 2022 from our group, including “Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization”, “Real-Time Portrait Stylization on the Edge”, and “Learning to generate image source-agnostic universal adversarial perturbations”.
  • 04/2022 [Committee] Yanzhi will serve as committee (ERC) member of HPCA 2022.
  • 04/2022 [Committee] Yanzhi will serve as Associate Editor of IEEE TCAS-II.
  • 04/2022 [Talk] Yanzhi remotely presented compression-compilation co-design for real-time DNN acceleration to Army Research Office.
  • 03/2022 [Committee] Yanzhi will serve as committee member of NeurIPS 2022.
  • 03/2022 [Paper] Mengshu and Zhengang’s work “Hardware-Friendly Acceleration for Deep Neural Networks with Micro-Structured Compression” accepted in FCCM 2022.
  • 03/2022 [Paper] One collaborative paper “Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card” accepted in RTAS (Industry Paper) 2022.
  • 03/2022 [Paper] One collaborative paper “Optimizing data layout for trainining deep neural networks” accepted in the Proc. of WWW, 2022.
  • 03/2022 [Panel] Yanzhi serves as a panelist at the MHI Scholar Panel at USC, 2022.
  • 02/2022 [Paper] Three leading papers accepted in DAC 2022 from our group, including “TAAS: A Timing-Aware Analytical Strategy for AQFP-Capable Placement Automation”, “FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization”, and “Hardware-efficient stochastic rounding unit design for DNN training”.
  • 01/2022 [Paper] Three papers (leading or collaborative) accepted in ICLR 2022 from our group, including “F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization”, “Effective Model Sparsification by Scheduled Grow-and-Prune Methods”, and “Reverse Engineering of Imperceptible Adversarial Image Perturbations”.
  • 01/2022 [Paper] One collaborative paper “DCT-RAM: A Driver-Free Process-In-Memory 8T SRAM Macro with Multi-Bit Charge-Domain Computation and Time-Domain Quantization” accepted in CICC 2022.
  • More news

Research Sponsors: