Yizeng Han

Ph.D Candidate, advised by Prof. Gao Huang and Prof. Shiji Song.
Department of Automation, Tsinghua University.


  • Ph.D, Tsinghua University, 2018 - present.
  • B.E., Tsinghua University, 2014 - 2018.

  • Research Experience

  • Intern, Georgia Institute of Technology, 06/2017 - 08/2017

  • Research Interest

    My research focuses on machine learning and computer vision, in particular deep learning, efficient inference and dynamic neural networks.


  • 10/2022: Awarded by National Scholarship, Ministry of Education of China.
  • 09/2022: Our work (latency-aware spatial-wise dynamic networks) is accepted by NeurIPS 2022.
  • 07/2022: Our work (learning to weight samples of dynamic early-exiting networks) is accepted by ECCV 2022.

  • Recent Publications & Preprints (Google Scholar)

    Dynamic Neural Networks: A Survey. [智源社区][机器之心-在线讲座][Bilibili][slides]
    Yizeng Han*, Gao Huang*, Shiji Song, Le Yang, Honghui Wang, Yulin Wang.
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI, IF=24.314), 2021
    In this survey, we comprehensively review the rapidly developing area, dynamic neural networks. The important research problems, e.g., architecture design, decision making scheme, and optimization technique, are reviewed systematically. We also discuss the open problems in this field together with interesting future research directions.
    Latency-aware Spatial-wise Dynamic Networks. [code] [slide]
    Yizeng Han*, Zhihang Yuan*, Yifan Pu*, Chenhao Xue, Shiji Song, Guangyu Sun, Gao Huang.
    Conference on Neural Information Processing Systems (NeurIPS), 2022
    In this paper, we use a latency predictor to guide both algorithm design and scheduling optimization of spatial-wise dynamic networks on various hardware platforms. We show that "coarse-grained" spatially adaptive computation can effectively reduce the memory access cost and shows superior efficiency than pixel-level dynamic operations.
    Learning to Weight Samples for Dynamic Early-exiting Networks. [code][slides][poster][video]
    Yizeng Han*, Yifan Pu*, Zihang Lai, Chaofei Wang, Shiji Song, Junfen Cao, Wenhui Huang, Chao Deng, Gao Huang.
    European Conference on Computer Vision (ECCV), 2022
    In this paper, we propose to bridge the gap between training and testing of dynamic early-exiting networks by sample weighting. By bringing the adaptive behavior during inference into the training phase, we show that the proposed weighting mechanism consistently improves the trade-off between classification accuracy and inference efficiency.
    Spatially Adaptive Feature Refinement for Efficient Inference.
    Yizeng Han, Gao Huang, Shiji Song, Le Yang, Yitian Zhang, Haojun Jiang.
    IEEE Transactions on Image Processing (TIP, IF=11.041), 2021
    We propose to perform efficient inference by adaptively fusing information from two branches: one conducts standard convolution on input features at a lower resolution, and the other one selectively refines a set of regions at the original resolution. Experiments on classification, object detection and semantic segmentation validate that SAR can consistently improve the network performance and efficiency.
    Resolution Adaptive Networks for Efficient Inference. [code]
    Le Yang*, Yizeng Han*, Xi Chen*, Shiji Song, Jifeng Dai, Gao Huang.
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020.
    We focus on the spatial redundancy of images, and propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying “easy” inputs, while only some “hard” samples need spatially detailed information. Empirically, we demonstrate the effectiveness of the proposed RANet in both the anytime prediction setting and the budgeted batch classification setting.
    Adaptive Focus for Efficient Video Recognition. [code]
    Yulin Wang, Zhaoxi Chen, Haojun Jiang, Shiji Song, Yizeng Han, and Gao Huang.
    IEEE/CVF International Conference on Computer Vision (ICCV Oral) 2021.
    In this paper, we explore the spatial redundancy in video recognition with the aim to improve the computational efficiency. Extensive experiments on five benchmark datasets, i.e., ActivityNet, FCVID, Mini-Kinetics, Something-Something V1&V2, demonstrate that our method is significantly more efficient than the competitive baselines.
    Towards Learning Spatially Discriminative Feature Representations.
    Chaofei Wang*, Jiayu Xiao*, Yizeng Han, Qisen Yang, Shiji Song, Gao Huang.
    IEEE/CVF International Conference on Computer Vision (ICCV) 2021.
    We propose CAM-loss to constrain the embedded feature maps with the class activation maps (CAMs) which indicate the spatially discriminative regions of an image for particular categories. Experimental results show that CAM-loss is applicable to a variety of network structures and can be combined with mainstream regularization methods to improve the performance of image classification.
    * Equal Contribution.


    • National Scholarship, Ministry of Education of China, 2022
    • Comprehensive Merit Scholarship, 2017, 2016 at Tsinghua University.
    • Academic Excellence Scholarship, 2015 at Tsinghua University.


    • hanyz18 at mails dot tsinghua dot edu dot cn.
    • 616 Centre Main Building, Tsinghua University, Beijing 100084, China.