Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (2): 323-336.DOI: 10.3778/j.issn.1673-9418.2106004

• Surveys and Frontiers • Previous Articles     Next Articles

Progress on Human-Object Interaction Detection of Deep Learning

RUAN Chenzhao, ZHANG Xiangsen, LIU Ke, ZHAO Zengshun+()   

  1. College of Electronic and Information Engineering, Shandong University of Science and Technology, Qingdao, Shandong 266590, China
  • Received:2021-06-01 Revised:2021-08-06 Online:2022-02-01 Published:2021-08-19
  • About author:RUAN Chenzhao, born in 1996, M.S. candidate. His research interests include computer vision and image processing.
    ZHANG Xiangsen, born in 1997, M.S. candi-date. His research interests include deep learning and image processing.
    LIU Ke, born in 1998, M.S. candidate. His research interests include deep learning and ima-ge processing.
    ZHAO Zengshun, born in 1975, Ph.D., associate professor. His research interests include computer vision, intelligent robots and machine learning.
  • Supported by:
    Postdoctoral Science Foundation Funded Project of China(2015T80717);Natural Science Foundation of Shandong Province(ZR2020MF086)


阮晨钊, 张祥森, 刘科, 赵增顺+()   

  1. 山东科技大学 电子信息工程学院,山东 青岛 266590
  • 通讯作者: + E-mail:
  • 作者简介:阮晨钊(1996—),男,山东淄博人,硕士研究生,主要研究方向为计算机视觉、图像处理。
  • 基金资助:


The task of human-object interaction (HOI) detection takes the image as the input to detect the interaction between people and objects in the image and the interaction verbs between them. It is a new task besides target detection, image segmentation and target tracking in the field of computer vision, in order that the image can be understood deeply. Aiming at filling the gap in the current review article of HOI detection based on deep learning, the methods for HOI detection are classified and analyzed. Firstly, the early methods are summarized briefly, the two-stage methods and one-stage methods are investigated according to the structure of model, and some representative algorithms are analyzed and introduced. The two-stage methods are focused on, which are divided into 3 categories: attention-aware, graph model, posture and body parts. What’s more, the basic ideas, advantages and disadvantages of each type of method are summarized. Besides, the experimental evaluation metrics, the benchmark data sets of HOI detection and the experimental results of most existing methods are introduced in detail and the results obtained by different types of methods are described. Finally, the main challenges of this technology are summarized and the future direction of development is prospected.

Key words: human-object interaction (HOI) detection, computer vision, object detection, deep learning



关键词: 人-物体交互检测(HOI), 计算机视觉, 目标检测, 深度学习

CLC Number: