Journal of Frontiers of Computer Science and Technology ›› 2022, Vol. 16 ›› Issue (5): 991-1007.DOI: 10.3778/j.issn.1673-9418.2110022
• Surveys and Frontiers • Previous Articles Next Articles
YANG Gang1, ZHANG Yushu1, SONG Zhen2,+()
Received:
2021-10-13
Revised:
2022-01-06
Online:
2022-05-01
Published:
2022-05-19
About author:
YANG Gang, born in 1977, Ph.D., associate professor, member of CCF. His research interests include computer graphics, virtual reality, etc.Supported by:
通讯作者:
+ E-mail: songzhen@zhongxi.cn作者简介:
杨刚(1977—),男,山西长治人,博士,副教授,CCF会员,主要研究方向为计算机图形学、虚拟现实等。基金资助:
CLC Number:
YANG Gang, ZHANG Yushu, SONG Zhen. Human Action Recognition and Evaluation—Differences, Connections and Research Progress[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(5): 991-1007.
杨刚, 张宇姝, 宋震. 人体动作识别与评价——区别、联系及研究进展[J]. 计算机科学与探索, 2022, 16(5): 991-1007.
Add to citation manager EndNote|Ris|BibTeX
URL: http://fcst.ceaj.org/EN/10.3778/j.issn.1673-9418.2110022
数据集 | 年份 | 类别数 | 样本量 | 数据模态 | 主要内容 |
---|---|---|---|---|---|
UCF101[ | 2012 | 101 | 13 320 | RGB | BBC/ESPN广播电视频道和YouTube的一系列数据集 |
HMDB51[ | 2011 | 51 | 6 766 | RGB | 从各种互联网资源和数字视频中收集的人类日常行为 |
YouTube-8M | 2016 | 4 716 | 8 000 000 | RGB | 包含800万个YouTube视频,提供视频级别的注释并标记为4 800个知识图谱实体 |
MuHAVi[ | 2010 | 17 | 1 904 | RGB | 人类动作视频数据,包括手动标注的轮廓数据 |
ActivityNet | 2016 | 200 | 20 000 | RGB | 涵盖了200种不同的日常活动,共计约700 h的视频,平均每个视频上有1.5个动作标注 |
MSR Action 3D[ | 2010 | 20 | 567 | RGB+D&skeleton | 记录了20个动作,10个主体,每个对象执行每个动作2~3次,共有567个深度图序列,分辨率为640×240像素 |
NTU RGB+D[ | 2016 | 60 | 56 000 | RGB&RGB+D&skeleton | 主要分为3类:(1)日常行为;(2)医疗卫生相关行为;(3)两人互动 |
NTU RGB+D 120[ | 2019 | 120 | 114 480 | RGB&RGB+D&skeleton | 主要分为3类:(1)日常行为;(2)医疗卫生相关行为;(3)两人互动 |
G3D[ | 2012 | 20 | 10 | RGB&RGB+D&skeleton | 包含一系列使用 Microsoft Kinect 捕获的游戏动作 |
Table 1 Commonly used publicly available action recognition datasets
数据集 | 年份 | 类别数 | 样本量 | 数据模态 | 主要内容 |
---|---|---|---|---|---|
UCF101[ | 2012 | 101 | 13 320 | RGB | BBC/ESPN广播电视频道和YouTube的一系列数据集 |
HMDB51[ | 2011 | 51 | 6 766 | RGB | 从各种互联网资源和数字视频中收集的人类日常行为 |
YouTube-8M | 2016 | 4 716 | 8 000 000 | RGB | 包含800万个YouTube视频,提供视频级别的注释并标记为4 800个知识图谱实体 |
MuHAVi[ | 2010 | 17 | 1 904 | RGB | 人类动作视频数据,包括手动标注的轮廓数据 |
ActivityNet | 2016 | 200 | 20 000 | RGB | 涵盖了200种不同的日常活动,共计约700 h的视频,平均每个视频上有1.5个动作标注 |
MSR Action 3D[ | 2010 | 20 | 567 | RGB+D&skeleton | 记录了20个动作,10个主体,每个对象执行每个动作2~3次,共有567个深度图序列,分辨率为640×240像素 |
NTU RGB+D[ | 2016 | 60 | 56 000 | RGB&RGB+D&skeleton | 主要分为3类:(1)日常行为;(2)医疗卫生相关行为;(3)两人互动 |
NTU RGB+D 120[ | 2019 | 120 | 114 480 | RGB&RGB+D&skeleton | 主要分为3类:(1)日常行为;(2)医疗卫生相关行为;(3)两人互动 |
G3D[ | 2012 | 20 | 10 | RGB&RGB+D&skeleton | 包含一系列使用 Microsoft Kinect 捕获的游戏动作 |
类别 | 方法分类 | 相关工作与方法 | 优缺点 | |
---|---|---|---|---|
基于统计模型的方法 | 模板匹配法 | ASM、AAM、MHI、MEI基于二维网格模板特征的匹配方法DTW(动态时间规整算法) | 实现简单,计算复杂度低,但精度低,鲁棒性差 | |
状态空间法 | HMMs | HMMs HHMMsS-HSMM基于多尺度特征的双层隐马尔可夫模型 | 精度较高,但鲁棒性差,计算复杂度高 | |
DBN | Du等人[ | 精度较高,计算复杂度较低,但设计复杂度高,鲁棒性差 | ||
支持向量机法 | Pontil等人[ | 精度高,设计复杂度低,但鲁棒性差,对大规模训练样本难以实施 | ||
基于深度学习的方法 | CNN | Mohamed等人[ | 精度非常高,鲁棒性强,处理高维数据能力强,但计算复杂度高,需要调参数 | |
双流网络 | Simonyan等人[ | 精度非常高,鲁棒性强,但计算复杂度高,速度慢 | ||
CNN-LSTM结构 | Donahue等人[ LRCNUnsupervised+LSTM(无监督的LSTM模型) LSCN[ | 精度非常高,鲁棒性强,且计算速度快 |
Table 2 Summary of action classification methods
类别 | 方法分类 | 相关工作与方法 | 优缺点 | |
---|---|---|---|---|
基于统计模型的方法 | 模板匹配法 | ASM、AAM、MHI、MEI基于二维网格模板特征的匹配方法DTW(动态时间规整算法) | 实现简单,计算复杂度低,但精度低,鲁棒性差 | |
状态空间法 | HMMs | HMMs HHMMsS-HSMM基于多尺度特征的双层隐马尔可夫模型 | 精度较高,但鲁棒性差,计算复杂度高 | |
DBN | Du等人[ | 精度较高,计算复杂度较低,但设计复杂度高,鲁棒性差 | ||
支持向量机法 | Pontil等人[ | 精度高,设计复杂度低,但鲁棒性差,对大规模训练样本难以实施 | ||
基于深度学习的方法 | CNN | Mohamed等人[ | 精度非常高,鲁棒性强,处理高维数据能力强,但计算复杂度高,需要调参数 | |
双流网络 | Simonyan等人[ | 精度非常高,鲁棒性强,但计算复杂度高,速度慢 | ||
CNN-LSTM结构 | Donahue等人[ LRCNUnsupervised+LSTM(无监督的LSTM模型) LSCN[ | 精度非常高,鲁棒性强,且计算速度快 |
方法类别 | 相关工作 | 评价对象 | 标准/方法 |
---|---|---|---|
动作评价的可视化工具 | 陈学梅[ | 高尔夫挥杆动作 | 关节角度 |
李奎[ | 羽毛球挥拍动作 | 切比雪夫距离 | |
王台瑞[ | 京剧 | 专家和机器分别打分 | |
在特征描述中引入专家知识 | 陈学梅[ | 高尔夫挥杆动作 | 关节角度 |
Zhang等人[ | 竞技健美操 | 人体动力学 | |
Alexiadis等人[ | 舞蹈 | 四元数特征 | |
Patrona等人[ | 医疗训练 | 动态加权、动能描述符 | |
基于专家知识制定动作规范 | 李睿敏[ | 发展性协调障碍症 | 基于时域滤波的CNN |
Richter等人[ | 髋外展、髋伸展和髋弯曲 | 基于规则和标签 | |
徐铮[ | 24式太极拳 | CCA | |
基于大数据的动作评价 | 吕默等人[ | 体操 | 大数据 |
Table 3 Summary of action evaluation methods
方法类别 | 相关工作 | 评价对象 | 标准/方法 |
---|---|---|---|
动作评价的可视化工具 | 陈学梅[ | 高尔夫挥杆动作 | 关节角度 |
李奎[ | 羽毛球挥拍动作 | 切比雪夫距离 | |
王台瑞[ | 京剧 | 专家和机器分别打分 | |
在特征描述中引入专家知识 | 陈学梅[ | 高尔夫挥杆动作 | 关节角度 |
Zhang等人[ | 竞技健美操 | 人体动力学 | |
Alexiadis等人[ | 舞蹈 | 四元数特征 | |
Patrona等人[ | 医疗训练 | 动态加权、动能描述符 | |
基于专家知识制定动作规范 | 李睿敏[ | 发展性协调障碍症 | 基于时域滤波的CNN |
Richter等人[ | 髋外展、髋伸展和髋弯曲 | 基于规则和标签 | |
徐铮[ | 24式太极拳 | CCA | |
基于大数据的动作评价 | 吕默等人[ | 体操 | 大数据 |
[1] |
DURIC Z, GRAY W, HEISHMAN R, et al. Integrating perce-ptual and cognitive modeling for adaptive and intelligent human-computer interaction[J]. Proceedings of the IEEE, 2002, 90(7): 1272-1289.
DOI URL |
[2] | KWAK S, HAN B, HAN J H. Scenario-based video event recognition by constraint flow[C]// Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recog-nition, Colorado Springs, Jun 20-25, 2011. Washington: IEEE Computer Society, 2011: 3345-3352. |
[3] | GAUR U, ZHU Y, SONG B, et al. A “string of feature graphs” model for recognition of complex activities in natural videos[C]// Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Nov 6-13, 2011. Washington: IEEE Computer Society, 2011: 2595-2602. |
[4] | PARK S, AGGARWAL J K. Recognition of two-person inte-ractions using a hierarchical Bayesian network[C]// Proceedings of the 2003 ACM SIGMM International Workshop on Video Surveillance. New York: ACM, 2003: 65-76. |
[5] |
JUNEJO I, DEXTER E, LAPTEV I, et al. View-independent action recognition from temporal self-similarities[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(1): 172-185.
DOI URL |
[6] | THANGALI A, NASH J P, SCLAROFF S, et al. Exploiting phonological constraints for handshape inference in ASL video[C]// Proceedings of the 24th IEEE Conference on Com-puter Vision and Pattern Recognition, Colorado Springs, Jun 20-25, 2011. Washington: IEEE Computer Society, 2011: 521-528. |
[7] | 樊景超, 周国民. 基于Kinect骨骼跟踪技术的手势识别研究[J]. 安徽农业科学, 2014, 42(11): 3444-3446. |
FAN J C, ZHOU G M. The research of gesture recognition based on kinect skeleton tracking technology[J]. Journal of Anhui Agricultural Sciences, 2014, 42(11): 3444-3446. | |
[8] | COOPER H, BOWDEN R. Large lexicon detection of sign language[C]// LNCS 4796: Proceedings of the 2007 IEEE International Workshop on Human-Computer Interaction, Rio de Janeiro, Oct 20, 2007. Berlin, Heidelberg: Springer, 2007: 88-97. |
[9] | CHANG Y J, CHEN S F, HUANG J D. A kinect-based sys-tem for physical rehabilitation: a pilot study for young adults with motor disabilities[J]. Research in Developmental Disabi-lities, 2011, 32: 2566-2570. |
[10] | 李少波. 机器人的人体姿态动作识别与模仿算法[D]. 上海: 上海交通大学, 2013. |
LI S B. Algorithm of human posture action recognition and imitation for robots[D]. Shanghai: Shanghai Jiaotong University, 2013. | |
[11] | REHG J M, ABOWD G D, ROZGA A, et al. Decoding children’s social behavior[C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, Jun 23-28, 2013. Washington: IEEE Computer Soc-iety, 2013: 3414-3421. |
[12] | PRESTI L L, SCLAROFF S, ROZGA A. Joint alignment and modeling of correlated behavior streams[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Dec 1-8, 2013. Washington: IEEE Computer Society, 2013: 730-737. |
[13] |
JOHANSSON G. Visual perception of biological motion and a model for its analysis[J]. Perception & Psychophysics, 1973, 14: 201-211.
DOI URL |
[14] | 陈学梅. 基于人体三维姿态的动作评价系统[D]. 杭州: 浙江大学, 2018. |
CHEN X M. An action evaluating system based on 3D human posture[D]. Hangzhou: Zhejiang University, 2018. | |
[15] | 李奎. 羽毛球运动员挥拍动作的捕捉、识别与分析[D]. 成都: 电子科技大学, 2017. |
LI K. Capture, recognition and analysis of badminton player’s swing[D]. Chengdu: University of Electronic Science and Technology of China, 2017. | |
[16] | 吕默, 万连城. 基于大数据和动作识别算法的体育竞赛辅助评审系统设计[J]. 电子设计工程, 2019, 27(16): 6-10. |
LV M, WAN L C. Design of sports competition aided evalua-tion system based on big data and motion recognition algo-rithm[J]. Electronic Design Engineering, 2019, 27(16): 6-10. | |
[17] | KAO C I, SPIRO I, SEUNGKYU L, et al. Dancing with turks[C]// Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, Brisbane, Oct 26-30, 2015. New York: ACM, 2015: 241-250. |
[18] | SCOTT J, COLLINS R, FUNK C, et al. 4D model-based spatio-temporal alignment of scripted Taiji Quan sequences[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Oct 22-29, 2017. Washington: IEEE Computer Society, 2017: 795-804. |
[19] | 王台瑞. 以三维撷取探索戏曲动作教与学之异同[J]. 艺术教育研究, 2018(35): 69-92. |
WANG T J. USING 3D motion capture study Chinese opera performance movements[J]. Research in Arts Education, 2018(35): 69-92. | |
[20] | 徐光祐, 曹媛媛. 动作识别与行为理解综述[J]. 中国图象图形学报, 2009, 14(2): 189-195. |
XU G Y, CAO Y Y. Action recognition and activity under-standing: a review[J]. Journal of Image and Graphics, 2009, 14(2): 189-195. | |
[21] | WU D, SHARMA N, BLUMENSTEIN M. Recent advances in video-based human action recognition using deep learning: a review[C]// Proceedings of the 2017 International Joint Con-ference on Neural Networks, Anchorage, May 14-19, 2017. Piscataway: IEEE, 2017: 2865-2872. |
[22] |
PRESTI L L, MARCO L C. 3D skeleton-based human action classification: a survey[J]. Pattern Recognition, 2016, 53: 130-147.
DOI URL |
[23] | 黄国范, 李亚. 人体动作姿态识别综述[J]. 电脑知识与技术, 2013, 9(1): 133-135. |
HUANG G F, LI Y. A survey of human action and pose recognition[J]. Computer Knowledge and Technology, 2013, 9(1): 133-135. | |
[24] | 田元, 李方迪. 基于深度信息的人体姿态识别研究综述[J]. 计算机工程与应用, 2020, 56(4): 1-8. |
TIAN Y, LI F D. Research review on human body gesture recognition based on depth data[J]. Computer Engineering and Applications, 2020, 56(4): 1-8. | |
[25] | 黄晴晴, 周风余, 刘美珍. 基于视频的人体动作识别算法综述[J]. 计算机应用研究, 2020, 37(11): 3213-3219. |
HUANG Q Q, ZHOU F Y, LIU M Z. Survey of human action recognition algorithms based on video[J]. Application Research of Computers, 2020, 37 (11): 3213-3219. | |
[26] |
PATRONA F, CHATZITOFIS A, ZARPALAS D, et al. Mo-tion analysis: action detection, recognition and evaluation based on motion capture data[J]. Pattern Recognition, 2018, 76: 612-622.
DOI URL |
[27] | SOOMRO K, ZAMIR A R, SHAH M, et al. UCF101: a dataset of 101 human actions classes form video vision the wild[J]. arXiv:1212.0402, 2012. |
[28] | KUEHNE H, JHUANG H, GARROTE E, et al. HMDB: a large video database for human motion recognition[C]// Proceedings of the 2011 International Conference on Com-puter Vision, Barcelona, Nov 6-13, 2011. Washington: IEEE Computer Society, 2011: 2556-2563. |
[29] | SINGH S, VELASTIN S A, RAGHEB H. MuHAVi: a multi-camera human action video dataset for the evaluation of action recognition methods[C]// Proceedings of the 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance, Washington, Aug 29-Sep 1, 2010. Was-hington: IEEE Computer Society, 2010: 48-55. |
[30] | LI W Q, ZHANG Z Y, LIU Z C, et al. Action recognition based on a bag of 3D points[C]// Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, Jun 13-18, 2010. Was-hington: IEEE Computer Society, 2010: 9-14. |
[31] | SHAHROUDY A, LIU J, TIAN-TSONG N G, et al. NTU RGB+D: a large scale dataset for 3D human activity analysis[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 1010-1019. |
[32] | LIU J, SHAHROUDY A, PEREZ M, et al. NTU RGB+D 120: a large-scale benchmark for 3D human activity unders-tanding[J]. IEEE Transactions on Pattern Analysis and Mac-hine Intelligence, 2019, 42: 2684-2701. |
[33] | BLOOM V, MAKRIS D, ARGYRIOU V. G3D:a gaming action dataset and real time action recognition evaluation framework[C]// Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recog-nition, Providence, Jun 16-21, 2012. Washington: IEEE Com-puter Society, 2012: 7-12. |
[34] |
DABOV K, FOI A, KATKOVNIK V, et al. Image denoising by sparse 3D transform-domain collaborative filtering[J]. IEEE Transactions on Image Processing, 2007, 16: 2080-2095.
DOI URL |
[35] | MAGGIONI M, BORACCHI G, FOI A. Video denoising using separable 4D nonlocal spatiotemporal transforms[J]. Procee-dings of SPIE-The International Society for Optical Enginee-ring, 2011, 7870(3): 1-12. |
[36] | DAVY A, EHRET T, MOREL J M. et al. Non-local video denoising by CNN[J]. arXiv:1811.12758, 2018. |
[37] | ARIAS P, MOREL J M. Video denoising via empirical Baye-sian estimation of space-time patches[J]. Journal of Mathe-matical Imaging & Vision, 2018, 60(1): 70-93. |
[38] | TASSANO M, DELON J, VEIT T. DVDNet: a fast network for deep video denoising[C]// Proceedings of the 2019 IEEE International Conference on Image Processing, Taipei, China, Sep 22-25, 2019. Piscataway: IEEE, 2019: 1805-1809. |
[39] | TASSANO M, DELON J, VEIT T. FastDVDnet: towards real-time deep video denoising without flow estimation[J]. arXiv:1907.01361v2, 2019. |
[40] | PING W, ZHENG N, ZHAO Y, et al. Concurrent action detection with structural prediction[C]// Proceedings of the 2013 International Conference on Computer Vision, Sydney, Dec 1-8, 2013. Washington: IEEE Computer Society, 2013: 3136-3143. |
[41] | WU D, SHAO L. Leveraging hierarchical parametric networks for skeletal joints based action segmentation and recognition[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, Jun 23-28, 2014. Washington: IEEE Computer Society, 2014: 724-731. |
[42] | WANG C, WANG Y, YUILLE A L. An approach to pose-based action recognition[C]// Proceedings of the 2013 Confe-rence on Computer Vision and Pattern Recognition, Portland, Jun 23-28, 2013. Washington: IEEE Computer Society, 2013: 915-922. |
[43] |
SEDMIDUBSKY J, ELIAS P, BUDIKOVA P, et al. Content-based management of human motion data: survey and challe-nges[J]. IEEE Access, 2021, 9: 64241-64255.
DOI URL |
[44] |
OJALA T, PIETIKÄINEN M, MÄENPÄÄ T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24: 971-987.
DOI URL |
[45] | 唐灿, 唐亮贵, 刘波. 图像特征检测与匹配方法研究综述[J]. 南京信息工程大学学报, 2020, 12(3): 261-273. |
TANG C, TANG L G, LIU B. A survey of image feature detecting and matching methods[J]. Journal of Nanjing Uni-versity of Information Science & Technology, 2020, 12(3): 261-273. | |
[46] | BOBICK A F, DAVIS J W. The recognition of human move-ment using temporal templates[J]. IEEE Transactions on Pat-tern Analysis and Machine Intelligence, 2001, 23: 257-267. |
[47] | DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recog-nition, San Diego, Jun 20-25, 2005. Washington: IEEE Com-puter Society, 2005: 886-893. |
[48] |
LAPTEV I. On space-time interest points[J]. International Journal of Computer Vision, 2005, 64(2/3): 107-123.
DOI URL |
[49] | LAPTEV I, MARSZALEK M, SCHMID C, et al. Learning realistic human actions from movies[C]// Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Anchorage, Jun 24-26, 2008. Was-hington: IEEE Computer Society, 2008: 1-8. |
[50] | WANG H, ULLAH M M, KLÄSER A, et al. Evaluation of local spatio-temporal features for action recognition[C]// Procee-dings of the 2009 British Machine Vision Conference, London, Sep 7-10, 2009. London: The British Machine Vision Asso-ciation, 2009: 1-11. |
[51] | BAUMANN J, WESSEL R, KRÜGER B, et al. Action graph a versatile data structure for action recognition[C]// Procee-dings of the 9th International Conference on Computer Gra-phics Theory and Applications, Lisbon, Jan 5-8, 2014: 325-334. |
[52] |
BARNACHON M, BOUAKAZ S, BOUFAMA B, et al. A real-time system for motion retrieval and interpretation[J]. Pattern Recognition Letters, 2013, 34(15): 1789-1798.
DOI URL |
[53] |
MASOOD S Z, MASOOD S Z, TAPPEN M F, et al. Explo-ring the trade-off between accuracy and observational latency in action recognition[J]. International Journal of Computer Vision, 2013, 101(3): 420-436.
DOI URL |
[54] |
MÜLLER M, RÖDER T, CLAUSEN M. Efficient content-based retrieval of motion capture data[J]. ACM Transactions on Graphics, 2005, 24(3): 677-685.
DOI URL |
[55] | CHERON G, LAPTEV I, SCHMID C. P-CNN: pose-based CNN features for action recognition[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Dec 7-13, 2015. Washington: IEEE Computer Society, 2015: 3218-3226. |
[56] | YAN S, XIONG Y, LIN D, et al. Spatial temporal graph convo-lutional networks for skeleton-based action recognition[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th Innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, Feb 2-7, 2018. Menlo Park: AAAI, 2018: 7444-7452. |
[57] | REN B, LIU M, DING R, et al. A survey on 3D skeleton-based action recognition using learning method[J]. arXiv:2002.05907, 2020. |
[58] | ZHANG P, XUE J, LAN C, et al. Adding attentiveness to the neurons in recurrent neural networks[C]// LNCS 11213: Proceedings of the 15th European Conference on Computer Vision, Munich, Sep 8-14, 2018. Cham: Springer, 2018: 136-152. |
[59] | LANITIS A, TAYLOR C J, COOTES T F. Automatic interpre-tation and coding of face images using flexible models[J]. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 1997, 19(7): 743-756. |
[60] |
COOTES T F, EDWARDS G J, TAYLOR C J. Active appea-rance models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(6): 681-685.
DOI URL |
[61] | BOBICK A F, WILSON A D. A state-based approach to the representation and recognition of gesture[J]. IEEE Transac-tions on Pattern Analysis and Machine Intelligence, 1997, 19(12): 1325-1337. |
[62] | YAMATO J, OHYA J, ISHII K. Recognizing human action in time sequential images using hidden Markov model[C]// Proceedings of the 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Champaign, Jun 15-18, 1992. Washington: IEEE Computer Society, 1992: 379-385. |
[63] | NGUYEN N T, PHUNG D Q, VENKATESH S, et al. Lear-ning and detecting activities from movement trajectories using the hierachical hidden Markov model[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, Jun 20-26, 2005. Washin-gton: IEEE Computer Society, 2005: 955-960. |
[64] | 梅雪, 胡石, 许松松, 等. 基于多尺度特征的双层隐马尔可夫模型及其在行为识别中的应用[J]. 智能系统学报, 2012, 7(6): 512-517. |
MEI X, HU S, XU S S, et al. Multi-scale feature based double-layer HMM and its application in behavior recognition[J]. CAAI Transactions on Intelligent Systems, 2012, 7(6): 512-517. | |
[65] | DU Y T, CHEN F, XU W L, et al. Recognizing interaction activities using dynamic Bayesian network[C]// Proceedings of the 18th International Conference on Pattern Recognition, Hong Kong, China, Aug 20-24, 2006. Washington: IEEE Com-puter Society, 2006: 618-621. |
[66] | OLIVER N, HORVITZ E. A comparison of HMMs and dynamic Bayesian networks for recognizing office activities[C]// LNCS 3538: Proceedings of the 10th International Confe-rence on User Modeling, Edinburgh, Jul 24-29, 2005. Berlin, Heidelberg: Springer, 2005: 199-209. |
[67] | 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016. |
ZHOU Z H. Machine learning[M]. Beijing: Tsinghua Univer-sity Press, 2016. | |
[68] | 李航. 统计学习方法[M]. 北京: 清华大学出版社, 2012. |
LI H. Statistical learning methods[M]. Beijing: Tsinghua Uni-versity Press, 2012. | |
[69] |
PONTIL M, VERRI A. Support vector machines for 3D object recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(6): 637-646.
DOI URL |
[70] | MANZI A, CAVALLO F, DARIO P. A 3D human posture approach for activity recognition based on depth camera[C]// LNCS 9914: Proceedings of the 14th European Confe-rence on Computer Vision, Amsterdam, Oct 8-10, 15-16, 2016. Cham: Springer, 2016: 432-447. |
[71] | SCHÜLDT C, LAPTEV I, CAPUTO B. Recognizing human actions: a local SVM approach[C]// Proceedings of the 17th International Conference on Pattern Recognition, Cambridge, Aug 23-26, 2004. Washington: IEEE Computer Society, 2004: 32-36. |
[72] | MOHAMED E, ISMAIL C, WASSIM B, et al. Posture recog-nition using an RGB-D camera: exploring 3D body modeling and deep learning approaches[C]// Proceedings of the 2018 IEEE Life Sciences Conference, Montreal, Oct 28-30, 2018. Piscataway: IEEE, 2018: 69-72. |
[73] | 刘锁兰, 顾嘉晖, 王洪元, 等. 基于关联分区和ST-GCN的人体行为识别[J]. 计算机工程与应用, 2021, 57(13): 168-175. |
LIU S L, GU J H, WANG H Y, et al. Human behavior reco-gnition based on associative partition and ST-GCN[J]. Com-puter Engineering and Applications, 2021, 57(13): 168-175. | |
[74] |
JI S W, XU W, YANG M, et al. 3D convolutional neural net-works for human action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 221-231.
DOI URL |
[75] | 李元祥, 谢林柏. 基于深度运动图和密集轨迹的行为识别算法[J]. 计算机工程与应用, 2020, 56(3): 194-200. |
LI Y X, XIE L B. Human action recognition based on depth motion map and dense trajectory[J]. Computer Engineering and Applications, 2020, 56(3): 194-200. | |
[76] | SIMONYAN K, ZISSERMAN A. Two-stream convolutional networks for action recognition in videos[C]// Advances in Neural Information Processing Systems 27: Annual Confe-rence on Neural Information Processing Systems 2014, Mont-real, Dec 8-13, 2014: 568-576. |
[77] | FEICHTEMHOFER C, PINZ A, ZISSERMAN A, et al. Con-volutional two-stream network fusion for video action recog-nition[C]// Proceedings of the 2016 IEEE Conference on Com-puter Vision and Pattern Recognition, Las Vegas, Jun 27-30, 2016. Washington: IEEE Computer Society, 2016: 1933-1941. |
[78] | 石祥滨, 李怡颖, 刘芳, 等. T-STAM: 基于双流时空注意力机制的端到端的动作识别模型[J]. 计算机应用研究, 2020, 38(3): 1235-1239. |
SHI X B, LI Y Y, LIU F, et al. T-STAM: end-to-end action recognition model based on two-stream network with spatio-temporal attention mechanism[J]. Application Research of Computers, 2020, 38(3): 1235-1239. | |
[79] | DONAHUE J, HENDRICKSL A, GUADARRAMA S, et al. Long-term recurrent convolutional networks for visual recog-nition and description[J]. IEEE Transactions on Pattern Ana-lysis and Machine Intelligence, 2017, 39(4): 677-691. |
[80] | 杨珂, 王敬宇, 戚琦, 等. LSCN: 一种用于动作识别的长短时序关注网络[J]. 电子学报, 2020, 48(3): 503-509. |
YANG K, WANG J Y, QI Q, et al. LSCN: concerning long and short sequence together for action recognition[J]. Acta Electronica Sinica, 2020, 48(3): 503-509. | |
[81] | PENG W, HONG X P, CHEN H Y, et al. Learning graph con-volutional network for skeleton-based human action recog-nition by neural searching[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Innovative Applications of Artificial Intelligence Conference, the 10th AAAI Symposium on Educational Advances in Artificial Intelligence, New York, Feb 7-12, 2020. Menlo Park: AAAI, 2020: 2669-2676. |
[82] | 张晓莹, 刘莉, 赵轩立. 竞技健美操难度动作C289不同技术特征的运动学分析[J]. 北京体育大学学报, 2017, 40(10): 99-105. |
ZHANG X Y, LIU L, ZHAO X L. Kinematic analysis on dif-ferent technical characteristics of C289 in aerobic gymnastics[J]. Journal of Beijing Sport University, 2017, 40(10): 99-105. | |
[83] |
ALEXIADIS D S, DARAS P. Quaternionic signal processing techniques for automatic evaluation of dance performances from MoCap data[J]. IEEE Transactions on Multimedia, 2014, 16(5): 1391-1406.
DOI URL |
[84] | 李睿敏. 基于视觉数据的人体动作精细分类及评估方法研究[D]. 北京: 中国科学院大学, 2020. |
LI R M. Research on fine classification and evaluation of human action based on visual data[D]. Beijing: University of Chinese Academy of Sciences, 2020. | |
[85] | RICHTER J, WIEDE C, HEINKEL U, et al. Motion evalua-tion of therapy exercises by means of skeleton normalisation, incremental dynamic time warping and machine learning: a comparison of a rule-based and a machine-learning-based approach[C]// Proceedings of VISIGRAPP 14th International Conference on Computer Vision Theory and Applications, Prague, Feb 25-27, 2019: 497-504. |
[86] | 徐铮. 基于全身动捕的太极拳辅助教学与评价方法[D]. 郑州: 郑州大学, 2018. |
XU Z. Taiji boxing assist teaching and evaluation method based on whole body motion capture[D]. Zhengzhou: Zheng-zhou University, 2018. |
[1] | SU Jiangyi, SONG Xiaoning, WU Xiaojun, YU Dongjun. Skeleton Based Action Recognition Algorithm on Multi-modal Lightweight Graph Convolutional Network [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(4): 733-742. |
[2] | QIAN Huifang, YI Jianping, FU Yunhu. Review of Human Action Recognition Based on Deep Learning [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(3): 438-455. |
[3] | SUN Dongpu, QU Li. Survey on Feature Representation and Similarity Measurement of Time Series [J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(2): 195-205. |
[4] | WANG Xinwen, XIE Linbo, PENG Li. Double Residual Network Recognition Method for Falling Abnormal Behavior [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(9): 1580-1589. |
[5] | YIN Congcong, ZHANG Qiuju. Review of Research on Robot Programming by Learning from Demonstration [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(8): 1275-1287. |
[6] | DONG Xu, TAN Li, ZHOU Lina, SONG Yanyan. Short Video Behavior Recognition Combining Scene and Behavior Features [J]. Journal of Frontiers of Computer Science and Technology, 2020, 14(10): 1754-1761. |
[7] | HUANG Feifei, CAO Jiangtao, JI Xiaofei, WANG Peiyao. Research on Human Interaction Recognition Algorithm Based on Mixed Features [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(2): 294-302. |
[8] | YANG Nongying, LI Feng, GUI Yan. Repeated Texture Elements Extraction from Texture Images [J]. Journal of Frontiers of Computer Science and Technology, 2016, 10(8): 1154-1165. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
/D:/magtech/JO/Jwk3_kxyts/WEB-INF/classes/