电子科技 ›› 2023, Vol. 36 ›› Issue (1): 28-37.doi: 10.16180/j.cnki.issn1007-7820.2023.01.005
张漫秸1,杨芳艳1,季云峰2
收稿日期:
2021-06-03
出版日期:
2023-01-15
发布日期:
2023-01-17
作者简介:
张漫秸(1997-),女,硕士研究生。研究方向:数字图像处理。|季云峰(1990-),男,博士,讲师。研究方向:乒乓球机器人。
基金资助:
ZHANG Manjie1,YANG Fangyan1,JI Yunfeng2
Received:
2021-06-03
Online:
2023-01-15
Published:
2023-01-17
Supported by:
摘要:
针对单个RGB图像,人体姿态估计通过对人体关键点定位来估计人体的位置和关节点位置。球类比赛是一种快速的运动,用主观观察对运动员的技术合法性进行判决无法避免错误。因此,文中利用基于人体姿态估计的运动员姿态分析技术进行辅助训练和辅助判罚,有效避免了传统系统中由于人的主观判断对运动员姿态的错误定位。目前,针对人体姿态估计的研究被分为基于传统算法和基于深度学习算法两种主要方式。在基于深度学习算法的基础上又分为单人人体姿态检测和多人人体姿态检测。基于深度学习算法的人体姿态估计通过构建神经网络,运用机器学习的方法提取图片特征读取图片信息,并在用于人体姿态估计的主流数据集上进行性能对比和分析。将人体姿态估计应用到球类运动中,为运动员的日常训练提供了一定的科学参考,同时也最大程度上保证了运动员比赛中的公平与公正。
中图分类号:
张漫秸,杨芳艳,季云峰. 球类运动中人体姿态估计研究进展[J]. 电子科技, 2023, 36(1): 28-37.
ZHANG Manjie,YANG Fangyan,JI Yunfeng. Research Progress of Body Posture Estimation in Ball Games[J]. Electronic Science and Technology, 2023, 36(1): 28-37.
表1
运动类视频分析技术研究进展"
研究来源 | 年份 | 运动类型 | 检测对象 | 目标 |
---|---|---|---|---|
文献[4] | 2008 | 足球 | 颜色特征 | 镜头分类 |
文献[5] | 2012 | 足球、高尔夫球 | 颜色特征 | 视频序列分类 |
文献[6] | 2013 | 篮球 | 颜色特征和区域特征 | 视频分割 |
文献[7] | 2014 | 足球 | 光流和颜色特征 | 视频分割、事件分类 |
文献[8] | 2016 | 足球 | 外观和运动模型 | 遮挡、多目标追踪 |
文献[9] | 2017 | 足球 | 视频和外部文本信息 | 利用高级特征分析视频 |
文献[10] | 2019 | 篮球 | 视频分类和上下文信息 | 球员动作跟踪与分析 |
文献[11] | 2018 | 篮球 | 文本和视频帧 | 提取慢动作 |
文献[12] | 2018 | 游泳 | 运动员关节 | 矫正运动员姿态 |
文献[13] | 2019 | - | 运动员姿态 | 运动员动作识别 |
文献[14] | 2020 | 乒乓球 | 球员2D姿态 | 预测乒乓球落脚点 |
表2
人体姿态估计数据集介绍"
年份 | 数据集 | 样本数量 | 样本标注特征 | 检测场景 |
---|---|---|---|---|
2010 | LSP | 训练:1 000 测试:1 000 | 标注14个关节点,包含2 000个姿势注释 | 单人 |
2013 | FILC | 训练:3 987 测试:1 016 | 10个上半身关节点 | 单人 |
2014 | MPII | 2.5×104 | 16个关节点,涵盖410项人类活动,包含超过4万人 | 单人/多人 |
2014 | MSCOCO | 超过3.3×105 | 17个关节点,包含10万人 | 多人 |
2017 | AI Challenger | 训练:2.1×105 验证:3×104 测试:3×104 | 标注14个关节点,是目前最大的人体姿态估计图像数据集 | 多人 |
2019 | Crowd Pose | 训练:10×103 验证:2×103 测试:8×103 | 标注14个关节点,包含8万行人,适应密集场景 | 多人 |
表3
单人人体姿态估计方法比较"
方法 | 网络 | 作者 | 表现/% | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
FLIC | LSP(PCP) | MPII(PCKh@0.5) | |||||||||
Eblow | Wrist | ||||||||||
基于坐标回归 | Deep-Pose | Toshev | 92.3 | 82.0 | 61.0 | - | |||||
IEF | Carreira | - | - | 72.5 | 81.3 | ||||||
基于热图检测 | CNN和图模型 | Tompson | 95.2 | 91.2 | 67.2 | - | |||||
堆叠沙漏网络 | Newell | 99.0 | 97.0 | - | 90.9 | ||||||
PRMs | Yang | - | - | 93.9 | 92.0 | ||||||
HRNet | Sun | - | - | - | 92.3 | ||||||
WASP | UniPose | 72.8 | 92.7 | ||||||||
回归与检测 混合模型 | Coordinate Net和Heatmap Net串联 | Bulat | - | - | 90.7 | 89.7 | |||||
DS-CNN | Fan | - | - | 84.0 | - |
表4
多人人体姿态估计方法比较"
方法 | 文献 | 多人姿态估计方法 | 主要网络 | 表现/% | ||
---|---|---|---|---|---|---|
MPII(PCKh@0.5) | MSCOCO | MAP | ||||
自顶向下 | [ | RCNN | RetNet-101 | - | - | 64.9 |
[32] | CPN | GLobalNet、Refinenet | - | 72.1 | - | |
[ | HR-NET | 3D HR-NET | - | - | 83.8 | |
[ | Faster-RCNN | AlignPSt | - | - | 94.0 | |
[ | Open Pose | CPM | 75.6 | 61.8 | - | |
[ | CRF | FCN | - | - | 79.1 | |
自底向上 | [ | HR-NET | HigherHR-NET | - | 74.9 | - |
[ | HRNet | W48 | - | 77.7 | - |
[1] | 周瑾, 孟祥印, 叶美松. 基于计算机视觉和LabVIEW平台的网球鹰眼系统[J]. 传感器与微系统, 2018, 37(7):102-104. |
Zhou Jin, Meng Xiangyin, Ye Meisong. Tennis hawkeye system based on computer vision and LabVIEW platform[J]. Transducer and Microsystem Technologies, 2018, 37(7):102-104. | |
[2] | Lai J H, Chien S Y. Tennis video with semantic scalability[C]. San Deigo: Proceedings of the IEEE International Symposium on Multimedia, 2009. |
[3] | Eltoukhy M, Asfour S, Thompson C, et al. Evaluation of the performance of digital video analysis of human motion: Dartfish tracking system[J]. International Journal of Scientific and Engineering Research, 2012, 3(3):1-6. |
[4] | 于俊清, 王宁. 基于子窗口区域的足球视频镜头分类[J]. 中国图象图形学报, 2008, 13(7):152-157. |
Yu Junqing, Wang Ning. Shot classification for soccer video based on sub-window region[J]. Journal of Image and Graphics, 2008, 13(7):152-157. | |
[5] | Hanna J, Patlar F, Akbulut A, et al. HMM based classification of sports videos using color feature[C]. Sofia: Proceedings of the IEEE International Conference Intelligent Systems, 2012. |
[6] | 朱映映, 刘剑武, 宋娜. 篮球视频冗余数据分析与检测[J]. 小型微型计算机系统, 2010, 31(9):1873-1876. |
Zhu Yingying, Liu Jianwu, Song Na. Analysis and detection of redundancy data on basketball video[J]. Journal of Chinese Computer Systems, 2010, 31(9):1873-1876. | |
[7] |
Pandya D S, Zaveri M A. A novel framework for semantic analysis of an illumination-variant soccer video[J]. EURASIP Journal on Image and Video Processing, 2014, 2014(1):49-56.
doi: 10.1186/1687-5281-2014-49 |
[8] |
Baysal S, Duygulu P. Sentioscope: A soccer player tracking system using model field particles[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(7):1350-1362.
doi: 10.1109/TCSVT.2015.2455713 |
[9] |
Wang Z, Yu J, He Y. Soccer video event annotation by synchronization of attack-defense clips and match reports with coarse-grained time information[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 27(5):1104-1117.
doi: 10.1109/TCSVT.2016.2515280 |
[10] |
Yoon Y, Hwang H, Choi Y, et al. Analyzing basketball movements and pass relationships using realtime object tracking techniques based on deep learning[J]. IEEE Access, 2019, 7(99):56564-56576.
doi: 10.1109/ACCESS.2019.2913953 |
[11] |
Chen H T, He Y Z, Hsu C C. Computer-assisted yoga training system[J]. Multimedia Tools and Applications, 2018, 77(18):23969-23991.
doi: 10.1007/s11042-018-5721-2 |
[12] | Dan Z, Einfalt M, Eggert C, et al. Kinematic pose rectification for performance analysis and retrieval in sports[C]. Salt Lake: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018. |
[13] |
Kong L, Huang D, Qin J, et al. A joint framework for athlete tracking and action recognition in sports videos[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(2):532-548.
doi: 10.1109/TCSVT.2019.2893318 |
[14] | Wu E, Koike H. FuturePong: Real-time table tennis trajectory forecasting using pose prediction network[C]. Honolulu: Proceedings of the Conference on Human Factors in Computing Systems, 2020. |
[15] | Fischler M A, Elschlager R A. The representation and matching of pictorial structures[J]. IEEE Transactions on Computers, 1973, 22(1):67-92. |
[16] | Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]. San Diego: Proceedings of the International Conference on Computer Vision and Pattern Recognition, 2005. |
[17] |
Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110.
doi: 10.1023/B:VISI.0000029664.99615.94 |
[18] | Yi Y, Ramanan D. Articulated pose estimation with flexible mixtures-of-parts[C]. Colorado Springs: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2011. |
[19] | Toshev A, Szegedy C. DeepPose: Human pose estimation via deep neural networks[C]. Portland: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013. |
[20] | Carreira J, Agrawal P, Fragkiadaki K, et al. Human pose estimation with iterative error feedback[C]. Las Vegas: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. |
[21] | Tompson J, Jain A, Lecun Y, et al. Joint training of a convolutional network and a graphical model for human pose estimation[C]. Columbus: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014. |
[22] | Li S Z. Markov random field modeling in image analysis[M]. New York: Springer-Verlag, 2001. |
[23] | Newell A, Yang K, Jia D. Stacked hourglass networks for human pose estimation[C]. Las Vegas: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. |
[24] | Yang W, Li S, Ouyang W, et al. Learning feature pyramids for human pose estimation[C]. Venice: Proceedings of the IEEE International Conference on Computer Vision, 2017. |
[25] | Sun K, Xiao B, Liu J, et al. Deep high-resolution representation learning for human pose estimation[C]. Long Beach: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019. |
[26] | Zhang J, Chen Z, Tao D. Towards high performance human keypoint detection[C]. Seattle: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. |
[27] | Artacho B, Savakis A. UniPose: Unified human pose estimation in single images and videos[C]. Seattle: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. |
[28] | Bulat A, Tzimiropoulos G. Human pose estimation via convolutional part heatmap regression[C]. Amsterdam: Proceedings of the European Conference on Computer Vision, 2016. |
[29] | Fa N X, Kang Z, Lin Y, et al. Combining local appearance and holistic view: Dual-source deep neural networks for human pose estimation[C]. Boston: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. |
[30] | Papandreou G, Zhu T, Kanazawa N, et al. Towards accurate multi-person pose estimation in the wild[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. |
[31] |
Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
doi: 10.1109/TPAMI.2016.2577031 pmid: 27295650 |
[32] | Chen Y, Wang Z, Peng Y, et al. Cascaded pyramid network for multi-person pose estimation[C]. Salt Lake: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018. |
[33] | Wang M, Tighe J, Modolo D. Combining detection and tracking for human pose estimation in videos[C]. Seattle: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. |
[34] | Huang J, Zhu Z, Guo F, et al. The devil is in the details: Delving into unbiased data processing for human pose estimation[C]. Seattle: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. |
[35] | Yan Y, Li J, Qin J, et al. Anchor-free person search[C]. Nashville: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021. |
[36] | Ma X, Su J, Wang C, et al. Context modeling in 3D human pose estimation: A unified perspective[C]. Nashville: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021. |
[37] | Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. |
[38] | Xia F, Peng W, Chen X, et al. Joint multi-person pose estimation and semantic part segmentation[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. |
[39] | Cheng B, Xiao B, Wang J, et al. HigherHRNet: Scale-Aware representation learning for bottom-up human pose estimation[C]. Seattle: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. |
[40] | Fabbri M, Lanzi F, Calderara S, et al. Compressed volumetric heatmaps for multi-person 3D pose estimation[C]. Seattle: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020. |
[41] | Luo Z, Wang Z, Huang Y, et al. Rethinking the heatmap regression for bottom-up human pose estimation[C]. Nashville: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021. |
[42] | 叶飞, 刘子龙. 基于改进YOLOv3算法的行人检测研究[J]. 电子科技, 2021, 34(1):5-9. |
Ye Fei, Liu Zilong. Pedestrian detection based on improved YOLOv3 algorithm[J]. Electronic Science and Technology, 2021, 34(1):5-9. | |
[43] | Johnson S, Everingham M. Clustered pose and nonlinear appearance models for human pose estimation[C]. Aberystwyth: Proceedings of the British Machine Vision Conference, 2010. |
[44] | Sapp B, Taskar B. MODEC: Multimodal decomposable models for human pose estimation[C]. Portland: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013. |
[45] | Andriluka M, Pishchulin L, Gehler P, et al. Human pose estimation: New benchmark and state of the art analysis[C]. Columbus: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014. |
[46] | Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: Common objects in context[C]. Zurich: Proceedings of the European Conference on Computer Vision, 2014. |
[47] | Wu J, Zheng H, Zhao B, et al. AI Challenger: A large-scale dataset for going deeper in image understanding[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. |
[48] | Li J, Wang C, Zhu H, et al. CrowdPose: Efficient crowded scenes pose estimation and a new benchmark[C]. Long Beach: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019. |
[1] | 余琼芳,牛冬阳. 基于LSTM网络的矿山压力时空混合预测[J]. 电子科技, 2023, 36(2): 67-72. |
[2] | 左斌,李菲菲. 基于注意力机制和Inf-Net的新冠肺炎图像分割方法[J]. 电子科技, 2023, 36(2): 22-28. |
[3] | 程长文,陈玮,陈劲宏,尹钟. 改进YOLO的口罩佩戴实时检测方法[J]. 电子科技, 2023, 36(2): 73-80. |
[4] | 黄雅静,廖爱华,于淼,李晓龙,胡定玉. 基于改进CNN的轴承声学故障诊断[J]. 电子科技, 2023, 36(1): 75-80. |
[5] | 周永长,黄亚宇. 基于BP神经网络建立二次润叶工艺参数的预测模型[J]. 电子科技, 2022, 35(9): 79-86. |
[6] | 毕嘉桢,沈拓,张轩雄. 基于机器视觉的轨道交通自动测距研究[J]. 电子科技, 2022, 35(9): 37-43. |
[7] | 刘国华,路宏敏,陈冲冲,李万玉,万健鹏. 基于神经网络的接收机宽带非线性行为建模[J]. 电子科技, 2022, 35(8): 1-6. |
[8] | 邓源,施一萍,江悦莹,朱亚梅,刘瑾. 基于MobileNetV2与LBP特征融合的婴幼儿表情识别算法[J]. 电子科技, 2022, 35(8): 47-52. |
[9] | 张乔木,钟倩文,孙明,罗文成,柴晓冬. 复杂环境下弓网接触位置动态监测方法研究[J]. 电子科技, 2022, 35(8): 66-72. |
[10] | 赵轩,周凡,余汉成. 基于改进特征提取及融合模块的YOLOv3模型[J]. 电子科技, 2022, 35(7): 40-45. |
[11] | 仝小森,杨金显. 基于GRNN网络自适应滤波的钻具加速度去噪[J]. 电子科技, 2022, 35(7): 46-51. |
[12] | 孙抗,轩旭阳,刘鹏辉,赵来军,龙洁. 小样本下基于CNN-DCGAN的电缆局部放电模式识别方法[J]. 电子科技, 2022, 35(7): 7-13. |
[13] | 张崇崇,黄亚宇. GA-BP神经网络对片烟结构的预测研究[J]. 电子科技, 2022, 35(6): 35-42. |
[14] | 沈宁静,袁健. 基于残差密集连接与注意力融合的人群计数算法[J]. 电子科技, 2022, 35(6): 6-12. |
[15] | 王培宇,马立新. 基于模糊神经网络的永磁同步电机伺服系统研究[J]. 电子科技, 2022, 35(6): 83-88. |
|