基于 Transformer的三维人体姿态估计及其动作达成度评估
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP391 TH86

基金项目:

国家重点研发计划项目资助(2023YFF1203503)、上海市自然科学基金(22ZR1424200)项目资助


Transformer-based 3D Human pose estimation and action achievement evaluation
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对人机交互、医疗康复等领域存在的人体姿态分析与评估问题,本文提出了一种基于 Transformer 的三维人体姿态估 计及其动作达成度评估方法。 首先,本文定义了人体姿态的关键点及关节角,并在深度位姿估计网络(DPEN)的基础上,提出 并构建了一个基于 Transformer 的三维人体姿态估计模型(TPEM),Transformer 的引入能够更好的提取人体姿态的长时序特征; 其次,利用 TPEM 模型对三维人体姿态估计结果,设计了基于加权 3D 关节角的动态时间规整算法,在时序上对不同人物同一动 作的姿态进行姿态关键帧的规整匹配,并据此提出了动作达成度评估方法,用于给出动作的达成度分数;最后,通过在不同数据 集上进行实验验证,TPEM 在 Human3. 6 M 数据集上实现了平均关节点误差为 37. 3 mm, 而基于加权 3D 关节角的动态时间规 整算法在 Fit3D 数据集上的平均误差帧数为 5. 08,展现了本文所提方法在三维人体姿态估计与动作达成度评估方面的可行性 和有效性。

    Abstract:

    According to the challenges of human pose analysis and assessment in domains such as human-computer interaction and medical rehabilitation, this paper introduces a Transformer-based methodology for 3D human pose estimation and the evaluation of action achievement. Firstly, key points of human pose and their joint angles were defined, and based on the deep pose estimation network (DPEN), a Transformer-based 3D human pose estimation model (TPEM) is proposed and constructed, the incorporation of Transformer facilitates better enhanced extraction of long-term sequential features of human pose. Secondly, the TPEM model′s outcomes in 3D human pose estimation are utilized to formulate a dynamic time warping algorithm, which focuses on weighted 3D joint angles. This algorithm temporally aligns pose keyframes for different individuals performing the same action and subsequently introduces an assessment method for action accomplishment to provide scores for the degree of action fulfillment. Finally, through experimental validation across various datasets, TPEM achieves an average joint point error of 37. 3 mm on the Human3. 6 M dataset, while the dynamic time warping algorithm based on weighted 3D joint angles yields an average error of 5. 08 frames on the Fit3D dataset. These results demonstrate the feasibility and effectiveness of the proposed approach for 3D human pose estimation and action accomplishment assessment.

    参考文献
    相似文献
    引证文献
引用本文

杨傲雷,周应宏,杨帮华,徐昱琳.基于 Transformer的三维人体姿态估计及其动作达成度评估[J].仪器仪表学报,2024,45(4):136-144

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-07-15
  • 出版日期:
文章二维码