Action quality assessment (AQA) applies computer vision to quantitatively assess the performance or execution of a human action. Current AQA approaches are end-to-end neural models, which lack transparency and tend to be biased because they are trained on subjective human judgements as ground-truth. To address these issues, we introduce a neuro-symbolic paradigm for AQA, which uses neural networks to abstract interpretable symbols from video data and makes quality assessments by applying rules to those symbols. We take diving as the case study. We found that domain experts prefer our system and find it more informative than purely neural approaches to AQA in diving. Our system also achieves state-of-the-art action recognition and temporal segmentation, and automatically generates a detailed report that breaks the dive down into its elements and provides objective scoring with visual evidence. As verified by a group of domain experts, this report may be used to assist judges in scoring, help train judges, and provide feedback to divers. Annotated training data and code: https://github.com/laurenok24/NSAQA.
翻译:动作质量评估(AQA)利用计算机视觉技术对人类动作的表现或执行情况进行定量评估。当前的AQA方法采用端到端神经网络模型,这类模型缺乏透明度,并且由于训练所依赖的真实标注源于主观的人类评判,往往存在偏差。为解决这些问题,我们提出了一种用于AQA的神经符号范式。该范式使用神经网络从视频数据中提取可解释的符号,并通过应用规则对这些符号进行质量评估。我们以跳水运动作为案例研究。研究发现,领域专家更青睐我们的系统,并认为与纯神经网络的跳水AQA方法相比,我们的系统能提供更丰富的信息。我们的系统在动作识别和时间分割方面也达到了最先进的水平,并能自动生成一份详细报告。该报告将跳水动作分解为各个组成元素,并提供带有视觉证据的客观评分。经一组领域专家验证,此报告可用于辅助裁判评分、帮助培训裁判,并为跳水运动员提供反馈。标注的训练数据与代码:https://github.com/laurenok24/NSAQA。