Human-centred systems require an understanding of human actions in the physical world. Temporally extended sequences of actions are intentional and structured, yet existing methods for recognising what actions are performed often do not attempt to capture their structure, particularly how the actions are executed. This, however, is crucial for assessing the quality of the action's execution and its differences from other actions. To capture the internal mechanics of actions, we introduce a domain-specific language EXACT that represents human motions as underspecified motion programs, interpreted as reward-generating functions for zero-shot policy inference using forward-backwards representations. By leveraging the compositional nature of EXACT motion programs, we combine individual policies into an executable neuro-symbolic model that uses program structure for compositional modelling. We evaluate the utility of the proposed pipeline for creating executable action models by analysing motion-capture data to understand human actions, for the tasks of human action segmentation and action anomaly detection. Our results show that the use of executable action models improves data efficiency and captures intuitive relationships between actions compared with monolithic, task-specific approaches.
翻译:以人为中心的系统需要理解物理世界中的人类行为。在时间上延展的动作序列具有意图性和结构性,然而现有识别行为类型的方法通常未能捕捉其结构,特别是动作的执行方式。然而,这对评估动作执行质量及其与其他动作的差异至关重要。为捕捉动作的内部机制,我们引入领域特定语言EXACT,将人体运动表示为未完全指定的运动程序,并解释为利用前向-后向表征进行零样本策略推理的奖励生成函数。通过利用EXACT运动程序的组合特性,我们将个体策略整合为可执行的神经符号模型,该模型利用程序结构进行组合建模。我们通过分析动作捕捉数据理解人类行为,评估所提流水线在人类动作分割与动作异常检测任务中创建可执行动作模型的实用性。结果表明,与单一任务特定方法相比,可执行动作模型的使用提升了数据效率,并捕捉到动作间更直观的关系。