Let us rethink the real-world scenarios that require human motion prediction techniques, such as human-robot collaboration. Current works simplify the task of predicting human motions into a one-off process of forecasting a short future sequence (usually no longer than 1 second) based on a historical observed one. However, such simplification may fail to meet practical needs due to the neglect of the fact that motion prediction in real applications is not an isolated ``observe then predict'' unit, but a consecutive process composed of many rounds of such unit, semi-overlapped along the entire sequence. As time goes on, the predicted part of previous round has its corresponding ground truth observable in the new round, but their deviation in-between is neither exploited nor able to be captured by existing isolated learning fashion. In this paper, we propose DeFeeNet, a simple yet effective network that can be added on existing one-off prediction models to realize deviation perception and feedback when applied to consecutive motion prediction task. At each prediction round, the deviation generated by previous unit is first encoded by our DeFeeNet, and then incorporated into the existing predictor to enable a deviation-aware prediction manner, which, for the first time, allows for information transmit across adjacent prediction units. We design two versions of DeFeeNet as MLP-based and GRU-based, respectively. On Human3.6M and more complicated BABEL, experimental results indicate that our proposed network improves consecutive human motion prediction performance regardless of the basic model.
翻译:让我们重新思考需要人体运动预测技术的真实场景,例如人机协作。现有工作将人体运动预测任务简化为基于历史观测序列预测短期未来序列(通常不超过1秒)的一次性过程。然而,这种简化可能无法满足实际需求,因为它忽视了真实应用中的运动预测并非孤立的“观测-预测”单元,而是由多个此类单元沿整条序列半重叠构成的连续过程。随着时间的推移,上一轮预测部分在新一轮中具有可观测的真值,但两者之间的偏差既未被利用,也无法被现有的孤立学习方式捕捉。本文提出DeFeeNet,这是一种简单而有效的网络,可附加于现有一次性预测模型之上,在应用于连续运动预测任务时实现偏差感知与反馈。在每个预测轮次中,前一单元产生的偏差首先由DeFeeNet编码,随后融入现有预测器以实现偏差感知的预测方式,这首次实现了相邻预测单元间的信息传递。我们分别设计了基于MLP和GRU的两个DeFeeNet版本。在Human3.6M及更复杂的BABEL数据集上的实验结果表明,所提出的网络无论基于何种基础模型,均能提升连续人体运动预测性能。