Let us rethink the real-world scenarios that require human motion prediction techniques, such as human-robot collaboration. Current works simplify the task of predicting human motions into a one-off process of forecasting a short future sequence (usually no longer than 1 second) based on a historical observed one. However, such simplification may fail to meet practical needs due to the neglect of the fact that motion prediction in real applications is not an isolated ``observe then predict'' unit, but a consecutive process composed of many rounds of such unit, semi-overlapped along the entire sequence. As time goes on, the predicted part of previous round has its corresponding ground truth observable in the new round, but their deviation in-between is neither exploited nor able to be captured by existing isolated learning fashion. In this paper, we propose DeFeeNet, a simple yet effective network that can be added on existing one-off prediction models to realize deviation perception and feedback when applied to consecutive motion prediction task. At each prediction round, the deviation generated by previous unit is first encoded by our DeFeeNet, and then incorporated into the existing predictor to enable a deviation-aware prediction manner, which, for the first time, allows for information transmit across adjacent prediction units. We design two versions of DeFeeNet as MLP-based and GRU-based, respectively. On Human3.6M and more complicated BABEL, experimental results indicate that our proposed network improves consecutive human motion prediction performance regardless of the basic model.
翻译:让我们重新思考需要人体运动预测技术的现实场景,例如人机协作。现有工作将人体运动预测任务简化为一个一次性过程:基于一段历史观测序列,预测一段短期的未来序列(通常不超过1秒)。然而,这种简化可能无法满足实际需求,因为它忽略了现实应用中运动预测并非孤立的“先观测后预测”单元,而是由多个这样的单元沿整个序列半重叠组成的连续过程。随着时间的推移,上一轮预测的部分在新一轮中能够观测到对应的真实值,但两者之间的偏差既未被利用,也无法被现有的孤立学习方式捕捉。在本文中,我们提出DeFeeNet,一种简单而有效的网络,可附加于现有一次性预测模型之上,在应用于连续运动预测任务时实现偏差感知与反馈。在每个预测轮次中,首先由我们的DeFeeNet对上一单元产生的偏差进行编码,然后将其融入现有预测器,从而实现一种偏差感知的预测方式,这首次允许信息在相邻预测单元之间传递。我们分别设计了基于MLP和基于GRU的两个DeFeeNet版本。在Human3.6M和更复杂的BABEL数据集上的实验结果表明,无论基础模型如何,我们提出的网络都能提升连续人体运动预测的性能。