Ensuring the safety of human workers in a collaborative environment with robots is of utmost importance. Although accurate pose prediction models can help prevent collisions between human workers and robots, they are still susceptible to critical errors. In this study, we propose a novel approach called deep ensembles of temporal graph neural networks (DE-TGN) that not only accurately forecast human motion but also provide a measure of prediction uncertainty. By leveraging deep ensembles and employing stochastic Monte-Carlo dropout sampling, we construct a volumetric field representing a range of potential future human poses based on covariance ellipsoids. To validate our framework, we conducted experiments using three motion capture datasets including Human3.6M, and two human-robot interaction scenarios, achieving state-of-the-art prediction error. Moreover, we discovered that deep ensembles not only enable us to quantify uncertainty but also improve the accuracy of our predictions.
翻译:在人与机器人协作环境中确保人类工人的安全至关重要。尽管精确的姿态预测模型有助于防止人类工人与机器人之间的碰撞,但这些模型仍容易出现严重错误。本研究提出了一种新颖方法——时序图神经网络深度集成(DE-TGN),该方法不仅能准确预测人体运动,还能提供预测不确定性的度量。通过利用深度集成并采用随机蒙特卡洛丢弃采样,我们基于协方差椭球构建了一个表示未来潜在人体姿态范围的体素场。为了验证我们的框架,我们使用包括Human3.6M在内的三个运动捕捉数据集以及两个人机交互场景进行了实验,取得了最先进的预测误差。此外,我们发现深度集成不仅能量化不确定性,还能提高预测的准确性。