We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot, the quadruped robot manufactured by Boston Dynamics. The key-novelty is the focus on the robot's perspective, i.e., on the data captured by the robot's sensors. These make 3D body pose analysis challenging because being close to the ground captures humans only partially. The scenario underlying HARPER includes 15 actions, of which 10 involve physical contact between the robot and users. The Corpus contains not only the recordings of the built-in stereo cameras of Spot, but also those of a 6-camera OptiTrack system (all recordings are synchronized). This leads to ground-truth skeletal representations with a precision lower than a millimeter. In addition, the Corpus includes reproducible benchmarks on 3D Human Pose Estimation, Human Pose Forecasting, and Collision Prediction, all based on publicly available baseline approaches. This enables future HARPER users to rigorously compare their results with those we provide in this work.
翻译:我们引入HARPER数据集,这是一个面向用户与波士顿动力公司制造的四足机器人Spot之间二元交互的三维人体姿态估计与预测的新型数据集。其核心创新在于聚焦机器人视角,即由机器人传感器捕获的数据。这些传感器因贴近地面而仅能部分捕捉人体,使得三维人体姿态分析极具挑战性。HARPER数据集涵盖15种动作场景,其中10种涉及机器人与用户的物理接触。该语料不仅包含Spot内置立体摄像头的记录,还同步收录了六摄像头OptiTrack系统的录制数据,从而获得精度低于毫米级的地面真值骨架表示。此外,该语料基于公开基准方法,建立了三维人体姿态估计、人体姿态预测及碰撞预测的可复现基准测试,便于后续HARPER用户将其结果与本研究提供的数据进行严格对比。