Prediction skills can be crucial for the success of tasks where robots have limited time to act or joints actuation power. In such a scenario, a vision system with a fixed, possibly too low, sampling rate could lead to the loss of informative points, slowing down prediction convergence and reducing the accuracy. In this paper, we propose to exploit the low latency, motion-driven sampling, and data compression properties of event cameras to overcome these issues. As a use-case, we use a Panda robotic arm to intercept a ball bouncing on a table. To predict the interception point, we adopt a Stateful LSTM network, a specific LSTM variant without fixed input length, which perfectly suits the event-driven paradigm and the problem at hand, where the length of the trajectory is not defined. We train the network in simulation to speed up the dataset acquisition and then fine-tune the models on real trajectories. Experimental results demonstrate how using a dense spatial sampling (i.e. event cameras) significantly increases the number of intercepted trajectories as compared to a fixed temporal sampling (i.e. frame-based cameras).
翻译:预测能力对于机器人因关节驱动力有限或动作时间紧迫而需要快速反应的任务至关重要。在此类场景中,固定且可能过低的采样率的视觉系统可能导致信息点丢失,减缓预测收敛速度并降低精度。本文提出利用事件相机的低延迟、运动驱动采样及数据压缩特性来克服这些问题。以Panda机械臂拦截桌面弹跳球作为应用案例,我们采用状态化LSTM网络(Stateful LSTM)这一无需固定输入长度的特殊LSTM变体,完美适配事件驱动范式及轨迹长度未定义的问题特征。通过仿真训练加速数据集采集,并在真实轨迹上微调模型。实验结果表明,相较于固定时间采样(即帧基相机),密集空间采样(即事件相机)可显著提高成功拦截的轨迹数量。