Robotic table tennis is a representative benchmark for high-speed, closed-loop robotic control in dynamic environments, where accurate and fast prediction of ball states is critical for reliable planning and control. Physics-based approaches rely heavily on accurate parameter identification and precise initial state, while learning-based methods often struggle to capture long-range temporal dependencies and are typically trained on limited or simulated data. We propose a transformer-based framework for table tennis ball state prediction that leverages attention mechanisms to model long-range temporal correlations directly from historical observations, without relying on explicit flight or bounce models. To support robust learning and generalization, we collected a large-scale real-world dataset from players of varying skill levels and diverse ball cannon configurations. The combination of a high-capacity transformer architecture and extensive real-world data enables accurate long-horizon forecasting. Building on this capability, we introduce a plug-and-play sim-to-real transfer strategy, Swap Predictor at Deployment (SPAD), which replaces the physics-based simulator used during training with the proposed real-world-trained predictor at deployment, improving the sim-to-real transferability of the policy without requiring retraining. We demonstrate that this simple substitution effectively narrows the sim-to-real gap while preserving the efficiency and scalability of simulation-based training.
翻译:乒乓球机器人是动态环境中高速闭环机器人控制的代表性基准,其中准确且快速预测球状态对于可靠规划与控制至关重要。基于物理的方法严重依赖精确的参数辨识和准确的初始状态,而基于学习的方法常常难以捕捉长程时间依赖关系,并且通常在有限或模拟数据上训练。我们提出了一种基于Transformer的乒乓球球状态预测框架,利用注意力机制直接从历史观测中建模长程时间相关性,无需依赖显式飞行或弹跳模型。为支持鲁棒学习与泛化,我们收集了涵盖不同技能水平运动员和多种发球机配置的大规模真实世界数据集。高容量Transformer架构与大规模真实数据的结合实现了精确的长时域预测。基于这一能力,我们引入了一种即插即用的仿真到现实迁移策略——部署时交换预测器(SPAD),在部署阶段将训练时使用的基于物理的仿真器替换为所提出的真实世界训练预测器,从而在不需重训练的情况下提升策略的仿真到现实迁移能力。我们证明,这种简单替换有效缩小了仿真到现实差距,同时保持了基于仿真训练的效率与可扩展性。