The 2025 BEHAVIOR Challenge is designed to rigorously track progress toward solving long-horizon tasks by physical agents in simulated environments. BEHAVIOR-1K focuses on everyday household tasks that people most want robots to assist with and these tasks introduce long-horizon mobile manipulation challenges in realistic settings, bridging the gap between current research and real-world, human-centric applications. This report presents our solution to the 2025 BEHAVIOR Challenge in a very close 2nd place and substantially outperforms the rest of the submissions. Building on $π_{0.5}$, we focus on systematically building our solution by studying the effects of training techniques and data. Through careful ablation studies, we reveal the scaling benefits in both the pre-training and post-training phases, leading to a validation Q-score of 0.345, significantly surpassing previous state-of-the-art performance. We summarize our practical lessons and design recommendations that we hope will provide actionable insights for the broader embodied AI community when adapting powerful foundation models to complex embodied scenarios. Project page: https://github.com/mli0603/openpi-comet
翻译:2025 BEHAVIOR挑战赛旨在严格追踪物理智能体在仿真环境中解决长程任务的研究进展。BEHAVIOR-1K聚焦于人们最期望机器人协助完成的日常家庭任务,这些任务在真实场景中引入了长程移动操作挑战,从而弥合了当前研究与现实世界、以人为中心的应用之间的差距。本报告介绍了我们在2025 BEHAVIOR挑战赛中获得极接近第二名成绩的解决方案,其性能显著优于其他参赛方案。基于$π_{0.5}$,我们通过系统研究训练技术与数据的影响来构建解决方案。通过细致的消融实验,我们揭示了预训练与后训练阶段中的规模化效益,最终实现了0.345的验证Q分数,显著超越了先前的最优性能。我们总结了实践心得与设计建议,期望能为更广泛的具身智能社区在将强大基础模型适配至复杂具身场景时提供可操作的见解。项目页面:https://github.com/mli0603/openpi-comet