While generalist robot policies hold significant promise for learning diverse manipulation skills through imitation, their performance is often hindered by the long-tail distribution of training demonstrations. Policies learned on such data, which is heavily skewed towards a few data-rich head tasks, frequently exhibit poor generalization when confronted with the vast number of data-scarce tail tasks. In this work, we conduct a comprehensive analysis of the pervasive long-tail challenge inherent in policy learning. Our analysis begins by demonstrating the inefficacy of conventional long-tail learning strategies (e.g., re-sampling) for improving the policy's performance on tail tasks. We then uncover the underlying mechanism for this failure, revealing that data scarcity on tail tasks directly impairs the policy's spatial reasoning capability. To overcome this, we introduce Approaching-Phase Augmentation (APA), a simple yet effective scheme that transfers knowledge from data-rich head tasks to data-scarce tail tasks without requiring external demonstrations. Extensive experiments in both simulation and real-world manipulation tasks demonstrate the effectiveness of APA. Our code and demos are publicly available at: https://mldxy.github.io/Project-VLA-long-tail/.
翻译:尽管通用型机器人策略在通过模仿学习多样化操作技能方面展现出巨大潜力,但其性能常受限于训练示范数据的长尾分布。基于此类严重偏向少数数据丰富头部任务的数据所学习的策略,在面对大量数据稀缺尾部任务时,往往表现出较差的泛化能力。本研究对策略学习中普遍存在的长尾挑战进行了系统性分析。我们首先论证了传统长尾学习策略(如重采样)在提升策略尾部任务性能方面的低效性,进而揭示了其失效的内在机制:尾部任务的数据稀缺直接损害了策略的空间推理能力。为克服此问题,我们提出了近似阶段增强(APA)方案——一种无需外部示范即可将知识从数据丰富的头部任务迁移至数据稀缺尾部任务的简洁而有效的方法。在仿真与真实机器人操作任务中的大量实验验证了APA的有效性。我们的代码与演示已公开于:https://mldxy.github.io/Project-VLA-long-tail/。