We present DriveGPT, a scalable behavior model for autonomous driving. We model driving as a sequential decision making task, and learn a transformer model to predict future agent states as tokens in an autoregressive fashion. We scale up our model parameters and training data by multiple orders of magnitude, enabling us to explore the scaling properties in terms of dataset size, model parameters, and compute. We evaluate DriveGPT across different scales in a planning task, through both quantitative metrics and qualitative examples including closed-loop driving in complex real-world scenarios. In a separate prediction task, DriveGPT outperforms a state-of-the-art baseline and exhibits improved performance by pretraining on a large-scale dataset, further validating the benefits of data scaling.
翻译:本文提出DriveGPT,一种用于自动驾驶的可扩展行为模型。我们将驾驶建模为序列决策任务,并训练Transformer模型以自回归方式预测未来智能体状态(表示为token)。通过将模型参数与训练数据规模扩展数个数量级,我们得以探索数据集规模、模型参数与计算资源之间的缩放特性。我们在规划任务中通过定量指标与定性案例(包括复杂真实场景下的闭环驾驶)评估了不同规模的DriveGPT模型。在独立的预测任务中,DriveGPT超越了当前最优基线模型,且通过大规模数据集预训练展现出性能提升,进一步验证了数据扩展的效益。