Designing robust machine learning systems remains an open problem, and there is a need for benchmark problems that cover both environmental changes and evaluation on a downstream task. In this work, we introduce AVOIDDS, a realistic object detection benchmark for the vision-based aircraft detect-and-avoid problem. We provide a labeled dataset consisting of 72,000 photorealistic images of intruder aircraft with various lighting conditions, weather conditions, relative geometries, and geographic locations. We also provide an interface that evaluates trained models on slices of this dataset to identify changes in performance with respect to changing environmental conditions. Finally, we implement a fully-integrated, closed-loop simulator of the vision-based detect-and-avoid problem to evaluate trained models with respect to the downstream collision avoidance task. This benchmark will enable further research in the design of robust machine learning systems for use in safety-critical applications. The AVOIDDS dataset and code are publicly available at $\href{https://purl.stanford.edu/hj293cv5980}{purl.stanford.edu/hj293cv5980}$ and $\href{https://github.com/sisl/VisionBasedAircraftDAA}{github.com/sisl/VisionBasedAircraftDAA}$, respectively.
翻译:设计鲁棒的机器学习系统仍是一个开放性问题,需要涵盖环境变化并在下游任务上进行评估的基准测试。本文提出了AVOIDDS,一个面向基于视觉的飞机检测与规避问题的真实目标检测基准。我们提供了一个包含72,000张逼真图像的数据集,这些图像记录了入侵飞机在不同光照条件、天气状况、相对几何构型和地理位置下的信息。同时,我们提供了一种接口,可以在该数据集的切片上评估训练好的模型,以识别性能随环境条件变化的情况。最后,我们实现了一个全集成闭环模拟器,用于模拟基于视觉的检测与规避问题,从而在下游碰撞规避任务上评估训练模型。这一基准将推动安全关键应用中鲁棒机器学习系统的进一步研究。AVOIDDS数据集与代码已分别公开于$\href{https://purl.stanford.edu/hj293cv5980}{purl.stanford.edu/hj293cv5980}$和$\href{https://github.com/sisl/VisionBasedAircraftDAA}{github.com/sisl/VisionBasedAircraftDAA}$。