Deploying ADAS and ADS across countries remains challenging due to differences in legislation, traffic infrastructure, and visual conventions, which introduce domain shifts that degrade perception performance. Traditional cross-country data collection relies on extensive on-road driving, making it costly and inefficient to identify representative locations. To address this, we propose a street-view-guided data acquisition strategy that leverages publicly available imagery to identify places of interest (POI). Two POI scoring methods are introduced: a KNN-based feature distance approach using a vision foundation model, and a visual-attribution approach using a vision-language model. To enable repeatable evaluation, we adopt a collect-detect protocol and construct a co-located dataset by pairing the Zenseact Open Dataset with Mapillary street-view images. Experiments on traffic sign detection, a task particularly sensitive to cross-country variations in sign appearance, show that our approach achieves performance comparable to random sampling while using only half of the target-domain data. We further provide cost estimations for full-country analysis, demonstrating that large-scale street-view processing remains economically feasible. These results highlight the potential of street-view-guided data acquisition for efficient and cost-effective cross-country model adaptation.
翻译:由于各国在法规、交通基础设施和视觉惯例方面的差异引入了导致感知性能下降的域偏移,跨国部署高级驾驶辅助系统(ADAS)和自动驾驶系统(ADS)仍面临挑战。传统的跨国数据收集依赖于大规模道路驾驶,这使得识别代表性位置成本高昂且效率低下。为解决此问题,我们提出一种街景引导的数据采集策略,利用公开可用的图像识别兴趣点(POI)。我们引入了两种POI评分方法:一种是使用视觉基础模型的基于KNN的特征距离方法,另一种是使用视觉-语言模型的视觉归因方法。为实现可重复评估,我们采用“采集-检测”协议,并通过将Zenseact开放数据集与Mapillary街景图像配对构建了一个共置数据集。在交通标志检测(一项对跨国标志外观变化特别敏感的任务)上的实验表明,我们的方法仅使用一半目标域数据即可达到与随机采样相当的性能。我们进一步提供了全国范围分析的成本估算,证明大规模街景处理在经济上仍然可行。这些结果凸显了街景引导数据采集在实现高效且经济高效的跨国模型适应方面的潜力。