This paper details the challenges in applying two computer vision systems, an EfficientDET supervised learning model and the unsupervised RX spectral classifier, to 98.9 GB of drone imagery from the Wu-Murad wilderness search and rescue (WSAR) effort in Japan and identifies 3 directions for future research. There have been at least 19 proposed approaches and 3 datasets aimed at locating missing persons in drone imagery, but only 3 approaches (2 unsupervised and 1 of an unknown structure) are referenced in the literature as having been used in an actual WSAR operation. Of these proposed approaches, the EfficientDET architecture and the unsupervised spectral RX classifier were selected as the most appropriate for this setting. The EfficientDET model was applied to the HERIDAL dataset and despite achieving performance that is statistically equivalent to the state-of-the-art, the model fails to translate to the real world in terms of false positives (e.g., identifying tree limbs and rocks as people), and false negatives (e.g., failing to identify members of the search team). The poor results in practice for algorithms that showed good results on datasets suggest 3 areas of future research: more realistic datasets for wilderness SAR, computer vision models that are capable of seamlessly handling the variety of imagery that can be collected during actual WSAR operations, and better alignment on performance measures.
翻译:本文详细阐述了将两种计算机视觉系统——有监督学习模型EfficientDET和无监督RX光谱分类器——应用于日本吴-穆拉德野外搜索与救援(WSAR)行动中采集的98.9 GB无人机影像时所面临的挑战,并指出了未来研究的三个方向。尽管已有至少19种研究方法和3个数据集旨在通过无人机影像定位失踪人员,但仅有3种方法(2种无监督方法和1种结构未知的方法)在文献中被引用用于实际WSAR行动。在这些方法中,EfficientDET架构和无监督光谱RX分类器被选为最适用于当前场景。EfficientDET模型在HERIDAL数据集上的表现虽在统计学上与现有最优方法相当,但在实际应用中却存在虚警(例如将树枝和岩石误识别为人体)和漏检(例如未能识别搜索队成员)问题。算法在数据集上表现良好但实际效果欠佳,这指明了未来研究的三个方向:构建更贴近实际的野外搜索救援数据集;开发能无缝处理实际WSAR行动中各类影像的计算机视觉模型;以及建立更一致的性能评估标准。