Detecting and quantifying marine pollution and macro-plastics is an increasingly pressing ecological issue that directly impacts ecology and human health. Efforts to quantify marine pollution are often conducted with sparse and expensive beach surveys, which are difficult to conduct on a large scale. Here, remote sensing can provide reliable estimates of plastic pollution by regularly monitoring and detecting marine debris in coastal areas. Medium-resolution satellite data of coastal areas is readily available and can be leveraged to detect aggregations of marine debris containing plastic litter. In this work, we present a detector for marine debris built on a deep segmentation model that outputs a probability for marine debris at the pixel level. We train this detector with a combination of annotated datasets of marine debris and evaluate it on specifically selected test sites where it is highly probable that plastic pollution is present in the detected marine debris. We demonstrate quantitatively and qualitatively that a deep learning model trained on this dataset issued from multiple sources outperforms existing detection models trained on previous datasets by a large margin. Our experiments show, consistent with the principles of data-centric AI, that this performance is due to our particular dataset design with extensive sampling of negative examples and label refinements rather than depending on the particular deep learning model. We hope to accelerate advances in the large-scale automated detection of marine debris, which is a step towards quantifying and monitoring marine litter with remote sensing at global scales, and release the model weights and training source code under https://github.com/marccoru/marinedebrisdetector
翻译:探测和量化海洋污染及大型塑料碎片是一项日益紧迫的生态问题,直接影响生态系统与人类健康。当前量化海洋污染的工作通常依赖于稀疏且昂贵的海滩调查,难以在大尺度上开展。遥感技术可通过定期监测海岸区域并探测海洋垃圾,为塑料污染提供可靠估算。中分辨率海岸区域卫星数据易于获取,可用于探测含有塑料废弃物的海洋垃圾聚集区域。本文提出一种基于深度分割模型的海洋垃圾检测器,该模型可输出像素级别的海洋垃圾存在概率。我们利用多源标注数据集训练该检测器,并在专门选取的测试区域进行验证——这些区域检测到的海洋垃圾极有可能包含塑料污染。定量与定性实验表明,基于多源数据集训练的深度学习模型,其性能显著优于现有基于旧数据集训练的检测模型。与数据驱动型人工智能原则一致,我们的实验证明:这一性能提升主要源于针对性的数据集设计(含大量负样本采样与标签精炼),而非特定深度学习模型的选择。我们期待此项研究能加速大规模海洋垃圾自动检测技术的发展,为通过遥感手段在全球尺度上量化与监测海洋垃圾奠定基础。相关模型权重与训练源代码已发布于https://github.com/marccoru/marinedebrisdetector