Small, amorphous waste objects such as biological droppings and microtrash can be difficult to see, especially in cluttered scenes, yet they matter for environmental cleanliness, public health, and autonomous cleanup. We introduce "ScatSpotter": a new dataset of images annotated with polygons around dog feces, collected to train and study object detection and segmentation systems for small potentially camouflaged outdoor waste. We gathered data in mostly urban environments, using "before/after/negative" (BAN) protocol: for a given location, we capture an image with the object present, an image from the same viewpoint after removal, and a nearby negative scene that often contains visually similar confusers. Image collection began in 2020. This paper focuses on two dataset checkpoints from 2025 and 2024. The dataset contains over 9000 images and 6000 polygon annotations. Of the author-captured images we held out 691 for validation and used the rest to train. Via community participation we obtained a 121-image test set that, while small, is independent from author-collected images and provides some generalization confidence across photographers, devices, and locations. Due to its limited size, we report both validation and test results. We explore the difficulty of the dataset using off-the-shelf VIT, MaskRCNN, YOLO-v9, and DINO-v2 models. Zero-shot DINO performs poorly, indicating limited foundational-model coverage of this category. Tuned DINO is the best model with a box-level average precision of 0.69 on a 691-image validation set and 0.7 on the test set. These results establish strong baselines and quantify the remaining difficulty of detecting small, camouflaged waste objects. To support open access to models and data, we compare centralized and decentralized distribution mechanisms and discuss trade-offs for sharing scientific data. Code and project details are hosted on GitHub.
翻译:小型、非结构化的废弃物,如生物排泄物和微型垃圾,在杂乱场景中尤其难以被察觉,但它们对环境清洁、公共卫生以及自主清理具有重要意义。我们推出了"ScatSpotter":这是一个新的图像数据集,其中标注了狗粪便周围的边界多边形,旨在用于训练和研究针对户外小型潜在伪装废弃物的目标检测与分割系统。我们主要在城市场景中收集数据,采用"存在/移除/负样本"(BAN)协议:针对特定地点,我们拍摄一张包含目标物的图像、一张同一视角下目标物移除后的图像,以及一张通常包含视觉相似干扰物的邻近负样本场景图像。图像采集始于2020年。本文重点介绍2025年和2024年的两个数据集检查点。该数据集包含超过9000张图像和6000个多边形标注。在作者采集的图像中,我们预留了691张用于验证,其余用于训练。通过社区参与,我们获得了一个包含121张图像的测试集,虽然规模较小,但独立于作者采集的图像,并在不同拍摄者、设备和地点间提供了一定的泛化置信度。鉴于其有限规模,我们同时报告了验证集和测试集的结果。我们使用现成的VIT、MaskRCNN、YOLO-v9和DINO-v2模型探索了数据集的难度。零样本DINO表现不佳,表明基础模型对此类别的覆盖有限。经调优的DINO是最佳模型,在691张图像的验证集上实现了0.69的边界框平均精度,在测试集上达到0.7。这些结果建立了强基准,并量化了检测小型伪装废弃物剩余的难度。为支持模型与数据的开放获取,我们比较了集中式与分布式分发机制,并讨论了科学数据共享的权衡。代码与项目详情托管于GitHub。