DAVIS-Ag: A Synthetic Plant Dataset for Prototyping Domain-Inspired Active Vision in Agricultural Robots

In agricultural environments, viewpoint planning can be a critical functionality for a robot with visual sensors to obtain informative observations of objects of interest (e.g., fruits) from complex structures of plant with random occlusions. Although recent studies on active vision have shown some potential for agricultural tasks, each model has been designed and validated on a unique environment that would not easily be replicated for benchmarking novel methods being developed later. In this paper, we introduce a dataset, so-called DAVIS-Ag, for promoting more extensive research on Domain-inspired Active VISion in Agriculture. To be specific, we leveraged our open-source "AgML" framework and 3D plant simulator of "Helios" to produce 502K RGB images from 30K densely sampled spatial locations in 632 synthetic orchards. Moreover, plant environments of strawberries, tomatoes, and grapes are considered at two different scales (i.e., Single-Plant and Multi-Plant). Useful labels are also provided for each image, including (1) bounding boxes and (2) instance segmentation masks for all identifiable fruits, and also (3) pointers to other images of the viewpoints that are reachable by an execution of action so as to simulate active viewpoint selections at each time step. Using DAVIS-Ag, we visualize motivating examples where fruit visibility can dramatically change depending on the pose of the camera view primarily due to occlusions by other components, such as leaves. Furthermore, we present several baseline models with experiment results for benchmarking in the task of target visibility maximization. Transferability to real strawberry environments is also investigated to demonstrate the feasibility of using the dataset for prototyping real-world solutions. For future research, our dataset is made publicly available online: https://github.com/ctyeong/DAVIS-Ag.

翻译：在农业环境中，视点规划对于配备视觉传感器的机器人而言，可能是一项关键功能，旨在从具有随机遮挡的复杂植物结构中获取感兴趣目标（如果实）的信息化观测。尽管近期关于主动视觉的研究在农业任务中展现出一定潜力，但每个模型均在独特环境中设计与验证，难以复现以用于后续开发新方法的基准测试。本文介绍一个名为DAVIS-Ag的数据集，旨在促进农业领域启发式主动视觉的更广泛研究。具体而言，我们利用开源“AgML”框架与“Helios”三维植物模拟器，从632个合成果园中3万个密集采样的空间位置生成了502K张RGB图像。此外，数据集涵盖了草莓、番茄和葡萄的植物环境，并包含两种不同尺度（即单株与多株）。每张图像均提供实用标注，包括：(1)所有可识别果实的边界框与(2)实例分割掩码，以及(3)指向通过执行动作可达的其他视点图像的指针，以模拟每个时间步的主动视点选择。基于DAVIS-Ag，我们可视化了一系列启发性案例，其中果实可见性可能因相机视点姿态（主要受叶片等其他组件遮挡影响）而发生显著变化。此外，我们提出了若干基线模型及实验结果，用于目标可见性最大化任务的基准测试。研究还探讨了向真实草莓环境的可迁移性，以证明使用该数据集进行实际解决方案原型设计的可行性。为支持未来研究，本数据集已公开在线发布：https://github.com/ctyeong/DAVIS-Ag。