Humans use UAVs to monitor changes in forest environments since they are lightweight and provide a large variety of surveillance data. However, their information does not present enough details for understanding the scene which is needed to assess the degree of deforestation. Deep learning algorithms must be trained on large amounts of data to output accurate interpretations, but ground truth recordings of annotated forest imagery are not available. To solve this problem, we introduce a new large aerial dataset for forest inspection which contains both real-world and virtual recordings of natural environments, with densely annotated semantic segmentation labels and depth maps, taken in different illumination conditions, at various altitudes and recording angles. We test the performance of two multi-scale neural networks for solving the semantic segmentation task (HRNet and PointFlow network), studying the impact of the various acquisition conditions and the capabilities of transfer learning from virtual to real data. Our results showcase that the best results are obtained when the training is done on a dataset containing a large variety of scenarios, rather than separating the data into specific categories. We also develop a framework to assess the deforestation degree of an area.
翻译:无人机因其轻便性及能提供多样化的监测数据,被用于监测森林环境变化。然而,其采集的信息缺乏足够细节,难以理解场景以评估森林砍伐程度。深度学习算法需经大量数据训练才能输出精确解读,但目前尚无可用的已标注森林图像真实数据集。为解决此问题,我们提出一个面向森林巡检的新型大规模航拍数据集,包含自然环境中的真实与虚拟记录,提供密集标注的语义分割标签与深度图,覆盖不同光照条件、多种海拔高度和拍摄角度。我们测试两种多尺度神经网络(HRNet与PointFlow网络)在语义分割任务中的性能,研究不同采集条件的影响及从虚拟到真实数据的迁移学习能力。结果表明,在包含多样化场景的数据集上进行训练优于将数据分类处理,可获得最佳结果。此外,我们开发了一套用于评估区域森林砍伐程度的框架。