Mapping agencies are increasingly adopting Aerial Lidar Scanning (ALS) as a new tool to monitor territory and support public policies. Processing ALS data at scale requires efficient point classification methods that perform well over highly diverse territories. To evaluate them, researchers need large annotated Lidar datasets, however, current Lidar benchmark datasets have restricted scope and often cover a single urban area. To bridge this data gap, we present the FRench ALS Clouds from TArgeted Landscapes (FRACTAL) dataset: an ultra-large-scale aerial Lidar dataset made of 100,000 dense point clouds with high-quality labels for 7 semantic classes and spanning 250 km$^2$. FRACTAL is built upon France's nationwide open Lidar data. It achieves spatial and semantic diversity via a sampling scheme that explicitly concentrates rare classes and challenging landscapes from five French regions. It should support the development of 3D deep learning approaches for large-scale land monitoring. We describe the nature of the source data, the sampling workflow, the content of the resulting dataset, and provide an initial evaluation of segmentation performance using a performant 3D neural architecture.
翻译:测绘机构日益将航空激光雷达扫描(ALS)作为监测国土并支持公共政策的新工具。大规模ALS数据处理需要能在高度多样化区域中表现优异的高效点分类方法。为评估这些方法,研究人员需要大规模带标注的激光雷达数据集,然而现有激光雷达基准数据集范围有限,通常仅覆盖单一城市区域。为填补这一数据空白,我们提出FRench ALS Clouds from TArgeted Landscapes (FRACTAL)数据集:一个由10万个密集点云组成的超大规模航空激光雷达数据集,包含7个语义类别的优质标注,覆盖250平方公里。FRACTAL基于法国全国开放激光雷达数据构建,通过采用明确集中法国五个地区稀有类别和挑战性地形的采样方案,实现了空间与语义多样性。该数据集应能支持面向大规模土地监测的三维深度学习方法开发。我们阐述了源数据特征、采样流程、最终数据集内容,并使用高性能三维神经网络架构提供了初始分割性能评估。