Paved2Paradise: Cost-Effective and Scalable LiDAR Simulation by Factoring the Real World

To achieve strong real world performance, neural networks must be trained on large, diverse datasets; however, obtaining and annotating such datasets is costly and time-consuming, particularly for 3D point clouds. In this paper, we describe Paved2Paradise, a simple, cost-effective approach for generating fully labeled, diverse, and realistic lidar datasets from scratch, all while requiring minimal human annotation. Our key insight is that, by deliberately collecting separate "background" and "object" datasets (i.e., "factoring the real world"), we can intelligently combine them to produce a combinatorially large and diverse training set. The Paved2Paradise pipeline thus consists of four steps: (1) collecting copious background data, (2) recording individuals from the desired object class(es) performing different behaviors in an isolated environment (like a parking lot), (3) bootstrapping labels for the object dataset, and (4) generating samples by placing objects at arbitrary locations in backgrounds. To demonstrate the utility of Paved2Paradise, we generated synthetic datasets for two tasks: (1) human detection in orchards (a task for which no public data exists) and (2) pedestrian detection in urban environments. Qualitatively, we find that a model trained exclusively on Paved2Paradise synthetic data is highly effective at detecting humans in orchards, including when individuals are heavily occluded by tree branches. Quantitatively, a model trained on Paved2Paradise data that sources backgrounds from KITTI performs comparably to a model trained on the actual dataset. These results suggest the Paved2Paradise synthetic data pipeline can help accelerate point cloud model development in sectors where acquiring lidar datasets has previously been cost-prohibitive.

翻译：为在真实场景中取得优异表现，神经网络必须在大规模多样化数据集上训练；然而，获取并标注此类数据集（尤其是三维点云数据）成本高昂且耗时。本文提出Paved2Paradise——一种从零生成完全标注、多样化且逼真的激光雷达数据集的简单经济方法，全程仅需极少量人工标注。我们的关键洞察在于：通过有意识地分别采集"背景"与"物体"数据集（即"分解真实世界"），可将二者智能组合，生成组合数量庞大且多样化的训练集。该流程包含四个步骤：（1）采集大量背景数据；（2）在隔离环境（如停车场）中录制目标物体类别的个体执行不同行为；（3）为物体数据集自举标签；（4）通过将物体随机放置于背景中生成样本。为验证Paved2Paradise的实用性，我们针对两项任务生成了合成数据集：（1）果园中的人体检测（尚无公开数据的任务）；（2）城市环境中的行人检测。定性分析表明，仅基于Paved2Paradise合成数据训练的模型在检测果园中人体（包括被树枝严重遮挡的情况）时表现出色。定量分析显示，使用KITTI背景数据生成的Paved2Paradise数据集训练的模型，其性能与直接基于原始KITTI数据集训练的模型相当。这些结果表明，Paved2Paradise合成数据生成流程有助于加速那些因之前获取激光雷达数据集成本过高而受限领域的点云模型开发。