Scene flow estimation predicts the 3D motion at each point in successive LiDAR scans. This detailed, point-level, information can help autonomous vehicles to accurately predict and understand dynamic changes in their surroundings. Current state-of-the-art methods require annotated data to train scene flow networks and the expense of labeling inherently limits their scalability. Self-supervised approaches can overcome the above limitations, yet face two principal challenges that hinder optimal performance: point distribution imbalance and disregard for object-level motion constraints. In this paper, we propose SeFlow, a self-supervised method that integrates efficient dynamic classification into a learning-based scene flow pipeline. We demonstrate that classifying static and dynamic points helps design targeted objective functions for different motion patterns. We also emphasize the importance of internal cluster consistency and correct object point association to refine the scene flow estimation, in particular on object details. Our real-time capable method achieves state-of-the-art performance on the self-supervised scene flow task on Argoverse 2 and Waymo datasets. The code is open-sourced at https://github.com/KTH-RPL/SeFlow along with trained model weights.
翻译:场景流估计预测连续LiDAR扫描中每个点的三维运动。这种详细的点级信息能够帮助自动驾驶车辆准确预测和理解周围环境的动态变化。当前最先进的方法需要标注数据来训练场景流网络,而标注成本本质上限制了其可扩展性。自监督方法可以克服上述限制,但面临两个阻碍其达到最优性能的主要挑战:点分布不平衡以及对物体级运动约束的忽视。本文提出SeFlow,一种将高效动态分类集成到基于学习的场景流流程中的自监督方法。我们证明,对静态点和动态点进行分类有助于为不同的运动模式设计有针对性的目标函数。我们还强调了内部聚类一致性和正确的物体点关联对于细化场景流估计的重要性,特别是在物体细节方面。我们具备实时处理能力的方法在Argoverse 2和Waymo数据集的自监督场景流任务中达到了最先进的性能。代码及训练好的模型权重已在https://github.com/KTH-RPL/SeFlow开源。