Scene flow estimation determines a scene's 3D motion field, by predicting the motion of points in the scene, especially for aiding tasks in autonomous driving. Many networks with large-scale point clouds as input use voxelization to create a pseudo-image for real-time running. However, the voxelization process often results in the loss of point-specific features. This gives rise to a challenge in recovering those features for scene flow tasks. Our paper introduces DeFlow which enables a transition from voxel-based features to point features using Gated Recurrent Unit (GRU) refinement. To further enhance scene flow estimation performance, we formulate a novel loss function that accounts for the data imbalance between static and dynamic points. Evaluations on the Argoverse 2 scene flow task reveal that DeFlow achieves state-of-the-art results on large-scale point cloud data, demonstrating that our network has better performance and efficiency compared to others. The code is open-sourced at https://github.com/KTH-RPL/deflow.
翻译:场景流估计通过预测场景中点的运动来确定三维运动场,尤其有助于自动驾驶任务。许多以大规模点云为输入的神经网络采用体素化方法生成伪图像以实现实时运行,但体素化过程常导致点特定特征丢失,这为场景流任务中恢复这些特征带来了挑战。本文提出DeFlow,它利用门控循环单元(GRU)优化机制实现从体素特征到点特征的转换。为进一步提升场景流估计性能,我们设计了一种新型损失函数,以解决静态与动态点之间的数据不平衡问题。在Argoverse 2场景流任务上的评估表明,DeFlow在大规模点云数据上取得了最先进成果,证明我们的网络相比其他方法具有更优性能与效率。相关代码已开源至https://github.com/KTH-RPL/deflow。