Recent weakly-supervised methods for scene flow estimation from LiDAR point clouds are limited to explicit reasoning on object-level. These methods perform multiple iterative optimizations for each rigid object, which makes them vulnerable to clustering robustness. In this paper, we propose our EgoFlowNet - a point-level scene flow estimation network trained in a weakly-supervised manner and without object-based abstraction. Our approach predicts a binary segmentation mask that implicitly drives two parallel branches for ego-motion and scene flow. Unlike previous methods, we provide both branches with all input points and carefully integrate the binary mask into the feature extraction and losses. We also use a shared cost volume with local refinement that is updated at multiple scales without explicit clustering or rigidity assumptions. On realistic KITTI scenes, we show that our EgoFlowNet performs better than state-of-the-art methods in the presence of ground surface points.
翻译:近年来,基于LiDAR点云的弱监督场景流估计方法仅限于在物体层面进行显式推理。这些方法需要对每个刚性物体执行多次迭代优化,使其易受聚类鲁棒性影响。本文提出EgoFlowNet——一种在点级别进行场景流估计的网络,采用弱监督训练且无需基于物体的抽象化处理。我们的方法预测一个二值分割掩码,该掩码隐式驱动并行处理自运动与场景流的双分支结构。与先前方法不同,我们为两个分支提供全部输入点,并将二值掩码精心整合至特征提取与损失函数中。此外,我们采用共享代价体积配合局部优化模块,该模块可在多尺度下更新而无需显式聚类或刚性假设。在真实KITTI场景上的实验表明,在存在地面点的情况下,我们的EgoFlowNet性能优于现有最先进方法。