We introduced Temporally Incremental Disparity Estimation Network (TIDE-Net), a learning-based technique for disparity computation in mono-camera structured light systems. In our hardware setting, a static pattern is projected onto a dynamic scene and captured by a monocular camera. Different from most former disparity estimation methods that operate in a frame-wise manner, our network acquires disparity maps in a temporally incremental way. Specifically, We exploit the deformation of projected patterns (named pattern flow ) on captured image sequences, to model the temporal information. Notably, this newly proposed pattern flow formulation reflects the disparity changes along the epipolar line, which is a special form of optical flow. Tailored for pattern flow, the TIDE-Net, a recurrent architecture, is proposed and implemented. For each incoming frame, our model fuses correlation volumes (from current frame) and disparity (from former frame) warped by pattern flow. From fused features, the final stage of TIDE-Net estimates the residual disparity rather than the full disparity, as conducted by many previous methods. Interestingly, this design brings clear empirical advantages in terms of efficiency and generalization ability. Using only synthetic data for training, our extensitve evaluation results (w.r.t. both accuracy and efficienty metrics) show superior performance than several SOTA models on unseen real data. The code is available on https://github.com/CodePointer/TIDENet.
翻译:我们提出了时间增量视差估计网络(TIDE-Net),这是一种用于单相机结构化光系统中视差计算的基于学习的技术。在我们的硬件设置中,一个静态图案被投影到动态场景上,并由单目相机捕获。与大多数以往以逐帧方式运算的视差估计方法不同,我们的网络以时间增量方式获取视差图。具体而言,我们利用捕获图像序列中投影图案的变形(称为图案流)来建模时间信息。值得注意的是,这一新提出的图案流公式反映了沿极线的视差变化,是光流的一种特殊形式。针对图案流,我们提出并实现了一种循环架构TIDE-Net。对于每一帧输入,我们的模型融合来自当前帧的相关体以及经图案流扭曲的先前帧视差。与许多先前方法直接估计完整视差的做法不同,TIDE-Net的最终阶段从融合特征中估计残差视差。有趣的是,这一设计在效率和泛化能力方面带来了明显的实证优势。仅使用合成数据进行训练,我们在精度和效率指标上的广泛评估结果显示,在未见过的真实数据上,其性能优于多个最先进模型。代码可在https://github.com/CodePointer/TIDENet获取。