Optical Image Stabilization (OIS) system in mobile devices reduces image blurring by steering lens to compensate for hand jitters. However, OIS changes intrinsic camera parameters (i.e. $\mathrm{K}$ matrix) dynamically which hinders accurate camera pose estimation or 3D reconstruction. Here we propose a novel neural network-based approach that estimates $\mathrm{K}$ matrix in real-time so that pose estimation or scene reconstruction can be run at camera native resolution for the highest accuracy on mobile devices. Our network design takes gratified projection model discrepancy feature and 3D point positions as inputs and employs a Multi-Layer Perceptron (MLP) to approximate $f_{\mathrm{K}}$ manifold. We also design a unique training scheme for this network by introducing a Back propagated PnP (BPnP) layer so that reprojection error can be adopted as the loss function. The training process utilizes precise calibration patterns for capturing accurate $f_{\mathrm{K}}$ manifold but the trained network can be used anywhere. We name the proposed Dynamic Intrinsic Manifold Estimation network as DIME-Net and have it implemented and tested on three different mobile devices. In all cases, DIME-Net can reduce reprojection error by at least $64\%$ indicating that our design is successful.
翻译:移动设备中的光学图像稳定(OIS)系统通过调整镜头来补偿手抖,从而减少图像模糊。然而,OIS会动态改变相机内参(即$\mathrm{K}$矩阵),这阻碍了精确的相机位姿估计或三维重建。本文提出一种基于神经网络的新方法,可实时估计$\mathrm{K}$矩阵,从而在移动设备上以相机原始分辨率运行位姿估计或场景重建,以达到最高精度。我们的网络设计将优化的投影模型偏差特征和三维点位置作为输入,并采用多层感知机(MLP)来近似$f_{\mathrm{K}}$流形。我们还通过引入反向传播PnP(BPnP)层设计了一种独特的训练方案,使得重投影误差可作为损失函数。训练过程使用精确的标定图案来捕捉准确的$f_{\mathrm{K}}$流形,但训练后的网络可应用于任意场景。我们将所提出的动态内参流形估计网络命名为DIME-Net,并在三种不同的移动设备上进行了实现与测试。在所有测试案例中,DIME-Net均能将重投影误差降低至少$64\%$,表明我们的设计是成功的。