Video stabilization refers to the problem of transforming a shaky video into a visually pleasing one. The question of how to strike a good trade-off between visual quality and computational speed has remained one of the open challenges in video stabilization. Inspired by the analogy between wobbly frames and jigsaw puzzles, we propose an iterative optimization-based learning approach using synthetic datasets for video stabilization, which consists of two interacting submodules: motion trajectory smoothing and full-frame outpainting. First, we develop a two-level (coarse-to-fine) stabilizing algorithm based on the probabilistic flow field. The confidence map associated with the estimated optical flow is exploited to guide the search for shared regions through backpropagation. Second, we take a divide-and-conquer approach and propose a novel multiframe fusion strategy to render full-frame stabilized views. An important new insight brought about by our iterative optimization approach is that the target video can be interpreted as the fixed point of nonlinear mapping for video stabilization. We formulate video stabilization as a problem of minimizing the amount of jerkiness in motion trajectories, which guarantees convergence with the help of fixed-point theory. Extensive experimental results are reported to demonstrate the superiority of the proposed approach in terms of computational speed and visual quality. The code will be available on GitHub.
翻译:[translated abstract in Chinese]
视频稳像旨在将抖动视频转化为视觉上令人舒适的画面。如何在视觉质量与计算速度之间取得良好平衡,一直是视频稳像领域亟待解决的开放性难题。受抖动帧与拼图游戏间相似性的启发,我们提出了一种基于合成数据集的迭代优化学习框架,该框架包含两个相互作用的子模块:运动轨迹平滑与全帧外延绘制。首先,我们开发了一种基于概率流场的两级(由粗到精)稳定算法,通过利用与光流估计相关的置信度图,借助反向传播引导共享区域的搜索。其次,我们采取分治策略,提出了一种新颖的多帧融合策略以生成全帧稳定视图。迭代优化方法带来的重要新见解在于:目标视频可被解释为视频稳像非线性映射的不动点。我们将视频稳像形式化为最小化运动轨迹抖动程度的问题,并借助不动点理论保证收敛性。大量实验结果表明,本方法在计算速度与视觉质量方面均具有优越性。代码将在GitHub上开源。