Restoring the original, flat appearance of a printed document from casual photographs of bent and wrinkled pages is a common everyday problem. In this paper we propose a novel method for grid-based single-image document unwarping. Our method performs geometric distortion correction via a fully convolutional deep neural network that learns to predict the 3D grid mesh of the document and the corresponding 2D unwarping grid in a dual-task fashion, implicitly encoding the coupling between the shape of a 3D piece of paper and its 2D image. In order to allow unwarping models to train on data that is more realistic in appearance than the commonly used synthetic Doc3D dataset, we create and publish our own dataset, called UVDoc, which combines pseudo-photorealistic document images with physically accurate 3D shape and unwarping function annotations. Our dataset is labeled with all the information necessary to train our unwarping network, without having to engineer separate loss functions that can deal with the lack of ground-truth typically found in document in the wild datasets. We perform an in-depth evaluation that demonstrates that with the inclusion of our novel pseudo-photorealistic dataset, our relatively small network architecture achieves state-of-the-art results on the DocUNet benchmark. We show that the pseudo-photorealistic nature of our UVDoc dataset allows for new and better evaluation methods, such as lighting-corrected MS-SSIM. We provide a novel benchmark dataset that facilitates such evaluations, and propose a metric that quantifies line straightness after unwarping. Our code, results and UVDoc dataset are available at https://github.com/tanguymagne/UVDoc.
翻译:从随意拍摄的弯曲和皱褶页面的照片中恢复打印文档的原始平坦外观是一个常见的日常问题。本文提出了一种新颖的基于网格的单图像文档去扭曲方法。我们的方法通过一个全卷积深度神经网络进行几何畸变校正,该网络以双任务方式学习预测文档的3D网格及其对应的2D去扭曲网格,隐式编码了3D纸张形状与其2D图像之间的耦合关系。为了让去扭曲模型能够在比常用的合成Doc3D数据集更逼真的外观数据上进行训练,我们创建并发布了名为UVDoc的自有数据集,该数据集将伪逼真文档图像与物理精确的3D形状和去扭曲函数标注相结合。我们的数据集标注了训练去扭曲网络所需的全部信息,无需设计额外的损失函数来处理野外文档数据集通常缺乏真实标注的问题。我们进行了深入研究评估,结果表明,通过引入我们新颖的伪逼真数据集,我们相对较小的网络架构在DocUNet基准测试上达到了最先进的结果。我们展示了UVDoc数据集的伪逼真特性支持了新的、更优的评估方法,例如光照校正的MS-SSIM。我们提供了一个有利于此类评估的新基准数据集,并提出了一个量化去扭曲后线条笔直度的指标。我们的代码、结果和UVDoc数据集可在https://github.com/tanguymagne/UVDoc获取。