UVDoc: Neural Grid-based Document Unwarping

Restoring the original, flat appearance of a printed document from casual photographs of bent and wrinkled pages is a common everyday problem. In this paper we propose a novel method for grid-based single-image document unwarping. Our method performs geometric distortion correction via a fully convolutional deep neural network that learns to predict the 3D grid mesh of the document and the corresponding 2D unwarping grid in a dual-task fashion, implicitly encoding the coupling between the shape of a 3D piece of paper and its 2D image. In order to allow unwarping models to train on data that is more realistic in appearance than the commonly used synthetic Doc3D dataset, we create and publish our own dataset, called UVDoc, which combines pseudo-photorealistic document images with physically accurate 3D shape and unwarping function annotations. Our dataset is labeled with all the information necessary to train our unwarping network, without having to engineer separate loss functions that can deal with the lack of ground-truth typically found in document in the wild datasets. We perform an in-depth evaluation that demonstrates that with the inclusion of our novel pseudo-photorealistic dataset, our relatively small network architecture achieves state-of-the-art results on the DocUNet benchmark. We show that the pseudo-photorealistic nature of our UVDoc dataset allows for new and better evaluation methods, such as lighting-corrected MS-SSIM. We provide a novel benchmark dataset that facilitates such evaluations, and propose a metric that quantifies line straightness after unwarping. Our code, results and UVDoc dataset are available at https://github.com/tanguymagne/UVDoc.

翻译：从随意拍摄的弯曲和皱褶页面的照片中恢复打印文档的原始平坦外观是一个常见的日常问题。本文提出了一种新颖的基于网格的单图像文档去扭曲方法。我们的方法通过一个全卷积深度神经网络进行几何畸变校正，该网络以双任务方式学习预测文档的3D网格及其对应的2D去扭曲网格，隐式编码了3D纸张形状与其2D图像之间的耦合关系。为了让去扭曲模型能够在比常用的合成Doc3D数据集更逼真的外观数据上进行训练，我们创建并发布了名为UVDoc的自有数据集，该数据集将伪逼真文档图像与物理精确的3D形状和去扭曲函数标注相结合。我们的数据集标注了训练去扭曲网络所需的全部信息，无需设计额外的损失函数来处理野外文档数据集通常缺乏真实标注的问题。我们进行了深入研究评估，结果表明，通过引入我们新颖的伪逼真数据集，我们相对较小的网络架构在DocUNet基准测试上达到了最先进的结果。我们展示了UVDoc数据集的伪逼真特性支持了新的、更优的评估方法，例如光照校正的MS-SSIM。我们提供了一个有利于此类评估的新基准数据集，并提出了一个量化去扭曲后线条笔直度的指标。我们的代码、结果和UVDoc数据集可在https://github.com/tanguymagne/UVDoc获取。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日