Images degraded by geometric distortions pose a significant challenge to imaging and computer vision tasks such as object recognition. Deep learning-based imaging models usually fail to give accurate performance for geometrically distorted images. In this paper, we propose the deformation-invariant neural network (DINN), a framework to address the problem of imaging tasks for geometrically distorted images. The DINN outputs consistent latent features for images that are geometrically distorted but represent the same underlying object or scene. The idea of DINN is to incorporate a simple component, called the quasiconformal transformer network (QCTN), into other existing deep networks for imaging tasks. The QCTN is a deep neural network that outputs a quasiconformal map, which can be used to transform a geometrically distorted image into an improved version that is closer to the distribution of natural or good images. It first outputs a Beltrami coefficient, which measures the quasiconformality of the output deformation map. By controlling the Beltrami coefficient, the local geometric distortion under the quasiconformal mapping can be controlled. The QCTN is lightweight and simple, which can be readily integrated into other existing deep neural networks to enhance their performance. Leveraging our framework, we have developed an image classification network that achieves accurate classification of distorted images. Our proposed framework has been applied to restore geometrically distorted images by atmospheric turbulence and water turbulence. DINN outperforms existing GAN-based restoration methods under these scenarios, demonstrating the effectiveness of the proposed framework. Additionally, we apply our proposed framework to the 1-1 verification of human face images under atmospheric turbulence and achieve satisfactory performance, further demonstrating the efficacy of our approach.
翻译:几何失真导致的图像退化对物体识别等成像与计算机视觉任务构成重大挑战。基于深度学习的成像模型通常难以在几何失真图像上取得准确性能。本文提出变形不变神经网络(DINN),该框架旨在解决几何失真图像的成像任务问题。DINN能够为几何失真但表征同一底层物体或场景的图像输出一致的潜在特征。DINN的核心思想是将一个称为拟共形变换网络(QCTN)的简易组件集成到现有成像任务深度网络中。QCTN是一种输出拟共形映射的深度神经网络,该映射可将几何失真图像转换为更接近自然或优质图像分布的优化版本。它首先生成Beltrami系数,用于度量输出变形映射的拟共形性。通过控制Beltrami系数,可以调控拟共形映射下的局部几何失真程度。QCTN结构轻量简洁,易于嵌入现有深度神经网络以提升其性能。基于该框架,我们开发了能够准确分类失真图像的分类网络。所提框架已成功应用于大气湍流和水湍流导致的几何失真图像恢复任务。在此类场景下,DINN超越了现有基于GAN的恢复方法,验证了框架的有效性。此外,我们将该框架应用于大气湍流下人脸图像的1-1验证任务,取得了令人满意的性能,进一步证明了本方法的实用性。