Perspective distortion (PD) causes unprecedented changes in shape, size, orientation, angles, and other spatial relationships of visual concepts in images. Precisely estimating camera intrinsic and extrinsic parameters is a challenging task that prevents synthesizing perspective distortion. Non-availability of dedicated training data poses a critical barrier to developing robust computer vision methods. Additionally, distortion correction methods make other computer vision tasks a multi-step approach and lack performance. In this work, we propose mitigating perspective distortion (MPD) by employing a fine-grained parameter control on a specific family of M\"obius transform to model real-world distortion without estimating camera intrinsic and extrinsic parameters and without the need for actual distorted data. Also, we present a dedicated perspectively distorted benchmark dataset, ImageNet-PD, to benchmark the robustness of deep learning models against this new dataset. The proposed method outperforms existing benchmarks, ImageNet-E and ImageNet-X. Additionally, it significantly improves performance on ImageNet-PD while consistently performing on standard data distribution. Notably, our method shows improved performance on three PD-affected real-world applications crowd counting, fisheye image recognition, and person re-identification and one PD-affected challenging CV task: object detection. The source code, dataset, and models are available on the project webpage at https://prakashchhipa.github.io/projects/mpd.
翻译:透视畸变会导致图像中视觉概念的形状、尺寸、方向、角度及其他空间关系发生显著改变。精确估计相机内外参数是一项具有挑战性的任务,这阻碍了透视畸变的合成建模。专用训练数据的缺失对开发鲁棒的计算机视觉方法构成了关键障碍。此外,现有的畸变校正方法使其他计算机视觉任务需采用多步骤流程,且性能表现不足。本研究提出通过精细调控特定Möbius变换族参数来缓解透视畸变,该方法无需估计相机内外参数,也无需实际畸变数据即可建模真实世界畸变。同时,我们构建了专用的透视畸变基准数据集ImageNet-PD,用于评估深度学习模型在此新数据集上的鲁棒性。所提方法在现有基准测试ImageNet-E和ImageNet-X上表现优异,在ImageNet-PD数据集上性能显著提升,同时在标准数据分布上保持稳定性能。值得注意的是,我们的方法在三个受透视畸变影响的现实应用(人群计数、鱼眼图像识别、行人重识别)以及一个受影响的挑战性计算机视觉任务(目标检测)中均展现出改进性能。源代码、数据集及模型已在项目网页https://prakashchhipa.github.io/projects/mpd 公开。