Beyond Convolution: A Taxonomy of Structured Operators for Learning-Based Image Processing

The convolution operator is the fundamental building block of modern convolutional neural networks (CNNs), owing to its simplicity, translational equivariance, and efficient implementation. However, its structure as a fixed, linear, locally-averaging operator limits its ability to capture structured signal properties such as low-rank decompositions, adaptive basis representations, and non-uniform spatial dependencies. This paper presents a systematic taxonomy of operators that extend or replace the standard convolution in learning-based image processing pipelines. We organise the landscape of alternative operators into five families: (i) decomposition-based operators, which separate structural and noise components through singular value or tensor decompositions; (ii) adaptive weighted operators, which modulate kernel contributions as a function of spatial position or signal content; (iii) basis-adaptive operators, which optimise the analysis bases together with the network weights; (iv) integral and kernel operators, which generalise the convolution to position-dependent and non-linear kernels; and (v) attention-based operators, which relax the locality assumption entirely. For each family, we provide a formal definition, a discussion of its structural properties with respect to the convolution, and a critical analysis of the tasks for which the operator is most appropriate. We further provide a comparative analysis of all families across relevant dimensions -- linearity, locality, equivariance, computational cost, and suitability for image-to-image and image-to-label tasks -- and outline the open challenges and future directions of this research area.

翻译：卷积算子因其简洁性、平移等变性和高效实现，已成为现代卷积神经网络（CNN）的基本构建模块。然而，其作为一种固定的、线性的、局部平均算子的结构，限制了其捕捉结构化信号特性的能力，例如低秩分解、自适应基表示以及非均匀空间依赖性。本文系统性地分类了在基于学习的图像处理流程中扩展或替代标准卷积的算子。我们将替代算子的研究领域划分为五个类别：（i）基于分解的算子，通过奇异值或张量分解分离结构成分与噪声成分；（ii）自适应加权算子，根据空间位置或信号内容动态调整卷积核的贡献；（iii）基自适应算子，将分析基与网络权重共同优化；（iv）积分与核算子，将卷积推广至位置相关和非线性核；（v）基于注意力的算子，完全放宽了局部性假设。针对每个类别，我们提供了形式化定义，讨论了其相对于卷积的结构特性，并批判性分析了该算子最适用的任务。我们进一步从多个相关维度——线性、局部性、等变性、计算成本以及适用于图像到图像和图像到标签任务的程度——对所有类别进行了比较分析，并概述了该研究领域的开放挑战与未来方向。