Orthogonal convolutional layers are the workhorse of multiple areas in machine learning, such as adversarial robustness, normalizing flows, GANs, and Lipschitzconstrained models. Their ability to preserve norms and ensure stable gradient propagation makes them valuable for a large range of problems. Despite their promise, the deployment of orthogonal convolution in large-scale applications is a significant challenge due to computational overhead and limited support for modern features like strides, dilations, group convolutions, and transposed convolutions.In this paper, we introduce AOC (Adaptative Orthogonal Convolution), a scalable method for constructing orthogonal convolutions, effectively overcoming these limitations. This advancement unlocks the construction of architectures that were previously considered impractical. We demonstrate through our experiments that our method produces expressive models that become increasingly efficient as they scale. To foster further advancement, we provide an open-source library implementing this method, available at https://github.com/thib-s/orthogonium.
翻译:正交卷积层是机器学习多个领域(如对抗鲁棒性、归一化流、生成对抗网络和Lipschitz约束模型)的核心组件。其保持范数并确保梯度稳定传播的能力使其在广泛问题中具有重要价值。尽管前景广阔,正交卷积在大规模应用中的部署仍面临重大挑战,主要源于计算开销以及对现代特征(如步长、膨胀卷积、分组卷积和转置卷积)的有限支持。本文提出AOC(自适应正交卷积),一种可扩展的正交卷积构造方法,有效克服了这些限制。这一进展使得构建先前被认为不切实际的架构成为可能。实验表明,我们的方法能够生成表达能力强的模型,且随着规模扩大效率不断提升。为促进进一步研究,我们提供了实现该方法的开源库,可通过https://github.com/thib-s/orthogonium获取。