U-Nets are a go-to, state-of-the-art neural architecture across numerous tasks for continuous signals on a square such as images and Partial Differential Equations (PDE), however their design and architecture is understudied. In this paper, we provide a framework for designing and analysing general U-Net architectures. We present theoretical results which characterise the role of the encoder and decoder in a U-Net, their high-resolution scaling limits and their conjugacy to ResNets via preconditioning. We propose Multi-ResNets, U-Nets with a simplified, wavelet-based encoder without learnable parameters. Further, we show how to design novel U-Net architectures which encode function constraints, natural bases, or the geometry of the data. In diffusion models, our framework enables us to identify that high-frequency information is dominated by noise exponentially faster, and show how U-Nets with average pooling exploit this. In our experiments, we demonstrate how Multi-ResNets achieve competitive and often superior performance compared to classical U-Nets in image segmentation, PDE surrogate modelling, and generative modelling with diffusion models. Our U-Net framework paves the way to study the theoretical properties of U-Nets and design natural, scalable neural architectures for a multitude of problems beyond the square.
翻译:U-Net已成为处理方形域上连续信号(如图像和偏微分方程)的众多任务中首选的最先进神经架构,然而其设计与架构的研究尚不充分。本文提出一个用于设计和分析通用U-Net架构的框架。我们通过理论结果揭示了U-Net中编码器与解码器的角色、它们的高分辨率极限缩放特性,以及通过预条件化与ResNet的共轭关系。我们提出Multi-ResNets——一种采用简化、基于小波的无参数编码器的U-Net变体。进一步展示了如何设计编码函数约束、自然基或数据几何结构的新型U-Net架构。在扩散模型中,该框架使我们能够识别高频信息以指数级速度被噪声主导,并阐明采用平均池化的U-Net如何利用这一特性。实验表明,Multi-ResNets在图像分割、偏微分方程代理建模及基于扩散模型的生成建模中,与经典U-Net相比表现出竞争性甚至更优的性能。我们的U-Net框架为研究U-Net的理论特性以及设计面向方形域外多类问题的自然可扩展神经架构铺平了道路。