U-Nets are a go-to, state-of-the-art neural architecture across numerous tasks for continuous signals on a square such as images and Partial Differential Equations (PDE), however their design and architecture is understudied. In this paper, we provide a framework for designing and analysing general U-Net architectures. We present theoretical results which characterise the role of the encoder and decoder in a U-Net, their high-resolution scaling limits and their conjugacy to ResNets via preconditioning. We propose Multi-ResNets, U-Nets with a simplified, wavelet-based encoder without learnable parameters. Further, we show how to design novel U-Net architectures which encode function constraints, natural bases, or the geometry of the data. In diffusion models, our framework enables us to identify that high-frequency information is dominated by noise exponentially faster, and show how U-Nets with average pooling exploit this. In our experiments, we demonstrate how Multi-ResNets achieve competitive and often superior performance compared to classical U-Nets in image segmentation, PDE surrogate modelling, and generative modelling with diffusion models. Our U-Net framework paves the way to study the theoretical properties of U-Nets and design natural, scalable neural architectures for a multitude of problems beyond the square.
翻译:U-Net是处理图像、偏微分方程(PDE)等正方形域连续信号众多任务中的首选先进神经架构,但其设计与架构研究尚不充分。本文提供了一个通用U-Net架构设计与分析框架。我们通过理论结果揭示了编码器与解码器在U-Net中的角色、它们的高分辨率缩放极限,以及通过预处理与ResNet的共轭关系。我们提出Multi-ResNet——一种采用简化小波基编码器、无需可学习参数的U-Net变体。进一步,我们展示了如何设计能够编码函数约束、自然基或数据几何结构的新型U-Net架构。在扩散模型中,该框架使我们能够识别出高频信息被噪声指数级更快主导的现象,并阐明采用平均池化的U-Net如何利用这一特性。实验表明,Multi-ResNet在图像分割、PDE代理建模及扩散模型生成建模中,相比经典U-Net取得了具有竞争力甚至更优的性能。我们的U-Net框架为研究U-Net的理论特性,以及为正方形域之外的众多问题设计自然、可扩展的神经架构奠定了基础。