Normalising flows are generative models that transform a complex density into a simpler density through the use of bijective transformations enabling both density estimation and data generation from a single model. %However, the requirement for bijectivity imposes the use of specialised architectures. In the context of image modelling, the predominant choice has been the Glow-based architecture, whereas alternative architectures remain largely unexplored in the research community. In this work, we propose a novel architecture called MixerFlow, based on the MLP-Mixer architecture, further unifying the generative and discriminative modelling architectures. MixerFlow offers an efficient mechanism for weight sharing for flow-based models. Our results demonstrate comparative or superior density estimation on image datasets and good scaling as the image resolution increases, making MixerFlow a simple yet powerful alternative to the Glow-based architectures. We also show that MixerFlow provides more informative embeddings than Glow-based architectures and can integrate many structured transformations such as splines or Kolmogorov-Arnold Networks.
翻译:标准化流是一种生成模型,它通过可逆的双射变换将复杂密度转换为更简单的密度,从而能够在单一模型中同时实现密度估计和数据生成。在图像建模领域,基于 Glow 的架构一直是主流选择,而其他架构在研究界仍较少被探索。本文提出一种名为 MixerFlow 的新型架构,该架构基于 MLP-Mixer 构建,进一步统一了生成式与判别式建模架构。MixerFlow 为基于流的模型提供了一种高效的权重共享机制。我们的实验结果表明,在图像数据集上,MixerFlow 实现了可比或更优的密度估计性能,并且随着图像分辨率的增加具有良好的扩展性,使其成为基于 Glow 架构的一种简洁而有效的替代方案。我们还证明,MixerFlow 能提供比基于 Glow 的架构更具信息量的嵌入表示,并且能够集成多种结构化变换,如样条函数或 Kolmogorov-Arnold 网络。