Incorporating geometric inductive biases into models can aid interpretability and generalization, but encoding to a specific geometric structure can be challenging due to the imposed topological constraints. In this paper, we theoretically and empirically characterize obstructions to training encoders with geometric latent spaces. We show that local optima can arise due to singularities (e.g. self-intersection) or due to an incorrect degree or winding number. We then discuss how normalizing flows can potentially circumvent these obstructions by defining multimodal variational distributions. Inspired by this observation, we propose a new flow-based model that maps data points to multimodal distributions over geometric spaces and empirically evaluate our model on 2 domains. We observe improved stability during training and a higher chance of converging to a homeomorphic encoder.
翻译:将几何归纳偏置融入模型有助于提升可解释性和泛化能力,但由于拓扑约束的存在,将模型编码到特定几何结构面临挑战。本文从理论和实证角度刻画了具有几何潜空间的编码器在训练中存在的障碍。我们证明,局部最优解可能源于奇异性(例如自交)或错误的度数及环绕数。随后讨论归一化流如何通过定义多模态变分分布来潜在规避这些障碍。受此观察启发,我们提出一种基于流的新模型,将数据点映射到几何空间上的多模态分布,并在两个领域进行实证评估。实验观察到训练稳定性提升,且更易收敛到同胚编码器。