A core problem in machine learning is to learn expressive latent variables for model prediction on complex data that involves multiple sub-components in a flexible and interpretable fashion. Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications. The key idea is to dynamically distance data samples in the latent space and thus enhance the output diversity. Our dynamic latent separation method, inspired by atomic physics, relies on the jointly learned structures of each data sample, which also reveal the importance of each sub-component for distinguishing data samples. This approach, atom modeling, requires no supervision of the latent space and allows us to learn extra partially interpretable representations besides the original goal of a model. We empirically demonstrate that the algorithm also enhances the performance of small to larger-scale models in various classification and generation problems.
翻译:机器学习的一个核心问题是在处理包含多个子组件的复杂数据时,能够以灵活且可解释的方式学习用于模型预测的表达性潜变量。本文提出了一种提升表达能力、提供部分可解释性且不局限于特定应用的方法。其关键思想是在潜空间中对数据样本进行动态距离化,从而增强输出的多样性。我们的动态潜分离方法受原子物理学启发,依赖于每个数据样本的联合学习结构,这种结构同时揭示了各子组件在区分数据样本时的重要性。这种被称为原子建模的方法无需对潜空间进行监督,允许我们在模型原始目标之外学习额外的部分可解释表示。实验证明,该算法在多种分类与生成任务中提升了从小型到大规模模型的性能。