Vector-based Representation is the Key: A Study on Disentanglement and Compositional Generalization

Recognizing elementary underlying concepts from observations (disentanglement) and generating novel combinations of these concepts (compositional generalization) are fundamental abilities for humans to support rapid knowledge learning and generalize to new tasks, with which the deep learning models struggle. Towards human-like intelligence, various works on disentangled representation learning have been proposed, and recently some studies on compositional generalization have been presented. However, few works study the relationship between disentanglement and compositional generalization, and the observed results are inconsistent. In this paper, we study several typical disentangled representation learning works in terms of both disentanglement and compositional generalization abilities, and we provide an important insight: vector-based representation (using a vector instead of a scalar to represent a concept) is the key to empower both good disentanglement and strong compositional generalization. This insight also resonates the neuroscience research that the brain encodes information in neuron population activity rather than individual neurons. Motivated by this observation, we further propose a method to reform the scalar-based disentanglement works ($\beta$-TCVAE and FactorVAE) to be vector-based to increase both capabilities. We investigate the impact of the dimensions of vector-based representation and one important question: whether better disentanglement indicates higher compositional generalization. In summary, our study demonstrates that it is possible to achieve both good concept recognition and novel concept composition, contributing an important step towards human-like intelligence.

翻译：从观测中识别基本潜在概念（解耦）并生成这些概念的新颖组合（组合泛化）是人类支持快速知识学习并推广至新任务的基础能力，而深度学习模型在此方面表现不佳。为迈向类人智能，研究者已提出多种解耦表示学习方法，近期也出现了一些关于组合泛化的研究。然而，目前鲜有工作探究解耦与组合泛化之间的关系，且已有观测结果并不一致。本文从解耦与组合泛化能力两方面研究了多个典型的解耦表示学习工作，并提出重要见解：基于向量的表示（使用向量而非标量表示一个概念）是实现良好解耦与强组合泛化能力的关键。这一见解也与神经科学中关于大脑通过神经元群体活动而非单个神经元编码信息的研究相呼应。受此启发，我们进一步提出一种方法，将基于标量的解耦工作（β-TCVAE和FactorVAE）改造为基于向量形式，以提升两者的能力。我们探究了基于向量表示的维度影响，并回答了一个关键问题：更好的解耦是否意味着更高的组合泛化能力？总之，我们的研究表明，同时实现良好的概念识别与新颖概念组合具有可能性，这为迈向类人智能迈出了重要一步。