Learning structural information from observational data is central to producing new knowledge outside the training corpus. This holds for mechanistic understanding in scientific discovery as well as flexible test-time compositional generation. We thus study how language models learn abstract structures and utilize the learnt structural information at test-time. To ensure a controlled setup, we design a natural language dataset based on linguistic structural transformations. We empirically show that the emergence of learning structural information correlates with complex reasoning tasks, and that the ability to perform test-time compositional generation remains limited.
翻译:从观测数据中学习结构信息是产生训练语料之外新知识的核心。这对于科学发现中的机制理解以及灵活的测试时组合生成均至关重要。因此,我们研究语言模型如何学习抽象结构,并在测试时利用所学的结构信息。为确保受控的实验设置,我们设计了一个基于语言结构转换的自然语言数据集。我们通过实证表明,学习结构信息的涌现与复杂推理任务相关,并且执行测试时组合生成的能力仍然有限。