Grammar induction has made significant progress in recent years. However, it is not clear how the application of induced grammar could enhance practical performance in downstream tasks. In this work, we introduce an unsupervised grammar induction method for language understanding and generation. We construct a grammar parser to induce constituency structures and dependency relations, which is simultaneously trained on downstream tasks without additional syntax annotations. The induced grammar features are subsequently incorporated into Transformer as a syntactic mask to guide self-attention. We evaluate and apply our method to multiple machine translation tasks and natural language understanding tasks. Our method demonstrates superior performance compared to the original Transformer and other models enhanced with external parsers. Experimental results indicate that our method is effective in both from-scratch and pre-trained scenarios. Additionally, our research highlights the contribution of explicitly modeling the grammatical structure of texts to neural network models.
翻译:近年来,语法归纳研究取得了显著进展。然而,如何将归纳得到的语法应用于下游任务以提升实际性能,目前尚不明确。本研究提出一种用于语言理解与生成的无监督语法归纳方法。我们构建了一个语法解析器来归纳成分句法结构和依存关系,该解析器在下游任务上进行同步训练,无需额外的句法标注。归纳得到的语法特征随后以句法掩码形式融入Transformer架构,用于引导自注意力机制。我们在多个机器翻译任务和自然语言理解任务上对所提方法进行了评估与应用。实验表明,相较于原始Transformer模型及其他结合外部解析器的增强模型,本方法展现出更优的性能。实验结果证明,该方法在从头训练和预训练两种场景下均具有有效性。此外,本研究进一步揭示了显式建模文本语法结构对神经网络模型的重要贡献。