Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data. Previous work has primarily considered silver-standard data augmentation or zero-shot methods, however, exploiting few-shot gold data is comparatively unexplored. We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between probabilistic latent variables using Optimal Transport. We demonstrate how this direct guidance improves parsing from natural languages using fewer examples and less training. We evaluate our method on two datasets, MTOP and MultiATIS++SQL, establishing state-of-the-art results under a few-shot cross-lingual regime. Ablation studies further reveal that our method improves performance even without parallel input translations. In addition, we show that our model better captures cross-lingual structure in the latent space to improve semantic representation similarity.
翻译:跨语言语义解析旨在将语义解析能力从高资源语言(如英语)迁移至训练数据稀缺的低资源语言。现有工作主要考虑银标准数据增强或零样本方法,然而利用少量金标准数据的研究相对较少。我们提出一种新的跨语言语义解析方法,通过最优传输显式最小化概率潜变量间的跨语言散度。我们证明,这种直接引导能够用更少的样本和更少的训练提升自然语言解析性能。我们在MTOP和MultiATIS++SQL两个数据集上评估该方法,在少样本跨语言场景下取得了当前最优结果。消融实验进一步表明,即使在没有平行输入翻译的情况下,我们的方法依然能提升性能。此外,我们展示模型能够更好地捕捉潜空间中的跨语言结构,从而提升语义表征相似性。