Machine Translation (MT) has been widely used for cross-lingual classification, either by translating the test set into English and running inference with a monolingual model (translate-test), or translating the training set into the target languages and finetuning a multilingual model (translate-train). However, most research in the area focuses on the multilingual models rather than the MT component. We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed. The optimal approach, however, is highly task dependent, as we identify various sources of cross-lingual transfer gap that affect different tasks and approaches differently. Our work calls into question the dominance of multilingual models for cross-lingual classification, and prompts to pay more attention to MT-based baselines.
翻译:机器翻译(MT)已被广泛应用于跨语言分类任务,具体方式包括将测试集翻译成英文并使用单语模型进行推理(即翻译-测试),或将训练集翻译成目标语言并微调多语言模型(即翻译-训练)。然而,该领域的大多数研究集中于多语言模型而非机器翻译组件。我们证明,通过使用更强的机器翻译系统并缓解在原始文本上训练与在机器翻译文本上推理之间的不匹配问题,翻译-测试方法可以比先前假设的取得显著更好的效果。然而,最优方法高度依赖具体任务,因为我们在不同任务和方法中识别出影响跨语言迁移差距的多种来源。本研究对多语言模型在跨语言分类中的主导地位提出了质疑,并呼吁更多关注基于机器翻译的基线方法。