In multilingual nations like India, access to legal information is often hindered by language barriers, as much of the legal and judicial documentation remains in English. Legal Machine Translation (L-MT) offers a scalable solution to this challenge by enabling accurate and accessible translations of legal documents. This paper presents our work for the JUST-NLP 2025 Legal MT shared task, focusing on English-Hindi translation using Transformer-based approaches. We experiment with 2 complementary strategies, fine-tuning a pre-trained OPUS-MT model for domain-specific adaptation and training a Transformer model from scratch using the provided legal corpus. Performance is evaluated using standard MT metrics, including SacreBLEU, chrF++, TER, ROUGE, BERTScore, METEOR, and COMET. Our fine-tuned OPUS-MT model achieves a SacreBLEU score of 46.03, significantly outperforming both baseline and from-scratch models. The results highlight the effectiveness of domain adaptation in enhancing translation quality and demonstrate the potential of L-MT systems to improve access to justice and legal transparency in multilingual contexts.
翻译:在印度等多语言国家,由于大量法律和司法文件仍以英文撰写,语言障碍常常阻碍了对法律信息的获取。法律机器翻译(L-MT)通过实现法律文件的准确且可访问的翻译,为这一挑战提供了可扩展的解决方案。本文介绍了我们为JUST-NLP 2025法律机器翻译共享任务所开展的工作,重点研究基于Transformer方法的英语-印地语翻译。我们尝试了两种互补的策略:对预训练的OPUS-MT模型进行领域特定适应的微调,以及使用提供的法律语料库从头训练一个Transformer模型。性能评估采用标准机器翻译指标,包括SacreBLEU、chrF++、TER、ROUGE、BERTScore、METEOR和COMET。我们微调后的OPUS-MT模型取得了46.03的SacreBLEU分数,显著优于基线模型和从头训练的模型。结果凸显了领域适应在提升翻译质量方面的有效性,并证明了L-MT系统在改善多语言环境下司法可及性和法律透明度方面的潜力。