Machine Translation (MT) has advanced from rule-based and statistical methods to neural approaches based on the Transformer architecture. While these methods have achieved impressive results for high-resource languages, low-resource varieties such as Sylheti remain underexplored. In this work, we investigate Bengali-to-Sylheti translation by fine-tuning multilingual Transformer models and comparing them with zero-shot large language models (LLMs). Experimental results demonstrate that fine-tuned models significantly outperform LLMs, with mBART-50 achieving the highest translation adequacy and MarianMT showing the strongest character-level fidelity. These findings highlight the importance of task-specific adaptation for underrepresented languages and contribute to ongoing efforts toward inclusive language technologies.
翻译:机器翻译(MT)已从基于规则和统计的方法发展到基于Transformer架构的神经方法。尽管这些方法在高资源语言上取得了令人瞩目的成果,但如锡尔赫特语等低资源变体仍未被充分探索。本研究通过微调多语言Transformer模型,并将其与零样本大型语言模型(LLMs)进行比较,探讨了孟加拉语至锡尔赫特语的翻译问题。实验结果表明,微调模型显著优于LLMs,其中mBART-50在翻译充分性上表现最佳,而MarianMT则在字符级保真度上最强。这些发现凸显了针对代表性不足语言进行任务特定适应的重要性,并为推动包容性语言技术的持续努力做出了贡献。