Large Language Models (LLMs) have stunningly advanced the field of machine translation, though their effectiveness within the financial domain remains largely underexplored. To probe this issue, we constructed a fine-grained Chinese-English parallel corpus of financial news called FFN. We acquired financial news articles spanning between January 1st, 2014, to December 31, 2023, from mainstream media websites such as CNN, FOX, and China Daily. The dataset consists of 1,013 main text and 809 titles, all of which have been manually corrected. We measured the translation quality of two LLMs -- ChatGPT and ERNIE-bot, utilizing BLEU, TER and chrF scores as the evaluation metrics. For comparison, we also trained an OpenNMT model based on our dataset. We detail problems of LLMs and provide in-depth analysis, intending to stimulate further research and solutions in this largely uncharted territory. Our research underlines the need to optimize LLMs within the specific field of financial translation to ensure accuracy and quality.
翻译:大型语言模型(LLMs)在机器翻译领域取得了令人瞩目的进展,然而其在金融领域内的翻译效能仍未得到充分探索。为探究此问题,我们构建了一个细粒度的中英金融新闻平行语料库,命名为FFN。我们从CNN、FOX、中国日报等主流媒体网站收集了2014年1月1日至2023年12月31日期间的金融新闻报道。该数据集包含1,013条正文文本与809条标题,均已进行人工校对。我们采用BLEU、TER和chrF作为评估指标,测量了两种LLMs——ChatGPT与ERNIE-bot的翻译质量。作为对比,我们还基于本数据集训练了一个OpenNMT模型。我们详细阐述了LLMs存在的问题并提供了深入分析,旨在激发这一尚未充分探索领域内的进一步研究与解决方案。本研究强调了在金融翻译这一特定领域内优化LLMs以确保准确性与质量的必要性。