The construction of high-quality parallel corpora for translation research has increasingly evolved from simple sentence alignment to complex, multi-layered annotation tasks. This methodological shift presents significant challenges for structurally divergent language pairs, such as Arabic--English, where standard automated tools frequently fail to capture deep linguistic shifts or semantic nuances. This paper introduces a novel, LLM-assisted interactive tool designed to reduce the gap between scalable automation and the rigorous precision required for expert human judgment. Unlike traditional statistical aligners, our system employs a template-based Prompt Manager that leverages large language models (LLMs) for sentence segmentation and alignment under strict JSON output constraints. In this tool, automated preprocessing integrates into a human-in-the-loop workflow, allowing researchers to refine alignments and apply custom translation technique annotations through a stand-off architecture. By leveraging LLM-assisted processing, the tool balances annotation efficiency with the linguistic precision required to analyze complex translation phenomena in specialized domains.
翻译:翻译研究中高质量平行语料库的构建已从简单的句子对齐逐渐演变为复杂的多层次标注任务。这一方法学转变对结构差异较大的语言对(如阿拉伯语-英语)提出了重大挑战,标准自动化工具往往无法捕捉深层的语言转换或语义细微差别。本文介绍了一种新颖的大语言模型辅助交互式工具,旨在缩小可扩展自动化与专家人工判断所需严格精度之间的差距。与传统统计对齐器不同,本系统采用基于模板的提示管理器,利用大语言模型在严格JSON输出约束下进行句子分割与对齐。该工具将自动化预处理整合至人机协同工作流中,使研究人员能够通过独立标注架构优化对齐结果并应用自定义翻译技巧标注。通过大语言模型辅助处理,该工具在标注效率与分析专业领域复杂翻译现象所需语言精度之间实现了平衡。