Small Open Models Achieve Near Parity with Large Models in Low Resource Literary Translation at a Fraction of the Cost

Literary translation has recently gained attention as a distinct and complex task in machine translation research. However, the translation by small open models remains an open problem. We contribute to this ongoing research by introducing TinyFabulist Translation Framework (TF2), a unified framework for dataset creation, fine-tuning, and evaluation in English->Romanian literary translation, centered on the creation and open release of both a compact, fine-tuned language model (TF2-12B) and large-scale synthetic parallel datasets (DS-TF2-EN-RO-3M and DS-TF2-EN-RO-15K). Building on DS-TF1-EN-3M (TF1), the largest collection of synthetic English fables to date, we address the need for rich, high-quality literary datasets in low-resource languages such as Romanian. Our pipeline first generates 15k high-quality Romanian reference translations from the TF1 pool using a high-performing LLM. We then apply a two-stage fine-tuning process to a 12B-parameter open-weight model: (i) instruction tuning to capture genre-specific narrative style, and (ii) adapter compression for efficient deployment. Evaluation combines corpus-level BLEU with a five-dimension LLM-based rubric (accuracy, fluency, coherence, style, and cultural adaptation) to provide a nuanced assessment of translation quality. Results show that our fine-tuned model achieves strong fluency and adequacy, narrowing the gap to top-performing proprietary models under automated and human-anchored evaluation, while being open, accessible, and significantly more cost-effective. Alongside the fine-tuned model and both datasets, we publicly release all scripts and evaluation prompts. TF2 thus provides an end-to-end, reproducible pipeline for research on cost-efficient translation, cross-lingual narrative generation, and the broad adoption of open models for culturally significant literary content in low-resource settings.

翻译：文学翻译作为机器翻译研究中一项独特且复杂的任务，近期受到广泛关注。然而，小型开放模型的翻译性能仍是一个待解决的问题。我们通过引入TinyFabulist翻译框架（TF2）为这一持续研究做出贡献，该框架是一个用于英语->罗马尼亚语文学翻译的数据集构建、微调与评估的统一框架，其核心是创建并开源发布一个紧凑的微调语言模型（TF2-12B）以及大规模合成平行数据集（DS-TF2-EN-RO-3M与DS-TF2-EN-RO-15K）。基于迄今为止最大的合成英语寓言数据集DS-TF1-EN-3M（TF1），我们针对罗马尼亚语等低资源语言对丰富、高质量文学数据集的需求进行了探索。我们的流程首先使用高性能大语言模型从TF1数据池中生成15k个高质量的罗马尼亚语参考译文。随后，我们对一个120亿参数的开源权重模型实施两阶段微调：（i）指令微调以捕捉特定体裁的叙事风格；（ii）适配器压缩以实现高效部署。评估结合了语料库级BLEU指标与基于大语言模型的五维评分标准（准确性、流畅性、连贯性、风格契合度与文化适应性），从而对翻译质量进行细致评估。结果表明，我们的微调模型在自动评估和人工锚定评估下均表现出优异的流畅性与充分性，缩小了与顶尖专有模型之间的差距，同时具备开放性、可访问性及显著更高的成本效益。除微调模型和两个数据集外，我们还公开了所有脚本和评估提示词。因此，TF2为成本效益翻译、跨语言叙事生成研究，以及在低资源环境下广泛采用开放模型处理具有文化意义的文学内容，提供了一个端到端、可复现的完整流程。