Cultural heritage serves as the enduring record of human thought and history. Despite significant efforts dedicated to the preservation of cultural relics, many ancient artefacts have been ravaged irreversibly by natural deterioration and human actions. Deep learning technology has emerged as a valuable tool for restoring various kinds of cultural heritages, including ancient text restoration. Previous research has approached ancient text restoration from either visual or textual perspectives, often overlooking the potential of synergizing multimodal information. This paper proposes a novel Multimodal Multitask Restoring Model (MMRM) to restore ancient texts, particularly emphasising the ideograph. This model combines context understanding with residual visual information from damaged ancient artefacts, enabling it to predict damaged characters and generate restored images simultaneously. We tested the MMRM model through experiments conducted on both simulated datasets and authentic ancient inscriptions. The results show that the proposed method gives insightful restoration suggestions in both simulation experiments and real-world scenarios. To the best of our knowledge, this work represents the pioneering application of multimodal deep learning in ancient text restoration, which will contribute to the understanding of ancient society and culture in digital humanities fields.
翻译:文化遗产是人类思想与历史的持久记录。尽管在文物保护方面付出了巨大努力,许多古代文物仍因自然退化与人为活动而遭受不可逆的损毁。深度学习技术已成为修复各类文化遗产(包括古代文本修复)的重要工具。以往研究从视觉或文本角度探讨古代文本修复,往往忽视了多模态信息协同的潜力。本文提出一种新颖的多模态多任务修复模型(MMRM),用于修复古代文本,特别聚焦于表意文字。该模型将语境理解与受损古代文物的残余视觉信息相结合,能够同时预测受损字符并生成修复图像。我们通过在模拟数据集和真实古铭文上开展的实验对MMRM模型进行了测试。结果表明,所提方法在模拟实验和真实场景中均能提供富有洞察力的修复建议。据我们所知,这项工作代表了多模态深度学习在古代文本修复领域的开创性应用,将有助于数字人文学科领域对古代社会与文化的理解。