多模态语义解析在墓碑铭文解读中的应用 (Multi-Modal Semantic Parsing for the Interpretation of Tombstone Inscriptions)

Tombstones are historically and culturally rich artifacts, encapsulating individual lives, community memory, historical narratives and artistic expression. Yet, many tombstones today face significant preservation challenges, including physical erosion, vandalism, environmental degradation, and political shifts. In this paper, we introduce a novel multi-modal framework for tombstones digitization, aiming to improve the interpretation, organization and retrieval of tombstone content. Our approach leverages vision-language models (VLMs) to translate tombstone images into structured Tombstone Meaning Representations (TMRs), capturing both image and text information. To further enrich semantic parsing, we incorporate retrieval-augmented generation (RAG) for integrate externally dependent elements such as toponyms, occupation codes, and ontological concepts. Compared to traditional OCR-based pipelines, our method improves parsing accuracy from an F1 score of 36.1 to 89.5. We additionally evaluate the model's robustness across diverse linguistic and cultural inscriptions, and simulate physical degradation through image fusion to assess performance under noisy or damaged conditions. Our work represents the first attempt to formalize tombstone understanding using large vision-language models, presenting implications for heritage preservation.

翻译：墓碑是具有丰富历史与文化价值的文物，承载着个体生命、社群记忆、历史叙事与艺术表达。然而，当今许多墓碑面临着严峻的保护挑战，包括物理侵蚀、人为破坏、环境退化及政治变迁。本文提出一种新颖的多模态墓碑数字化框架，旨在提升墓碑内容的解读、组织与检索能力。该方法利用视觉-语言模型将墓碑图像转化为结构化的墓碑意义表征，同时捕捉图像与文本信息。为进一步增强语义解析，我们引入检索增强生成技术以整合外部依赖要素，如地名、职业编码与本体概念。相较于传统基于OCR的流程，本方法将解析准确率从F1值36.1提升至89.5。我们还评估了模型在不同语言文化铭文上的鲁棒性，并通过图像融合模拟物理退化场景以检验其在噪声或受损条件下的性能。本研究首次尝试运用大规模视觉-语言模型实现墓碑理解的规范化，为文化遗产保护提供了新的技术路径。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/