The intersection of computer vision and machine learning has emerged as a promising avenue for advancing historical research, facilitating a more profound exploration of our past. However, the application of machine learning approaches in historical palaeography is often met with criticism due to their perceived ``black box'' nature. In response to this challenge, we introduce NeuroPapyri, an innovative deep learning-based model specifically designed for the analysis of images containing ancient Greek papyri. To address concerns related to transparency and interpretability, the model incorporates an attention mechanism. This attention mechanism not only enhances the model's performance but also provides a visual representation of the image regions that significantly contribute to the decision-making process. Specifically calibrated for processing images of papyrus documents with lines of handwritten text, the model utilizes individual attention maps to inform the presence or absence of specific characters in the input image. This paper presents the NeuroPapyri model, including its architecture and training methodology. Results from the evaluation demonstrate NeuroPapyri's efficacy in document retrieval, showcasing its potential to advance the analysis of historical manuscripts.
翻译:计算机视觉与机器学习的交叉领域已成为推动历史研究、促进对过往进行更深入探索的一条前景广阔的途径。然而,机器学习方法在历史古文书学中的应用常因其被视为“黑箱”性质而受到质疑。为应对这一挑战,我们提出了NeuroPapyri,这是一种专门为分析包含古希腊纸莎草文献图像而设计的创新型深度学习模型。为解决与透明度和可解释性相关的担忧,该模型引入了注意力机制。此注意力机制不仅提升了模型的性能,还能以可视化方式呈现对决策过程有显著贡献的图像区域。该模型专门针对处理包含手写文本行的纸莎草文献图像进行校准,利用独立的注意力图来指示输入图像中是否存在特定字符。本文介绍了NeuroPapyri模型,包括其架构与训练方法。评估结果表明,NeuroPapyri在文献检索方面具有显著效能,展现了其在推进历史手稿分析方面的潜力。