Paleography is the study of ancient and historical handwriting, its key objectives include the dating of manuscripts and understanding the evolution of writing. Estimating when a document was written and tracing the development of scripts and writing styles can be aided by identifying the individual scribes who contributed to a medieval manuscript. Although digital technologies have made significant progress in this field, the general problem remains unsolved and continues to pose open challenges. ... We previously proposed an approach focused on identifying specific letters or abbreviations that characterize each writer. In that study, we considered the letter "a", as it was widely present on all pages of text and highly distinctive, according to the suggestions of expert paleographers. We used template matching techniques to detect the occurrences of the character "a" on each page and the convolutional neural network (CNN) to attribute each instance to the correct scribe. Moving from the interesting results achieved from this previous system and being aware of the limitations of the template matching technique, which requires an appropriate threshold to work, we decided to experiment in the same framework with the use of the YOLO object detection model to identify the scribe who contributed to the writing of different medieval books. We considered the fifth version of YOLO to implement the YOLO object detection model, which completely substituted the template matching and CNN used in the previous work. The experimental results demonstrate that YOLO effectively extracts a greater number of letters considered, leading to a more accurate second-stage classification. Furthermore, the YOLO confidence score provides a foundation for developing a system that applies a rejection threshold, enabling reliable writer identification even in unseen manuscripts.
翻译:古文书学是研究古代及历史手写文字的学科,其核心目标包括确定手稿年代和理解书写演变过程。通过识别参与中世纪手稿创作的具体抄写员,有助于推定文献的撰写年代并追溯文字体系与书写风格的发展。尽管数字技术在该领域已取得显著进展,但这一普遍性问题仍未完全解决,持续带来开放性的挑战。……我们先前提出了一种专注于识别表征每位书写者的特定字母或缩写符号的研究方法。在该研究中,我们依据专业古文书学家的建议,选取了字母"a"作为研究对象,因其在文本所有页面上广泛存在且具有高度区分性。我们采用模板匹配技术检测每页中字符"a"的出现位置,并利用卷积神经网络(CNN)将每个实例归属到正确的抄写员。基于先前系统取得的显著成果,同时认识到模板匹配技术需要设定合适阈值才能工作的局限性,我们决定在同一框架内尝试使用YOLO目标检测模型来识别参与多部中世纪书籍创作的抄写员。我们采用YOLO第五版实现目标检测模型,完全替代了先前工作中使用的模板匹配与CNN方法。实验结果表明,YOLO能有效提取更多数量的目标字母,从而为第二阶段的分类提供更准确的基础。此外,YOLO置信度评分为开发具有拒绝阈值的系统提供了依据,使得即使在未见过的 manuscripts 中也能实现可靠的书写者识别。