The recent Artificial Intelligence (AI) revolution has opened transformative possibilities for the humanities, particularly in unlocking the visual-artistic content embedded in historical illuminated manuscripts. While digital archives now offer unprecedented access to these materials, the ability to systematically locate, extract, and analyze illustrations at scale remains a major challenge. We present a general and scalable AI-based pipeline for large-scale visual analysis of illuminated manuscripts. The framework integrates modern deep-learning models for page-level illustration detection, illustration extraction, and multimodal description, enabling scholars to search, cluster, and study visual materials and artistic trends across entire corpora. We demonstrate the applicability of this approach on large heterogeneous collections, including the Vatican Library and richly illuminated manuscripts such as the Bible of Borso d'Este. The system reveals meaningful visual patterns and cross-manuscript relationships by embedding illustrations into a shared representation space and analyzing their similarity structure (see figure 4). By harnessing recent advances in computer vision and vision-language models, our framework enables new forms of large-scale visual scholarship in historical studies, art history, and cultural heritage making it possible to explore iconography, stylistic trends, and cultural connections in ways that were previously impractical.
翻译:近期的人工智能(AI)革命为人文学科开启了变革性可能,尤其在解锁历史彩绘手稿中蕴含的视觉艺术内容方面。尽管数字档案现已提供前所未有的材料获取途径,但大规模系统性地定位、提取和分析插图仍是一项重大挑战。我们提出了一种通用且可扩展的基于AI的流程,用于彩绘手稿的大规模视觉分析。该框架集成了现代深度学习模型,实现页面级插图检测、插图提取和多模态描述,使学者能够跨整个文献库搜索、聚类和研究视觉材料与艺术趋势。我们在大型异构收藏集上验证了该方法的适用性,包括梵蒂冈图书馆馆藏及《博尔索·德斯特圣经》等精美彩绘手稿。该系统通过将插图嵌入共享表示空间并分析其相似性结构(见图4),揭示了有意义的视觉模式与跨手稿关联。借助计算机视觉与视觉-语言模型的最新进展,我们的框架为历史研究、艺术史和文化遗产领域实现了新形式的大规模视觉学术研究,使得探索图像志、风格趋势和文化联系成为可能,这在以往是难以实现的。