AutArch: An AI-assisted workflow for object detection and automated recording in archaeological catalogues

Compiling large datasets from published resources, such as archaeological find catalogues presents fundamental challenges: identifying relevant content and manually recording it is a time-consuming, repetitive and error-prone task. For the data to be useful, it must be of comparable quality and adhere to the same recording standards, which is hardly ever the case in archaeology. Here, we present a new data collection method exploiting recent advances in Artificial Intelligence. Our software uses an object detection neural network combined with further classification networks to speed up, automate, and standardise data collection from legacy resources, such as archaeological drawings and photographs in large unsorted PDF files. The AI-assisted workflow detects common objects found in archaeological catalogues, such as graves, skeletons, ceramics, ornaments, stone tools and maps, and spatially relates and analyses these objects on the page to extract real-life attributes, such as the size and orientation of a grave based on the north arrow and the scale. A graphical interface allows for and assists with manual validation. We demonstrate the benefits of this approach by collecting a range of shapes and numerical attributes from richly-illustrated archaeological catalogues, and benchmark it in a real-world experiment with ten users. Moreover, we record geometric whole-outlines through contour detection, an alternative to landmark-based geometric morphometrics not achievable by hand.

翻译：从已出版资源（如考古发现目录）中整理大规模数据集面临根本性挑战：识别相关内容并手动记录是一项耗时、重复且易出错的任务。要使数据具有实用性，必须保持可比质量并遵循统一的记录标准，这在考古学中几乎难以实现。本文提出一种利用人工智能最新进展的新型数据采集方法。我们的软件采用目标检测神经网络与多级分类网络相结合的方式，加速、自动化并标准化来自遗产资源（如大型未分类PDF文件中的考古图纸与照片）的数据采集。该AI辅助工作流可检测考古目录中的常见对象（如墓葬、骨骼、陶器、饰品、石器和地图），并基于页面空间关系分析这些对象，从而提取其实物属性（例如根据指北针和比例尺确定墓葬尺寸与朝向）。图形化界面支持并辅助人工验证。我们通过从图文详实的考古目录中采集多种形状与数值属性来验证该方法优势，并在包含十名用户的真实实验中设立基准测试。此外，我们通过轮廓检测记录几何整体轮廓——这是一种无法手工实现的全几何形态测量替代方案，突破了传统基于地标点的几何形态测量方法的局限。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日