Paleoradiology, the use of modern imaging technologies to study archaeological and anthropological remains, offers new windows on millennial scale patterns of human health. Unfortunately, the radiographs collected during field campaigns are heterogeneous: bones are disarticulated, positioning is ad hoc, and laterality markers are often absent. Additionally, factors such as age at death, age of bone, sex, and imaging equipment introduce high variability. Thus, content navigation, such as identifying a subset of images with a specific projection view, can be time consuming and difficult, making efficient triaging a bottleneck for expert analysis. We report a zero shot prompting strategy that leverages a state of the art Large Vision Language Model (LVLM) to automatically identify the main bone, projection view, and laterality in such images. Our pipeline converts raw DICOM files to bone windowed PNGs, submits them to the LVLM with a carefully engineered prompt, and receives structured JSON outputs, which are extracted and formatted onto a spreadsheet in preparation for validation. On a random sample of 100 images reviewed by an expert board certified paleoradiologist, the system achieved 92% main bone accuracy, 80% projection view accuracy, and 100% laterality accuracy, with low or medium confidence flags for ambiguous cases. These results suggest that LVLMs can substantially accelerate code word development for large paleoradiology datasets, allowing for efficient content navigation in future anthropology workflows.
翻译:古放射学作为运用现代成像技术研究考古与人类学遗存的学科,为探索千年尺度的人类健康模式提供了新视角。然而,野外考察中采集的X光片存在高度异质性:骨骼常呈分离状态,摆放位置具有随意性,且左右侧标记时常缺失。此外,死亡年龄、骨骼年代、性别及成像设备等因素进一步引入了显著变异。因此,针对特定投照视角的图像子集进行内容导航往往耗时且困难,使得高效分诊成为专家分析流程的瓶颈。本研究提出一种零样本提示策略,利用先进的大型视觉语言模型(LVLM)自动识别此类图像中的主要骨骼、投照视角与左右侧信息。我们的处理流程将原始DICOM文件转换为骨窗PNG图像,通过精心设计的提示词提交至LVLM,并接收结构化JSON输出,经提取与格式化后生成待验证的电子表格。经专家委员会认证的古放射学家对100张随机样本图像的评审显示,该系统在主要骨骼识别准确率达92%,投照视角识别准确率为80%,左右侧判断准确率达到100%,并对模糊病例标注了低/中置信度标识。这些结果表明,LVLM能显著加速大型古放射学数据集的编码词开发进程,为未来人类学研究工作流中的高效内容导航提供支持。