This paper proposes the RePAIR dataset that represents a challenging benchmark to test modern computational and data driven methods for puzzle-solving and reassembly tasks. Our dataset has unique properties that are uncommon to current benchmarks for 2D and 3D puzzle solving. The fragments and fractures are realistic, caused by a collapse of a fresco during a World War II bombing at the Pompeii archaeological park. The fragments are also eroded and have missing pieces with irregular shapes and different dimensions, challenging further the reassembly algorithms. The dataset is multi-modal providing high resolution images with characteristic pictorial elements, detailed 3D scans of the fragments and meta-data annotated by the archaeologists. Ground truth has been generated through several years of unceasing fieldwork, including the excavation and cleaning of each fragment, followed by manual puzzle solving by archaeologists of a subset of approx. 1000 pieces among the 16000 available. After digitizing all the fragments in 3D, a benchmark was prepared to challenge current reassembly and puzzle-solving methods that often solve more simplistic synthetic scenarios. The tested baselines show that there clearly exists a gap to fill in solving this computationally complex problem.
翻译:本文提出RePAIR数据集,该数据集为测试现代计算与数据驱动的拼图求解与重组任务方法提供了一个具有挑战性的基准。我们的数据集具备当前二维与三维拼图求解基准中罕见的独特属性。其碎片与断裂面具有高度真实性,源于庞贝考古公园在二战轰炸期间一幅壁画的坍塌损毁。这些碎片历经风化侵蚀且存在缺失部分,具有不规则形状与不同尺寸,进一步增加了重组算法的挑战性。该数据集为多模态数据集,提供包含特征性绘画元素的高分辨率图像、碎片的详细三维扫描数据以及考古学家标注的元数据。基准真值通过多年持续田野工作生成,包括对每块碎片的挖掘清理,随后由考古学家对约16000块可用碎片中的约1000块子集进行人工拼图求解。在完成所有碎片的三维数字化后,我们构建了此基准以挑战当前常局限于简化合成场景的重组与拼图求解方法。基线测试结果表明,在解决这一计算复杂问题上仍存在明显的差距需填补。