The evolution of digital manufacturing requires intelligent Question Answering (QA) systems that can seamlessly integrate and analyze complex multi-modal data, such as text, images, formulas, and tables. Conventional Retrieval Augmented Generation (RAG) methods often fall short in handling this complexity, resulting in subpar performance. We introduce ManuRAG, an innovative multi-modal RAG framework designed for manufacturing QA, incorporating specialized techniques to improve answer accuracy, reliability, and interpretability. To benchmark performance, we evaluate ManuRAG on three datasets comprising a total of 1,515 QA pairs, corresponding to mathematical, multiple-choice, and review-based questions in manufacturing principles and practices. Experimental results show that ManuRAG consistently outperforms existing methods across all evaluated datasets. Furthermore, ManuRAG's adaptable design makes it applicable to other domains, including law, healthcare, and finance, positioning it as a versatile tool for domain-specific QA.
翻译:数字化制造的发展需要能够无缝集成和分析复杂多模态数据(如文本、图像、公式和表格)的智能问答系统。传统的检索增强生成方法在处理此类复杂性时往往表现不足,导致性能欠佳。本文提出ManuRAG,一种专为制造领域问答设计的创新多模态RAG框架,它融合了专门技术以提升答案的准确性、可靠性和可解释性。为评估性能,我们在三个包含总计1,515个问答对的数据集上对ManuRAG进行了测试,这些数据集对应制造原理与实践中的数学计算、多项选择和综述类问题。实验结果表明,ManuRAG在所有评估数据集上均持续优于现有方法。此外,ManuRAG的适应性设计使其可应用于法律、医疗和金融等其他领域,使其成为领域特定问答的通用工具。