Retrieval-augmented generation (RAG) enhances LLM reasoning in knowledge-intensive tasks, but existing RAG pipelines incur substantial retrieval and generation overhead when applied to large-scale entity matching. To address this limitation, we introduce CE-RAG4EM, a cost-efficient RAG architecture that reduces computation through blocking-based batch retrieval and generation. We also present a unified framework for analyzing and evaluating RAG systems for entity matching, focusing on blocking-aware optimizations and retrieval granularity. Extensive experiments suggest that CE-RAG4EM can achieve comparable or improved matching quality while substantially reducing end-to-end runtime relative to strong baselines. Our analysis further reveals that key configuration parameters introduce an inherent trade-off between performance and overhead, offering practical guidance for designing efficient and scalable RAG systems for entity matching and data integration.
翻译:检索增强生成(RAG)能够提升大语言模型在知识密集型任务中的推理能力,但现有RAG流程应用于大规模实体匹配时会产生显著的检索与生成开销。为应对这一局限,本文提出CE-RAG4EM——一种基于分块批量检索与生成机制的高效RAG架构,通过计算优化实现成本控制。同时,我们构建了用于分析与评估实体匹配RAG系统的统一框架,重点关注分块感知优化策略与检索粒度控制。大量实验表明,相较于现有强基线方法,CE-RAG4EM在保持相当或更优匹配质量的同时,能显著降低端到端运行时间。进一步分析揭示,关键配置参数在性能与开销间存在固有权衡关系,这为设计面向实体匹配与数据集成的高效可扩展RAG系统提供了实践指导。