Domain specific question answering is an evolving field that requires specialized solutions to address unique challenges. In this paper, we show that a hybrid approach combining a fine-tuned dense retriever with keyword based sparse search methods significantly enhances performance. Our system leverages a linear combination of relevance signals, including cosine similarity from dense retrieval, BM25 scores, and URL host matching, each with tunable boost parameters. Experimental results indicate that this hybrid method outperforms our single-retriever system, achieving improved accuracy while maintaining robust contextual grounding. These findings suggest that integrating multiple retrieval methodologies with weighted scoring effectively addresses the complexities of domain specific question answering in enterprise settings.
翻译:领域特定问答是一个不断发展的领域,需要专门的解决方案来应对其独特的挑战。本文表明,将经过微调的密集检索器与基于关键词的稀疏搜索方法相结合的混合方法能显著提升性能。我们的系统利用了相关性信号的线性组合,包括来自密集检索的余弦相似度、BM25分数以及URL主机匹配,每个信号均带有可调的增强参数。实验结果表明,这种混合方法优于我们的单一检索器系统,在保持稳健上下文基础的同时实现了更高的准确率。这些发现表明,将多种检索方法与加权评分相结合,能有效应对企业环境中领域特定问答的复杂性。