Question answering (QA) has become an important application in the advanced development of large language models. General pre-trained large language models for question-answering are not trained to properly understand the knowledge or terminology for a specific domain, such as finance, healthcare, education, and customer service for a product. To better cater to domain-specific understanding, we build an in-house question-answering system for Adobe products. We propose a novel framework to compile a large question-answer database and develop the approach for retrieval-aware finetuning of a Large Language model. We showcase that fine-tuning the retriever leads to major improvements in the final generation. Our overall approach reduces hallucinations during generation while keeping in context the latest retrieval information for contextual grounding.
翻译:问答(QA)已成为大语言模型高级发展中的一项重要应用。通用预训练大语言模型在进行问答时,并未经过专门训练来准确理解特定领域(如金融、医疗、教育及产品客户服务)的知识或术语。为更好地适配领域特定理解,我们针对Adobe产品构建了一套内部问答系统。我们提出了一种新型框架,用于编制大规模问答数据库,并开发了面向检索感知微调的大语言模型方法。研究表明,对检索器进行微调能显著提升最终生成的性能。我们的整体方法在保留最新检索信息用于上下文锚定的同时,有效减少了生成过程中的幻觉现象。