Large language models (LLMs) with retrieval augmented-generation (RAG) have been the optimal choice for scalable generative AI solutions in the recent past. However, the choice of use-cases that incorporate RAG with LLMs have been either generic or extremely domain specific, thereby questioning the scalability and generalizability of RAG-LLM approaches. In this work, we propose a unique LLM-based system where multiple LLMs can be invoked to enable data authentication, user query routing, data retrieval and custom prompting for question answering capabilities from data tables that are highly varying and large in size. Our system is tuned to extract information from Enterprise-level data products and furnish real time responses under 10 seconds. One prompt manages user-to-data authentication followed by three prompts to route, fetch data and generate a customizable prompt natural language responses. Additionally, we propose a five metric scoring module that detects and reports hallucinations in the LLM responses. Our proposed system and scoring metrics achieve >90% confidence scores across hundreds of user queries in the sustainability, financial health and social media domains. Extensions to the proposed extreme RAG architectures can enable heterogeneous source querying using LLMs.
翻译:大型语言模型结合检索增强生成技术近年来已成为可扩展生成式人工智能解决方案的最优选择。然而,将检索增强生成与大型语言模型相结合的应用场景要么具有通用性,要么高度局限于特定领域,从而质疑了检索增强生成-大型语言模型方法的可扩展性与泛化能力。本研究提出了一种独特的大型语言模型系统,可通过调用多个大型语言模型实现数据认证、用户查询路由、数据检索及自定义提示功能,从而对高度多变且规模庞大的数据表格执行问答任务。该系统针对企业级数据产品的信息提取进行了调优,可在10秒内提供实时响应。系统通过一个提示词实现用户到数据的认证,随后采用三个提示词分别完成路由、数据获取和可自定义的自然语言响应生成。此外,我们提出了一种五指标评分模块,用于检测和报告大型语言模型响应中的幻觉现象。所提出的系统与评分指标在可持续发展、金融健康及社交媒体领域的数百个人工查询中实现了超过90%的置信度评分。该极致检索增强生成架构的扩展可支持基于大型语言模型的异构数据源查询。