Augmenting Large Language Models (LLMs) for Question Answering (QA) with domain specific data has attracted wide attention. However, domain data often exists in a hybrid format, including text and semi-structured tables, posing challenges for the seamless integration of information. Table-to-Text Generation is a promising solution by facilitating the transformation of hybrid data into a uniformly text-formatted corpus. Although this technique has been widely studied by the NLP community, there is currently no comparative analysis on how corpora generated by different table-to-text methods affect the performance of QA systems. In this paper, we address this research gap in two steps. First, we innovatively integrate table-to-text generation into the framework of enhancing LLM-based QA systems with domain hybrid data. Then, we utilize this framework in real-world industrial data to conduct extensive experiments on two types of QA systems (DSFT and RAG frameworks) with four representative methods: Markdown format, Template serialization, TPLM-based method, and LLM-based method. Based on the experimental results, we draw some empirical findings and explore the underlying reasons behind the success of some methods. We hope the findings of this work will provide a valuable reference for the academic and industrial communities in developing robust QA systems.
翻译:为大语言模型(LLM)在问答(QA)任务中注入领域特定数据已引起广泛关注。然而,领域数据常以混合格式存在,包括文本和半结构化表格,给信息无缝整合带来挑战。表格到文本生成通过将混合数据转化为统一文本格式的语料库,成为一项有前景的解决方案。尽管此技术已被自然语言处理社区广泛研究,但目前尚缺乏关于不同表格到文本方法生成的语料库如何影响问答系统性能的比较分析。本文分两步填补这一研究空白:首先,我们创新性地将表格到文本生成融入利用领域混合数据增强基于大语言模型问答系统的框架中;其次,在实际工业数据上,针对两种类型问答系统(DSFT和RAG框架),采用四种代表性方法(Markdown格式、模板序列化、基于TPLM的方法和基于LLM的方法)开展广泛实验。基于实验结果,我们总结经验性发现,并探究部分方法成功背后的潜在原因。希望本研究结果能为学术界和工业界构建鲁棒问答系统提供有价值的参考。