Recognizing the promise of natural language interfaces to databases, prior studies have emphasized the development of text-to-SQL systems. While substantial progress has been made in this field, existing research has concentrated on generating SQL statements from text queries. The broader challenge, however, lies in inferring new information about the returned data. Our research makes two major contributions to address this gap. First, we introduce a novel Internet-of-Things (IoT) text-to-SQL dataset comprising 10,985 text-SQL pairs and 239,398 rows of network traffic activity. The dataset contains additional query types limited in prior text-to-SQL datasets, notably temporal-related queries. Our dataset is sourced from a smart building's IoT ecosystem exploring sensor read and network traffic data. Second, our dataset allows two-stage processing, where the returned data (network traffic) from a generated SQL can be categorized as malicious or not. Our results show that joint training to query and infer information about the data can improve overall text-to-SQL performance, nearly matching substantially larger models. We also show that current large language models (e.g., GPT3.5) struggle to infer new information about returned data, thus our dataset provides a novel test bed for integrating complex domain-specific reasoning into LLMs.
翻译:认识到自然语言数据库接口的潜力,先前的研究重点发展了文本到SQL系统。尽管该领域已取得实质性进展,但现有研究集中于从文本查询生成SQL语句。然而,更广泛的挑战在于推断返回数据的新信息。我们的研究为弥补这一空白做出了两项主要贡献。首先,我们引入了一个新颖的物联网(IoT)文本到SQL数据集,包含10,985个文本-SQL对和239,398行网络流量活动数据。该数据集包含了先前文本到SQL数据集中有限的额外查询类型,特别是时间相关查询。我们的数据集源自一个智能建筑的物联网生态系统,探索了传感器读取和网络流量数据。其次,我们的数据集支持两阶段处理,其中从生成的SQL返回的数据(网络流量)可被分类为恶意或非恶意。我们的结果表明,联合训练以查询并推断数据信息能够提升整体文本到SQL性能,几乎与规模大得多的模型相匹配。我们还表明,当前的大型语言模型(例如GPT3.5)难以推断返回数据的新信息,因此我们的数据集为将复杂领域特定推理集成到LLMs中提供了一个新颖的测试平台。