Natural Language Querying for Time Series Databases (NLQ4TSDB) aims to assist non-expert users retrieve meaningful events, intervals, and summaries from massive temporal records. However, existing Text-to-SQL methods are not designed for continuous morphological intents such as shapes or anomalies, while time series models struggle to handle ultra-long histories. To address these challenges, we propose Sonar-TS, a neuro-symbolic framework that tackles NLQ4TSDB via a Search-Then-Verify pipeline. Analogous to active sonar, it utilizes a feature index to ping candidate windows via SQL, followed by generated Python programs to lock on and verify candidates against raw signals. To enable effective evaluation, we introduce NLQTSBench, the first large-scale benchmark designed for NLQ over TSDB-scale histories. Our experiments highlight the unique challenges within this domain and demonstrate that Sonar-TS effectively navigates complex temporal queries where traditional methods fail. This work presents the first systematic study of NLQ4TSDB, offering a general framework and evaluation standard to facilitate future research.
翻译:时间序列数据库自然语言查询(NLQ4TSDB)旨在帮助非专业用户从海量时序记录中检索有意义的事件、区间和摘要。然而,现有文本到SQL方法无法处理形态连续型查询意图(如形状或异常检测),而时序模型则难以应对超长历史数据。为应对这些挑战,我们提出Sonar-TS——一种通过搜索-验证流程处理NLQ4TSDB的神经符号框架。该框架类比主动声纳系统:首先利用特征索引通过SQL语句定位候选时间窗口,随后通过生成的Python程序在原始信号上锁定并验证候选结果。为建立有效评估体系,我们提出了NLQTSBench——首个面向TSDB级历史数据的自然语言查询大规模基准测试集。实验结果表明,该领域存在独特的挑战性,而Sonar-TS能有效处理传统方法无法应对的复杂时序查询。本研究首次对NLQ4TSDB进行了系统性探索,提供了通用框架与评估标准,为未来研究奠定基础。