Analytical workloads exhibit substantial semantic repetition, yet most production caches key entries by SQL surface form (text or AST), fragmenting reuse across BI tools, notebooks, and NL interfaces. We introduce a safety-first middleware cache for dashboard-style OLAP over star schemas that canonicalizes both SQL and NL into a unified key space -- the OLAP Intent Signature -- capturing measures, grouping levels, filters, and time windows. Reuse requires exact intent matches under strict schema validation and confidence-gated NL acceptance; two correctness-preserving derivations (roll-up, filter-down) extend coverage without approximate matching. Across TPC-DS, SSB, and NYC TLC (1,395 queries), we achieve 82% hit rate versus 28% (text) and 56% (AST) with zero false hits; derivations double hit rate on hierarchical queries.
翻译:分析型工作负载展现出显著的语义重复性,然而多数生产系统通过SQL表层形式(文本或AST)进行缓存键值存储,导致跨BI工具、笔记本及自然语言接口的复用能力碎片化。我们提出一种面向星型模式仪表板式OLAP的安全优先中间件缓存,该系统将SQL与自然语言统一规范化为OLAP意图签名——该签名空间完整捕获度量指标、分组层级、筛选条件及时间窗口。复用机制需在严格模式验证与置信度门控的自然语言接受条件下实现精确意图匹配;两种保持正确性的派生操作(上卷、下钻筛选)可在无需近似匹配的情况下扩展覆盖范围。在TPC-DS、SSB及NYC TLC数据集(共1,395条查询)的测试中,本方案实现82%的命中率,相较于文本缓存(28%)与AST缓存(56%)有显著提升,且保持零误命中;在层次化查询场景中,派生操作可使命中率提升一倍。