Production log analytics in self-hosted, resource-constrained environments requires natural-language access to massive log streams without the cost of routing every query through a large language model. We present LogRouter, an end-to-end log question-answering system deployed on TUBITAK BILGEM's national big data platform that combines a PySpark-based Drain3 ingestion pipeline, GPU-accelerated embeddings, and dual-index storage in Apache Druid and PostgreSQL with pgvector. A two-level cost-aware router dispatches each query along one of four execution paths: direct response, Druid keyword search, template lookup with SQL generation, and pgvector semantic retrieval, while a Level-2 router selects either a 14B-class or 32B-class generator for the semantic path. A dedicated coder LLM handles text-to-SQL generation. We evaluate the system on four LogHub datasets (Linux, Apache, Windows, and Mac; 70 questions in total) under both an online full-pipeline configuration and an offline configuration that isolates the generator. The router reaches 88.4% mean accuracy across datasets and 94.7% on Linux, while the full pipeline attains a mean ROUGE-1 of 0.373, BERTScore of 0.879, RAGAS Faithfulness of 0.779, and an end-to-end latency of 18.6 s. In an apples-to-apples offline comparison, the routed system reduces mean latency by 55% versus a Fixed-32B baseline (46.3 s vs. 102.1 s) while preserving Answer Correctness within 5.8 points and exceeding a Fixed-14B baseline on RAGAS Faithfulness across every dataset. Cost-aware dispatching is therefore a practical mechanism for production log QA: routing recovers most of the quality of an always-32B configuration at less than half the latency, and the L1 keyword vocabulary makes that routing decision with high precision without a learned classifier.
翻译:在自托管、资源受限环境中,生产级日志分析需要以自然语言访问海量日志流,同时避免将所有查询通过大语言模型处理的成本。我们提出LogRouter——一个部署于TUBITAK BILGEM国家大数据平台上的端到端日志问答系统,该系统融合了基于PySpark的Drain3数据摄取管道、GPU加速嵌入技术,以及基于Apache Druid和PostgreSQL(含pgvector扩展)的双索引存储架构。系统采用双层成本感知路由器,将每个查询分配至四条执行路径之一:直接响应、Druid关键词搜索、基于模板的SQL生成查询、以及pgvector语义检索;其中第二层路由器为语义路径选择14B级或32B级生成模型。专用编码大语言模型负责文本到SQL的转换。我们在四个LogHub数据集(Linux、Apache、Windows及Mac系统,共70个问题)上,分别在全管道在线配置和隔离生成器的离线配置下评估系统性能。路由器的跨数据集平均准确率达88.4%,在Linux数据集上达94.7%;全文管道取得平均ROUGE-1分数0.373、BERTScore 0.879、RAGAS忠实度0.779,端到端延迟为18.6秒。在同条件下离线对比中,路由系统相比固定32B基线将平均延迟降低55%(46.3秒对102.1秒),同时将答案正确性保持在5.8个百分点的范围内,并在每个数据集上超越固定14B基线的RAGAS忠实度指标。因此,成本感知路由是生产级日志QA的实用机制:路由机制以不到一半的延迟恢复常驻32B配置的大部分质量,且第一层关键词词汇表无需学习分类器即可高精度做出路由决策。