Big data platforms are widely used in modern enterprises, and an in-production intelligent assistant is increasingly important to help users quickly find actionable guidance and reduce operational burden. While recent LLM+RAG assistants provide a natural interface, they face practical challenges in real deployments: limited scenario coverage across both general consultation and domain-specific troubleshooting workflows, inefficient knowledge access due to inadequate multi-hop retrieval and flat knowledge organization, and high maintenance cost because escalated tickets are unstructured and hard to convert into assistant improvements and reusable SOPs. In this paper, we present SiriusHelper, a deployed intelligent assistant for big data platforms. SiriusHelper serves as a unified online assistant that automatically identifies user intent and routes queries to the right handling path, including dedicated expert workflows for specialized scenarios (e.g., SQL execution diagnosis). To support complex troubleshooting, SiriusHelper combines a DeepSearch-driven mechanism with a priority-based hierarchical knowledge base to enable multi-hop retrieval without context overload, thus improving answer reliability and latency. To reduce expert overhead, SiriusHelper further introduces automated ticket understanding and SOP distillation: it diagnoses the assistant failure reason (e.g., missing knowledge or wrong routing) and extracts domain-specific SOPs to continuously enrich the knowledge base. Experiments and online deployment on Tencent Big Data platform show that SiriusHelper outperforms representative alternatives and reduces online ticket volume by 20.8\%.
翻译:大数据平台在现代企业中被广泛使用,生产环境中的智能助手对于帮助用户快速找到可操作的指导、减轻运维负担日益重要。尽管基于LLM+RAG的助手提供了自然交互界面,但在实际部署中面临诸多挑战:在通用咨询和领域特定故障排查流程上的场景覆盖有限;因多跳检索能力不足及知识组织扁平化导致知识访问效率低下;此外,升级工单缺乏结构化且难以转化为助手的改进措施和可复用的标准作业程序(SOP),使维护成本居高不下。本文介绍了SiriusHelper——一款面向大数据平台的已部署智能助手。SiriusHelper作为统一在线助手,可自动识别用户意图并将查询路由至正确的处理路径,包括针对特定场景(如SQL执行诊断)的专属专家工作流。为支持复杂故障排查,SiriusHelper将DeepSearch驱动机制与基于优先级的分层知识库相结合,在避免上下文过载的前提下实现多跳检索,从而提升答案可靠性与响应延迟。为降低专家负担,SiriusHelper进一步引入自动化工单理解与SOP蒸馏:它可诊断助手故障原因(如知识缺失或路由错误),并提取领域特定的SOP以持续丰富知识库。在腾讯大数据平台上的实验与在线部署表明,SiriusHelper优于代表性替代方案,并将在线工单量降低20.8%。