Software engineers frequently grapple with the challenge of accessing disparate documentation and telemetry data, including Troubleshooting Guides (TSGs), incident reports, code repositories, and various internal tools developed by multiple stakeholders. While on-call duties are inevitable, incident resolution becomes even more daunting due to the obscurity of legacy sources and the pressures of strict time constraints. To enhance the efficiency of on-call engineers (OCEs) and streamline their daily workflows, we introduced DECO -- a comprehensive framework for developing, deploying, and managing enterprise-grade chatbots tailored to improve productivity in engineering routines. This paper details the design and implementation of the DECO framework, emphasizing its innovative NL2SearchQuery functionality and a hierarchical planner. These features support efficient and customized retrieval-augmented-generation (RAG) algorithms that not only extract relevant information from diverse sources but also select the most pertinent toolkits in response to user queries. This enables the addressing of complex technical questions and provides seamless, automated access to internal resources. Additionally, DECO incorporates a robust mechanism for converting unstructured incident logs into user-friendly, structured guides, effectively bridging the documentation gap. Feedback from users underscores DECO's pivotal role in simplifying complex engineering tasks, accelerating incident resolution, and bolstering organizational productivity. Since its launch in September 2023, DECO has demonstrated its effectiveness through extensive engagement, with tens of thousands of interactions from hundreds of active users across multiple organizations within the company.
翻译:软件工程师经常面临访问分散文档和遥测数据的挑战,这些数据包括故障排除指南(TSG)、事件报告、代码仓库以及由多个利益相关者开发的各种内部工具。虽然值班任务不可避免,但由于遗留数据源的模糊性和严格时间限制的压力,事件解决变得更加困难。为提高值班工程师(OCE)的工作效率并优化其日常流程,我们推出了DECO——一个用于开发、部署和管理企业级聊天机器人的综合框架,旨在提升工程日常工作的生产力。本文详细阐述了DECO框架的设计与实现,重点介绍了其创新的自然语言转搜索查询功能及分层规划器。这些特性支持高效且可定制的检索增强生成算法,不仅能从多源数据中提取相关信息,还能根据用户查询选择最相关的工具包,从而解决复杂技术问题并提供无缝的自动化内部资源访问。此外,DECO集成了将非结构化事件日志转换为用户友好的结构化指南的稳健机制,有效弥合了文档缺口。用户反馈表明,DECO在简化复杂工程任务、加速事件解决及提升组织生产力方面发挥了关键作用。自2023年9月上线以来,DECO已通过广泛使用证明了其有效性,在公司内多个组织的数百名活跃用户中实现了数万次交互。