Software engineers frequently grapple with the challenge of accessing disparate documentation and telemetry data, including TroubleShooting Guides (TSGs), incident reports, code repositories, and various internal tools developed by multiple stakeholders. While on-call duties are inevitable, incident resolution becomes even more daunting due to the obscurity of legacy sources and the pressures of strict time constraints. To enhance the efficiency of on-call engineers (OCEs) and streamline their daily workflows, we introduced DECO-a comprehensive framework for developing, deploying, and managing enterprise-grade copilots tailored to improve productivity in engineering routines. This paper details the design and implementation of the DECO framework, emphasizing its innovative NL2SearchQuery functionality and a lightweight agentic framework. These features support efficient and customized retrieval-augmented-generation (RAG) algorithms that not only extract relevant information from diverse sources but also select the most pertinent skills in response to user queries. This enables the addressing of complex technical questions and provides seamless, automated access to internal resources. Additionally, DECO incorporates a robust mechanism for converting unstructured incident logs into user-friendly, structured guides, effectively bridging the documentation gap. Since its launch in September 2023, ENCO has demonstrated its effectiveness through widespread adoption, enabling tens of thousands of interactions and engaging hundreds of monthly active users (MAU) across dozens of organizations within the company.
翻译:软件工程师常常面临访问分散文档与遥测数据的挑战,这些数据包括故障排除指南(TSG)、事件报告、代码仓库以及由多方利益相关者开发的各种内部工具。尽管值班职责不可避免,但由于遗留数据源的模糊性及严格时间限制带来的压力,事件解决过程变得尤为艰巨。为提升值班工程师(OCE)的工作效率并优化其日常流程,我们推出了ENCO——一个用于开发、部署和管理企业级智能副驾的综合性框架,旨在提升工程日常工作的生产力。本文详述了ENCO框架的设计与实现,重点介绍其创新的自然语言转搜索查询功能及轻量级智能体框架。这些特性支持高效且可定制的检索增强生成算法,该算法不仅能从多源数据中提取相关信息,还能根据用户查询选择最适配的技能。这使得系统能够处理复杂技术问题,并提供无缝自动化的内部资源访问能力。此外,ENCO内置了将非结构化事件日志转换为用户友好的结构化指南的稳健机制,有效弥合了文档断层。自2023年9月发布以来,ENCO已通过广泛部署验证了其有效性,在公司内部数十个组织中实现了数万次交互,每月吸引数百名活跃用户参与使用。