We introduce PyRCA, an open-source Python machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps). It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents. It offers a unified interface for multiple commonly used RCA models, encompassing both graph construction and scoring tasks. This library aims to provide IT operations staff, data scientists, and researchers a one-step solution to rapid model development, model evaluation and deployment to online applications. In particular, our library includes various causal discovery methods to support causal graph construction, and multiple types of root cause scoring methods inspired by Bayesian analysis, graph analysis and causal analysis, etc. Our GUI dashboard offers practitioners an intuitive point-and-click interface, empowering them to easily inject expert knowledge through human interaction. With the ability to visualize causal graphs and the root cause of incidents, practitioners can quickly gain insights and improve their workflow efficiency. This technical report introduces PyRCA's architecture and major functionalities, while also presenting benchmark performance numbers in comparison to various baseline models. Additionally, we demonstrate PyRCA's capabilities through several example use cases.
翻译:我们介绍了PyRCA,一个用于人工智能运维(AIOps)中根因分析(RCA)的开源Python机器学习库。它提供了一个整体框架,用于揭示复杂的指标因果依赖关系,并自动定位事件的根因。该库为多种常用RCA模型提供了统一接口,涵盖图构建和评分任务。旨在为IT运维人员、数据科学家和研究人员提供一站式解决方案,支持快速模型开发、评估及部署至在线应用。特别地,我们的库集成了多种因果发现方法以支持因果图构建,以及基于贝叶斯分析、图分析和因果分析等多种根因评分方法。我们的GUI仪表盘为从业者提供了直观的点击式界面,使其能够通过人机交互轻松注入专家知识。通过可视化因果图及事件根因,从业者可快速获取洞察并提升工作流效率。本技术报告介绍了PyRCA的架构和主要功能,同时给出了与多种基线模型对比的基准性能数据。此外,我们通过若干示例用例展示了PyRCA的能力。