Deep learning libraries like Transformers and Megatron are now widely adopted in modern AI programs. However, when these libraries introduce defects, ranging from silent computation errors to subtle performance regressions, it is often challenging for downstream users to assess whether their own programs are affected. Such impact analysis requires not only understanding the defect semantics but also checking whether the client code satisfies complex triggering conditions involving configuration flags, runtime environments, and indirect API usage. We present DepRadar, an agent coordination framework for fine grained defect and impact analysis in DL library updates. DepRadar coordinates four specialized agents across three steps: 1. the PR Miner and Code Diff Analyzer extract structured defect semantics from commits or pull requests, 2. the Orchestrator Agent synthesizes these signals into a unified defect pattern with trigger conditions, and 3. the Impact Analyzer checks downstream programs to determine whether the defect can be triggered. To improve accuracy and explainability, DepRadar integrates static analysis with DL-specific domain rules for defect reasoning and client side tracing. We evaluate DepRadar on 157 PRs and 70 commits across two representative DL libraries. It achieves 90% precision in defect identification and generates high quality structured fields (average field score 1.6). On 122 client programs, DepRadar identifies affected cases with 90% recall and 80% precision, substantially outperforming other baselines.
翻译:Transformer和Megatron等深度学习库已在现代人工智能程序中广泛应用。然而,当这些库引入从静默计算错误到细微性能回归等各类缺陷时,下游用户往往难以评估其自身程序是否受到影响。此类影响分析不仅需要理解缺陷语义,还需检查客户端代码是否满足涉及配置标志、运行时环境及间接API使用的复杂触发条件。本文提出DepRadar——一个面向深度学习库更新中细粒度缺陷与影响分析的智能体协同框架。DepRadar通过三个步骤协调四个专用智能体:1. PR挖掘器与代码差异分析器从提交或拉取请求中提取结构化缺陷语义;2. 协调器智能体将这些信号合成为包含触发条件的统一缺陷模式;3. 影响分析器检查下游程序以确定缺陷是否可被触发。为提升准确性与可解释性,DepRadar将静态分析与深度学习专用领域规则相结合,实现缺陷推理与客户端追踪。我们在两个代表性深度学习库的157个拉取请求和70次提交上评估DepRadar,其在缺陷识别方面达到90%的精确率,并生成高质量结构化字段(平均字段得分1.6)。在122个客户端程序上,DepRadar以90%召回率和80%精确率识别受影响案例,显著优于其他基线方法。