The unit testing of Deep Learning (DL) libraries is challenging due to complex numerical semantics and implicit tensor constraints. Traditional Search-Based Software Testing (SBST) often suffers from semantic blindness, failing to satisfy the constraints of high-dimensional tensors, whereas Large Language Models (LLMs) struggle with cross-file context and unstable code modifications. This paper proposes ATTest, an agent-driven tensor testing framework for module-level unit test generation. ATTest orchestrates a seven-stage pipeline, which encompasses constraint extraction and an iterative "generation-validation-repair" loop, to maintain testing stability and mitigate context-window saturation. An evaluation on PyTorch and TensorFlow demonstrates that ATTest significantly outperforms state-of-the-art baselines such as PynguinML, achieving an average branch coverage of 55.60% and 54.77%, respectively. The results illustrate how agent-driven workflows bridge the semantic gap in numerical libraries while ensuring auditable test synthesis. Source code: https://github.com/iSEngLab/ATTest.git
翻译:深度学习(DL)库的单元测试因其复杂的数值语义和隐式张量约束而极具挑战性。传统的基于搜索的软件测试(SBST)常受语义盲目性困扰,难以满足高维张量的约束条件,而大型语言模型(LLMs)则面临跨文件上下文理解与不稳定代码修改的难题。本文提出ATTest,一种面向模块级单元测试生成的智能体驱动张量测试框架。ATTest通过编排包含约束提取和迭代式"生成-验证-修复"循环的七阶段流程,维持测试稳定性并缓解上下文窗口饱和问题。在PyTorch和TensorFlow上的评估表明,ATTest显著优于PynguinML等前沿基线方法,平均分支覆盖率分别达到55.60%和54.77%。实验结果揭示了智能体驱动工作流如何弥合数值库的语义鸿沟,同时确保可审计的测试合成。源代码:https://github.com/iSEngLab/ATTest.git