Artificial Intelligence (AI) and Large Language Models (LLMs) are increasingly used in autonomous software testing; however, AI-generated test artifacts often suffer from hallucinations, compliance violations, security risks, and limited explainability. To enhance the reliability, transparency, and trustworthiness of AI-generated testing artifacts, this research introduces the concept of Governance-Aware Autonomous Testing Framework (GATF). The framework extends the autonomous testing lifecycle with governance validation, explainability analysis, probabilistic risk assessment, compliance monitoring, as well as audit governance. Experiments were performed with Defects4J and PROMISE software engineering datasets. The proposed framework successfully reduced the governance-related risks by 89.6% and demonstrated 94.3% accuracy in governance, 96.5% artifact reliability, 94.2% compliance accuracy, and 90.8% explainability performance. The results show that autonomous testing systems that are governance-aware can significantly enhance the reliability, transparency, and operational security of autonomous testing systems in comparison to conventional AI-based testing systems. The proposed architecture is scalable and reliable and provides a safe environment for software testing.
翻译:人工智能(AI)和大语言模型(LLMs)在自主软件测试中的应用日益广泛;然而,AI生成的测试制品常常存在幻觉、合规违规、安全风险以及可解释性有限等问题。为提升AI生成测试制品的可靠性、透明度和可信度,本研究引入了治理感知自主测试框架(GATF)的概念。该框架通过治理验证、可解释性分析、概率风险评估、合规监控以及审计治理,扩展了自主测试生命周期。实验采用了Defects4J和PROMISE软件工程数据集。所提出的框架成功将治理相关风险降低了89.6%,并在治理方面达到94.3%的准确率,制品可靠性达96.5%,合规准确率达94.2%,可解释性性能达90.8%。结果表明,与传统的基于AI的测试系统相比,具有治理感知能力的自主测试系统能够显著提升系统的可靠性、透明度和运行安全性。所提出的架构兼具可扩展性与可靠性,为软件测试提供了安全环境。