Reflections on Surrogate-Assisted Search-Based Testing: A Taxonomy and Two Replication Studies based on Industrial ADAS and Simulink Models

Surrogate-assisted search-based testing (SA-SBT) aims to reduce the computational time for testing compute-intensive systems. Surrogates enhance testing techniques by improving test case generation focusing the testing budget on the most critical portions of the input domain. In addition, they can serve as approximations of the system under test (SUT) to predict tests' results instead of executing the tests on compute-intensive SUTs. This article reflects on the existing SA-SBT techniques, particularly those applied to system-level testing and often facilitated using simulators or complex test beds. Our objective is to synthesize different heuristic algorithms and evaluation methods employed in existing SA-SBT techniques and present a comprehensive view of SA-SBT solutions. In addition, by critically reviewing our previous work on SA-SBT, we aim to identify the limitations in our proposed algorithms and evaluation methods and to propose potential improvements. We present a taxonomy that categorizes and contrasts existing SA-SBT solutions and highlights key research gaps. To identify the evaluation challenges, we conduct two replication studies of our past SA-SBT solutions: One study uses industrial advanced driver assistance system (ADAS) and the other relies on a Simulink model benchmark. We compare our results with those of the original studies and identify the difficulties in evaluating SA-SBT techniques, including the impact of different contextual factors on results generalization and the validity of our evaluation metrics. Based on our taxonomy and replication studies, we propose future research directions, including re-considerations in the current evaluation metrics used for SA-SBT solutions, utilizing surrogates for fault localization and repair in addition to testing, and creating frameworks for large-scale experiments by applying SA-SBT to multiple SUTs and simulators.

翻译：基于代理的搜索测试（SA-SBT）旨在降低计算密集型系统的测试时间。代理通过改进测试用例生成来增强测试技术，将测试预算集中在输入域的关键部分。此外，它们可作为被测系统（SUT）的近似模型，用于预测测试结果，从而避免在计算密集型SUT上直接执行测试。本文对现有SA-SBT技术进行反思，尤其关注那些应用于系统级测试、常借助模拟器或复杂测试平台的技术。我们的目标是综合现有SA-SBT技术中采用的不同启发式算法与评估方法，并呈现SA-SBT解决方案的全面视图。同时，通过批判性回顾我们先前在SA-SBT方面的工作，旨在识别所提算法与评估方法中的局限性，并提出潜在改进方向。我们提出一种分类法，用于分类和对比现有SA-SBT解决方案，并突出关键研究空白。为识别评估挑战，我们对我方过往SA-SBT解决方案开展两项重复性研究：一项使用工业级高级驾驶辅助系统（ADAS），另一项基于Simulink模型基准测试。我们将结果与原始研究进行对比，识别SA-SBT技术评估中的难点，包括不同情境因素对结果泛化性的影响及评估指标的有效性。基于分类法与重复性研究，我们提出未来研究方向，包括重新审视当前SA-SBT解决方案中使用的评估指标、将代理扩展应用于故障定位与修复（而非仅限测试），以及通过将SA-SBT应用于多个SUT和模拟器来构建大规模实验框架。