Finite automata (FA) are a fundamental computational abstraction that is widely used in practice for various tasks in computer science, linguistics, biology, electrical engineering, and artificial intelligence. Given an input word, an FA maps the word to a result, in the simple case "accept" or "reject", but in general to one of a finite set of results. A question that then arises is: why? Another question is: how can we modify the input word so that it is no longer accepted? One may think that the automaton itself is an adequate explanation of its behaviour, but automata can be very complex and difficult to make sense of directly. In this work, we investigate how to explain the behaviour of an FA on an input word in terms of the word's characters. In particular, we are interested in minimal explanations: what is the minimal set of input characters that explains the result, and what are the minimal changes needed to alter the result? In this paper, we propose an efficient method to determine all minimal explanations for the behaviour of an FA on a particular word. This allows us to give unbiased explanations about which input features are responsible for the result. Experiments show that our approach scales well, even when the underlying problem is challenging.
翻译:有限自动机(FA)是一种基础的计算抽象模型,广泛应用于计算机科学、语言学、生物学、电气工程和人工智能领域的各种任务。给定输入词,有限自动机会将其映射为结果——在简单情况下是"接受"或"拒绝",但通常可映射至有限结果集中的任一结果。随之产生的问题是:为何会得到该结果?另一问题是:如何修改输入词才能改变接受状态?人们可能认为自动机本身已足以解释其行为,但自动机可能极其复杂,难以直接理解其运作机制。本研究探讨如何通过输入词的字符特征来解释有限自动机在特定输入词上的行为。我们特别关注最小化解释:解释结果所需的最小输入字符集是什么?改变结果所需的最小修改量是多少?本文提出一种高效方法,用于确定有限自动机在特定词上所有可能的最小解释。这使得我们能够无偏地解释哪些输入特征对结果产生影响。实验表明,即使面对底层计算难题,我们的方法仍具备良好的可扩展性。