We study the learnability of symbolic finite state automata (SFA), a model shown useful in many applications in software verification. The state-of-the-art literature on this topic follows the query learning paradigm, and so far all obtained results are positive. We provide a necessary condition for efficient learnability of SFAs in this paradigm, from which we obtain the first negative result. The main focus of our work lies in the learnability of SFAs under the paradigm of identification in the limit using polynomial time and data, and its strengthening efficient identifiability, which are concerned with the existence of a systematic set of characteristic samples from which a learner can correctly infer the target language. We provide a necessary condition for identification of SFAs in the limit using polynomial time and data, and a sufficient condition for efficient learnability of SFAs. From these conditions we derive a positive and a negative result. The performance of a learning algorithm is typically bounded as a function of the size of the representation of the target language. Since SFAs, in general, do not have a canonical form, and there are trade-offs between the complexity of the predicates on the transitions and the number of transitions, we start by defining size measures for SFAs. We revisit the complexity of procedures on SFAs and analyze them according to these measures, paying attention to the special forms of SFAs: normalized SFAs and neat SFAs, as well as to SFAs over a monotonic effective Boolean algebra. This is an extended version of the paper with the same title published in CSL'22.
翻译:我们研究符号有限状态自动机(SFA)的可学习性,该模型在软件验证的许多应用中已被证明有效。现有关于该主题的文献遵循查询学习范式,且迄今为止所有结果均为正面的。我们给出了在该范式下SFA高效可学习性的一个必要条件,并由此得到首个负面结果。本文主要关注在多项式时间与数据条件下极限辨识范式下SFA的可学习性及其强化版本——高效可辨识性,该问题涉及是否存在一组系统的特征样本,使得学习器能从中正确推断出目标语言。我们给出了在多项式时间与数据条件下极限辨识SFA的一个必要条件,以及SFA高效可学习性的一个充分条件。由这些条件推导出一个正面结果与一个负面结果。学习算法的性能通常受目标语言表示规模的函数约束。由于SFA一般不存在规范形式,且转移谓词复杂度与转移数量之间存在权衡,我们首先定义了SFA的规模度量。我们重新审视了SFA相关过程的复杂性,并依据这些度量进行分析,特别关注SFA的特殊形式:归一化SFA与简洁SFA,以及基于单调有效布尔代数的SFA。本文是发表于CSL'22上同名论文的扩展版本。