We study the learnability of symbolic finite state automata (SFA), a model shown useful in many applications in software verification. The state-of-the-art literature on this topic follows the query learning paradigm, and so far all obtained results are positive. We provide a necessary condition for efficient learnability of SFAs in this paradigm, from which we obtain the first negative result. The main focus of our work lies in the learnability of SFAs under the paradigm of identification in the limit using polynomial time and data, and its strengthening efficient identifiability, which are concerned with the existence of a systematic set of characteristic samples from which a learner can correctly infer the target language. We provide a necessary condition for identification of SFAs in the limit using polynomial time and data, and a sufficient condition for efficient learnability of SFAs. From these conditions we derive a positive and a negative result. The performance of a learning algorithm is typically bounded as a function of the size of the representation of the target language. Since SFAs, in general, do not have a canonical form, and there are trade-offs between the complexity of the predicates on the transitions and the number of transitions, we start by defining size measures for SFAs. We revisit the complexity of procedures on SFAs and analyze them according to these measures, paying attention to the special forms of SFAs: normalized SFAs and neat SFAs, as well as to SFAs over a monotonic effective Boolean algebra. This is an extended version of the paper with the same title published in CSL'22.
翻译:我们研究符号有限状态自动机(SFA)的可学习性,该模型在软件验证的诸多应用中展现出实用性。关于该主题的现有文献遵循查询学习范式,且迄今所有结果均为正面的。我们提供了在此范式下高效可学习SFA的必要条件,并由此获得了首个负面结果。本工作的核心在于研究在多项式时间与数据约束下的极限可辨识性范式及其强化版本——高效可辨识性中SFA的可学习性,这类问题关注是否存在系统性的特征样本集,使得学习器能据此正确推断目标语言。我们给出了SFA在多项式时间与数据约束下的极限可辨识性必要条件,以及SFA高效可学习性的充分条件。依据这些条件,我们推导出正面与负面两类结果。学习算法的性能通常受限于目标语言表示规模的函数。由于SFA一般不具备规范形式,且转移上的谓词复杂度与转移数量之间存在权衡,我们首先定义了SFA的规模度量。我们重新审视了SFA相关过程的复杂性,并根据这些度量对其进行分析,特别关注SFA的特殊形式:规范化SFA(normalized SFA)与整洁SFA(neat SFA),以及基于单调有效布尔代数上的SFA。本文是发表于CSL'22的同名论文的扩展版本。