In this work, we introduce DeepDFA, a novel approach to identifying Deterministic Finite Automata (DFAs) from traces, harnessing a differentiable yet discrete model. Inspired by both the probabilistic relaxation of DFAs and Recurrent Neural Networks (RNNs), our model offers interpretability post-training, alongside reduced complexity and enhanced training efficiency compared to traditional RNNs. Moreover, by leveraging gradient-based optimization, our method surpasses combinatorial approaches in both scalability and noise resilience. Validation experiments conducted on target regular languages of varying size and complexity demonstrate that our approach is accurate, fast, and robust to noise in both the input symbols and the output labels of training data, integrating the strengths of both logical grammar induction and deep learning.
翻译:本文提出DeepDFA,一种利用可微分离散模型从轨迹中识别确定性有限自动机(DFA)的新方法。该方法受DFA的概率松弛与循环神经网络(RNN)的双重启发,在训练后具有可解释性,且与传统RNN相比降低了复杂度并提升了训练效率。此外,通过基于梯度的优化,本方法在可扩展性与噪声鲁棒性方面均超越了组合优化方法。在不同规模与复杂度的目标正则语言上进行的验证实验表明,该方法准确、快速,且对训练数据中输入符号与输出标签的噪声具有强鲁棒性,有效融合了逻辑文法归纳与深度学习的优势。