Neural finite-state transducers (NFSTs) form an expressive family of neurosymbolic sequence transduction models. An NFST models each string pair as having been generated by a latent path in a finite-state transducer. As they are deep generative models, both training and inference of NFSTs require inference networks that approximate posterior distributions over such latent variables. In this paper, we focus on the resulting challenge of imputing the latent alignment path that explains a given pair of input and output strings (e.g., during training). We train three autoregressive approximate models for amortized inference of the path, which can then be used as proposal distributions for importance sampling. All three models perform lookahead. Our most sophisticated (and novel) model leverages the FST structure to consider the graph of future paths; unfortunately, we find that it loses out to the simpler approaches -- except on an artificial task that we concocted to confuse the simpler approaches.
翻译:神经有限状态转换器(NFSTs)构成了一类富有表现力的神经符号序列转换模型。NFST 将每个字符串对建模为由有限状态转换器中的潜在路径生成。由于它们是深度生成模型,NFST 的训练和推断都需要推断网络来近似这些潜在变量上的后验分布。本文着重解决由此产生的挑战:如何估算出能解释给定输入和输出字符串对(例如在训练过程中)的潜在对齐路径。我们训练了三种自回归近似模型用于路径的摊销推断,这些模型随后可作为重要性抽样的提议分布。三种模型均执行前向搜索。我们最复杂(且新颖)的模型利用了 FST 结构来考虑未来路径的图;但遗憾的是,我们发现该模型在性能上不如更简单的方法——除非是在我们为了混淆简单方法而精心设计的人工任务中。