We investigate a novel modeling approach for end-to-end neural network training using hidden Markov models (HMM) where the transition probabilities between hidden states are modeled and learned explicitly. Most contemporary sequence-to-sequence models allow for from-scratch training by summing over all possible label segmentations in a given topology. In our approach there are explicit, learnable probabilities for transitions between segments as opposed to a blank label that implicitly encodes duration statistics. We implement a GPU-based forward-backward algorithm that enables the simultaneous training of label and transition probabilities. We investigate recognition results and additionally Viterbi alignments of our models. We find that while the transition model training does not improve recognition performance, it has a positive impact on the alignment quality. The generated alignments are shown to be viable targets in state-of-the-art Viterbi trainings.
翻译:我们研究了一种基于隐马尔可夫模型(HMM)的端到端神经网络训练新方法,其中隐状态之间的转移概率被显式建模和学习。大多数当代序列到序列模型通过在给定拓扑中对所有可能的标签分段求和来实现从头训练。在我们的方法中,分段间具有可显式学习的转移概率,而非隐式编码持续时间统计信息的空白标签。我们实现了一种基于GPU的前向-后向算法,该算法能够同时训练标签概率和转移概率。我们进一步考察了识别结果以及模型的维特比对齐。研究发现,虽然转移模型训练未能提升识别性能,但对齐质量却得到了积极改善。实验证明,生成的对齐可作为最先进维特比训练中的有效目标。