Conventional keyword search systems operate on automatic speech recognition (ASR) outputs, which causes them to have a complex indexing and search pipeline. This has led to interest in ASR-free approaches to simplify the search procedure. We recently proposed a neural ASR-free keyword search model which achieves competitive performance while maintaining an efficient and simplified pipeline, where queries and documents are encoded with a pair of recurrent neural network encoders and the encodings are combined with a dot-product. In this article, we extend this work with multilingual pretraining and detailed analysis of the model. Our experiments show that the proposed multilingual training significantly improves the model performance and that despite not matching a strong ASR-based conventional keyword search system for short queries and queries comprising in-vocabulary words, the proposed model outperforms the ASR-based system for long queries and queries that do not appear in the training data.
翻译:传统关键词搜索系统基于自动语音识别(ASR)输出运行,导致其索引与搜索流程复杂。这激发了学界对无ASR方法简化搜索流程的兴趣。我们近期提出了一种神经无ASR关键词搜索模型,该模型在保持高效简化流程的同时实现了竞争性性能——查询与文档通过一对循环神经网络编码器进行编码,并通过点积操作组合这些编码。本文中,我们通过多语言预训练及模型详细分析对该研究进行了扩展。实验表明,所提出的多语言训练显著提升了模型性能;尽管对于短查询及包含词汇表内词汇的查询,该模型未能达到基于强ASR的传统关键词搜索系统的水平,但在长查询及训练数据中未出现的查询场景中,本模型的表现优于基于ASR的系统。