We propose a dialog system utility component that gets the last two utterances of a user and can detect whether the last utterance is an error correction of the second last utterance. If yes, it corrects the second last utterance according to the error correction in the last utterance and outputs the extracted pairs of reparandum and repair entity. This component offers two advantages, learning the concept of corrections to avoid collecting corrections for every new domain and extracting reparandum and repair pairs, which offers the possibility to learn out of it. For the error correction one sequence labeling and two sequence to sequence approaches are presented. For the error correction detection these three error correction approaches can also be used and in addition, we present a sequence classification approach. One error correction detection and one error correction approach can be combined to a pipeline or the error correction approaches can be trained and used end-to-end to avoid two components. We modified the EPIC-KITCHENS-100 dataset to evaluate the approaches for correcting entity phrases in request dialogs. For error correction detection and correction, we got an accuracy of 96.40 % on synthetic validation data and an accuracy of 77.81 % on human-created real-world test data.
翻译:我们提出一种对话系统实用组件,该组件获取用户最近两条话语,并能够检测最后一条话语是否为倒数第二条话语的错误纠正。若是,则根据最后一条话语中的错误纠正修正倒数第二条话语,并输出修复项与修复实体的提取配对。该组件具备两大优势:学习纠正概念以避免为每个新领域收集纠正数据,以及提取修复项与修复配对,从而提供从中学习的可能性。针对错误纠正任务,本文提出一种序列标注方法和两种序列到序列方法;对于错误纠正检测,可采用上述三种错误纠正方法,此外还提出一种序列分类方法。一种错误纠正检测方法与一种错误纠正方法可组合为流水线,或采用端到端方式训练和使用错误纠正方法以避免两个独立组件。我们修改了EPIC-KITCHENS-100数据集以评估在请求对话中纠正实体短语的方法。在错误纠正检测与纠正任务上,我们在合成验证数据上达到96.40%的准确率,在人工创建的真实世界测试数据上达到77.81%的准确率。