RECOVER: Robust Entity Correction via agentic Orchestration of hypothesis Variants for Evidence-based Recovery

Entity recognition in Automatic Speech Recognition (ASR) is challenging for rare and domain-specific terms. In domains such as finance, medicine, and air traffic control, these errors are costly. If the entities are entirely absent from the ASR output, post-ASR correction becomes difficult. To address this, we introduce RECOVER, an agentic correction framework that serves as a tool-using agent. It leverages multiple hypotheses as evidence from ASR, retrieves relevant entities, and applies Large Language Model (LLM) correction under constraints. The hypotheses are used using different strategies, namely, 1-Best, Entity-Aware Select, Recognizer Output Voting Error Reduction (ROVER) Ensemble, and LLM-Select. Evaluated across five diverse datasets, it achieves 8-46% relative reductions in entity-phrase word error rate (E-WER) and increases recall by up to 22 percentage points. The LLM-Select achieves the best overall performance in entity correction while maintaining overall WER.

翻译：自动语音识别（ASR）中的实体识别对于罕见词和领域专有术语具有挑战性。在金融、医疗和空中交通管制等领域，此类识别错误会导致高昂代价。若实体完全未出现在ASR输出中，后置ASR校正将变得尤为困难。为此，我们提出RECOVER——一种作为工具调用智能体的校正框架。该框架利用ASR生成的多种假设作为证据，检索相关实体，并在约束条件下应用大语言模型（LLM）进行校正。假设的生成采用四种策略：1-Best、实体感知选择、识别器输出投票误差缩减（ROVER）集成以及LLM选择。在五个异构数据集上的评估表明，该框架使实体短语词错误率（E-WER）相对降低8-46%，召回率最高提升22个百分点。其中LLM选择策略在保持整体词错误率的同时，实现了最优的实体校正综合性能。

相关内容

实体

关注 12

实体（entity）是有可区别性且独立存在的某种事物，但它不需要是物质上的存在。尤其是抽象和法律拟制也通常被视为实体。实体可被看成是一包含有子集的集合。在哲学里，这种集合被称为客体。实体可被使用来指涉某个可能是人、动物、植物或真菌等不会思考的生命、无生命物体或信念等的事物。在这一方面，实体可以被视为一全包的词语。有时，实体被当做本质的广义，不论即指的是否为物质上的存在，如时常会指涉到的无物质形式的实体－语言。更有甚者，实体有时亦指存在或本质本身。在法律上，实体是指能具有权利和义务的事物。这通常是指法人，但也包括自然人。

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

专知会员服务

15+阅读 · 5月27日

【NTU博士论文】端到端鲁棒自动语音识别的最新进展

专知会员服务

12+阅读 · 2025年10月15日

《深度伪造检测模型的准确性和鲁棒性》2023最新论文

专知会员服务

41+阅读 · 2023年10月29日

自动语音识别:简介、当前趋势和有待解决的问题，97页slides

专知会员服务

24+阅读 · 2022年12月20日