Accurate recognition of specific categories, such as persons' names, dates or other identifiers is critical in many Automatic Speech Recognition (ASR) applications. As these categories represent personal information, ethical use of this data including collection, transcription, training and evaluation demands special care. One way of ensuring the security and privacy of individuals is to redact or eliminate Personally Identifiable Information (PII) from collection altogether. However, this results in ASR models that tend to have lower recognition accuracy of these categories. We use text-injection to improve the recognition of PII categories by including fake textual substitutes of PII categories in the training data using a text injection method. We demonstrate substantial improvement to Recall of Names and Dates in medical notes while improving overall WER. For alphanumeric digit sequences we show improvements to Character Error Rate and Sentence Accuracy.
翻译:在许多自动语音识别(ASR)应用中,准确识别特定类别(如人名、日期或其他标识符)至关重要。由于这些类别代表个人信息,涉及此类数据的收集、转录、训练和评估等伦理使用需特别谨慎。确保个人安全与隐私的一种方法是完全从收集中删除或消除个人身份信息(PII)。然而,这会导致ASR模型对这些类别的识别准确率下降。我们采用文本注入方法,通过将虚构的PII类别文本替代物纳入训练数据,来提升PII类别的识别效果。实验表明,该方法在医疗笔记中显著提高了人名和日期的召回率,同时改善了整体词错误率(WER)。对于字母数字序列,我们展示了字符错误率和句子准确率的改进。