Recent work in open-domain question answering (ODQA) has shown that adversarial poisoning of the search collection can cause large drops in accuracy for production systems. However, little to no work has proposed methods to defend against these attacks. To do so, we rely on the intuition that redundant information often exists in large corpora. To find it, we introduce a method that uses query augmentation to search for a diverse set of passages that could answer the original question but are less likely to have been poisoned. We integrate these new passages into the model through the design of a novel confidence method, comparing the predicted answer to its appearance in the retrieved contexts (what we call \textit{Confidence from Answer Redundancy}, i.e. CAR). Together these methods allow for a simple but effective way to defend against poisoning attacks that provides gains of nearly 20\% exact match across varying levels of data poisoning/knowledge conflicts.
翻译:近期开放域问答(ODQA)研究表明,对搜索语料库的对抗性投毒会导致生产系统准确率大幅下降。然而,目前鲜有研究提出抵御此类攻击的方法。为此,我们基于大规模语料库中普遍存在冗余信息的直觉,引入了一种利用查询增强技术搜索多样化文档段落的方法,这些段落能够回答原始问题但更不易被投毒。通过设计一种新型置信度方法——将预测答案与其在检索上下文中的出现情况进行比较(我们称之为“基于答案冗余性的置信度”,即CAR方法),我们将这些新段落整合到模型中。这些方法联合构成了一种简单而有效的投毒攻击防御方案,在不同程度的数据投毒/知识冲突场景下,精确匹配率提升近20%。