Recent work in open-domain question answering (ODQA) has shown that adversarial poisoning of the search collection can cause large drops in accuracy for production systems. However, little to no work has proposed methods to defend against these attacks. To do so, we rely on the intuition that redundant information often exists in large corpora. To find it, we introduce a method that uses query augmentation to search for a diverse set of passages that could answer the original question but are less likely to have been poisoned. We integrate these new passages into the model through the design of a novel confidence method, comparing the predicted answer to its appearance in the retrieved contexts (what we call Confidence from Answer Redundancy, i.e. CAR). Together these methods allow for a simple but effective way to defend against poisoning attacks that provides gains of nearly 20% exact match across varying levels of data poisoning/knowledge conflicts.
翻译:近年来,开放域问答(ODQA)的研究表明,对搜索集合进行对抗性投毒会导致生产系统准确率大幅下降。然而,目前鲜有工作提出针对此类攻击的防御方法。为此,我们基于大规模语料库中常存在冗余信息的直觉,提出了一种利用查询增强来检索多样化的段落集合的方法。这些段落既能回答原始问题,又更不可能被投毒。通过设计新型置信度方法——将预测答案与检索上下文中的出现情况进行比较(即“答案冗余置信度”,简称CAR),我们将这些新段落融入模型。这些方法共同构成了一种简单有效的投毒攻击防御方案,在不同程度的数据投毒/知识冲突场景下,精确匹配率提升近20%。