Larger language models, such as GPT-3, have shown to be excellent in many tasks. However, we demonstrate that out-of-ordinary questions can throw the model off guard. This work focuses on finding answers to negated complementary questions in commonsense scenarios. We illustrate how such questions adversely affect the model responses. We propose a model-agnostic methodology to improve the performance in negated complementary scenarios. Our method outperforms few-shot generation from GPT-3 (by more than 11 points) and, more importantly, highlights the significance of studying the response of large language models in negated complementary questions. The code, data, and experiments are available under: https://github.com/navidre/negated_complementary_commonsense.
翻译:更大的语言模型(如GPT-3)在许多任务中表现出色。然而,我们证明异常问题可能会使模型措手不及。本研究聚焦于在常识场景中寻找否定互补问题的答案。我们阐释了此类问题如何对模型响应产生负面影响,并提出一种与模型无关的方法来提升其在否定互补场景下的性能。该方法在少样本生成任务上比GPT-3提升超过11个百分点,更重要的是,它凸显了研究大语言模型对否定互补问题响应的重要性。相关代码、数据及实验均可在https://github.com/navidre/negated_complementary_commonsense获取。