Large language models can be prompted to produce text. They can also be prompted to produce "explanations" of their output. But these are not really explanations, because they do not accurately reflect the mechanical process underlying the prediction. The illusion that they reflect the reasoning process can result in significant harms. These "explanations" can be valuable, but for promoting critical thinking rather than for understanding the model. I propose a recontextualisation of these "explanations", using the term "exoplanations" to draw attention to their exogenous nature. I discuss some implications for design and technology, such as the inclusion of appropriate guardrails and responses when models are prompted to generate explanations.
翻译:大型语言模型可以被提示生成文本,也可以被提示生成对其输出的“解释”。但这些并非真正的解释,因为它们未能准确反映预测背后的机械过程。这些“解释”能反映推理过程的错觉可能导致重大危害。这类“解释”虽具价值,但更适用于促进批判性思维,而非用于理解模型本身。我提出对这些“解释”进行重新语境化,使用术语“exoplanations”来强调其外源特性。并讨论了设计与技术层面的一些启示,例如在模型被要求生成解释时,应设置适当的防护措施及应对机制。