Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks. However, their opaqueness and proliferation into scientific domains such as neuroscience have created a growing need for interpretability. Here, we ask whether we can obtain interpretable embeddings through LLM prompting. We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM. Training QA-Emb reduces to selecting a set of underlying questions rather than learning model weights. We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli. QA-Emb significantly outperforms an established interpretable baseline, and does so while requiring very few questions. This paves the way towards building flexible feature spaces that can concretize and evaluate our understanding of semantic brain representations. We additionally find that QA-Emb can be effectively approximated with an efficient model, and we explore broader applications in simple NLP tasks.
翻译:大语言模型(LLMs)已迅速提升了文本嵌入在日益增长的自然语言处理任务中的性能。然而,其不透明性以及在神经科学等科学领域的广泛应用,催生了日益增长的可解释性需求。本文探讨是否能够通过LLM提示获得可解释的嵌入。我们提出了问答嵌入(QA-Emb),该嵌入的每个特征代表大语言模型对一个是非问题的回答。训练QA-Emb简化为选择一组基础问题,而无需学习模型权重。我们使用QA-Emb灵活构建可解释模型,用于预测fMRI体素对语言刺激的反应。QA-Emb显著优于既有的可解释基线方法,且仅需极少问题即可实现。这为构建能够具体化和评估我们对语义大脑表征理解的可灵活定义特征空间开辟了道路。我们还发现QA-Emb可通过高效模型有效近似,并探索了其在简单NLP任务中的更广泛应用。