Continuous prompts have become widely adopted for augmenting performance across a wide range of natural language tasks. However, the underlying mechanism of this enhancement remains obscure. Previous studies rely on individual words for interpreting continuous prompts, which lacks comprehensive semantic understanding. Drawing inspiration from Concept Bottleneck Models, we propose a framework for interpreting continuous prompts by decomposing them into human-readable concepts. Specifically, to ensure the feasibility of the decomposition, we demonstrate that a corresponding concept embedding matrix and a coefficient matrix can always be found to replace the prompt embedding matrix. Then, we employ GPT-4o to generate a concept pool and choose potential candidate concepts that are discriminative and representative using a novel submodular optimization algorithm. Experiments demonstrate that our framework can achieve similar results as the original P-tuning and word-based approaches using only a few concepts while providing more plausible results. Our code is available at https://github.com/qq31415926/CD.
翻译:连续提示已广泛应用于提升各类自然语言处理任务的性能,但其性能提升的内在机制仍不明确。现有研究主要依赖单个词汇来解释连续提示,缺乏对语义的全面理解。受概念瓶颈模型启发,我们提出一个通过将连续提示分解为人类可读概念来实现其可解释性的框架。具体而言,为保障分解的可行性,我们证明总能找到对应的概念嵌入矩阵和系数矩阵来替代提示嵌入矩阵。随后,我们利用GPT-4o生成概念池,并通过一种新颖的次模优化算法筛选具有区分度和代表性的潜在候选概念。实验表明,我们的框架仅需使用少量概念即可达到与原始P-tuning及基于词汇的方法相当的效果,同时提供更具合理性的解释。代码已开源:https://github.com/qq31415926/CD。