Large Language Models (LLMs) offer strong generative capabilities, but many applications require explicit and \textit{fine-grained} control over specific textual concepts, such as humor, persuasiveness, or formality. Prior approaches in prompting and representation engineering can provide coarse or single-attribute control, but systematic evaluation of multi-attribute settings remains limited. We introduce an evaluation framework for fine-grained controllability for both single- and dual-concept scenarios, focusing on linguistically distinct concept pairs (e.g., persuasiveness vs.~humor). Surprisingly, across multiple LLMs and generative tasks, we find that performance often drops in the dual-concept setting, even though the chosen concepts should in principle be separable. This reveals a fundamental limitation of naive prompting-based control: models struggle with compositionality even when concepts are intuitively independent. Our framework provides systematic evidence of this gap and offers a principled approach for measuring the ability of future methods for multi-concept control.
翻译:大语言模型(LLMs)展现出强大的生成能力,但许多应用需要对特定文本概念(如幽默性、说服力或正式程度)进行显式且\textit{细粒度}的控制。先前在提示工程和表示工程方面的研究虽能实现粗略或单一属性的控制,但对多属性场景的系统性评估仍显不足。我们提出了一个针对单概念及双概念场景的细粒度可控性评估框架,重点关注语言学上可区分的概念对(例如说服力与幽默性)。令人惊讶的是,在多个LLM及多种生成任务中,我们发现模型在双概念设定下的性能往往下降,即使所选概念在原则上应是可分离的。这揭示了基于朴素提示的控制方法存在根本性局限:即使概念在直觉上相互独立,模型仍难以处理其组合性。我们的框架为此类差距提供了系统性证据,并为衡量未来多概念控制方法的能力提供了原则性途径。